Research and applications

Are Meaningful Use Stage 2 certiﬁed EHRs ready for interoperability? Findings from the SMART C-CDA Collaborative
John D D’Amore,1,2 Joshua C Mandel,3,4 David A Kreda,5 Ashley Swain,1 George A Koromia,1 Sumesh Sundareswaran,1 Liora Alschuler,1 Robert H Dolin,1 Kenneth D Mandl,3,4,6 Isaac S Kohane,3,6 Rachel B Ramoni6,7
For numbered afﬁliations see end of article. Correspondence to John D D’Amore, Diameter Health, Inc., 1005 Boylston St #304, Newton, MA 02461, USA; jdamore@diameterhealth.com JDD, JCM, and DAK contributed equally to this work. Received 16 April 2014 Revised 3 June 2014 Accepted 5 June 2014 Published Online First 26 June 2014

ABSTRACT Background and objective Upgrades to electronic health record (EHR) systems scheduled to be introduced in the USA in 2014 will advance document interoperability between care providers. Speciﬁcally, the second stage of the federal incentive program for EHR adoption, known as Meaningful Use, requires use of the Consolidated Clinical Document Architecture (C-CDA) for document exchange. In an effort to examine and improve C-CDA based exchange, the SMART (Substitutable Medical Applications and Reusable Technology) C-CDA Collaborative brought together a group of certiﬁed EHR and other health information technology vendors. Materials and methods We examined the machinereadable content of collected samples for semantic correctness and consistency. This included parsing with the open-source BlueButton.js tool, testing with a validator used in EHR certiﬁcation, scoring with an automated open-source tool, and manual inspection. We also conducted group and individual review sessions with participating vendors to understand their interpretation of C-CDA speciﬁcations and requirements. Results We contacted 107 health information technology organizations and collected 91 C-CDA sample documents from 21 distinct technologies. Manual and automated document inspection led to 615 observations of errors and data expression variation across represented technologies. Based upon our analysis and vendor discussions, we identiﬁed 11 speciﬁc areas that represent relevant barriers to the interoperability of C-CDA documents. Conclusions We identiﬁed errors and permissible heterogeneity in C-CDA documents that will limit semantic interoperability. Our ﬁndings also point to several practical opportunities to improve C-CDA document quality and exchange in the coming years.

In our study, we apply the operational deﬁnition of semantic interoperability to assess structured data within Consolidated Clinical Document Architecture (C-CDA) documents, which certiﬁed electronic health record (EHR) systems must produce to satisfy federal regulation of EHR adoption. We study core variation in document samples to examine if reliable semantic interoperability is possible.

EHR adoption and Meaningful Use
EHR use in the USA has risen rapidly since 2009 with certiﬁed EHRs now used by 78% of ofﬁcebased physicians and 85% of hospitals.3 4 Meaningful Use (MU), a staged federal incentive program enacted as part of the American Recovery and Reinvestment Act of 2009, has paid incentives of US$21 billion to hospitals and physicians for installing and using certiﬁed EHRs pursuant to speciﬁc objectives.5 6 Stage 1 of the program (MU1) commenced in 2011, Stage 2 (MU2) in 2014, and Stage 3 is expected by 2017. While the term interoperability can refer to messages, documents, and services, MU provides several objectives that prioritize document interoperability.7 Although multiple document standards existed prior to MU1, providers with installed EHRs rarely had the capability to send structured patient care summaries to external providers or patients, as noted by the President’s Council of Advisors on Science and Technology and the Institute of Medicine.8 9 MU1 advanced document interoperability by requiring Continuity of Care Document (CCD) or Continuity of Care Record (CCR) implementation as part of EHR certiﬁcation. Many vendors chose the CCD, which was created to harmonize the CCR with more widely implemented standards.10 11 In MU2, the C-CDA, an HL7 consolidation of the MU1 CCD with other clinical document types, became the primary standard for document-based exchange.12

Open Access Scan to access more free content

BACKGROUND AND SIGNIFICANCE
Health Level 7 (HL7), a leading standards development organization for electronic health information, deﬁnes interoperability as ‘the ability of two parties, either human or machine, to exchange data or information where this deterministic exchange preserves shared meaning.’1 In addition, semantic interoperability has been operationally deﬁned to be ‘the ability to import utterances from another computer without prior negotiation, and have your decision support, data queries and business rules continue to work reliably against these utterances.’2

C-CDA use in document interoperability
The C-CDA is a library of templates using extensible markup language (XML) to transmit patientspeciﬁc medical data in structured and unstructured formats.13 It builds upon the HL7’s Clinical Document Architecture release 2.0 (CDA) and the Reference Implementation Model (RIM), a consensus view of how information can be abstractly represented.14 The CDA constrains the RIM by

To cite: D’Amore JD, Mandel JC, Kreda DA, et al. J Am Med Inform Assoc 2014;21:1060–1068. 1060

D’Amore JD, et al. J Am Med Inform Assoc 2014;21:1060–1068. doi:10.1136/amiajnl-2014-002883

Research and applications
applying principles of how to represent information in clinical documents. The C-CDA Implementation Guide 1.1 describes how to create nine CDA document types (table 1), each a combination of speciﬁc sections (eg, problems, allergies) and entries (eg, diagnosis of heart failure, medication allergy to penicillin). Moreover, different documents types (eg, a history and physical vs discharge summary) share common sections to achieve consistency in data representation. MU2 objectives include the use of C-CDA documents for both human display and machine-readable data exchange.7 15 Since C-CDA implementation guidance requires both data structured in XML and speciﬁc terminologies, healthcare providers can generate machine-readable documents for individual care transitions and across a practice to prevent EHR vendor lock-in. Previous research has cataloged issues associated with past interoperability standards, but research speciﬁc to C-CDA is still limited given the nascent utilization of the standard.16–19 feedback from discussion would only be incorporated into future application releases.

MATERIALS AND METHODS Vendor outreach
We e-mailed invitations to organizations listed on the Certiﬁed Health IT Product List (http://oncchpl.force.com/ehrcert).23 In cases where SMART or Lantana had prior contact with individuals within vendor organizations, we sent personal invitations. We posted public announcements on the SMART Platforms website and on the HL7’s Structured Document Working Group mailing list. We provided further details to interested organizations by phone, informing them of the means for sample collection and group discussions.

Collection of samples
As a condition of participation, we required vendors to submit at least one C-CDA document that had been generated by exporting a ﬁctional patient’s health record from their software application. To allay concerns, we allowed submitted documents to be kept private, but nonetheless encouraged vendor participants to select a sharing policy that included public posting to a GitHub repository managed by Boston Children’s Hospital (https://github.com/chb/sample_ccdas).

OBJECTIVES
Those of us ( JCM, DAK, KDM, ISK, RBR) involved in the SMART (Substitutable Medical Applications and Reusable Technology) Platforms Project, an Ofﬁce of the National Coordinator for Health Information Technology (ONC)-funded Special Health Advanced Research Project, have been exploring ways to integrate medical apps across diverse EHRs.20 21 To assess the current state of C-CDA interoperability and prepare recommendations to improve document quality, the SMART team engaged Lantana Consulting Group in April 2013 to form the SMART C-CDA Collaborative. The Collaborative approached health information technology vendors for a study of C-CDA quality and variability. Vendors who participated in the Collaborative provided 2011 Certiﬁed EHR Technology for a majority of provider attestations for MU1 from 2011 to 2013.22 While several vendor applications received 2014 EHR certiﬁcation before joining the Collaborative, most received it during the Collaborative’s term, which ended in December 2013. To identify both applicationspeciﬁc and general means to improve the quality of C-CDA documents, we engaged vendors in discussions and document reviews to reﬁne our analysis, as well as to hear how and why vendors made certain implementation decisions. Our interaction with vendors may have inﬂuenced the quality of C-CDA documents used during certiﬁcation, but many reported that the

Automated parsing of samples
C-CDA samples were parsed using the open-source BlueButton.js tool V .0.0.19, to which one of the authors ( JCM) has contributed.24 We have previously used BlueButton.js to integrate C-CDA data into medical applications. Using node.js, we parsed each C-CDA twice: a ﬁrst pass wrote all data to a text ﬁle, and a second pass only recorded parsing times to isolate ﬁle writing time artifacts. All processing was performed on a quad-core AMD 2.2 GHz workstation with 6 Gb of RAM running Windows 7 (Microsoft, Redmond, Washington, USA). We counted each non-null section and JavaScript Object Notation data elements returned from the parser.

Manual analysis of samples
While only vendor-supplied C-CDA samples were part of the formal analysis, C-CDA documents from HL7 and other nonvendor organizations were reviewed for comparison to collected samples. Two of the authors ( JDD, AS) performed the manual

Table 1 Data domains required by each C-CDA document type
Data domains C-CDA document type Continuity of care document Consultation note Diagnostic imaging report Discharge summary History and physical note Operative note Procedure note Progress note Unstructured document Other required Demographics Allergies Medications Plan of care Problems Procedures Results Social history Vital signs sections ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ 0 3 2 2 8 7 4 1 0

✓ ✓

✓ ✓

✓ ✓ ✓ ✓ ✓ ✓

Only the required domains are shown for each C-CDA document type. Additional information required by MU2 (ie, care team, functional and cognitive status, plan of care goals and instructions, immunizations, and referral information) are also supported in C-CDA documents. Because C-CDA documents are open templates, vendors may add optional data domains in order to meet regulatory and business requirements. C-CDA, Consolidated Clinical Document Architecture; MU, Meaningful Use.

D’Amore JD, et al. J Am Med Inform Assoc 2014;21:1060–1068. doi:10.1136/amiajnl-2014-002883

1061

Research and applications
inspection, adapting techniques previously used for analysis of CDA documents.17 Only a single C-CDA document submission was required to participate in the SMART C-CDA Collaborative. To give equal weighting in our analysis to each vendor application, when multiple samples were submitted from a single application, we selected the one with as many domains as possible and largest in kilobytes, together a proxy for the most data. The manual inspection identiﬁed errors and heterogeneity in the studied samples, but was conﬁned to seven domains from the ‘Common Data Set’ deﬁned in MU2: demographics, problems, allergies, medications, results, vital signs, and smoking status.15 We deﬁned an error in our study as any XML usage that conﬂicted with mandatory guidance from the HL7 C-CDA 1.1 Implementation Guide.13 Given this deﬁnition, any document with an error would not satisfy MU2 requirements for document interoperability. While many errors can be identiﬁed by automated software tools, some require human review (eg, where the dose of a structured medication entry contradicts dosing information in the narrative). Identifying heterogeneity in structured data meant ﬁnding variations in data inclusion, omission, or expression across examined documents that did not qualify as errors deﬁned above. Again, while some heterogeneity can be detected by automated software tools, human reviewers identiﬁed other types of heterogeneity which are currently not identiﬁable by software (eg, the omission of result interpretation as structured information when known from value and reference range). Our inspection recorded only the ﬁrst instance of any speciﬁc error or heterogeneity found in each domain of each sample. Recording repeated instances of the same issue in an individual C-CDA would document data frequency and not prevalence of error types. We mapped observed errors and heterogeneity to one of six mutually exclusive categories: (1) incorrect data within XML elements; (2) terminology misuse or omission; (3) inappropriate or variable XML organization or identiﬁers; (4) inclusion versus omission of optional elements; (5) problematic reference to narrative text from structured body; and (6) inconsistent data representation. These scores are combined into section-wide scores by dividing the number of points earned by the total points possible. A composite score reported as a percentage (0–100%) is produced by summing the number of points earned across sections and dividing by the total points possible.

Group web conferences
From July through December 2013, SMART and Lantana conducted ten 60-min group meetings to discuss C-CDA implementation. The protocol consisted of a short review of issues identiﬁed through analysis of the collected samples and polling each health information technology vendor to respond to each issue (eg, ‘When do you include reference ranges as structured elements vs text strings?’). Written notes, compiled by one of us ( JDD) for each meeting, were published weekly on a participant message board, which allowed for feedback between meetings (https://trello.com/b/CicwWfdW/smart-c-cda-collaborative).

One-on-one vendor reviews
From September through December 2013, SMART and Lantana scheduled sessions with individual vendors to review their respective C-CDA samples. Reviews covered speciﬁc observations about errors and explored variation in C-CDA data representations. Each vendor could request a second session and submit an additional C-CDA sample. Vendor feedback from these sessions was blinded when used in subsequent group discussions.

RESULTS Vendor outreach
Of the 107 individual organizations contacted, 44 (41%) responded to the invitation. Fourteen organizations submitted one or more samples from a single application and one organization provided multiple samples from three separate technologies. Several respondents did not submit a C-CDA sample. Supplemental samples came from four organizations who had openly published their C-CDAs. In total, 91 C-CDA documents were collected with an average of 4.3 (range 1–20) documents per vendor application. Samples were categorized (table 2) by whether the vendor application had been certiﬁed for MU2 by study conclusion.23

Automated analysis of samples
Automated analysis of the samples made use of the Transport Testing Tool (TTT) release V .175 (http://transport-testing.nist.gov/ ttt/) from the National Institute of Standard and Technology (NIST) and the SMART C-CDA Scorecard (http://ccda-scorecard. smartplatforms.org) from one of the authors ( JCM). TTT returns schema and schematron errors and warnings describing the conformance of a C-CDA document to the XML templates and conformance statements published by HL7. The SMART C-CDA Scorecard performs a set of semantic checks that ofﬁcial validation tools omit. These checks include the validation of RxNorm, Systematized Nomenclature of Medicine (SNOMED), Logical Observation Identiﬁers Name and Codes (LOINC), and the Uniﬁed Code for Units of Measure (UCUM) use within a C-CDA document. The Scorecard computes a series of rubrics, each corresponding to a best practice for C-CDA implementation derived from discussion on an HL7 community mailing list. For example, two rubrics are: ‘Document uses ofﬁcial C-CDA templateIds whenever possible’ and ‘Vitals are expressed with UCUM units.’ The Scorecard assigns a score from zero to ﬁve for each rubric, allowing partial credit for documents with incomplete adherence to each rubric. No score is assigned for a rubric if no relevant data are available.
1062

Automated parsing of samples
All 91 samples were parsed using BlueButton.js. Parsing results omitted smoking status because BlueButton.js does not support the C-CDA section of social history. Since not every C-CDA

Table 2 SMART C-CDA Collaborative tallies: vendors, applications, and C-CDA samples
MU2 certification status as of December 2013 Certified EHR Certified modular EHR for C-CDA exchange (HIE) Non-certified health IT Total C-CDA Vendors Applications samples 12 3 4 19 14 3 4 21 55 13 23 91

Results are categorized by the certification status of a vendor’s application as of December 2013 but the C-CDA samples submitted by a vendor may have been different from those submitted for EHR certification. C-CDA, Consolidated Clinical Document Architecture; EHR, electronic health record; HIE, health information exchange; IT, information technology; MU, Meaningful Use; SMART, Substitutable Medical Applications and Reusable Technology.

D’Amore JD, et al. J Am Med Inform Assoc 2014;21:1060–1068. doi:10.1136/amiajnl-2014-002883

Research and applications
We applied the TTT to each of the 21 samples that had been manually inspected and observed: ▸ Ten vendor applications returned no errors. ▸ The remaining 11 had an average of 71 errors (range of 2– 297) with the higher values being observed among noncertiﬁed vendor applications. ▸ Warnings were issued for all samples, generally for omission of XML elements, with an average of 78 warnings (range of 7–381) per vendor application. We submitted the same samples for scoring by the SMART C-CDA Scorecard, obtaining an average score of 63% (range 23–100%; ﬁgure 2). As expected, no correlation (R2<0.01) was observed between TTT results and SMART C-CDA Scorecard scores because they examine wholly different aspects of C-CDA document correctness. De-identiﬁed group results were presented publically and identiﬁed results were shared during individual vendor sessions.

Figure 1 Parsing times for C-CDA document samples (N=91). C-CDA, Consolidated Clinical Document Architecture.

included every domain for possible data, 5.4 (range 2–6) sections were parsed per document. For the parsed sections, the number of non-null data elements totaled 10 220. The extracted data elements by section were: 1706 for demographics, 620 for problems, 909 for allergies, 1866 for medications, 3338 for results, and 1781 for vital signs. The average document size was 135 kb (SD 130 kb) with an average parsing time of 864 ms (SD 862 ms). Approximately 1 s was required to parse 149 kb of C-CDA data. Document size and average parsing time were highly correlated (R²=0.971) and the distribution was rightskewed (ﬁgure 1). Results for the two C-CDA parsing passes showed an average 2 ms increment for writing data versus only parsing the documents (R²=0.988); hence parsing is essentially the entire computing time.

Group and one-on-one vendor web conferences
Of the 19 organizations represented by C-CDA samples, 12 attended at least one group call. Six organizations who did not submit a sample during the outreach also joined the group calls. On average, eight organizations participated in each group call. Eleven organizations discussed their samples in one-on-one sessions with the research team, and three requested a second session. Individual sessions averaged 66 min (range 30–90 min) for a total of 930 min. Five organizations submitted revised samples to the Collaborative.

Manual analysis of samples
For the 21 vendor C-CDA samples we analyzed, we observed 615 errors and heterogeneities, assigning 607 (99%) to one of six mutually exclusive categories (table 3). Eight observations (1%) did not ﬁt this schema. For each category, the research team selected up to two examples from examined C-CDA documents that illustrate one potential type of error or heterogeneity (table 4).

Summation: common trouble spots in C-CDA samples
Based upon our analysis and discussions with Collaborative participants, we identiﬁed 11 speciﬁc areas (ie, ‘trouble spots’) in examined C-CDA documents. Although not comprehensive, each trouble spot represents a relevant, common issue in C-CDA documents. Since not all vendors elected to publicize their participation in the Collaborative, de-identiﬁed results were presented in the last group call (ﬁgure 3). The severity and clinical relevance of these trouble spots vary according to the context of C-CDA document use. Data heterogeneity or omission may impose a minimal burden in cases where humans or computers can normalize or supplement information from other sources. In other cases, a missing or erroneous code (eg, terminology misuse; table 4) could disrupt vital care activities, such as automated surveillance for drug–allergy interactions. Because the severity of trouble spots depends upon speciﬁc clinical

TTT/SMART C-CDA scorecard results
We used both the TTT and SMART C-CDA Scorecard to help detect and classify errors and types of heterogeneity. TTT focused on a document’s adherence to a series of structural constraints described in the C-CDA 1.1 Implementation Guide, while the SMART C-CDA Scorecard assessed speciﬁc semantic issues with data content and terminology.

Table 3 Categorized observations (N=615) across 21 C-CDA samples examined
Examined domains from MU common data set Demographics Incorrect data within XML elements Terminology misuse or omission Inappropriate or variable XML organization or identifiers Element optionality through inclusion or omission Problematic reference to narrative text from structured body Inconsistent data representation Not elsewhere classified Total 10 9 7 49 0 23 1 99 Allergies 12 40 20 20 6 7 3 108 Medications 27 29 13 40 10 4 2 125 Problems 24 12 17 16 11 4 1 85 Results 5 31 23 22 3 12 1 97 Smoking status 14 2 10 1 6 0 0 33 Vital signs 5 19 20 13 9 2 0 68 Total 97 142 110 161 45 52 8 615

Both errors and heterogeneity observations were recorded in each category with the exception of ‘Inconsistent data representation’ which only included heterogeneity. C-CDA, Consolidated Clinical Document Architecture; MU, Meaningful Use; XML, extensible markup language.

D’Amore JD, et al. J Am Med Inform Assoc 2014;21:1060–1068. doi:10.1136/amiajnl-2014-002883

1063

Research and applications
Table 4 Errors and heterogeneity examples in C-CDA samples
Category Incorrect data doseQuantity is ‘40 mg’ but should be ‘1’ to correspond to the RxNorm code that specifies tablet dosing C-CDA XML code Type Error

Terminology misuse RxNorm code 7982 is ‘penicillin’, while the display and narrative state ‘codeine’

Error

Inappropriate organization Code for a vaccine recorded in the diagnostic results section whereas it should be in immunizations

Error

Element optionality Method code is optional and included on only one sample (eg, patient position as seated for blood pressure).

Inclusion of an XML element

Heterogeneity

Interpretation code is optional for results and often omitted or left blank. In this example normal can be inferred from reference range

Omission of an XML element

Reference to narrative text Reference to allergic reaction (cough) has no reference to allergen (aspirin)

Narrative, unstructured body

Heterogeneity

Structured body only references the reaction

Inconsistent representation Two samples showing a medication to be administered ‘every day’ but units vary from hours to days

Medication timing of “every day”

Heterogeneity

Separate medication timing of “every day” from different sample

C-CDA, Consolidated Clinical Document Architecture; XML, extensible markup language.

1064

D’Amore JD, et al. J Am Med Inform Assoc 2014;21:1060–1068. doi:10.1136/amiajnl-2014-002883

Research and applications
Present in automated detection
Some observed violations of the C-CDA speciﬁcation are already detected by NIST’s TTT validator. For example, the validator ﬂagged inappropriate null values of ‘UNC,’ which were likely intended to be ‘UNK’, meaning unknown. Such errors were unusual among EHR applications since certiﬁcation requires the production of TTT-validated documents.

Potential for automated detection
Some barriers to semantic interoperability could be detected with additions to the TTT validator. One observed area was internal C-CDA consistency, which could be evaluated using logical correlations of structured entries. For example, if a C-CDA problem has an observation status asserting that the problem is biologically active, it would be incorrect for the concern status code to be ‘completed’ or for the patient’s timing information to include a problem resolution date. Terminology issues were prevalent and also amendable to automated detection. In several samples we observed the use of non-existent, deprecated, or misleading codes, and nonadherence to required value sets. For example, one sample used the deprecated LOINC code ‘41909-3’ which has been superseded by LOINC code ‘39156-5’ to represent body mass index. There were also more complex concerns. For example, medication allergies should be encoded at the ingredient level (eg, ‘aspirin’) or drug class level (eg, ‘sulfonamides’), but some samples reported allergens at the semantic clinical drug level (eg, ‘aspirin 81 mg oral tablet’). While the latter representation is syntactically correct, it is clinically questionable to say that someone is allergic to a speciﬁc form and dose of aspirin. To reconcile such terminology issues, receivers of C-CDA documents would need to perform substantial manual reconciliation or apply intricate normalizing logic to the hierarchy of potential RxNorm codes.

Figure 2 SMART C-CDA Scorecard histogram for C-CDA samples (N=21). C-CDA, Consolidated Clinical Document Architecture; SMART, Substitutable Medical Applications and Reusable Technology.

workﬂows, we conﬁne our discussion to the knowable barriers they create to semantic interoperability.

DISCUSSION
We demonstrated that aggregated, structured data covering a range of clinical information from MU2 C-CDA samples can be parsed with the open-source BlueButton.js library. This allowed us to inspect manually and programmatically the structured content of vendor-supplied documents to answer this question: will the exchange of C-CDA documents generated by 2014 Certiﬁed EHR Technology be capable of achieving semantic interoperability? Analyzing these documents, we identiﬁed many barriers to such interoperability. This leads to recommendations on how to improve C-CDA document quality, and with such improvements, advance automated document exchange.

Barriers to semantic interoperability
Our observations identify several barriers that will challenge reliable import and interpretation of parsed C-CDA documents. It can be helpful to categorize these issues based on how erroneous data can be detected.

Issues difﬁcult to detect automatically
Heterogeneity in data representation imposes interoperability barriers that are difﬁcult to detect automatically without clear guidance. We frequently observed variations where the C-CDA speciﬁcation does not provide uniform guidance. Telephone

Figure 3 Chief trouble spots in C-CDA documents (N=21). C-CDA, Consolidated Clinical Document Architecture.
D’Amore JD, et al. J Am Med Inform Assoc 2014;21:1060–1068. doi:10.1136/amiajnl-2014-002883 1065

Research and applications
numbers illustrate this, where examples in the C-CDA show multiple ways to encode and no testable conformance is provided.13 In collected samples, we found 12 distinct patterns for recording a telephone number through combination of dashes, parentheses, periods, and leading characters. These representations are straightforward for humans to interpret but automated tools require speciﬁcity on permissible representations. Many variations in data representation may be addressed through lightweight data consumption normalization algorithms (eg, regular expressions for a telephone number). Data optionality introduces two large challenges for semantic interoperability. First, the data are not present for certain downstream clinical workﬂows and applications. For example, the absence of medication administration timing (eg, ‘take every 8 hours’) prevents generation of automated reminders to promote medication adherence. Second, the absence of data may only reﬂect that the certiﬁed technology never populates and does not convey whether the data were known, unknown, or not structured in a vendor’s application. Such heterogeneity creates instances where the receiver cannot disambiguate data context. Many of these observations may have a straightforward explanation. Several vendors explained that they focused development efforts on C-CDA generation to pass TTT validation and less on provider demands for semantic interoperability. Almost all vendors commented that they had too few implementation examples to guide them in expressing common clinical data and ambiguous guidance from regulatory and standards development organizations. large value sets referenced by C-CDA are dynamic and are subject to change, this is reasonably addressable if reference terminology systems were maintained and hosted by an authority, such as the National Library of Medicine. Because there is no such authority today, the SMART C-CDA Scorecard hosted unofﬁcial vocabulary sources for its C-CDA scoring.

Reduce data optionality
MU2 regulations have taken steps to reduce optionality by requiring a Common Data Set. The Common Data Set, however, does not constrain C-CDA optionality at a granular level. In effect, this permits vendors to omit data. For example, vendors must include an appropriate RxNorm code for each medication in C-CDA documents but otherwise may populate its dose, timing, and route of administration with null values. Moreover, MU2 data requirements imperfectly correspond to HL7 C-CDA speciﬁcations. For example, no single document type in the C-CDA library requires all components of the Common Data Set, so optional sections must be used to meet MU2 requirements. We therefore recommend that regulations require EHR vendors to populate known data in C-CDA documents.

Monitor and track real-world document quality
In real-world clinical environments, a multitude of C-CDA documents will be generated to satisfy clinical workﬂows. To quantify and improve document quality, metrics could be calculated using a C-CDA quality surveillance service running within the ﬁrewall or at a trusted cloud provider. Such a service could use existing tools, such as the TTT validator and SMART C-CDA Scorecard. These services could also be offered through health information exchanges that transmit C-CDA documents between organizations.

Improving C-CDA document quality
We identify four areas—spanning standards development, implementation, and policy—that can lead to improved C-CDA document quality. Each of the recommendations we make in these areas can be weighed for its potential beneﬁt against burden for implementation.

Advancing C-CDA document exchange
MU2 requires providers to exchange C-CDA documents for 10% of care transitions and for certiﬁed EHR technology to be capable of ingesting select data upon receipt. This is a signiﬁcant advance from MU1, where only data display and testing of exchange were required.6 25 According to MU2 regulations, however, the intake of clinical data for certiﬁed systems need not be fully automated.7 This is entirely appropriate given the issues identiﬁed in this research. To advance automated document exchange, we suggest vendors and policy makers consider Postel’s Law. This law, also known in computing as the robustness principle, states ‘be conservative in what you send, be liberal in what you accept from others.’26 Our recommendations to provide more robust examples, validate codes, and reduce data optionality will reduce variability in the export of C-CDA documents, addressing the ﬁrst half of this principle. To improve the liberal consumption of C-CDA documents, intelligent parsing could normalize some aspects of heterogeneity. Such software could detect common variations in units, terminology, and data expression to return a more consistent data set from C-CDA documents. Our recommendation to monitor actual C-CDA document exchange would serve vendors well as they write normalizing algorithms. However, even the best engineering cannot reliably populate missing data or resolve conﬂicting statements, so there will be an upper bound as to what can be normalized. While Postel’s Law cannot be directly enforced through regulation, a combination of policy changes with community support of our recommendations would move real-world C-CDA exchange closer to realizing this principle.

Provide richer samples in publically accessible format
Vendors commented in the Collaborative that they did not always know how to represent data within the C-CDA. While the ONC created a website to assist in C-CDA implementation and testing (http://www.sitenv.org/) and HL7 increased its help desk content, vendors suggested these were inadequate and sometimes unclear. There is need for a site where public samples and common clinical scenarios of C-CDA documents, sections, and entries can be queried. We posted samples to the Boston Children’s Hospital’s public C-CDA repository, when permitted by vendors. HL7 also supports this goal through the commission of a CDA Example Task Force (http://wiki.hl7.org/ index.php?title=CDA_Example_Task_Force). A simple and powerful solution would be to require every technology to publish C-CDA documents with standardized ﬁctional data used in EHR certiﬁcation. While vendors may take different implementation approaches, publication would foster transparent discussion between vendors, standards bodies, and providers.

Validate codes
Many errors cataloged by this research would not exist if certiﬁcation tools used by testing bodies included terminology vetting to validate codes and value set membership. Because the C-CDA includes dozens of reference vocabularies in its implementation, testing for appropriate conformance to common vocabularies, such as SNOMED, LOINC, RxNorm, and UCUM, should be part of certiﬁcation. Although many of the
1066

D’Amore JD, et al. J Am Med Inform Assoc 2014;21:1060–1068. doi:10.1136/amiajnl-2014-002883

Research and applications
A further challenge for real-world document exchange also emerged in this study. Latency of C-CDA document production and consumption, while not a barrier to semantic interoperability, may limit application responsiveness. Automated parsing using BlueButton.js, which provides minimal C-CDA normalizing logic, requires up to several seconds for larger documents. Intelligent normalization, network latency, and C-CDA generation time, none of which we measured in our research, would add to such computational time. Together these considerations suggest a limited role for C-CDA usage in low latency services, such as concurrent clinical decision support or other third party applications. quality services that work for certiﬁcation and post-certiﬁcation validation will all be needed to advance C-CDA based technologies. However, without timely policy to move these elements forward, semantically robust document exchange will not happen anytime soon.
Author afﬁliations 1 Lantana Consulting Group, LLC, East Thetford, Vermont, USA 2 Diameter Health, Inc., Newton, Massachusetts, USA 3 Children’s Hospital Informatics Program at Harvard-MIT Health Sciences and Technology, Boston, Massachusetts, USA 4 Department of Pediatrics, Harvard Medical School, Boston, Massachusetts, USA 5 SMART Platforms Project, Center for Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA 6 Center for Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA 7 Department of Oral Health Policy and Epidemiology, Harvard School of Dental Medicine, Boston, Massachusetts, USA Acknowledgements We would like to thank several organizations for their assistance in the planning and execution of the SMART C-CDA Collaborative. The Ofﬁce of the National Coordinator for Health Information Technology assisted in the planning of the SMART C-CDA Collaborative. HL7 provided time on its Structured Document Working Group meetings to invite vendor participation and solicit feedback on proposed revisions to the C-CDA Implementation Guide. In addition, many individuals from Lantana Consulting Group beyond the listed authors provided assistance in outreach and execution of the Collaborative. Organizations who participated in the SMART C-CDA Collaborative and permitted public recognition of their support are Allscripts, Athenahealth, Cerner, the Electronic Medical Record Generation Project, Greenway, Infraware, InterSystems, Kinsights, Mirth, NextGen, Partners HealthCare, and Vitera. We would like to thank the many individuals at these organizations who submitted samples and participated in both group and individual review sessions. The SMART C-CDA Collaborative would not have been possible without their support and the support of other health information technology vendors who choose to remain anonymous. Many of these vendors also elected to publish C-CDA samples in a public repository to help advance collective knowledge of C-CDA implementation. We would like to speciﬁcally acknowledge Holly Miller of MedAllies, Brett Marquard of River Rock Associates, Peter O’Toole of mTuitive, and Lisa Nelson of Life Over Time Solutions for their extensive support in C-CDA discussions. Contributors JDD, JCM, and DAK were the principle authors of the manuscript and led individual and group vendor sessions. JDD, JCM, SS, AS, and GAK contributed to the collection and review of C-CDA documents and to presentations to group vendor sessions. JDD and JCM led programming work for automated evaluation and C-CDA parsing. LA, RHD, KDM, ISK, and RBR provided organizational leadership including project design, assistance in vendor recruitment, background literature review, and editing of the article. SS, DAK and RBR led daily project management and performed extensive editing of the manuscript. Funding This work was funded by the Strategic Health IT Advanced Research Projects Award 90TR000101 from the Ofﬁce of the National Coordinator of Health Information Technology. Competing interests One of the authors ( JCM) provided design advice to the BlueButton.js initiative. BlueButton.js is a non-commercial, open-source project and no ﬁnancial remuneration or explicit business relationship exists between BlueButton. js and any of the authors of this work. Provenance and peer review Not commissioned; externally peer reviewed. Data sharing statement Full detail on all publicly submitted C-CDA documents are available at: https://github.com/chb/sample_ccdas and noted in the article. Open Access This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/ licenses/by-nc/4.0/

Caveats
When providers adopt certiﬁed software for MU2, we expect C-CDA exports will exhibit many of the challenges we observed in our samples. Nonetheless, our ﬁndings about the readiness of C-CDA documents for interoperability have several limitations. First, our samples represented technologies that voluntarily submitted or publicized their C-CDA documents. While participating vendors represent a majority of the certiﬁed EHR market, we examined only ﬁctional patient records from vendor test or development environments. These ﬁctional records represented varying clinical scenarios individually created by vendors. Using a standard data set based on consistent clinical conditions from all technologies would have yielded greater comparative analysis. Second, our ﬁndings do not capture real-world implementation by hospitals and physicians, since MU2 had not yet been implemented during the research. We anticipate additional issues will surface in the thousands of upcoming C-CDA deployments. Third, we focused exclusively on seven clinical domains from the Common Data Set. Had we scrutinized other C-CDA domains, we would likely have recorded additional errors and heterogeneity. Finally, while we examined C-CDA documents and discussed their production rationale with vendors as external observers, we were unable to examine any vendor’s C-CDA consumption and reconciliation algorithms. This would have provided further insight into the challenges of semantic interoperability but was beyond the scope of our research. These limitations in aggregate have likely caused us to understate the frequency and types of observed errors and heterogeneity that will be observed in real C-CDA exchange. Our ﬁndings, however, materially capture the problems facing C-CDA document exchange for MU2.

CONCLUSION
Although progress has been made since Stage 1 of MU, any expectation that C-CDA documents could provide complete and consistently structured patient data is premature. Based on the scope of errors and heterogeneity observed, C-CDA documents produced from technologies in Stage 2 of MU will omit key clinical information and often require manual data reconciliation during exchange. In an industry often faulted for locking down data and stiﬂing interoperability, we were heartened by Collaborative participants who helped identify speciﬁc problems and equally speciﬁc ways to improve document exchange. This research demonstrated the power of group collaboration and utility of open-source tools to parse documents and identify latent challenges to interoperability. Future policy, market adoption, and availability of widespread terminology validation will determine if C-CDA documents can mature into efﬁcient workhorses of interoperability. Our ﬁndings suggest that knowledge and example repositories, increasing rigor in document production and validation, and data

REFERENCES
1 2 3 Health Level 7 Wiki. SAIF Information Framework. http://wiki.hl7.org/index.php? title=SAIF_Information_Framework (accessed 7 Mar 2014). Dolin RH, Alschuler L. Approaching semantic interoperability in Health Level Seven. J Am Med Inform Assoc 2011;18:99–103. United States Department of Health and Human Services. ONC Data Brief No. 9. Adoption of electronic health record systems among U.S. non-federal acute care hospitals: 2008–2012. Ofﬁce of the National Coordinator for Health

D’Amore JD, et al. J Am Med Inform Assoc 2014;21:1060–1068. doi:10.1136/amiajnl-2014-002883

1067

Research and applications
Information Technology. http://www.healthit.gov/sites/default/ﬁles/oncdatabrief9ﬁnal. pdf (accessed 7 Mar 2014). DeSalvo K. Survey says: EHR incentive program is on track. Ofﬁce of the National Coordinator for Health IT. http://www.healthit.gov/buzz-blog/from-the-onc-desk/ survey-ehr-incentive-program/ (accessed 7 Mar 2014). Conn J. EHR incentive payments could exceed $22.5B estimate. http://www. modernhealthcare.com/article/20140303/NEWS/303039952 (accessed 11 Apr 2014). Blumenthal D, Tavenner M. The “meaningful use” regulation for electronic health records. N Engl J Med 2010;363:501–4. United States Department of Health and Human Services. Health information technology: standards, implementation speciﬁcations, and certiﬁcation criteria for electronic health record technology, 2014 edition; revisions to the permanent certiﬁcation program for health information technology. Washington, DC: Ofﬁce of the National Coordinator for Health Information Technology (US), 2012. Regulation Identiﬁcation Number 0991-AB82. Grossman C, Powers B, McGinnis JM; Institute of Medicine. Digital infrastructure for the learning health system. The Foundation for Continuous Improvement in Health and Health Care. Washington, DC: National Academies Press, 2011. Report to the President. Realizing the full potential of health information technology to improve healthcare for Americans. Washington, DC: Executive Ofﬁce of the President: President’s Council of Advisors on Science and Technology. http://www.whitehouse. gov/administration/eop/ostp/pcast/docsreports (accessed 7 Mar 2014). Ferranti JM, Musser RC, Kawamoto K, et al. The clinical document architecture and the continuity of care record: a critical analysis. J Am Med Inform Assoc 2006;13:245–52. D’Amore JD, Sittig DF, Ness RB. How the continuity of care document can advance medical research and public health. Am J Public Health 2012;102:e1–4. McNickle M, Brull R. 6 Things to know about Consolidated CDA. Healthcare IT News. http://www.healthcareitnews.com/news/ 6-things-know-about-consolidated-cda (accessed 28 Mar 2014). HL7 implementation guide for CDA release 2: IHE health story consolidation, release 1.1 - US Realm. https://www.hl7.org/implement/standards/product_brief.cfm? product_id=258 (accessed 7 Mar 2014). Dolin RH, Alschuler L, Boyer S, et al. HL7 Clinical Document Architecture, Release 2. J Am Med Inform Assoc 2006;13:30–9. 15 United States Department of Health and Human Services. Medicare and Medicaid Programs; Electronic health record incentive program—stage 2. USA: Centers for Medicare and Medicaid Services, 2012. Regulation Identiﬁcation Number 0938-AQ84. Landgrebe J, Smith B. The HL7 approach to semantic interoperability. International Conference on Biomedical Ontology; 2011:139–46. D’Amore JD, Sittig DF, Wright A, et al. The promise of the CCD: challenges and opportunity for quality improvement and population health. AMIA Annu Symp Proc 2011;2011:285–94. Fickenscher KM. President’s column: interoperability—the 30% solution: from dialog and rhetoric to reality. J Am Med Inform Assoc 2013;20:593–4. Legg M. Standardisation of test requesting and reporting for the electronic health record. Clin Chim Acta 2014;432:148–56. Mandl KD, Mandel JC, Murphy SN, et al. The SMART Platform: early experience enabling substitutable applications for electronic health records. J Am Med Inform Assoc 2012;19:597–603. Mandl KD, Kohane IS. No small change for the health information economy. N Engl J Med 2009;360:1278–81. United States. CMS Medicare and Medicaid EHR Incentive Program, electronic health record products used for attestation. http://catalog.data.gov/dataset/cmsmedicare-and-medicaid-ehr-incentive-program-electronic-health-record-productsused-for (accessed 7 Mar 2014). United States Department of Health and Human Services. Certiﬁed health IT product list. USA: Ofﬁce of the National Coordinator for Health Information Technology. http://oncchpl.force.com/ehrcert?q=chpl (accessed 31 Dec 2013). Blue Button Development Community. BlueButton.js. http://www.bluebuttonjs.com/ (accessed 17 Jan 2014). United States Department of Health and Human Services. Health information technology: initial set of standards, implementation speciﬁcations, and certiﬁcation criteria for electronic health record technology. Washington, DC: Ofﬁce of the National Coordinator for Health Information Technology (US), 2010. Regulation Identiﬁcation Number 0991-AB58. Boone K. Postel’s Principle, CDA and meaningful use. Healthcare IT News. http:// www.healthcareitnews.com/blog/postels-principle-cda-meaningful-use (accessed 7 Mar 2014).

4

16 17

5 6 7

18 19 20

8

21 22

9

10

23

11 12

24 25

13

26

14

1068

D’Amore JD, et al. J Am Med Inform Assoc 2014;21:1060–1068. doi:10.1136/amiajnl-2014-002883