Transparent Reporting of Data Quality in Distributed Data Networks

DSpace/Manakin Repository

Transparent Reporting of Data Quality in Distributed Data Networks

Citable link to this page


Title: Transparent Reporting of Data Quality in Distributed Data Networks
Author: Kahn, Michael G.; Brown, Jeffrey S.; Chun, Alein T.; Davidson, Bruce N.; Meeker, Daniella; Ryan, Patrick B.; Schilling, Lisa M.; Weiskopf, Nicole G.; Williams, Andrew E.; Zozus, Meredith Nahm

Note: Order does not necessarily reflect citation order of authors.

Citation: Kahn, Michael G., Jeffrey S. Brown, Alein T. Chun, Bruce N. Davidson, Daniella Meeker, Patrick B. Ryan, Lisa M. Schilling, Nicole G. Weiskopf, Andrew E. Williams, and Meredith Nahm Zozus. 2015. “Transparent Reporting of Data Quality in Distributed Data Networks.” eGEMs 3 (1): 1052. doi:10.13063/2327-9214.1052.
Full Text & Related Files:
Abstract: Introduction: Poor data quality can be a serious threat to the validity and generalizability of clinical research findings. The growing availability of electronic administrative and clinical data is accompanied by a growing concern about the quality of these data for observational research and other analytic purposes. Currently, there are no widely accepted guidelines for reporting quality results that would enable investigators and consumers to independently determine if a data source is fit for use to support analytic inferences and reliable evidence generation. Model and Methods: We developed a conceptual model that captures the flow of data from data originator across successive data stewards and finally to the data consumer. This “data lifecycle” model illustrates how data quality issues can result in data being returned back to previous data custodians. We highlight the potential risks of poor data quality on clinical practice and research results. Because of the need to ensure transparent reporting of a data quality issues, we created a unifying data-quality reporting framework and a complementary set of 20 data-quality reporting recommendations for studies that use observational clinical and administrative data for secondary data analysis. We obtained stakeholder input on the perceived value of each recommendation by soliciting public comments via two face-to-face meetings of informatics and comparative-effectiveness investigators, through multiple public webinars targeted to the health services research community, and with an open access online wiki. Recommendations: Our recommendations propose reporting on both general and analysis-specific data quality features. The goals of these recommendations are to improve the reporting of data quality measures for studies that use observational clinical and administrative data, to ensure transparency and consistency in computing data quality measures, and to facilitate best practices and trust in the new clinical discoveries based on secondary use of observational data.
Published Version: doi:10.13063/2327-9214.1052
Other Sources:
Terms of Use: This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at
Citable link to this page:
Downloads of this work:

Show full Dublin Core record

This item appears in the following Collection(s)


Search DASH

Advanced Search