Publication:
If these data could talk

Thumbnail Image

Date

2017

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

Springer Nature
The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Pasquier, Thomas, Matthew K. Lau, Ana Trisovic, Emery R. Boose, Ben Couturier, Mercè Crosas, Aaron M. Ellison, Valerie Gibson, Chris R. Jones, and Margo Seltzer. 2017. “If These Data Could Talk.” Scientific Data 4 (September 5): 170114. doi:10.1038/sdata.2017.114.

Research Data

Abstract

In the last few decades, data-driven methods have come to dominate many fields of scientific inquiry. Open data and open-source software have enabled the rapid implementation of novel methods to manage and analyze the growing flood of data. However, it has become apparent that many scientfic fields exhibit distressingly low rates of repeatability and reproducibility. Although there are many dimensions to this issue, we believe that there is a lack of formalism used when describing end-to-end published results, from the data source to the analysis to the final published results. Even when authors do their best to make their research and data accessible, this lack of formalism reduces the clarity and effciency of reporting, which contributes to issues of reproducibility. Data provenance aids both repeatability and reproducibility through systematic and formal records of the relationships among data sources, processes, datasets, publications and researchers.

Description

Other Available Sources

Keywords

Terms of Use

This article is made available under the terms and conditions applicable to Open Access Policy Articles (OAP), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories