Automating Open Science for Big Data

DSpace/Manakin Repository

Automating Open Science for Big Data

Citable link to this page


Title: Automating Open Science for Big Data
Author: Crosas, Merce; King, Gary ORCID  0000-0002-5327-7631 ; Honaker, James Allen; Sweeney, Latanya

Note: Order does not necessarily reflect citation order of authors.

Citation: Crosas, M., G. King, J. Honaker, and L. Sweeney. 2015. “Automating Open Science for Big Data.” The ANNALS of the American Academy of Political and Social Science 659 (1) (April 9): 260–273. doi:10.1177/0002716215570847.
Full Text & Related Files:
Abstract: The vast majority of social science research presently uses small (MB or GB scale) data sets. These fixed scale sets are commonly downloaded to the researcher's computer where the analysis is performed locally, and are often shared and cited with well-established technologies, such as the Dataverse Project (see, to support the published results. The trend towards Big Data - including large scale streaming data - is starting to transform research and has the potential to impact policy-making and our understanding of the social, economic, and political problems that affect human societies. However, this research poses new challenges in execution, accountability, preservation, reuse, and reproducibility. Downloading these data sets to a researcher's computer is infeasible or not practical; hence, analyses take place in the cloud, require unusual expertise, and benefit from collaborative teamwork and novel tool development. The advantage of these data sets in how informative they are also means that they are much more likely to contain highly sensitive personally identifiable information. In this paper, we discuss solutions to these new challenges so that the social sciences can realize the potential of Big Data.
Published Version: doi:10.1177/0002716215570847
Terms of Use: This article is made available under the terms and conditions applicable to Open Access Policy Articles, as set forth at
Citable link to this page:
Downloads of this work:

Show full Dublin Core record

This item appears in the following Collection(s)


Search DASH

Advanced Search