Development of a New Uniform File Format for Neuroscience Data Across the Globe
Citation
Friedsam, Claudia. 2016. Development of a New Uniform File Format for Neuroscience Data Across the Globe. Master's thesis, Harvard Extension School.Abstract
With about 1011 neurons and about 1015 synaptic connections the human brain is among the most complex biologic systems ever observed (Rautenberg, Sobolev et al. 2011). Technical progress is driving the volume of data collection to unknown heights. Unsurprisingly neuroscience is rapidly increasing its already extreme volumes of collected data (Schwarz, Lebedev et al. 2014). At the same time the discipline is lagging behind with tools and concepts needed to fully exploit the information contained in the overwhelming amount of data. Scientific progress depends more and more on collaboration, which in turn requires data sharing (Teeters, Godfrey et al., 2015). Neurophysiologic experiments are complex and diverse and the data are recorded in a multitude of different data formats. This lack of compatibility poses a major obstacle to efficient data sharing and collaboration (Breaking Down the Data Barriers in Neuroscience 2014).Neurodata without Borders (NWB) is a new initiative, which aims at overcoming these limitations by creating a new common file format, the NWB format, to facilitate data sharing across the individual labs. The objective of this thesis is to investigate how well this newly created file format is suited to become the new common standard within the research community of cell-based neurophysiology. To address this question we studied how well the NWB file format encodes the diversity of existing data in this field. Four carefully selected datasets served as model cases for the development and testing of the new file format. We investigated how diverse these datasets are and how well they represent the different data categories that the workgroup defined as essential for the field of cell-based neurophysiology. Furthermore, we determined the degree of standardization that the new file format achieves by studying the amount of empty fields and special custom fields needed to map the four datasets onto the NWB format. To further benchmark the NWB format we compared to existing file formats and studied if and how NWB overcomes the limitation of the other formats. Finally, we developed a metric to evaluate how well the NWB format fulfills the requirements for the new common file format, which were defined by the NWB workgroup. We hypothesize that the NWB format is suited to become the common file standard within the community of cell-based neurophysiology.
The results of this study demonstrate how much the NWB file format has already achieved. However, due to the small size of test datasets and the resulting lack of diversity we reject our hypotheses at this point in time. More results are required to get clarity about the true potential of the NWB format and time will show if it will rise to become the universal data standard in the field of cell-based electrophysiology.
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-use#LAACitable link to this page
http://nrs.harvard.edu/urn-3:HUL.InstRepos:33797333
Collections
- DCE Theses and Dissertations [1189]
Contact administrator regarding this item (to report mistakes or request changes)