Publication:
Sparse Patterns in High Dimensional Data: Discovering Order Parameters for Developmental Biology

No Thumbnail Available

Date

2020-09-28

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Melton, Samuel. 2020. Sparse Patterns in High Dimensional Data: Discovering Order Parameters for Developmental Biology. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.

Research Data

Abstract

Recent technological advances allow increasingly detailed measurements of biological systems. These measurements -- ranging from sequencing every gene transcript in all the cells in a developing embryo, to recording from tens of thousands of neurons in an animals brain -- provide comprehensive and quantitative descriptions of the system in the form of high dimensional data. Extracting meaningful insight and mechanistic understanding from these data remains difficult, and poses an ongoing challenge in the biological sciences. Here, we formalize challenges associated with analysis of high dimensional data sets. Focusing on single cell gene expression measurements, we describe the barriers faced in bridging the gap between these data and experimental follow-ups in developmental biology. We then describe a class of statistical techniques based on searching for sparse (or low dimensional codes) that provide a framework for constructing meaningful quantitative representations of the underlying system, and we show that these codes allow the inference of cell states during developmental trajectories. Next, we show that these sparse codes, shown to define the state of a cell, can also be used to infer complex dynamical trajectories of the cells as they undergo a series of lineage decisions and transitions. We demonstrate how these low dimensional codes, consisting of just a small set of key genes, act as order parameters for developmental transitions. Finally, we use these statistical techniques to infer a map of the dynamics of human neocortical development, and predict order parameters in the form of key molecular markers for lineage decisions in this progression. These techniques demonstrate the efficacy of leveraging sparse patterns in high dimensional distributions to gain statistical insights, and to connect the quantitative measurements to molecular drivers of the underlying dynamics.

Description

Other Available Sources

Keywords

Biology

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories