Publication:

Sort-seq under the hood: implications of design choices on large-scale characterization of sequence-function relations

Loading...
Thumbnail Image

Open/View Files

Date

2016

Journal Title

Journal ISSN

Volume Title

Publisher

BioMed Central
The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Peterman, Neil, and Erel Levine. 2016. “Sort-seq under the hood: implications of design choices on large-scale characterization of sequence-function relations.” BMC Genomics 17 (1): 206. doi:10.1186/s12864-016-2533-5. http://dx.doi.org/10.1186/s12864-016-2533-5.

Abstract

Background: Sort-seq is an effective approach for simultaneous activity measurements in a large-scale library, combining flow cytometry, deep sequencing, and statistical inference. Such assays enable the characterization of functional landscapes at unprecedented scale for a wide-reaching array of biological molecules and functionalities in vivo. Applications of sort-seq range from footprinting to establishing quantitative models of biological systems and rational design of synthetic genetic elements. Nearly as diverse are implementations of this technique, reflecting key design choices with extensive impact on the scope and accuracy the results. Yet how to make these choices remains unclear. Here we investigate the effects of alternative sort-seq designs and inference methods on the information output using mathematical formulation and simulations. Results: We identify key intrinsic properties of any system of interest with practical implications for sort-seq assays, depending on the experimental goals. The fluorescence range and cell-to-cell variability specify the number of sorted populations needed for quantitative measurements that are precise and unbiased. These factors also indicate cases where an enrichment-based approach that uses a single sorted population can offer satisfactory results. These predications of our model are corroborated using re-analysis of published data. We explore implications of these results for quantitative modeling and library design. Conclusions: Sort-seq assays can be streamlined by reducing the number of sorted populations, saving considerable resources. Simple preliminary experiments can guide optimal experiment design, minimizing cost while maintaining the maximal information output and avoiding latent biases. These insights can facilitate future applications of this highly adaptable technique. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2533-5) contains supplementary material, which is available to authorized users.

Description

Research Data

Keywords

Sequence-function relations, Systems biology, High-throughput sequencing, Fluorescence-activated cell sorting

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories