Publication:
Bayesian Nonparametric Inference of Population Size Changes from Sequential Genealogies

Thumbnail Image

Open/View Files

Date

2015

Journal Title

Journal ISSN

Volume Title

Publisher

Genetics Society of America
The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Palacios, Julia A., John Wakeley, and Sohini Ramachandran. 2015. “Bayesian Nonparametric Inference of Population Size Changes from Sequential Genealogies.” Genetics 201 (1): 281-304. doi:10.1534/genetics.115.177980. http://dx.doi.org/10.1534/genetics.115.177980.

Research Data

Abstract

Sophisticated inferential tools coupled with the coalescent model have recently emerged for estimating past population sizes from genomic data. Recent methods that model recombination require small sample sizes, make constraining assumptions about population size changes, and do not report measures of uncertainty for estimates. Here, we develop a Gaussian process-based Bayesian nonparametric method coupled with a sequentially Markov coalescent model that allows accurate inference of population sizes over time from a set of genealogies. In contrast to current methods, our approach considers a broad class of recombination events, including those that do not change local genealogies. We show that our method outperforms recent likelihood-based methods that rely on discretization of the parameter space. We illustrate the application of our method to multiple demographic histories, including population bottlenecks and exponential growth. In simulation, our Bayesian approach produces point estimates four times more accurate than maximum-likelihood estimation (based on the sum of absolute differences between the truth and the estimated values). Further, our method’s credible intervals for population size as a function of time cover 90% of true values across multiple demographic scenarios, enabling formal hypothesis testing about population size differences over time. Using genealogies estimated with ARGweaver, we apply our method to European and Yoruban samples from the 1000 Genomes Project and confirm key known aspects of population size history over the past 150,000 years.

Description

Keywords

Population and Evolutionary Genetics, Markov process, genomics, sequentially Markov coalescent, point process, Gaussian process

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories