Publication:

Quantifying and Mitigating the Effect of Preferential Sampling on Phylodynamic Inference

Loading...
Thumbnail Image

Open/View Files

Date

2016

Journal Title

Journal ISSN

Volume Title

Publisher

Public Library of Science
The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Karcher, Michael D., Julia A. Palacios, Trevor Bedford, Marc A. Suchard, and Vladimir N. Minin. 2016. “Quantifying and Mitigating the Effect of Preferential Sampling on Phylodynamic Inference.” PLoS Computational Biology 12 (3): e1004789. doi:10.1371/journal.pcbi.1004789. http://dx.doi.org/10.1371/journal.pcbi.1004789.

Abstract

Phylodynamics seeks to estimate effective population size fluctuations from molecular sequences of individuals sampled from a population of interest. One way to accomplish this task formulates an observed sequence data likelihood exploiting a coalescent model for the sampled individuals’ genealogy and then integrating over all possible genealogies via Monte Carlo or, less efficiently, by conditioning on one genealogy estimated from the sequence data. However, when analyzing sequences sampled serially through time, current methods implicitly assume either that sampling times are fixed deterministically by the data collection protocol or that their distribution does not depend on the size of the population. Through simulation, we first show that, when sampling times do probabilistically depend on effective population size, estimation methods may be systematically biased. To correct for this deficiency, we propose a new model that explicitly accounts for preferential sampling by modeling the sampling times as an inhomogeneous Poisson process dependent on effective population size. We demonstrate that in the presence of preferential sampling our new model not only reduces bias, but also improves estimation precision. Finally, we compare the performance of the currently used phylodynamic methods with our proposed model through clinically-relevant, seasonal human influenza examples.

Description

Research Data

Keywords

Biology and Life Sciences, Evolutionary Biology, Population Genetics, Effective Population Size, Genetics, Population Biology, Population Metrics, Population Size, Medicine and Health Sciences, Infectious Diseases, Viral Diseases, Influenza, Simulation and Modeling, Database and Informatics Methods, Biological Databases, Sequence Databases, Molecular Biology, Molecular Biology Techniques, Sequencing Techniques, Sequence Analysis, Mathematical and statistical techniques, Statistical methods, Monte Carlo method, Physical sciences, Mathematics, Statistics (mathematics), Population Dynamics

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories