Publication:
Addressing Missing Data in Viral Genetic Linkage Analysis Through Multiple Imputation and Subsampling-Based Likelihood Optimization

No Thumbnail Available

Date

2015-06-26

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Erion, Gabriel Gandhi. 2015. Addressing Missing Data in Viral Genetic Linkage Analysis Through Multiple Imputation and Subsampling-Based Likelihood Optimization. Bachelor's thesis, Harvard College.

Research Data

Abstract

This thesis addresses the intersection of two important areas in epidemiology and statistics: genetic linkage analysis and missing data methods, respectively. Genetic linkage analysis is a promising method in viral epidemiology which involves learning about transmission patterns by studying clusters of similar gene sequences. For example, similar sequences found in a pair of geographically distinct communities may imply disease transmission between the two locations. However, this analysis is sensitive to missing data, which can introduce substantial bias. This thesis presents a multiple-imputation approach which corrects for much, though not all, of the bias in genetic linkage analysis. It also introduces a novel resampling-based approach that generates a weighted distribution of complete datasets and is even more effective than imputation for reducing bias. This work highlights the importance of missing data in genetic linkage studies and presents ways to provide more accurate epidemiological information by correcting for missing data. The new resampling-based approach presented in this paper is also general enough to be applied to many types of missing-data problems involving complex datasets; such broader applications are a promising avenue for future research.

Description

Other Available Sources

Keywords

Biology, Biostatistics, Statistics, Health Sciences, Epidemiology

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories