Publication: Transformation and Multivariate Methods for Improving Power in Genome-Wide Association Studies
No Thumbnail Available
Open/View Files
Date
2019-05-17
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
McCaw, Zachary Ryan. 2019. Transformation and Multivariate Methods for Improving Power in Genome-Wide Association Studies. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.
Research Data
Abstract
This dissertation proposes methods for improving power while maintaining valid inference in genome-wide association studies (GWAS) of common, complex, quantitative traits. Chapter 1 proposes association tests that incorporate the rank-based inverse normal transformation (INT). In the direct approach, the phenotype itself is transformed, whereas in the indirect approach, phenotypic residuals are transformed. Since neither test is uniformly most powerful, these approaches are combined into an adaptive INT-based omnibus test (O-INT). Chapter 2 proposes Surrogate Phenotype Regression Analysis (Spray) for leveraging information from surrogate outcome to improve inference on a partially missing target outcome. The target and surrogate outcomes are jointly modeled within a bivariate normal regression framework. Estimation in the presence of bilateral missingness is performed using the expectation conditional maximization algorithm, and a Wald test for the total effect of genotype on the target outcome is derived. Chapter 3 proposes Synthetic Surrogate Analysis (SSA) for extending Spray to the setting of multiple candidate surrogates. Rather than directly modeling the target outcome together with multiple surrogates, the candidate surrogates are instead combined into a univariate summary measure, termed the synthetic surrogate, which is jointly analyzed with the target outcome using the bivariate framework. Combining multiple surrogates to predict the target outcome improves power while maintaining computational tractability. All proposed estimation and inference procedures have been implemented in publicly available R packages.
Description
Other Available Sources
Keywords
Genome-wide Association Studies, Inverse Normal Transformation, Missing Data, Multivariate Analysis, Power, Statistical Genetics, Type I Error
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service