Statistical Methods for Multivariate and Complex Phenotypes

DSpace/Manakin Repository

Statistical Methods for Multivariate and Complex Phenotypes

Citable link to this page


Title: Statistical Methods for Multivariate and Complex Phenotypes
Author: Agniel, Denis Madison
Citation: Agniel, Denis Madison. 2014. Statistical Methods for Multivariate and Complex Phenotypes. Doctoral dissertation, Harvard University.
Full Text & Related Files:
Abstract: Many important scientific questions can not be studied properly using a single measurement as a response. For example, many phenotypes of interest in recent clinical research may be difficult to characterize due to their inherent complexity. It may be difficult to determine the presence or absence of disease based on a single measurement, or even a few measurements, or the phenotype may only be defined based on a series of symptoms. Similarly, a set of related phenotypes or measurements may be studied together in order to detect a shared etiology. In this work, we propose methods for studying complex phenotypes of these types, where the phenotype may be characterized either longitudinally or by a diverse set of continuous, discrete, or not fully observed components.
In chapter 1, we seek to identify predictors that are related to multiple components of diverse outcomes. We take up specifically the question of identifying a multiple regulator, where we seek a genetic marker that is associated with multiple biomarkers for autoimmune disease. To do this, we propose sparse multiple regulation testing (SMRT) both to estimate the relationship between a set of predictors and diverse outcomes and to provide a testing framework in which to identify which predictors are associated with multiple elements of the outcomes, while controlling error rates. In chapter 2, we seek to identify risk profiles or risk scores for diverse outcomes, where a risk profile is a linear combination of predictors. The risk profiles will be chosen to be highly correlated to latent traits underlying the outcomes. To do this, we propose semiparametric canonical correlation analysis (sCCA), an updated version of the classical canonical correlation analysis. In chapter 3, the scientific question of interest pertains directly to the progression of disease over time. We provide a testing framework in which to detect the association between a set of genetic markers and the progression of disease in the context of a GWAS. To test for this association while allowing for highly nonlinear longitudinal progression of disease, we propose functional principal variance component (FPVC) testing.
Terms of Use: This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at
Citable link to this page:
Downloads of this work:

Show full Dublin Core record

This item appears in the following Collection(s)


Search DASH

Advanced Search