Publication:
Statistical Methods for Precise Phenotypes in Alzheimer's Disease Research

No Thumbnail Available

Date

2018-09-25

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Goodman, Matthew Orlando. 2018. Statistical Methods for Precise Phenotypes in Alzheimer's Disease Research. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.

Research Data

Abstract

Longitudinal cohort studies in Alzheimer ’s disease (AD) often measure a variety of quantitative features that each provide a window on the disease process and may be combined to form a more complete picture. These phenotypes, such as time-to-diganosis of AD, longitudinal measures of cognition, additional functional scores, along with pathology outcomes and biomarker measurements, present opportunities and challenges for statistical analysis. We propose statistical methods that are tailored to such challenging outcome settings, explore the statistical properties of these proposed methods, and make comparisons to existing methods where available. In chapter 1, we discuss a SNP-set test for a count outcome with informative excess zeros, such as occurs when observing the burden of neuritic plaques in Alzheimer ’s brain pathology studies, where those who show none contribute to our understanding of neurodegenerative disease. The outcome may be characterized by a mixture distribution with one component being the ‘structural zero’ and the other component being a Poisson distribution. We propose a novel variance components score test of genetic association between a set of genetic markers and a zero-inflated count outcome from a mixture distribution. This test shares advantageous properties with SNP-set tests which have been previously devised for standard continuous or binary outcomes, such as the Sequence Kernel Association Test (SKAT). In particular, our method has superior statistical power compared to competing methods, especially when there is correlation within the group of markers, and when the SNPs are associated with both the mixing proportion and the rate of the Poisson distribution. We apply the method to Alzheimer ’s data from the Rush University Religious Orders Study and Memory and Aging Project (ROSMAP), where as proof of principle we find highly significant associations with the APOE gene, in both the ‘structural zero’ and ‘count’ parameters, when applied to a zero-inflated neuritic plaques count outcome. In chapter 2 we consider chronic degenerative diseases with age-related etiology, such as Alzheimer ’s disease, where it may be useful to test for age-specific genetic associations. In this paper we propose an age-specific test of association between a set of genetic or other baseline markers and a collection of multiple outcomes, including a post-landmark time-to-event, e.g. time-to-diagnosis, and a pre-landmark set of longitudinal trajectories, e.g. cognitive testing. This testing setting arises naturally in a landmark prediction setting with age-specific risk. Here the question of interest is to predict risk of an outcome occurring within τ -years of a landmark point, where longitudinal outcomes and baseline data have been collected prior to the landmark. We propose age-specific and outcome-specific tests as well as a global test of association for the marker set with multivariate outcome across ages. As a demonstration of our method we analyze data from the ROSMAP study of aging and discover a statistically significant association of the APOE gene with both 5-year AD-free survival and cognitive trajectory scores. In Chapter 3, we propose a data-driven method to characterize severity and risk using data from multivariate longitudinal outcomes, in tandem with a time-to-event outcome, to improve understanding of etiology and prognosis in chronic degenerative neurological disorders and cognitive aging. The method will find application to studies that assess multiple outcomes over time, such as cognitive assessments, functional assessments, imaging and other biomarkers, as well as time to clinical diagnosis. Our approach is to construct a time-specific shared-coefficients model that unifies the effect of baseline predictors on a survival outcome and multivariate longitudinal outcomes. We build on previous work on time-varying coefficients models and sufficient dimension reduction, extending and combining existing methods to provide a unified approach in our outcome setting. This allows us to extract a data-driven bivariate severity measure, consisting of a linear combination of the longitudinal phenotypes, along with the time-to-event outcome itself, which together capture the disease process, and are by construction predictable by a simultaneously extracted data-driven time-varying risk score.

Description

Other Available Sources

Keywords

Biostatistics, Methods, Alzheimer's

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories