Publication:

Statistical Methods for Analyzing DNA Methylation Data and Subpopulation Analysis of Continuous, Binary and Count Data for Clinical Trials

Loading...
Thumbnail Image

Date

2015-01-13

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Yip, Wai-Ki. 2015. Statistical Methods for Analyzing DNA Methylation Data and Subpopulation Analysis of Continuous, Binary and Count Data for Clinical Trials. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.

Abstract

DNA methylation may represent an important contributor to the missing heritability described in complex trait genetics. However, technology to measure DNA methylation has outpaced statistical methods for analysis. Novel methodologies are required to accommodate this growing volume of DNA methylation data. In this dissertation, I propose two novel methods to analyze DNA methylation data: (1) a new statistic based on spatial location information of DNA methylation sites to detect differentially methylated regions in the genome in case and control studies; and (2) a principal component approach for the detection of unknown substructure in DNA methylation data. For each method, I review existing ones and demonstrate the efficacy of my proposed method using simulation and data application.

Medical research is increasingly focused on personalizing the care of patients. A better understanding of the interaction between treatment and patient specific prognostic factors will enable practitioners to expand the availability of tailored therapies improving patient outcomes. The Subpopulation Treatment Effect Pattern Plot (STEPP) approach was developed to allow researchers to investigate the heterogeneity of treatment effects on survival outcomes across increasing values of a continuously measured covariate, such as biomarker measurement. I extend the STEPP approach to continuous, binary and count outcomes which can be easily modeled with generalized linear models (GLM). The statistical significance of any observed heterogeneity of treatment effect is assessed using permutation tests. The method is implemented in the R software package (stepp) and is available in R version 3.1.1. The efficacy of my STEPP extension is demonstrated by using simulation and data application.

Description

Other Available Sources

Research Data

Keywords

Biology, Biostatistics, Biology, Bioinformatics, Biology, Genetics

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories