Survival Analysis with High-Dimensional c\Covariates, with Applications to Cancer Genomics

DSpace/Manakin Repository

Survival Analysis with High-Dimensional c\Covariates, with Applications to Cancer Genomics

Citable link to this page

 

 
Title: Survival Analysis with High-Dimensional c\Covariates, with Applications to Cancer Genomics
Author: Zhao, Sihai
Citation: Zhao, Sihai. 2012. Survival Analysis with High-Dimensional c\Covariates, with Applications to Cancer Genomics. Doctoral dissertation, Harvard University.
Full Text & Related Files:
Abstract: Recent technological advances have given cancer researchers the ability to gather vast amounts of genetic and genomic data from individual patients. These offer tantalizing possibilities for, for example, basic cancer biology, tailored therapies, and personalized risk predictions. At the same time, they have also introduced many analytical difficulties that cannot be properly addressed with current statistical procedures, because the number of genomic covariates in these datasets is often larger than the sample size. In this dissertation we study methods for addressing this so-called high-dimensional issue when genomic data are used to analyze time-to-event outcomes, so common to clinical cancer studies. In Chapter 1, we propose a regularization method for sparse estimation for estimating equations. Our method can be used even when the number of covariates exceeds the number of samples, and can be implemented using well-studied algorithms from the non-linear constrained optimization literature. Furthermore, for certain estimating equations and certain regularizers, including the lasso and group lasso, we prove a finite-sample probability bound on the accuracy of our estimator. However, it is well-known that these types of regularization methods can achieve better performance if a quick and simple procedure is first used to reduce the number of covariates. In Chapter 2, we propose and theoretically justify a principled method for reducing dimensionality in the analysis of censored data by selecting only the important covariates. Our procedure involves a tuning parameter that has a simple interpretation as the desired false positive rate of this selection. Similar types of model-based screening methods have also been proposed, but only for a few specific models. Model-free screening methods have also recently been studied, but can have lower power to detect important covariates. In Chapter 3 we propose a screening procedure that can be used with any model that can be fit using estimating equations, and provide unified results on its finite-sample screening performance. We thus generalize many recently proposed model-based and model-free screening procedures. We also propose an iterative version of our method and show that it is closely related to a recently studied boosting method for estimating equations.
Terms of Use: This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-use#LAA
Citable link to this page: http://nrs.harvard.edu/urn-3:HUL.InstRepos:9385643
Downloads of this work:

Show full Dublin Core record

This item appears in the following Collection(s)

 
 

Search DASH


Advanced Search
 
 

Submitters