Show simple item record

dc.contributor.advisorLi, Yi
dc.contributor.authorZhao, Sihai
dc.date.accessioned2012-08-09T15:05:11Z
dc.date.issued2012-08-09
dc.date.submitted2012
dc.identifier.citationZhao, Sihai. 2012. Survival Analysis with High-Dimensional c\Covariates, with Applications to Cancer Genomics. Doctoral dissertation, Harvard University.en_US
dc.identifier.otherhttp://dissertations.umi.com/gsas.harvard:10245en
dc.identifier.urihttp://nrs.harvard.edu/urn-3:HUL.InstRepos:9385643
dc.description.abstractRecent technological advances have given cancer researchers the ability to gather vast amounts of genetic and genomic data from individual patients. These offer tantalizing possibilities for, for example, basic cancer biology, tailored therapies, and personalized risk predictions. At the same time, they have also introduced many analytical difficulties that cannot be properly addressed with current statistical procedures, because the number of genomic covariates in these datasets is often larger than the sample size. In this dissertation we study methods for addressing this so-called high-dimensional issue when genomic data are used to analyze time-to-event outcomes, so common to clinical cancer studies. In Chapter 1, we propose a regularization method for sparse estimation for estimating equations. Our method can be used even when the number of covariates exceeds the number of samples, and can be implemented using well-studied algorithms from the non-linear constrained optimization literature. Furthermore, for certain estimating equations and certain regularizers, including the lasso and group lasso, we prove a finite-sample probability bound on the accuracy of our estimator. However, it is well-known that these types of regularization methods can achieve better performance if a quick and simple procedure is first used to reduce the number of covariates. In Chapter 2, we propose and theoretically justify a principled method for reducing dimensionality in the analysis of censored data by selecting only the important covariates. Our procedure involves a tuning parameter that has a simple interpretation as the desired false positive rate of this selection. Similar types of model-based screening methods have also been proposed, but only for a few specific models. Model-free screening methods have also recently been studied, but can have lower power to detect important covariates. In Chapter 3 we propose a screening procedure that can be used with any model that can be fit using estimating equations, and provide unified results on its finite-sample screening performance. We thus generalize many recently proposed model-based and model-free screening procedures. We also propose an iterative version of our method and show that it is closely related to a recently studied boosting method for estimating equations.en_US
dc.language.isoen_USen_US
dash.licenseLAA
dc.subjectbiostatisticsen_US
dc.titleSurvival Analysis with High-Dimensional c\Covariates, with Applications to Cancer Genomicsen_US
dc.typeThesis or Dissertationen_US
dc.date.available2012-08-09T15:05:11Z
thesis.degree.date2012en_US
thesis.degree.disciplineBiostatisticsen_US
thesis.degree.grantorHarvard Universityen_US
thesis.degree.leveldoctoralen_US
thesis.degree.namePh.D.en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record