Overcoming the Curse: Statistical Theory and Methods in High Dimensions
Access StatusFull text of the requested work is not available in DASH at this time ("dark deposit"). For more information on dark deposits, see our FAQ.
MetadataShow full item record
CitationWang, Wenshuo. 2021. Overcoming the Curse: Statistical Theory and Methods in High Dimensions. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.
AbstractBattling the curse of dimensionality is a common theme in modern statistics. This dissertation contributes to this effort by surveying theory and methods in high-dimensinonal statistics developed by the author and collaborators, specifically in the areas of regression and sampling. In regression analysis, statisticians try to relate a response variable Y to a set of potential ex- planatory variables X = (X1,... , Xp), and start by trying to identify variables that contribute to this relationship. In statistical terms, this goal can be posed as trying to identify Xj’s upon which Y is conditionally dependent. Sometimes it is of value to simultaneously test for each j, which is more commonly known as variable selection. Classic techniques fail to accomplish these tasks when the number of features is too large. To address this challenge, the conditional randomization test and model-X knockoffs are two recently proposed methods that respectively per- form conditional independence testing and variable selection using knowledge of X’s distribution. Chapter 1 proposes the first generic knockoff sampling algorithm, which is the point of departure of implementing knockoffs, and investigates its optimality in terms of time complexity. Chapter 2 examines the power of the conditional randomization test and knockoffs in the high-dimensional regime in which the ratio of the dimension p and the sample size n converge to a positive constant. In sampling, statisticians try to draw samples from a target distribution. It is quite demanding to obtain representative samples from a high-dimensional distribution while restricting the sample size at a reasonable level. Sequential Monte Carlo and sequential quasi-Monte Carlo methods are powerful computational tools on this front. Chapter 3 studies these methods and derives new convergence rates in terms of the number of samples n.
Citable link to this pagehttps://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37368273
- FAS Theses and Dissertations