Publication:
Overcoming the Curse: Statistical Theory and Methods in High Dimensions

No Thumbnail Available

Date

2021-05-07

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Wang, Wenshuo. 2021. Overcoming the Curse: Statistical Theory and Methods in High Dimensions. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.

Research Data

Abstract

Battling the curse of dimensionality is a common theme in modern statistics. This dissertation contributes to this effort by surveying theory and methods in high-dimensinonal statistics developed by the author and collaborators, specifically in the areas of regression and sampling. In regression analysis, statisticians try to relate a response variable Y to a set of potential ex- planatory variables X = (X1,... , Xp), and start by trying to identify variables that contribute to this relationship. In statistical terms, this goal can be posed as trying to identify Xj’s upon which Y is conditionally dependent. Sometimes it is of value to simultaneously test for each j, which is more commonly known as variable selection. Classic techniques fail to accomplish these tasks when the number of features is too large. To address this challenge, the conditional randomization test and model-X knockoffs are two recently proposed methods that respectively per- form conditional independence testing and variable selection using knowledge of X’s distribution. Chapter 1 proposes the first generic knockoff sampling algorithm, which is the point of departure of implementing knockoffs, and investigates its optimality in terms of time complexity. Chapter 2 examines the power of the conditional randomization test and knockoffs in the high-dimensional regime in which the ratio of the dimension p and the sample size n converge to a positive constant. In sampling, statisticians try to draw samples from a target distribution. It is quite demanding to obtain representative samples from a high-dimensional distribution while restricting the sample size at a reasonable level. Sequential Monte Carlo and sequential quasi-Monte Carlo methods are powerful computational tools on this front. Chapter 3 studies these methods and derives new convergence rates in terms of the number of samples n.

Description

Other Available Sources

Keywords

Statistics

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories