Publication: Inference on Nonparametric Targets and Discrete Structures
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
Many modern applications seek to understand the relationship between different variables. For example, scientists want to infer the dependence between an outcome variable and a covariate in the presence of a (possibly high-dimensional) confounding variable. In the context of graphs and networks, it is also interesting to learn the underlying discrete structures. This dissertation focuses on designing uncertainty assessment methodologies for nonparametric targets and discrete graph structures to reveal complex patterns in the underlying data-generating distributions. Chapter 1 focuses on the variable importance problem: it proposes a new approach called floodgate and applies it to the minimum mean squared error gap, an interpretable and sensitive model-free measure of variable importance. Floodgate can leverage any working regression function chosen by the user to construct asymptotic lower confidence bounds, and its adaptivity and robustness are also discussed. Chapter 2 delivers a regression inference framework: it uses the mMSE gap with respect to a closed linear subspace or a convex cone to define a diverse range of inferential targets; it utilizes the floodgate idea to conduct inference in a unified way. To demonstrate the generality and flexibility of floodgate, it presents the computation details of implementing floodgate for multiple statistical problems, including nonlinearity, interactions, deviation from shape constraints and many others. Chapter 3 studies the hub, a particular type of discrete structure. It proposes the StarTrek filter to select hub nodes over the networks and establishes FDR control guarantees in high-dimensional models. As core techniques for such FDR control problems, novel probabilistic results, i.e., Cram'er-type Gaussian comparison bounds, are developed in this chapter.