Publication:

Learning Low-Dimensional Structures in High Dimensions

Loading...
Thumbnail Image

Date

2025-04-23

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Song, Yanke. 2025. Learning Low-Dimensional Structures in High Dimensions. Doctoral Dissertation, Harvard University Graduate School of Arts and Sciences.

Abstract

Although researchers nowadays deal with increasingly high-dimensional datasets, lower-dimensional structures often exist and learning them effectively play a crucial role. This dissertation contributes to this effort by developing theories and methodologies for learning low dimensional structures in high dimensions, developed by the author and collaborators. The first two chapters focus on learning one-dimensional functionals in the proportional asymptotics setting, where the number of observations grows proportionally with the number of features. The theories developed in this asymptotic regime reveals surprising new phenomena and exhibit remarkable accuracy in finite samples of high dimensions, where classical theories fail. Chapter 1 investigates heritability estimation under this framework. We study the joint distribution of debiased Lasso and Ridge estimators under proportional asymptotics, and consequently propose a heritability estimator that features an adaptive tuning procedure. Empirical studies confirm its efficiency and robustness compared to previous methods. Chapter 2 focus on transfer learning in linear models. Within proportional asymptotics framework, we precisely characterize the generalization error of min-norm interpolators under both design and model shifts, and confirm the finite sample accuracy via simulations. This analysis identifies regimes where transfer learning is beneficial, and provides relevant guidelines for practitioners. Chapter 3 shifts focus to unsupervised learning of high-dimensional Poisson process arrival data. We propose an end-to-end pipeline that not only reconstructs underlying (inhomogeneous) rate functions of Poisson processes, but also learns low-dimensional representations that are valuable for downstream learning tasks.

Description

Other Available Sources

Research Data

Keywords

High-Dimensional Statistics, Poisson Processes, Signal-to-Noise Ratio, Transfer Learning, Statistics

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories