Publication:

Statistical Advances Under Dependence: Multiple Studies, Stein Variational Methods, and Limit Theorems

Loading...
Thumbnail Image

Date

2025-06-05

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Liu, Tianle. 2025. Statistical Advances Under Dependence: Multiple Studies, Stein Variational Methods, and Limit Theorems. Doctoral Dissertation, Harvard University Graduate School of Arts and Sciences.

Abstract

This dissertation presents contributions to statistical inference and probabilistic modeling, with a focus on uncertainty quantification, variational inference, and dependence structures in limit theorems.

The first part introduces a robust method for integrating evidence across multiple dependent studies. Recently, there has been a surge of interest in hypothesis testing methods for combining dependent studies without explicitly assessing their dependence. Among these, the Cauchy combination test (CCT) stands out for its approximate validity and power, leveraging a heavy-tail approximation insensitive to dependence. However, CCT is highly sensitive to large p-values and inverting it to construct confidence regions can result in regions lacking compactness, convexity, or connectivity. This article proposes a “heavily right” strategy by excluding the left half of the Cauchy distribution in the combination rule, retaining CCT's resilience to dependence while resolving its sensitivity to large p-values. Moreover, the Half-Cauchy combination as well as the harmonic mean approach guarantees bounded and convex confidence regions, distinguishing them as the only known combination tests with all such desirable properties. Efficient and accurate algorithms are introduced for implementing both methods. Additionally, we develop a divide-and-combine strategy for constructing confidence regions for high-dimensional mean estimation using the Half-Cauchy method, and empirically illustrate its advantages over the Hotelling T^2 approach. To demonstrate the practical utility of our Half-Cauchy approach, we apply it to network meta-analysis, constructing simultaneous confidence intervals for treatment effect comparisons across multiple clinical trials.

The second part examines Stein Variational Gradient Descent (SVGD) for Gaussian variational inference. Stein Variational Gradient Descent (SVGD) is a nonparametric particle-based deterministic sampling algorithm. Despite its wide usage, understanding the theoretical properties of SVGD has remained a challenging problem due to the complicated dependency structures between particles in their updates. Notably, for sampling from a Gaussian target, the SVGD dynamics with a bilinear kernel will remain Gaussian when the initializer is Gaussian. Inspired by this fact, we undertake a detailed theoretical study of the Gaussian--SVGD, i.e., SVGD projected to the family of Gaussian distributions via the bilinear kernel, or equivalently Gaussian variational inference (GVI) with SVGD. We present a complete picture by considering both the mean-field PDE and discrete particle systems. When the target is strongly log-concave, the mean-field Gaussian--SVGD dynamics is proven to converge linearly to the Gaussian distribution closest to the target in KL divergence. In the finite-particle setting, there is both uniform in time convergence to the mean-field limit and linear convergence in time to the equilibrium if the target is Gaussian. In the general case, we propose a density-based and a particle-based implementation of the Gaussian--SVGD, and show that several recent algorithms for GVI, proposed from different perspectives, emerge as special cases of our unifying framework. Interestingly, one of the new particle-based instance from this framework empirically outperforms existing approaches. Our results make concrete contributions towards obtaining a deeper understanding of both SVGD and GVI.

The third and fourth parts of the dissertation investigate classical probability problems concerning Wasserstein-p bounds in central limit theorems under dependence. The central limit theorem is one of the most fundamental results in probability and has been successfully extended to locally dependent data and strongly mixing random fields. In this work, we establish the rate of convergence in the central limit theorem in terms of transport distances. In specific, for arbitrary p>=1 we obtain an upper bound on the Wasserstein-p distance between the law of the scaled sum and the limiting normal distribution for (i) locally dependent random variables and (ii) strongly mixing stationary random fields. Our proofs extend the Stein's dependency neighborhood method for the Wasserstein distance of a general order p>=1 and provide new tools to study the deviation behaviors for dependent random variables. Moreover, as an application we demonstrate how our results can be used to obtain tail bounds that are asymptotically tight and decrease polynomially fast for the empirical average under weak dependence.

Collectively, these works provide new insights into statistical inference under dependence, advancing both methodological and theoretical understanding.

Description

Other Available Sources

Research Data

Keywords

Statistics, Computer science, Applied mathematics

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories