Two Numerical Methods for Approximating High-Dimensional Posterior Distributions

Quinn, Jameson Arnold

View/Open

QUINN-DISSERTATION-2020.pdf (1.163Mb)

Author

Quinn, Jameson Arnold

Metadata

Show full item record

Citation

Quinn, Jameson Arnold. 2020. Two Numerical Methods for Approximating High-Dimensional Posterior Distributions. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.

Abstract

The three chapters within this dissertation are largely self-contained, though chapter 3 does build on the ideas and work of chapter 2. The underlying similarities and connections are discussed in the foreword, but the content may be summarized separately.
Chapter 1: Online data assimilation in time series models over a large spatial extent is an important problem in both geosciences and robotics. Such models are intrinsically high-dimensional, rendering naive particle filter algorithms ineffective. I present a novel particle-based algorithm for online approximation of the filtering problem on such models, using the fact that each locus affects only nearby loci at the next time step. The algorithm constructs hybrid particles at time $t$ using an MCMC that combines values obtained by progressing various particles at time $t-1$, using custom-built proposal and acceptance probabilities. I show simulation results that suggest the error of this algorithm is uniform in both space and time, with a lower bias, though higher variance, as compared to a previously-proposed algorithm. Since this variance may be fixable with more computing power, this tradeoff is promising.
Chapter 2: Variational inference is a way to estimate posterior distributions, especially in cases such as models with many latent variables that make MCMC difficult. Existing techniques such as mean-field methods can fail to account for posterior correlations, leading to downward bias in estimates of posterior variance. We present a novel technique, Laplace Family Variational Inference, for creating posterior estimates with more-realistic posterior correlation structures. We show that this technique outperforms Gaussian mean-field variational inference in two models: one simple two-variable model and one model based on a multi-site study. We give results of the latter model on real data for an educational intervention, Early College High Schools.
Chapter 3: Ecological inference -- inferring individual-level quantities from group-level data -- appears in many contexts, but is particularly key to demonstrating violations of the US Voting Rights Act. In this setting, the standard approach to solving the ecological inference problem is King's EI. We extend the EI framework in two ways. First, we give a flexible Bayesian model of voting behavior that can be easily customized for different scenarios. Second, we show how to use the techniques from the Chapter 2, along with some observation-dependent reparametrizations, to perform variational inference on our model. We demonstrate this on simulated data based on actual racial voting patterns in the 2016 Presidential election in North Carolina. We show that this technique is comparably accurate to existing methods. Our model, however, easily permits extensions which would allow for increased power and/or addressing open questions in ecological inference.

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-use#LAA

Citable link to this page

https://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37365135

Collections

FAS Theses and Dissertations [6136]

Contact administrator regarding this item (to report mistakes or request changes)