Publication:

Practical Semiparametric Inference With Bayesian Nonparametric Ensembles

Loading...
Thumbnail Image

Date

2019-05-13

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Liu, Jeremiah Zhe. 2019. Practical Semiparametric Inference With Bayesian Nonparametric Ensembles. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.

Abstract

Set in the practical situation where the data-generating process is not known and there are multiple imperfect candidate models available, this thesis studies how to construct an approximation model that optimally captures the relevant aspect of the data, for the purpose of conducting sound inference. We consider three types of inference objectives: hypothesis testing (Chapter 2), spatiotemporal prediction (i.e. estimating conditional mean) (Chapter 3), and uncertainty quantification (i.e. estimating distribution function) (Chapter 4). We focus on regression models for continuous outcome. Specifically, we propose Bayesian Nonparametric Ensemble (BNE), a general modeling approach that combines the a priori information encoded in candidate models using ensemble methods, and then addresses the systematic bias in the candidate models using Bayesian nonparametric machinery. As a result, BNE specifies a large model space that is centered around the ensemble of candidate models. Through both theoretical investigation and extensive numeric studies, we show that the proposed approach achieves a valid and powerful test for nonlinear effects (Chapter 2), improves predictive performance (Chapter 3), and provides calibrated quantification of its varying degree of model uncertainty over the feature space (Chapter 4). The proposed method is applied to the detection of nutrition-environment interaction effect on early-stage neuro-development in Bangladesh children, and the integration of multiple spatial prediction models for PM 2.5 levels in Eastern Massachusetts, USA.

Description

Other Available Sources

Research Data

Keywords

Bayesian Nonparametrics, Ensemble Learning, Robustness, Hypothesis Testing, Spatio-temporal Modeling

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories