Publication:

Statistical Methods for Analyzing Complex Spatial and Missing Data

Loading...
Thumbnail Image

Date

2015-12-04

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Antonelli, Joseph. 2016. Statistical Methods for Analyzing Complex Spatial and Missing Data. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.

Abstract

In chapter 1, we develop a novel two-dimensional wavelet decomposition to decompose spatial surfaces into different frequencies without imposing any restrictions on the form of the spatial surface. We illustrate the effectiveness of the proposed decomposition on satellite based PM2.5 data, which is available on a 1km by 1km grid across Massachusetts. We then apply our proposed decomposition to study how different frequencies of the PM2.5 surface adversely impact birth weights in Massachusetts.

In chapter 2, we study the impact of monitor locations on two stage health effect studies in air pollution epidemiology. Typically in these studies, estimates of air pollution exposure are obtained from a first stage model that utilizes monitoring data, and then a second stage outcome model is fit using this estimated exposure. The location of the monitoring sites is usually not random and their locations can drastically impact inference in health effect studies. We take an in-depth look at the specific case where the location of monitors depends on the locations of the subjects in the second stage model and show that inference can be greatly improved in this setting relative to completely random allocation of monitors.

In chapter 3, we introduce a Bayesian data augmentation method to control for confounding in large administrative databases when additional data is available on confounders in a validation study. Large administrative databases are becoming increasingly available, and they have the power to address many questions that we otherwise couldn't answer. Most of these databases, while large in size, do not have sufficient information on confounders to validly estimate causal effects. However, in many cases a smaller, validation data set is available with a richer set of confounders. We propose a method that uses information from the validation data to impute missing confounders in the main data and select only those confounders which are necessary for confounding adjustment. We illustrate the effectiveness of our method in a simulation study, and analyze the effect of surgical resection on 30 day survival in brain tumor patients from Medicare.

Description

Other Available Sources

Research Data

Keywords

Biology, Biostatistics, Statistics

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories