Publication: Causal Inference with Complex Exposures in Observational Studies
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
Many existing causal inference methods address the ideal situation of estimating deterministic causal effects of a single binary exposure assuming non-interference. In real-world observational studies, data harmonized from various sources can be large, heterogeneous, and imperfectly measured, creating numerous statistical methodological challenges. This dissertation proposes approaches to address some methodological causal inference challenges with complex exposure regimes. These exposure regimes include exposures measured with error, continuous exposures, and time-varying exposures.
In Chapter 1, we propose a new approach for estimating causal effects when the exposure is measured with error and confounding adjustment is performed via a generalized propensity score (GPS). Using validation data, we propose a regression calibration (RC)-based adjustment for a continuous error-prone exposure combined with GPS to adjust for confounding (RC-GPS). The outcome analysis is conducted after transforming the corrected continuous exposure into a categorical exposure. We consider confounding adjustment in the context of GPS subclassification, inverse probability treatment weighting, and matching. In simulations with varying degrees of exposure error and confounding bias, RC-GPS eliminates bias from exposure error and confounding compared to standard approaches that rely on error-prone exposure. We applied RC-GPS to a rich data platform to estimate the causal effect of long-term exposure to fine particles (PM2.5) on mortality in New England for the period from 2000 to 2012.
In Chapter 2, we propose an innovative approach for GPS caliper matching in settings with continuous exposures. We first introduce an assumption of identifiability, called local weak unconfoundedness, that is less stringent than what is currently proposed in the literature. Under this assumption and mild smoothness conditions, we provide theoretical guarantees that our proposed matching estimators attain consistency and asymptotic normality. Importantly, we introduce new measures of covariate balance under the matching framework. In simulations, our proposed matching estimator outperforms existing methods under settings of model misspecification and/or in the presence of extreme values of the estimated GPS in terms of bias reduction and root mean squared error, and overall achieves excellent covariate balance. We utilize the largest-to-date Medicare claims data for the entire US from 2000 to 2016 to construct a continuous causal exposure-response curve for long-term exposure to fine particles (PM2.5) on mortality. We observed aggravated harmful effects at exposure levels below the national standards, and suggest that a revision to lower current environmental standards is indispensable for improving public health.
In Chapter 3, we turn to the problem of estimating the causal effect of a time-varying exposure on a health outcome. We introduce a new causal inference framework for time series data aimed at assessing the effectiveness of heat alerts in reducing mortality and hospitalization risk. In the context of time series data, the overlap assumption -- each unit must have a positive probability of receiving the treatment -- is often violated because issuing the heat alert on a given day in a given county is a rare event. To overcome this challenge, first we introduce a new class of causal estimands under a stochastic intervention (i.e., increasing the odds of issuing a heat alert) for a single time series. We develop the theory to show that these causal estimands can be identified and estimated under a weaker version of the overlap assumption. Second, we propose nonparametric estimators based on time-varying propensity scores and derive point-wise confidence bands for these estimators. Third, we extend this framework onto multiple time series. Furthermore, via simulations, we show that the proposed estimator has good performance with respect to integrated bias and root mean squared error. Finally, we apply our proposed method to estimate the causal effects of increasing the odds of issuing heat alerts in reducing deaths and hospitalizations among Medicare enrollees in 2817 U.S. counties for the period 2006-2016.