Publication: Breaking the MAR Paradigm: Estimation, Bounding, and Sensitivity When Data Are Missing Not at Random
No Thumbnail Available
Open/View Files
Date
2020-05-14
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Ocampo, Alex. 2020. Breaking the MAR Paradigm: Estimation, Bounding, and Sensitivity When Data Are Missing Not at Random. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.
Research Data
Abstract
Statistical methods for unobserved, or missing data often rely on an assumption defined over 40 years ago; namely, that the data are missing at random (MAR). Simply put, MAR is when the probability of a missing value is based on other observed values, not the missing value itself. Armed with this assumption, statisticians can use methods that leverage the available data such as multiple imputation, likelihood approaches, and inverse probability weighting to obtain consistent and efficient estimates. If the MAR assumption is unrealistic, then only a few tools are available. The consequences of departures from MAR can be quantified by conducting a sensitivity analysis, however, there is no consensus on how to best carry out such an analysis, and many current approaches are tedious, technical, and time-consuming.
This dissertation aims to provide a path forward in certain situations when the MAR assumption does not hold. Chapter 1 provides conditions for identifying treatment effects when a continuous outcome is missing not at random. Identification is possible by reframing the estimand of a trimmed means estimator when the missing outcome comes from the ”poor” tail of its treatment distribution. Chapter 2 proposes an efficient approach to estimate upper and lower bounds of parameters that account for the uncertainty in the missing data. The bounding approach utilizes the influence function of the statistical functional at hand to identify the best and worst possible imputations of an incomplete data set. Chapter 3 argues that a ratio type estimator may make the MAR assumption more plausible. The approach is used to estimate vaccination coverage rates in the 77 communes of Benin where only some communes have survey data from random samples but all have administrative data available. The chapter additionally demonstrates how departures from MAR can be explored intuitively in the Bayesian framework through the introduction of sensitivity parameters.
Description
Other Available Sources
Keywords
Missing Data, Missing Not at Random, Influence Function, Sensitivity Analysis
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service