Publication:

New Methods for Causal Inference in Randomized Experiments and Observational Studies

Loading...
Thumbnail Image

Date

2022-05-12

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Chattopadhyay, Ambarish. 2022. New Methods for Causal Inference in Randomized Experiments and Observational Studies. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.

Abstract

A fundamental goal of numerous studies across social and health sciences is to estimate the causal effect of an intervention or treatment on an outcome in a well-defined target population. In view of this goal, the dissertation provides novel frameworks and methodologies to design and analyze both randomized experiments and observational studies for causal inference. A summary of each chapter is presented below, and their mutual connections are explored in the Introduction.

Chapter 1. We revisit, formalize, and extend the Finite Selection Model (FSM) as a general tool for experimental design with multiple treatment groups. The original version of the FSM was proposed by Carl N. Morris for designing RAND's Health Insurance Experiment (HIE, Morris 1979, Newhouse et al. 1993). The idea behind the FSM is that treatment groups take turns selecting units in a fair and random selection order. At each of its turns, using a prespecified common criterion, a treatment chooses the single unit from the ever shrinking pool of remaining available units that maximally improves the combined quality of its resulting group of units. Leveraging the idea of D-optimality, we propose and evaluate a new selection criterion for treatments in the FSM. The FSM with the D-optimal selection function has no tuning parameters, is affine invariant, achieves near-exact mean-balance on a class of covariate transformations, and retrieves several classical designs such as randomized block and matched-pair designs. For a range of cases with multiple treatment groups, we propose algorithms to generate a fair and random selection order of treatments. We demonstrate FSM's performance in terms of balance and efficiency in a simulation study and a case study based on the HIE data.

Chapter 2. A basic principle in the design of observational studies is to approximate the randomized experiment that would have been conducted under controlled circumstances. Now, linear regression models are commonly used to analyze observational data and estimate causal effects. How do linear regression adjustments in observational studies emulate key features of randomized experiments, such as covariate balance, self-weighted sampling, and study representativeness? We provide answers to this and related questions by analyzing the implied (individual-level data) weights of linear regression. We derive new closed-form expressions of the weights and examine their properties in both finite sample and asymptotic regimes. The implied weights of general regression problems are shown to be equivalently obtained by solving a convex optimization problem. With this equivalence, we propose novel design-based regression diagnostics for causal inference. As special cases, we analyze the implied weights in common settings such as multi-valued treatments, regression adjustment after matching, and two-stage least squares with instrumental variables.

Chapter 3. Weighting methods are often used to generalize and transport estimates of causal effects from a study sample to a target population. Traditional methods construct the weights by separately modeling the treatment assignment and the study selection probabilities and then multiplying functions (e.g., inverses) of the estimated probabilities. These estimated multiplicative weights may not produce adequate covariate balance and can be highly variable, resulting in biased and/or unstable estimators, particularly when there is limited covariate overlap across populations or treatment groups. To address these limitations, we propose a weighting approach for both randomized and observational studies that weights each treatment group directly in `one go' towards the target population. We present a general framework for generalization and transportation by characterizing the study and target populations in terms of generic probability distributions. Under this framework, we justify this one-step weighting approach.The one-step weighting estimator for the target average treatment effect is shown to be consistent, asymptotically Normal, doubly robust, and semiparametrically efficient. We demonstrate the performance of this approach using a simulation study and a randomized case study.

Description

Other Available Sources

Research Data

Keywords

Causal Inference, Linear Regression, Observational Studies, Randomized Experiments, Statistics, Weighting, Statistics

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories