Publication: Design-Based Causal Inference: Applications to Social Sciences and Industry
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
In today's data-driven world, social scientists and industry data scientists increasingly utilize larger-scale complex experiments, e.g., adaptive sequential experiments and high-dimensional treatments, and novel observational causal inference practices, such as combining Difference-in-Difference and Matching, for causal inference. Despite this new landscape, there is a mismatch between the outdated statistical theory and practical modern demand from applications. In my thesis, my research aims to mend these tensions by providing rigorous yet assumption-lean practical causal solutions that are directly aligned with the novel settings in industry and social science applications.
Chapter 1: Randomization Tests with High-dimensional Treatments. Many experiments, such as conjoint experiments, have high-dimensional treatments. In this setting, standard causal inference methods, which assume a binary treatment, may not be adequate. For example, a simple difference-in-means is often under-powered, yet widely used in political science, to detect treatment effects present in complex high-dimensional interactions. In this chapter, we introduce conditional randomization tests to flexibly allow practitioners to utilize powerful machine learning algorithms to detect a (potentially) high-dimensional treatment effect while requiring no modelling assumptions for the validity of our approach.
Chapter 2: Anytime-valid Causal Inference Through the Design-Based Approach. In most A/B tests, observations arrive over time and the manager desires to test the results as new data becomes available to mitigate risk. For example, Netflix ran an A/B test with a strong negative treatment effect on the sign-up page on 30,000 potential subscribers for over a month, where Netflix lost approximately 3,000 potential subscribers or approximately 1 million U.S. Dollars of lifetime value from a single experiment. This issue is exacerbated for companies running numerous experiments on their customers as prolonged exposure to harmful treatments leads to retention drops and customer dissatisfaction. However, peeking and repeatedly testing with a t-test, for example, leads to uncontrolled type-1 error due to multiple testing. In this chapter, we propose design-based confidence sequences, sequences of confidence intervals with uniform type-1 error guarantees, that formally allows peeking while being statistically valid without any modeling assumptions or regularity assumption on the outcome.
Chapter 3: Difference-in-Difference and Matching Many applied researchers in both social sciences and industry commonly perform matching prior to a Difference-in-Difference (DiD) analysis to bolster the plausibility of the ``parallel-trends'' assumption in observational causal inference settings. For example, analysts regularly use DiD to evaluate the impact of a change when it is difficult to randomize such treatment, e.g., the effect of the introduction of a new law or company policy on revenue. Despite this common practice, it is still unclear when matching prior to a Difference-in-Difference analysis actually reduces bias. In this chapter, we aim to mathematically understand and quantify when the common practice of matching prior to a difference-in-difference analysis is justified and give further practical guidelines on when to match or not.