Publication:
New Asymptotic Results on Randomization Inference in Experiments

No Thumbnail Available

Date

2018-05-14

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Li, Xinran. 2018. New Asymptotic Results on Randomization Inference in Experiments. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.

Research Data

Abstract

This manuscript consists of three self-contained chapters about randomization inference in experiments (Neyman, 1923; Fisher, 1935; Rubin, 1978). The first chapter studies the asymptotic property of rerandomization (Morgan and Rubin, 2012) in treatment-control experiments, and the second chapter extends the discussion to $2^K$ factorial experiments. The third chapter studies peer effects when the stable unit treatment value assumption (Rubin, 1980) fails. Chapter 1. Although complete randomization ensures covariate balance on average, the chance for observing significant differences between treatment and control covariate distributions increases with many covariates. Rerandomization discards randomizations that do not satisfy a predetermined covariate balance criterion, generally resulting in better covariate balance and more precise estimates of causal effects. Previous theory has derived finite sample theory for rerandomization under the assumptions of equal treatment group sizes, Gaussian covariate and outcome distributions, or additive causal effects, but not for the general sampling distribution of the difference-in-means estimator for the average causal effect. We develop asymptotic theory for rerandomization without these assumptions, which reveals a non-Gaussian asymptotic distribution for this estimator, specifically a linear combination of a Gaussian random variable and a truncated Gaussian random variable. This distribution follows because rerandomization affects only the projection of potential outcomes onto the covariate space but does not affect the corresponding orthogonal residuals. We also demonstrate that, compared to complete randomization, rerandomization reduces the asymptotic sampling variances and quantile ranges of the difference-in-means estimator. Moreover, our work allows the construction of accurate large-sample confidence intervals for the average causal effect, thereby revealing further advantages of rerandomization over complete randomization. Chapter 2. With many pretreatment covariates and treatment factors, classical factorial experiments often fail to balance covariates across multiple factorial effects simultaneously. Therefore, it is intuitive to restrict the randomization of the treatment factors to satisfy certain covariate balance criteria, possibly conforming to the tiers of factorial effects and covariates based on their relative importances. This is rerandomization in factorial experiments. We study the asymptotic properties of this experimental design under the randomization inference framework without imposing any distributional or modeling assumptions of the covariates and outcomes. We derive the joint asymptotic sampling distribution of the usual estimators of the factorial effects, and show that it is symmetric, unimodal, and more ``concentrated'' at the true factorial effects under rerandomization than under classical factorial experiments. This advantage of rerandomization is quantified by the mathematical notions of ``central convex unimodality'' and ``peakedness'' of the joint asymptotic sampling distribution, which also serve as theoretical bases for constructing conservative large-sample confidence sets for the factorial effects. Chapter 3. Many previous causal inference studies require no interference among units, that is, the potential outcomes of a unit do not depend on the treatments of other units. This no-interference assumption, however, becomes unreasonable when units are partitioned into groups and they interact with other units within groups. In a motivating application from Peking University, students are admitted through either the college entrance exam (also known as Gaokao) or recommendation (often based on Olympiads in various subjects). Right after entering college, students are randomly assigned to different dorms, each of which hosts four students. Because students within the same dorm live together and interact with each other extensively, it is very likely that peer effects exist and the no-interference assumption is violated. More importantly, understanding peer effects among students gives useful guidance for future roommate assignment to improve the overall performance of students. Methodologically, we define peer effects in terms of potential outcomes, and propose a randomization-based inference framework to study peer effects in general settings with arbitrary numbers of peers and arbitrary numbers of peer types. Our inferential procedure does not require any parametric modeling assumptions on the outcome distributions. Additionally, our analysis of the data set from Peking University gives useful practical guidance for policy makers.

Description

Other Available Sources

Keywords

Causal inference, Covariate balance, Geometry of rerandomization, Mahalanobis distance, Quantile range, Tiers of covariates, Tiers of factorial effects, Design-based inference, Interference, Optimal treatment assignment, Spillover effect

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories