Publication: Robust Statistical Methods for Cluster Randomized Trials
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
In a cluster randomized trial (CRT), groups of people are randomly assigned to receive different interventions. CRTs are useful when interventions are more naturally or feasibly applied at the cluster level or when we wish to quantify population-level effects. Existing parametric and semiparametric statistical methods for analyzing CRTs can perform poorly when distributional assumptions are violated, when there are a small number of clusters, or when some outcomes are missing. This dissertation is devoted to developing robust statistical methods to address these challenges. First, we focus on randomization-based inference, an alternative approach for analyzing CRTs that is distribution-free and does not require a large number of clusters to be valid. Although it is well known that a confidence interval (CI) can be obtained by inverting a randomization test, this requires testing a non-zero null hypothesis, which is challenging with non-continuous and survival outcomes. In Chapter 1, we introduce a general method for calculating randomization-based CIs in parallel CRTs using regression models with fixed offset terms. This approach accommodates various outcome types and employs an efficient algorithm to speed up CI computation. In Chapter 2, we extend this methodology to a stepped wedge trial (SWT), a CRT design in which clusters crossover from control to intervention at different time points. Here, we also examine how our method compares to existing randomization-based approaches for SWTs based on cluster-period summary statistics, and demonstrate how to account for design features---and why it is important to do so---in a randomization-based analysis. In Chapter 3, we investigate multiply robust (MR) estimation in CRTs with missing outcomes. Unlike imputation, inverse probability weighting, or doubly robust estimation, MR estimation allows for various sets of multiple models (e.g. for the propensity scores, covariate-adjusted means) to be specified, with consistency guaranteed if any one of these models is correct. We establish MR estimators for the marginal mean, scale, and correlation parameters (e.g. the intracluster correlation coefficient) in a CRT via weighted generalized estimating equations.