Publication: Three Essays on Making Casual Inferences with Test Scores
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
In education research test scores are a common object of analysis. Across studies test scores can be an important outcome, a highly predictive covariate, or a means of assigning treatment. However, test scores are a measure of an underlying proficiency we can’t observe directly and so contain error. This measurement error has implications for how we use test scores in research. In this dissertation, I combine psychometrics and causal inference to develop three methods for doing education research with test scores. In the first study, I combine Classical Test Theory and simulation to develop a generalized method for adjusting test score distribution where there was a policy to either selectively retest or rescore initially failing students. Using this method, I show how adjusting for retesting on a North Carolina accountability exam reduces the estimate of mean growth across testing occasions from .17 standard deviations to near zero. I also reexamine an investigation of “score scrubbing" on the New York Regent and demonstrate rescoring can inflate perceived scrubbing rates by a factor of three, from 12% to 36%. The second and third studies contribute to the literature on regression discontinuity design. In the second study, I create and evaluate two methods for estimating cross-site treatment effect variation in multi-site RDDs, one based on random-effects meta analysis and the other based on the fixed intercepts random coefficients model. I use these models to evaluate Massachusetts‘s “Education Proficiency Plan" policy and find enough treatment effect variance in three cohorts for the treatment effect to have been negative in more than a third of high schools. In the third study, I apply a psychometric latent variable framework to regression discontinuity design and derive the amount biased induced by analyzing a regression discontinuity design using a local randomization framework.