Publication:
Three Essays on Making Casual Inferences with Test Scores

No Thumbnail Available

Date

2021-05-12

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Litschwartz, Sophie Lilit. 2021. Three Essays on Making Casual Inferences with Test Scores. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.

Research Data

Abstract

In education research test scores are a common object of analysis. Across studies test scores can be an important outcome, a highly predictive covariate, or a means of assigning treatment. However, test scores are a measure of an underlying proficiency we can’t observe directly and so contain error. This measurement error has implications for how we use test scores in research. In this dissertation, I combine psychometrics and causal inference to develop three methods for doing education research with test scores. In the first study, I combine Classical Test Theory and simulation to develop a generalized method for adjusting test score distribution where there was a policy to either selectively retest or rescore initially failing students. Using this method, I show how adjusting for retesting on a North Carolina accountability exam reduces the estimate of mean growth across testing occasions from .17 standard deviations to near zero. I also reexamine an investigation of “score scrubbing" on the New York Regent and demonstrate rescoring can inflate perceived scrubbing rates by a factor of three, from 12% to 36%. The second and third studies contribute to the literature on regression discontinuity design. In the second study, I create and evaluate two methods for estimating cross-site treatment effect variation in multi-site RDDs, one based on random-effects meta analysis and the other based on the fixed intercepts random coefficients model. I use these models to evaluate Massachusetts‘s “Education Proficiency Plan" policy and find enough treatment effect variance in three cohorts for the treatment effect to have been negative in more than a third of high schools. In the third study, I apply a psychometric latent variable framework to regression discontinuity design and derive the amount biased induced by analyzing a regression discontinuity design using a local randomization framework.

Description

Other Available Sources

Keywords

Statistics

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories