Publication:

Measurement in K-12 Policy Analysis

Loading...
Thumbnail Image

Date

2025-05-15

Authors

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

An, Lily. 2025. Measurement in K-12 Policy Analysis. Doctoral Dissertation, Harvard University Graduate School of Arts and Sciences.

Abstract

This dissertation consists of three papers that consider the construction, role, and use of educational measures in educational policies and evaluation methods. Educational measures, such as student test scores, are widely used to evaluate the effectiveness of educational programs or policies.

The first paper investigates the properties of school quality scores in state educational accountability systems under the Every Student Succeeds Act (2015). I use multilevel modeling and factor analysis to simulate an accountability system based on a state’s existing student- and school-level data. I find that this system exhibits high classification accuracy of the state’s lowest performing schools, particularly for elementary schools. I also test how classification accuracy varies due to common policy decisions and show that these design choices have differential effects by school level. This challenges uniform accountability approaches across school levels and suggest the need for level-specific policy decisions in designing these complex systems.

The second paper explores the use of a nonparametric surface response estimation method, Gaussian process regression (GPR), in educational two-dimensional regression discontinuity designs. Regression discontinuity designs are used to estimate the effectiveness of policies or programs, which in education are commonly provided to students based on their scores on multiple tests, such as math and reading. GPR allows one to target an estimand of a boundary average treatment effect as well as understand treatment effect heterogeneity in student outcomes. In simulation, GPR exhibits stronger statistical properties compared to existing methods, and it improves the analysis of an empirical example of a state’s English Language Learner reclassification policy based on two test scores.

The third paper analyzes how state policy documents discuss the use of student sociodemographic variables in constructing teacher value-added model (VAM) scores. VAMs are used to evaluate educators through comparisons between expected and observed student test scores, conditioning on students’ prior achievement and other information. Despite states’ agreement that the role of student background in academic performance should be statistically accounted for when evaluating teachers to increase fairness to educators, I find that states work against their stated goals by tending to exclude race from these calculations, which I examine using tenets of quantitative critical theory.

Description

Other Available Sources

Research Data

Keywords

accountability, classification accuracy, Gaussian process, QuantCrit, regression discontinuity designs, value-added models, Education policy, Educational evaluation, Educational tests & measurements

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories