Publication:

Combining Human and Automated Scoring Methods in Experimental Assessments of Writing: A Case Study Tutorial

Loading...
Thumbnail Image

Date

2023-11-08

Journal Title

Journal ISSN

Volume Title

Publisher

American Educational Research Association (AERA)
The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Mozer, R., Miratrix, L., Relyea, J. E., & Kim, J. S. (2024). Combining Human and Automated Scoring Methods in Experimental Assessments of Writing: A Case Study Tutorial. Journal of Educational and Behavioral Statistics, 49(5), 780-816.

Abstract

In a randomized trial that collects text as an outcome, traditional approaches for assessing treatment impact require that each document first be manually coded for constructs of interest by human raters. An impact analysis can then be conducted to compare treatment and control groups, using the hand-coded scores as a measured outcome. This process is both time and labor-intensive, which creates a persistent barrier for large-scale assessments of text. Furthermore, enriching one’s understanding of a found impact on text outcomes via secondary analyses can be difficult without additional scoring efforts. The purpose of this article is to provide a pipeline for using machine-based text analytic and data mining tools to augment traditional text-based impact analysis by analyzing impacts across an array of automatically generated text features. In this way, we can explore what an overall impact signifies in terms of how the text has evolved due to treatment. Through a case study based on a recent field trial in education, we show that machine learning can indeed enrich experimental evaluations of text by providing a more comprehensive and fine-grained picture of the mechanisms that lead to stronger argumentative writing in a first- and second-grade content literacy intervention. Relying exclusively on human scoring, by contrast, is a lost opportunity. Overall, the workflow and analytical strategy we describe can serve as a template for researchers interested in performing their own experimental evaluations of text.

Description

Other Available Sources

Research Data

Keywords

Social Sciences (miscellaneous), Education

Terms of Use

This article is made available under the terms and conditions applicable to Open Access Policy Articles (OAP), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories