Publication:

Large Language Models for Automated Evaluation of Radiology Reports with Fine-Grained Scoring

Loading...
Thumbnail Image

Date

2025-03-14

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Huang, Alyssa. 2024. Large Language Models for Automated Evaluation of Radiology Reports with Fine-Grained Scoring. Bachelors Thesis, Harvard University Engineering and Applied Sciences.

Abstract

The current gold standard for evaluating generated chest x-ray (CXR) reports is through radiologist annotations. However, this process can be extremely time-consuming, especially if there are large numbers of reports to evaluate. In this work, we present a Large Language Model (LLM)-based automated evaluation metric for generated CXR reports called FineRadScore. Given a candidate and a ground truth report, FineRadScore gives the minimum number of line by line corrections required to go from the candidate to the ground truth report. Additionally, FineRadScore assigns a severity rating for each correction and generates comments regarding why the correction was needed. We demonstrate that FineRadScore is able to generate the corrections in a way that aligns with radiologists and has an understanding of how clinically meaningful each error is. We also demonstrate that, when used to get a sense of the quality of the report as a whole, it aligns with radiologists at a similar level to current state of the art automated CXR evaluation metrics. Finally, we analyze FineRadScore's shortcomings to pave the way for future works.

Description

Other Available Sources

Research Data

Keywords

Computer science, Linguistics

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories