Negation’s Not Solved: Generalizability Versus Optimizability in Clinical Natural Language Processing

Wu, Stephen; Miller, Timothy; Masanz, James; Coarr, Matt; Halgrim, Scott; Carrell, David; Clark, Cheryl

View/Open

4231086.pdf (645.9Kb)

Author

Wu, Stephen

Miller, Timothy HARVARD

Masanz, James

Coarr, Matt

Halgrim, Scott

Carrell, David

Clark, Cheryl

Published Version

https://doi.org/10.1371/journal.pone.0112774

Metadata

Show full item record

Citation

Wu, Stephen, Timothy Miller, James Masanz, Matt Coarr, Scott Halgrim, David Carrell, and Cheryl Clark. 2014. “Negation’s Not Solved: Generalizability Versus Optimizability in Clinical Natural Language Processing.” PLoS ONE 9 (11): e112774. doi:10.1371/journal.pone.0112774. http://dx.doi.org/10.1371/journal.pone.0112774.

Abstract

A review of published work in clinical natural language processing (NLP) may suggest that the negation detection task has been “solved.” This work proposes that an optimizable solution does not equal a generalizable solution. We introduce a new machine learning-based Polarity Module for detecting negation in clinical text, and extensively compare its performance across domains. Using four manually annotated corpora of clinical text, we show that negation detection performance suffers when there is no in-domain development (for manual methods) or training data (for machine learning-based methods). Various factors (e.g., annotation guidelines, named entity characteristics, the amount of data, and lexical and syntactic context) play a role in making generalizability difficult, but none completely explains the phenomenon. Furthermore, generalizability remains challenging because it is unclear whether to use a single source for accurate data, combine all sources into a single model, or apply domain adaptation methods. The most reliable means to improve negation detection is to manually annotate in-domain training data (or, perhaps, manually modify rules); this is a strategy for optimizing performance, rather than generalizing it. These results suggest a direction for future work in domain-adaptive and task-adaptive methods for clinical NLP.

Other Sources

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4231086/pdf/

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-use#LAA

Citable link to this page

http://nrs.harvard.edu/urn-3:HUL.InstRepos:13454759

Collections

HMS Scholarly Articles [17922]

Contact administrator regarding this item (to report mistakes or request changes)