Show simple item record

dc.contributor.authorWu, Stephenen_US
dc.contributor.authorMiller, Timothyen_US
dc.contributor.authorMasanz, Jamesen_US
dc.contributor.authorCoarr, Matten_US
dc.contributor.authorHalgrim, Scotten_US
dc.contributor.authorCarrell, Daviden_US
dc.contributor.authorClark, Cherylen_US
dc.date.accessioned2014-12-02T21:28:39Z
dc.date.issued2014en_US
dc.identifier.citationWu, Stephen, Timothy Miller, James Masanz, Matt Coarr, Scott Halgrim, David Carrell, and Cheryl Clark. 2014. “Negation’s Not Solved: Generalizability Versus Optimizability in Clinical Natural Language Processing.” PLoS ONE 9 (11): e112774. doi:10.1371/journal.pone.0112774. http://dx.doi.org/10.1371/journal.pone.0112774.en
dc.identifier.issn1932-6203en
dc.identifier.urihttp://nrs.harvard.edu/urn-3:HUL.InstRepos:13454759
dc.description.abstractA review of published work in clinical natural language processing (NLP) may suggest that the negation detection task has been “solved.” This work proposes that an optimizable solution does not equal a generalizable solution. We introduce a new machine learning-based Polarity Module for detecting negation in clinical text, and extensively compare its performance across domains. Using four manually annotated corpora of clinical text, we show that negation detection performance suffers when there is no in-domain development (for manual methods) or training data (for machine learning-based methods). Various factors (e.g., annotation guidelines, named entity characteristics, the amount of data, and lexical and syntactic context) play a role in making generalizability difficult, but none completely explains the phenomenon. Furthermore, generalizability remains challenging because it is unclear whether to use a single source for accurate data, combine all sources into a single model, or apply domain adaptation methods. The most reliable means to improve negation detection is to manually annotate in-domain training data (or, perhaps, manually modify rules); this is a strategy for optimizing performance, rather than generalizing it. These results suggest a direction for future work in domain-adaptive and task-adaptive methods for clinical NLP.en
dc.language.isoen_USen
dc.publisherPublic Library of Scienceen
dc.relation.isversionofdoi:10.1371/journal.pone.0112774en
dc.relation.hasversionhttp://www.ncbi.nlm.nih.gov/pmc/articles/PMC4231086/pdf/en
dash.licenseLAAen_US
dc.subjectComputer and Information Sciencesen
dc.subjectInformation Technologyen
dc.subjectNatural Language Processingen
dc.subjectDatabase and Informatics Methodsen
dc.subjectHealth Informaticsen
dc.subjectSocial Sciencesen
dc.subjectLinguisticsen
dc.subjectLanguagesen
dc.subjectNatural Languageen
dc.titleNegation’s Not Solved: Generalizability Versus Optimizability in Clinical Natural Language Processingen
dc.typeJournal Articleen_US
dc.description.versionVersion of Recorden
dc.relation.journalPLoS ONEen
dash.depositing.authorMiller, Timothyen_US
dc.date.available2014-12-02T21:28:39Z
dc.identifier.doi10.1371/journal.pone.0112774*
dash.contributor.affiliatedMiller, Timothy


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record