Publication:
Natural Language Processing for Health System Messages: Deep Transfer Learning Approach to Aspect-Based Sentiment Analysis of COVID-19 Content

No Thumbnail Available

Date

2022-02-07

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Sun, Mary Daotung. 2021. Natural Language Processing for Health System Messages: Deep Transfer Learning Approach to Aspect-Based Sentiment Analysis of COVID-19 Content. Master's thesis, Harvard University Division of Continuing Education.

Research Data

Abstract

Recent efforts in clinical natural language processing research have focused on mining social media data and other user-generated data relevant to COVID-19. An increasing number of machine learning tools have been developed for annotation, topic modeling, and various classification tasks, including the now widely used Bidirectional Encoder Representations from Transformer (BERT) language model. In this study, we improve the domain specificity of COVID-Twitter-BERT for secure health system messages through supervised fine-tuning on a large corpus of patient-originated messages. Following sentiment annotation with a validated rule-based heuristic tool, an auxiliary sentence approach was implemented to conduct aspect-based sentiment analysis. We find that existing machine learning tools optimized on social media tools demonstrate moderate-to-good generalizability to secure patient messages. Furthermore, significant performance improvements to a weighted average F1 score of 0.881 were realized upon implementation of fine-tuning and auxiliary sentence approaches. We also observe temporal trends in positive, negative, and compound (overall) sentiment measured on over 3.2 million patient messages related to COVID-19. We believe that secure health system messages are an increasingly valuable source of clinical text data and can be effectively mined using emerging deep transfer learning techniques.

Description

Other Available Sources

Keywords

Clinical informatics, COVID-19, Health system messages, Large-scale data mining, Natural language processing, Sentiment analysis, Computer science, Artificial intelligence, Medicine

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories