Publication:

Big Data and Disease: Using Twitter to Model the 2014 Outbreak of Chikungunya Fever in Puerto Rico

Loading...
Thumbnail Image

Date

2015-06-26

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Chen, Wesley King. 2015. Big Data and Disease: Using Twitter to Model the 2014 Outbreak of Chikungunya Fever in Puerto Rico. Bachelor's thesis, Harvard College.

Abstract

Big data has enabled an entirely new approach to solving and understanding problems. With the popularity of social media, data is created by individuals. We believe that embedded in the big data of social media, like Twitter, is the documentation of self-reporting illness. Through analysis of keywords in tweets geo-tagged to Puerto Rico, we seek to model the outbreak of Chikungunya fever, with initial correlations of around 0.86. Collected tweets were then divided into categories and treated as independent variables for Lasso regression. Although we train on imperfect suspected numbers for the outbreak from the Pan-American Health Organization (PAHO), we analyze the coefficients to understand the social implications behind both social media disease reporting and awareness in Puerto Rico. We see different phases of Twitter volumes pre-, during and post-initial outbreak. News and government tweets decrease during subsequent outbreaks when we see a corresponding relative increase of self-reporting tweets. Especially when applied to epidemiology, big data isn't about finding the perfect answer, but instead, about discovering the underlying story. This thesis is about the story of a Chikungunya outbreak in Puerto Rico from the eyes of Twitter.

Description

Other Available Sources

Research Data

Keywords

Health Sciences, Epidemiology, Applied Mechanics, Computer Science

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories