Big Data and Disease: Using Twitter to Model the 2014 Outbreak of Chikungunya Fever in Puerto Rico
MetadataShow full item record
CitationChen, Wesley King. 2015. Big Data and Disease: Using Twitter to Model the 2014 Outbreak of Chikungunya Fever in Puerto Rico. Bachelor's thesis, Harvard College.
AbstractBig data has enabled an entirely new approach to solving and understanding problems. With the popularity of social media, data is created by individuals. We believe that embedded in the big data of social media, like Twitter, is the documentation of self-reporting illness. Through analysis of keywords in tweets geo-tagged to Puerto Rico, we seek to model the outbreak of Chikungunya fever, with initial correlations of around 0.86. Collected tweets were then divided into categories and treated as independent variables for Lasso regression. Although we train on imperfect suspected numbers for the outbreak from the Pan-American Health Organization (PAHO), we analyze the coefficients to understand the social implications behind both social media disease reporting and awareness in Puerto Rico. We see different phases of Twitter volumes pre-, during and post-initial outbreak. News and government tweets decrease during subsequent outbreaks when we see a corresponding relative increase of self-reporting tweets. Especially when applied to epidemiology, big data isn't about finding the perfect answer, but instead, about discovering the underlying story. This thesis is about the story of a Chikungunya outbreak in Puerto Rico from the eyes of Twitter.
Citable link to this pagehttp://nrs.harvard.edu/urn-3:HUL.InstRepos:17417577
- FAS Theses and Dissertations