Defuse the News: Predicting Misinformation and Bias in News on Social Networks via Content-Blind Learning
Abstract
As the locus of social discourse shifts to the Internet, information travels faster than ever before. Consequently, the risks to truth are high when misinformation propagates through social networks billions of individuals strong. For humans, the task of identifying fake news at scale is both imperative and impossible. This work integrates machine learning with network theory to build models of information veracity, political bias, and subject matter based on the topology of Twitter retweet networks that form around news articles. Crucially, these models have no access to linguistic, temporal, or user-identifying information and instead operate solely over the shape of the networks surrounding news articles, a novel specification termed content-blindness. This thesis is the first application of predictive analytics to the largest collection of fake news stories and associated social networks ever assembled. As such, this work represents the state of the art for fake news identification in this domain. Using graph kernel algorithms to extract information from Twitter network structures, these models predict information veracity at 84% accuracy, political bias at 93%, and subject matter at 76%. This research also suggests that models that adhere to content-blindness enjoy decisive advantages over those that take linguistic content into account. While content-aware models can be fooled by a single fake news writer, topological methods are robust to interference by malicious producers of false content; attacking a content-blind model requires changing a large network's topology, an undertaking that inherently demands concerted action. Last, these models can be used, not just forensically - after the fake news has run its course - but in real time, to make accurate predictions about information as it spreads.Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-use#LAACitable link to this page
http://nrs.harvard.edu/urn-3:HUL.InstRepos:38811538
Collections
- FAS Theses and Dissertations [5370]
Contact administrator regarding this item (to report mistakes or request changes)