Publication: A method for identifying predictive markers of mental illness in social media data
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
Undiagnosed mental illness poses a significant health risk. In-person screenings to identify individuals at-risk of mental illness are expensive, time-consuming, and often inaccurate. This report presents an array of computational methods that can be used to identify predictive markers of mental illness - specifically, depression and Post-Traumatic Stress Disorder - by scanning and interpreting text and images posted on social media. Separate analyses of Twitter and Instagram data are presented. Predictive features were extracted from social media posts (N_Twitter = 279,951, N_Instagram = 43,950) using a variety of techniques, including color analysis, face detection, semantic analysis, and Natural Language Processing. Resulting models successfully discriminated between depressed and healthy content, and compared favorably to general practitioners’ average success rates in diagnosing depression. Results held even when the analysis was restricted to content posted before first depression diagnosis. In the case of Twitter data, state-space temporal analysis suggests that onset of depression may be detectable from Twitter data several months prior to diagnosis. These methods offer a data-driven, predictive approach for early screening and detection of mental illness.