Publication:

Unsupervised Learning Methods in Digital Phenotyping

Loading...
Thumbnail Image

Date

2021-05-12

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Liu, Gang. 2021. Unsupervised Learning Methods in Digital Phenotyping. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.

Abstract

Digital phenotyping is defined as the “moment-by-moment quantification of the individual level human phenotype in situ using data from personal digital devices”. The passive data collected by smartphone devices, including GPS, accelerometer and communication logs, can provide insights on users’ behaviors that could be related to various diseases. Research findings have demonstrated robust associations between the behavioral risk factors derived from the sensor data and health outcomes, including obesity, diabetes, various cardiovascular diseases, mental health and mortality. To improve patients’ symptoms, communication and clinical outcomes, the thesis is focused on solving the challenging problems arising in integrating the health informatics from the smartphones into clinical practice in free-living settings. The first chapter addresses the missing data problem in GPS data caused by the sampling strategy determined by the limited battery capacity. We developed an algorithm that simulates an individual’s trajectory based on previously observed GPS location traces, without reliance on external data. The method makes use of sparse online Gaussian Process, spherical geometry and a bidirectional imputation fashion to reduce the computational cost and improve the accuracy of existing methods. The second chapter describes how to quantify the uncertainty of acceleromter-based estimates such as step counts due to different sources of missingness. We propose an online Bayesian learning method which models the step count estimates as random variables from a zero-inflated negative binomial distribution. The method updates the posterior distribution of each parameter on the fly, and provides a credible interval for each time window as well as each day based on the posterior predictive distribution. The third chapter is about detecting aberrant human behaviors from all kinds of of passive data collected by smartphones in real time. We propose an online anomaly detection method using Hotelling’s T-squared test, where the test statistic is a weighted average, with more weight on the between-individual component when there are little data available for the individual and more weight on the within-individual component when the data are adequate.

Description

Other Available Sources

Research Data

Keywords

accelerometer, anomaly detection, digital phenotyping, GPS, missing data, smartphone, Biostatistics, Electrical engineering

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories