Publication:
Augmenting Infectious Disease Surveillance: Bridging Traditional and Novel Data

No Thumbnail Available

Date

2024-05-31

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Kogan, Nicole. 2024. Augmenting Infectious Disease Surveillance: Bridging Traditional and Novel Data. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.

Research Data

Abstract

Existing data for monitoring infectious diseases have historically been plagued by delayed availability, poor representativity, and inadequate capture of morbidity and mortality. Digital epidemiology, designed to track infectious disease transmission using Big Data not originally conceived for that purpose, furnishes a complementary suite of methods to mitigate some of these challenges. Here we present three analytical approaches that harmonize traditional and digital data to improve the real-time characterization of emerging infectious diseases. In Chapter 1, we evaluate the utility of multimodal data – namely COVID-19-related searches on Google, Twitter, and UpToDate as well as smart thermometer measurements, smartphone-derived human mobility patterns, and epidemic model predictions – as components of an early warning system for COVID-19’s 2020 transmission within the United States (US). Notably, we find that increases in activity of these digital proxies anticipate increases in confirmed cases and deaths by two to three weeks. We extend this finding by proposing and validating a multiproxy model that yields probabilistic estimates for the timing of a COVID-19 outbreak’s impending growth and decay. In Chapter 2, we propose an uncertainty quantification approach using serosurveillance and postmortem surveillance data to estimate magnitudes of COVID-19’s underascertainment in African nations, where significant discrepancies between ground truth infections and reported cases have been reported. We observe concordance in the range of COVID-19 reporting rate estimates – from one in two to one in thousands of infections – for the aforementioned data sources, suggesting a key reason for low cumulative incidence in Africa has been significant underdetection and underreporting. We also explore the potential for Google and Twitter data to recapitulate local-in-time trends within national epidemic curves and, in some instances, serve as leading indicators of COVID-19 outbreaks. Finally, in Chapter 3, we implement several classification models to investigate the relative contributions of ascertainment, importation, and transmission to spatiotemporal patterns of infectious disease seeding and subsequent spread, using the 2022 US mpox epidemic as a case study. We find that real-time data sources (i.e., air travel and Google) can predict the evolution of viral spread. In addition, through a post hoc analysis, we find that state-specific timings of index mpox cases mirror chronologies observed in non-mpox (i.e., COVID-19 and H1N1) pandemics and, therefore, that historical data sources may prove equally useful in ascertaining where and when an epidemic is liable to spread.

Description

Other Available Sources

Keywords

Digital epidemiology, Epidemic, Infectious disease, Outbreak, Surveillance, Epidemiology

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories