Person:
Yang, Shihao

Loading...
Profile Picture

Email Address

AA Acceptance Date

Birth Date

Research Projects

Organizational Units

Job Title

Last Name

Yang

First Name

Shihao

Name

Yang, Shihao

Search Results

Now showing 1 - 2 of 2
  • Thumbnail Image
    Publication
    Advances in using Internet searches to track dengue
    (Public Library of Science, 2017) Yang, Shihao; Kou, Samuel C.; Lu, Fred; Brownstein, John; Brooke, Nicholas; Santillana, Mauricio
    Dengue is a mosquito-borne disease that threatens over half of the world’s population. Despite being endemic to more than 100 countries, government-led efforts and tools for timely identification and tracking of new infections are still lacking in many affected areas. Multiple methodologies that leverage the use of Internet-based data sources have been proposed as a way to complement dengue surveillance efforts. Among these, dengue-related Google search trends have been shown to correlate with dengue activity. We extend a methodological framework, initially proposed and validated for flu surveillance, to produce near real-time estimates of dengue cases in five countries/states: Mexico, Brazil, Thailand, Singapore and Taiwan. Our result shows that our modeling framework can be used to improve the tracking of dengue activity in multiple locations around the world.
  • Thumbnail Image
    Publication
    Using electronic health records and Internet search information for accurate influenza forecasting
    (BioMed Central, 2017) Yang, Shihao; Santillana, Mauricio; Brownstein, John; Gray, Josh; Richardson, Stewart; Kou, S. C.
    Background: Accurate influenza activity forecasting helps public health officials prepare and allocate resources for unusual influenza activity. Traditional flu surveillance systems, such as the Centers for Disease Control and Prevention’s (CDC) influenza-like illnesses reports, lag behind real-time by one to 2 weeks, whereas information contained in cloud-based electronic health records (EHR) and in Internet users’ search activity is typically available in near real-time. We present a method that combines the information from these two data sources with historical flu activity to produce national flu forecasts for the United States up to 4 weeks ahead of the publication of CDC’s flu reports. Methods: We extend a method originally designed to track flu using Google searches, named ARGO, to combine information from EHR and Internet searches with historical flu activities. Our regularized multivariate regression model dynamically selects the most appropriate variables for flu prediction every week. The model is assessed for the flu seasons within the time period 2013–2016 using multiple metrics including root mean squared error (RMSE). Results: Our method reduces the RMSE of the publicly available alternative (Healthmap flutrends) method by 33, 20, 17 and 21%, for the four time horizons: real-time, one, two, and 3 weeks ahead, respectively. Such accuracy improvements are statistically significant at the 5% level. Our real-time estimates correctly identified the peak timing and magnitude of the studied flu seasons. Conclusions: Our method significantly reduces the prediction error when compared to historical publicly available Internet-based prediction systems, demonstrating that: (1) the method to combine data sources is as important as data quality; (2) effectively extracting information from a cloud-based EHR and Internet search activity leads to accurate forecast of flu. Electronic supplementary material The online version of this article (doi:10.1186/s12879-017-2424-7) contains supplementary material, which is available to authorized users.