New Statistical Methods for Integrative Data Analysis and Applications in Biology, Epidemiology and Finance
Abstract
This dissertation consists of four self-contained chapters where statistical methods for integrative data analysis with applications in biology, epidemiology and finance are introduced and discussed. Chapters 1 and 2 focus on aggregating multi-source, high-throughput biological data for enhancement in functionality annotation and prediction: in chapter 1, we introduce an integrative algorithm based on Bayesian hidden Markov tree models to incorporate genes' phylogenetic profile and their inferred evolutionary histories for gene clustering and functional prediction; in chapter 2, we work on aggregating the genetic and pharmacological profiling data in Cancer Cell Line Encyclopedia to provide predictions on the mechanisms and targets of cancer-treating drugs. In chapter 3, we move to integrating large-scale digital data and spatio-temporal epidemics data, and show how to improve robustness and accuracy in localized influenza tracking by effectively combining Internet search data and traditional disease surveillance data. In chapter 4, we take a more general view as to link multi-dimension data with a non-parametric Bayesian copula model and predict the irregular covariance structure between stock price and index data during the financial crisis.Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-use#LAACitable link to this page
http://nrs.harvard.edu/urn-3:HUL.InstRepos:39947181
Collections
- FAS Theses and Dissertations [5858]
Contact administrator regarding this item (to report mistakes or request changes)