Robust Predictions With Observational Data

Yuan, William

dc.contributor.advisor	Kohane, Isaac S.
dc.contributor.author	Yuan, William
dc.date.accessioned	2020-10-16T14:01:09Z
dash.embargo.terms	2022-05-01
dc.date.created	2020-05
dc.date.issued	2020-05-06
dc.date.submitted	2020
dc.identifier.citation	Yuan, William. 2020. Robust Predictions With Observational Data. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.
dc.identifier.uri	https://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37365789	*
dc.description.abstract	Data science, as currently practiced, is an awkward fit for studying biology or medicine, which currently exist in a state where causal mechanisms to explain many of our observations are often unavailable. While mechanistic deductions are possible in narrow, well defined areas (signaling pathways, binding and protein folding, etc.), a deterministic, internally consistent model of human physiology is still far off. Consequently, the field has developed to serve two purposes simultaneously: both to construct such a framework, but also to help patients in the present with the incomplete information that we have access to. Modern data scientists and researchers utilize massive datasets to attempt to extract insights from a highly complex, largely mysterious system. Given the implications that research recommendations can have on physician behavior, and acknowledged missingness in our understanding, ensuring the reliability and validity of our methods is of paramount importance. The rise of statistical learning and large datasets has led to significant optimism regarding the ability of such models to influence or even make predictions about patient outcomes. However, constructing inductions that can fit into the otherwise deductive medical and scientific frameworks can be a fraught process. I examine how such work can be framed so as to resultant predictive models “useful” to both clinicians and scientists, and suggest methods for this that can exist within existing research frameworks. In particular, I examine three cases in detail. First, I describe the basis and implications of temporal bias for the first time, a flaw present in a ubiquitous study design that prevents reliable predictions of the future. Next, I describe knowledge parasitism, a phenomenon where machine learning models piggyback off of the decisions and expertise of clinicians, making their predictions consequently less likely to extend beyond what a clinician may already suspect. Finally, I describe the tendency for propensity matching to “launder” bias in surgical studies, acting to conceal overlooked biases and introduce new biases, reducing the confidence and applicability of the findings.
dc.description.sponsorship	Medical Sciences
dc.format.mimetype	application/pdf
dc.language.iso	en
dash.license	LAA
dc.subject	prediction, machine learning, risk stratification, bias
dc.title	Robust Predictions With Observational Data
dc.type	Thesis or Dissertation
dash.depositing.author	Yuan, William
dash.embargo.until	2022-05-01
dc.date.available	2020-10-16T14:01:09Z
thesis.degree.date	2020
thesis.degree.grantor	Graduate School of Arts & Sciences
thesis.degree.grantor	Graduate School of Arts & Sciences
thesis.degree.level	Doctoral
thesis.degree.level	Doctoral
thesis.degree.name	Doctor of Philosophy
thesis.degree.name	Doctor of Philosophy
dc.type.material	text
thesis.degree.department	Medical Sciences
thesis.degree.department	Medical Sciences
dash.identifier.vireo
dash.author.email	wyuan95@gmail.com

Files in this item

Name:: YUAN-DISSERTATION-2020.pdf
Size:: 5.360Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

FAS Theses and Dissertations [6136]

Show simple item record