Bayesian Methods for Multi-Outcome Analysis and a Study of Gender Bias in Medical Articles
Access StatusFull text of the requested work is not available in DASH at this time ("dark deposit"). For more information on dark deposits, see our FAQ.
Thomas, Emma Grace
MetadataShow full item record
CitationThomas, Emma Grace. 2020. Bayesian Methods for Multi-Outcome Analysis and a Study of Gender Bias in Medical Articles. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.
AbstractIn Chapter 1, we present Multi-Outcome Regression with Tree-structured Shrinkage (MOReTreeS), a novel framework for Bayesian multi-response regression when the outcomes are related according to a known tree or hierarchy [Thomas et. al, 2020]. Examining the impact of an exposure on a large number of health outcomes is statistically challenging. Analyzing each outcome separately ignores relationships among pathologies, leads to multiple testing problems, and introduces bias when only the strongest associations are reported. MOReTreeS addresses these limitations using a tree-structured prior for the regression coefficients of each outcome that enables: (1) borrowing of strength across related outcomes; (2) data-driven discovery of groups of outcomes that are similarly affected by the exposure, and; (3) estimation of a single, interpretable effect for each outcome group. Through simulations, we find that MOReTreeS can reduce the root mean squared error of the effect estimates compared to standard regression methods while still providing significant dimensional reduction, thereby enhancing interpretability of the estimates. We apply MOReTreeS to study the effect of short-term exposure to fine particulate matter (PM2.5) on risk of hospitalization due to 432 cardiovascular diseases categorized by the hierarchical International Classification of Diseases, Revision 9.
In Chapter 2, we apply MOReTreeS to a more in-depth analysis of PM2.5 and hospitalizations due to cardiovascular and respiratory disease in Medicare beneficiaries. Our objectives were (1) to discover groups of related causes of hospitalization such that PM2.5 associations are similar within but different between groups; (2) to introduce MOReTreeS to an audience of epidemiologists, and; (3) to introduce an R package, moretrees, for fitting MOReTreeS models to matched case-control and case-crossover data. We conducted a time-stratified case-crossover study of hospitalizations from 2000 through 2014 among Medicare beneficiaries aged 65+ living in the contiguous United States. Cause of hospitalization was defined using hierarchical Clinical Classification Software (CCS) codes. Daily PM2.5, temperature, and relative humidity data were linked to Medicare data via residential ZIP codes. Statistical models used a MOReTreeS prior and conditional logistic likelihood with nonlinear control for temperature and humidity. Our dataset included 6,007,293 hospitalizations for 57 cardiovascular causes and 8,690,837 hospitalizations for 32 respiratory causes. MOReTrees grouped 51 of 57 cardiovascular diseases into one group with a positive PM2.5 association. Compared to this group, heart failure exhibited a stronger positive association. Negative associations were observed for certain aneurysms and intracranial hemorrhage. 31 of 32 respiratory outcomes were grouped and were positively associated with PM2.5. Influenza exhibited a negative association.
In Chapter 3, we present an analysis of gender bias in authorship of invited commentaries in medical journals [Thomas et al., 2019]. In peer-reviewed medical journals, authoring an invited commentary on an original article is a recognition of expertise. Women author fewer invited publications than men. However, it is unknown whether this disparity is due to gender differences in characteristics that drive invitations, such as field of expertise, seniority, and scientific output. We aimed to estimate the odds ratio (OR) of authoring an invited commentary for women compared to men with similar expertise, seniority, and publication metrics. We used a matched case-control study design of all articles published from 2013 through 2017 in an English-language medical journal, or medical articles published in a multidisciplinary journal. Cases were defined as corresponding authors of invited commentaries in a given journal during the study period. Controls were matched to cases on scientific expertise by calculating a similarity index for abstracts published 2013 through 2017 using natural language processing. Genderize.io was used to predict gender from author first name and country of origin. Invited commentaries were defined as publications that cite another publication within the same journal volume and issue. The OR for gender was estimated adjusting for field of expertise, publication output, citation impact, and years since first publication (years active), with an interaction between gender and years active. The final dataset included 43,235 cases across 2,549 journals. For researchers who had been active for the median number of years (16), the OR for invited commentary authorship was 0.79 for women compared to men with similar scientific expertise, number of publications, and citation impact (95% confidence interval (CI): 0.77 to 0.81). For every one decile increase in years active, this OR decreased further by a factor of 0.97 (95%CI: 0.96 to 0.98).
Citable link to this pagehttps://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37365707
- FAS Theses and Dissertations