A Tale of Two Multi-Phase Inference Applications
MetadataShow full item record
CitationMcKeough, Kathryn. 2020. A Tale of Two Multi-Phase Inference Applications. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.
AbstractMulti-phase inference refers to any sequential procedure where the results, or some realization of the output of one phase, is fed into another phase. Multi-phase models are becoming more prevalent in applied statistical analyses as data gets bigger and more complicated. They offer a solution for complex statistical problems where modeling all parameters jointly has its limitations. We explore two applications, one in sports analytics and astronomy, where we choose multi-phase models to explore our data.
Part 1 - Predicting Athlete Performance: It is often the goal of sports analysts, coaches, and fans to predict athlete performance over time. Methods such as Elo, Glicko, and Plackett-Luce based ratings measure athlete skill based on results of competitions over time but have no predictive strength on their own. Growth curves are often applied in the context of sports to predict future ability, but these curves are too simple to account for complex career trajectories. We propose a non-linear, mixed-effects growth curve to model the ratings as a function of time and other athlete-specific covariates. The mixture of growth curves allows for flexibility in the estimated shape of career trajectories between athletes as well as between sports. We use the fitted growth curves to make predictions of an athlete's career trajectory in two ways. The first is a model of how athlete performance progresses over time in a multi-competitor scenario as an extension to the Plackett-Luce model. The second is a method that applies the growth curve as a second step to existing rating systems of multi-competitor and head-to-head sports. We show this method can be applied to different sports by using examples from men's slalom and women's luge, respectively.
Part II - Defining Regions that Contain Complex Astronomical Structure: Astronomers are interested in delineating boundaries of extended sources in noisy images. Examples include finding outlines of a jet in a distant quasar or observing the morphology of a supernova remnant over time. Analyzing the morphology of these objects is particularly challenging for X-ray images of high redshift sources where there are a limited number of high-energy photon counts. Low-counts Image Reconstruction and Analysis (LIRA), a Bayesian multi-scale image reconstruction, has been tremendously successful in analyzing low count images and extracting noisy structure. However, we do not always have supplementary information to predetermine ROI, and the size and shape can significantly affect flux/luminosity. To group similar pixels, we impose a multi-phase model using the output of LIRA to build a distribution for the shape of the ROI. We adopt the Ising model as a prior on assigning the pixels to either the background or the ROI. This Bayesian post-process step informs the final boundary. This method is applied to observed data as well as simulations to show it is capable of picking out meaningful ROIs.
Citable link to this pagehttps://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37365871
- FAS Theses and Dissertations