Publication: Methods of Imputation and Data Merging to Predict Supportive Housing Outcomes for Homeless Families in San Francisco
No Thumbnail Available
Open/View Files
Date
2020-06-17
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Nakada, Madeleine Rose. 2020. Methods of Imputation and Data Merging to Predict Supportive Housing Outcomes for Homeless Families in San Francisco. Bachelor's thesis, Harvard College.
Research Data
Abstract
Machine learning models have been applied to data on homeless households to investigate a range of problems from predicting length-of-stay at homeless shelters to predicting outcomes for home- less individuals to help cities and shelters allocate resources. However, the majority of these models rely on data from collected from a number of agencies to build a comprehensive dataset of home- less individuals within a municipality. Fitting a model to predict outcomes for households on data from a single provider presents a number of challenges including a smaller dataset and a lack of sec- ondary data sources to deal with missing data. Nonetheless, such a model can be useful to these service providers as it can be used to model how households respond to the provider’s specific ser- vices rather than generalizing over all supportive housing providers within a city who may cater to different demographics and have different levels of engagement with In this thesis, I investigate methods for building a dataset which can be used to predict outcomes for households that engage with Compass Family Services, an agency which provides housing and housing stipends to homeless households in San Francisco. The results show that while relatively high prediction accuracy can attained using simple imputation methods, these accuracies rely on information in the dataset that includes information about their enrollment in other programs. Since these programs filter who can enroll in them, these variables are likely correlated with other agency’s beliefs that the household will have a successful exit. I discuss the pros and cons of including these variables, as well as further applications of the predictive model to chronic homelessness.
Description
Other Available Sources
Keywords
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service