Publication:
Methods of Imputation and Data Merging to Predict Supportive Housing Outcomes for Homeless Families in San Francisco

No Thumbnail Available

Date

2020-06-17

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Nakada, Madeleine Rose. 2020. Methods of Imputation and Data Merging to Predict Supportive Housing Outcomes for Homeless Families in San Francisco. Bachelor's thesis, Harvard College.

Research Data

Abstract

Machine learning models have been applied to data on homeless households to investigate a range of problems from predicting length-of-stay at homeless shelters to predicting outcomes for home- less individuals to help cities and shelters allocate resources. However, the majority of these models rely on data from collected from a number of agencies to build a comprehensive dataset of home- less individuals within a municipality. Fitting a model to predict outcomes for households on data from a single provider presents a number of challenges including a smaller dataset and a lack of sec- ondary data sources to deal with missing data. Nonetheless, such a model can be useful to these service providers as it can be used to model how households respond to the provider’s specific ser- vices rather than generalizing over all supportive housing providers within a city who may cater to different demographics and have different levels of engagement with In this thesis, I investigate methods for building a dataset which can be used to predict outcomes for households that engage with Compass Family Services, an agency which provides housing and housing stipends to homeless households in San Francisco. The results show that while relatively high prediction accuracy can attained using simple imputation methods, these accuracies rely on information in the dataset that includes information about their enrollment in other programs. Since these programs filter who can enroll in them, these variables are likely correlated with other agency’s beliefs that the household will have a successful exit. I discuss the pros and cons of including these variables, as well as further applications of the predictive model to chronic homelessness.

Description

Other Available Sources

Keywords

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories