Publication:
Geographic Clustering for Neighborhood Boundaries: A Spatial Analysis of Chicago Using Public Data

No Thumbnail Available

Date

2019-10-25

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Kuppersmith, Joshua Benjamin. 2019. Geographic Clustering for Neighborhood Boundaries: A Spatial Analysis of Chicago Using Public Data. Bachelor's thesis, Harvard College.

Research Data

Abstract

Open data initiatives in cities around the world have enabled new efforts to understand and improve urban areas through data analysis. In order to develop actionable insights to improve cities, it is important to isolate differences between geographic areas throughout the city. Neighborhoods are typically used as a unit for spatial separation, where each neighbor- hood is internally similar, and different from outside areas. As such, neighborhood analysis is key to developing an understanding of complex urban dynamics, yet current neighborhood boundaries do not always adequately reflect similar areas of cities. This thesis proposes a new clustering algorithm to automatically generate neighborhoods with highly similar in- ternal data profiles. Using a grid-model of a city, this new method of clustering, called Geographic K-Means, incorporates data accumulated within grid cells and builds clumps of neighboring cells with similar data trends. This method is optimized using hyper-parameter tuning to improve an Earth Mover’s Distance-based measure of within-neighborhood homo- geneity. The optimization uses regularization to enforce smooth neighborhood boundaries, helping us find an optimal balance between data similarity and realistic contiguous neigh- borhoods. In order to build and test this algorithm, we used Chicago as a case study due to its abundance of data. By generating new Chicago neighborhood boundaries, and increasing within-neighborhood crime homogeneity, we are able to see the relationship between crime and neighborhoods, and better detect sharp boundaries between areas of the city.

Description

Other Available Sources

Keywords

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories