Geographic Clustering for Neighborhood Boundaries: A Spatial Analysis of Chicago Using Public Data
Kuppersmith, Joshua Benjamin
MetadataShow full item record
CitationKuppersmith, Joshua Benjamin. 2019. Geographic Clustering for Neighborhood Boundaries: A Spatial Analysis of Chicago Using Public Data. Bachelor's thesis, Harvard College.
AbstractOpen data initiatives in cities around the world have enabled new efforts to understand and improve urban areas through data analysis. In order to develop actionable insights to improve cities, it is important to isolate differences between geographic areas throughout the city. Neighborhoods are typically used as a unit for spatial separation, where each neighbor- hood is internally similar, and different from outside areas. As such, neighborhood analysis is key to developing an understanding of complex urban dynamics, yet current neighborhood boundaries do not always adequately reflect similar areas of cities. This thesis proposes a new clustering algorithm to automatically generate neighborhoods with highly similar in- ternal data profiles. Using a grid-model of a city, this new method of clustering, called Geographic K-Means, incorporates data accumulated within grid cells and builds clumps of neighboring cells with similar data trends. This method is optimized using hyper-parameter tuning to improve an Earth Mover’s Distance-based measure of within-neighborhood homo- geneity. The optimization uses regularization to enforce smooth neighborhood boundaries, helping us find an optimal balance between data similarity and realistic contiguous neigh- borhoods. In order to build and test this algorithm, we used Chicago as a case study due to its abundance of data. By generating new Chicago neighborhood boundaries, and increasing within-neighborhood crime homogeneity, we are able to see the relationship between crime and neighborhoods, and better detect sharp boundaries between areas of the city.
Citable link to this pagehttps://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37364628
- FAS Theses and Dissertations