Person:
Gomez, Andres

Loading...
Profile Picture

Email Address

AA Acceptance Date

Birth Date

Research Projects

Organizational Units

Job Title

Last Name

Gomez

First Name

Andres

Name

Gomez, Andres

Search Results

Now showing 1 - 2 of 2
  • Publication
    Explaining the Prevalence, Scaling and Variance of Urban Phenomena
    (Center for International Development at Harvard University, 2016-12) Gomez, Andres; Patterson-Lomba, Oscar; Hausmann, Ricardo
    The prevalence of many urban phenomena changes systematically with population size1. We propose a theory that unifies models of economic complexity2, 3 and cultural evolution4 to derive urban scaling. The theory accounts for the difference in scaling exponents and average prevalence across phenomena, as well as the difference in the variance within phenomena across cities of similar size. The central ideas are that a number of necessary complementary factors must be simultaneously present for a phenomenon to occur, and that the diversity of factors is logarithmically related to population size. The model reveals that phenomena that require more factors will be less prevalent, scale more superlinearly and show larger variance across cities of similar size. The theory applies to data on education, employment, innovation, disease and crime, and it entails the ability to predict the prevalence of a phenomenon across cities, given information about the prevalence in a single city.
  • Publication
    A New Algorithm to Efficiently Match U.S. Census Records and Balance Representativity with Match Quality
    (Growth Lab, 2024-12) Protzer, Eric; Orazbayev, Sultan; Gomez, Andres; Hartog, Matte; Neffke, Frank
    We introduce a record linkage algorithm that allows one to (1) efficiently match hundreds of millions of records based not just on demographic characteristics but also name similarity, (2) make statistical choices regarding the trade-off between match quality and representativity and (3) automatically generate a ground truth of true and false matches, suitable for training purposes, based on networked family relationships. Given the recent availability of hundreds of millions of digitized census records, this algorithm significantly reduces computational costs to researchers while allowing them to tailor their matching design towards their research question at hand (e.g. prioritizing external validity over match quality). Applied to U.S Census Records from 1850 to 1940, the algorithm produces two sets of matches, one designed for representativity and one designed to maximize the number of matched individuals. At the same level of accuracy as commonly used methods, the algorithm tends to have a higher level of representativity and a larger pool of matches. The algorithm also allows one to match harder-to-match groups with less bias (e.g. women whose names tend to change over time due to marriage).