Why Has Regional Convergence in the U.S. Stopped?

The past thirty years have seen a dramatic decrease in the rate of income convergence across U.S. states. This decline coincides with a similarly substantial decrease in population flows to wealthy states. We develop a model where labor mobility plays a central role in convergence and can quantitatively account for its disappearance. We then link this decline in directional migration to a large increase in housing prices and housing regulation in high-income areas. The model predicts that these housing market changes generate (1) a divergence in the skill-specific economic returns to living in rich places, (2) a decline in low-skilled migration to rich places and continued low-skilled migration to places with high income net of housing costs, (3) a decline in the rate of human capital convergence and (4) continued income convergence among places with unconstrained housing supply. Using Census data, we find support for the first three hypotheses. To test the fourth hypothesis, we develop a new state-level panel measure of housing supply regulations. Using this measure as an instrument for housing prices, we document the central role of housing prices and building restrictions in the end of income convergence.


Introduction
The century-long convergence in per-capita income across U.S. states from 1880 to 1980 is one of the most striking relationships in macroeconomics. 1 During this period, incomes across states converged, on average, at a rate of 1.8% a year. 2 The R-squared for this relationship is 0.95, which means that the same slope can essentially be recovered by comparing any subset of states. This negative relationship between initial income and subsequent growth held within most sub-periods as well. Per-capita income in Connecticut was 4.37 times larger than per capita income in Mississippi in 1940. By 1960 that ratio had fallen to 2.28, and it fell again to 1.76 by 1980. During the last thirty years this relationship has weakened considerably, as documented at the metro-area level by Berry and Glaeser (2005). 3 The top two panels of Figure 2 plot the relationship between income growth and initial income for the periods 1940-1960 and 1990-2010. 4 While both periods show a negative correlation, the slope of the convergence relationship fell by more than 50%, and the R-squared fell from 0.90 to 0.16. The income gap between Mississippi and Connecticut declined for over a century; over the past thirty years the income gap has remained constant. The residents of Connecticut have average incomes 1.77 times the average in Mississippi today.
The change in this relationship can be seen in the bottom panel of Figure 2. In this panel, we plot the convergence relationship (change in log income on initial log income) for rolling twenty-year windows.
The slopes of the twenty year windows 1940-1960 and 1990-2010 are highlighted. This figure shows that the convergence relationship was quite strong in through 1980, with a convergence rate of 2.1% per year.
In the last thirty years, this pattern has largely disappeared. From 1981 to 2010, the annual convergence rate of less than 1%, and in many of the years leading up to the "Great Recession," there was essentially no convergence at all. 5 During the period of strong convergence prior to 1980, population flowed from poor to rich states, and changes in population were also well-predicted by initial income. Figure 3 plots the relationship between the twenty year changes in log population and initial log income per-capita for the period 1940-1960. Over those decades, a doubling in initial income per capita was associated with an 1.6 percentage point higher annual growth rate in population. Four states -Arizona, California, Florida, and Nevada -had high average incomes and grew extremely quickly. However, the positive relationship between income and population growth is not driven by this group alone. It remains large and significant in their absence, as other rich states such as Maryland, Connecticut, and Delaware all experienced faster than average growth.
The top panel of Figure 3 plots the same relationship for the period 1990-2010. As is evident in the figure, population is no longer flowing to richer states. In the bottom panel of Figure 3, we plot the extent of differential population growth (change in log population on initial log income) for rolling twenty-year windows. The slopes of the twenty year windows from 1940-1960 and 1990-2010 are highlighted, as in the convergence relationship figure. This figure shows that directed net migration was quite strong in the period prior to 1980, where a doubling of income was associated with a 1.46 percentage point higher annual growth rate in population. In the last thirty years, this pattern has largely disappeared. From 1981-2010, the average slope was just 0.29. As an example, from 2000 to 2010 Utah's population grew by 24% wheareas Massachusetts's populaion grew by just 3%. This occured even though per capita incomes in Massachusetts were 55% higher in 2000.
In this paper, we argue that labor mobility explains income convergence and its disappearance. Past literature has deemed the observed level of migration as quantitatively unable to account for the observed level of convergence and instead focused on the role of capital, racial discrimination, or sectoral reallocations. 6 We build on an older tradition of work by economic historians (Easterlin (1958) and Williamson (1965)) as formalized by Braun (1993), in which directed migration drives convergence. By adding elastic labor supply and human capital to Braun's model, we are better able to measure the impact of migration on wages. With 5 Table 2 shows these coefficients with standard errors on a decennial basis. The strong rate of convergence in the past as well as the decline today appear not to be driven by measurement error. The standard deviation of log income across states fell and then held steady (Appendix Table 1). This measure demonstrates that the estimated decline in convergence rates is not due to a reduction in the variance of initial incomes relative to a stationary shock process. When we use the Census measure of state income to instrument for BEA income, or vice-versa, we find similar results. The decline also occurs at sub-state geographies, using data from Haines (2010) and U.S. Census Bureau (2012).
6 See Barro and Sala-i Martin (1992), Caselli and Coleman (2001) , Michaels et al. (2012), and Hseih et al. (2012). The rate of convergence predicted by the standard neoclassical model is ∆lny = α∆lnpop. A conventional value for α of 0.33 would mean that the observed population changes in Figure 3 could account for less than one-third of the observed convergence rate in Figure 2.
this improved measurement, we show that the observed levels of migration can indeed account for the lion's share of the observed convergence.
Over the past fifty years, the difference in housing prices between rich and poor areas has become increasingly large relative to the differences in income (Van Nieuwerburgh and Weill (2010), Glaeser et al. (2005b)). In 1960, a 1 log-point increase in income in the cross section was associated with an approximately 1 log-point increase in housing prices. In 2010, this slope has more than doubled. Housing prices now capitalize a greater fraction of the income differentials across places.
This changing price-income relationship has important implications for income convergence in our model.
Low-skilled workers are especially sensitive to changes in housing prices. We show that in recent years (1) the real returns to migration to productive places have fallen dramatically for low-skilled workers but have remained high for high-skilled workers, (2) high-skilled workers continue to move to areas with high nominal income, and (3) low-skilled workers are now moving to areas with low nominal income but high real income net of housing costs. As a result, there are lower net population flows to productive places and a divergence in skill levels that slows income convergence.
These three predictions are generated assuming an exogenous increase in housing prices in productive areas. To better understand the origins of housing price increases, we construct a new instrument for housing prices. Our measure is a scaled count of the number of decisions for each state that mention "land use", as tracked through an online database of state appeals court records. Changes in the regulation measure are strongly predictive of changes in housing prices, and we validate this measure using existing cross-sectional survey data. To the best of our knowledge, this is the first panel measure of land use regulations in the United States and the first direct statistical evidence linking regulation increases to price increases. Using this measure, we can connect the entire causal chain in a well-identified manner. High land use regulations weaken the link between high incomes and new housing permits. Income differences across places are capitalized into prices. These price increases weaken human capital convergence and prevent net migration to rich areas with high regulation. Income convergence persists among places unconstrained by these regulations, but is diminished in areas with supply-constraints.
Understanding convergence and its disappearance is important for macro, labor, and urban economists.
The existing literature in macroeconomics typically attributes convergence among U.S. states to capital accumulation or technological diffusion, but this paper argues for labor mobility as an important mechanism in equilibrating productivity differentials. Our results touch on two key issues in labor economics: we show substantial labor migration in response to economic incentives, and our results are consistent with a fairly large elastcity of labor demand. Urban economists have done substantial work analyzing the consequences of recent housing price increases; we add to this literature by constructing the first panel measure of land use regulations, documenting systematically for the first time the increase in building regulation, and highlighting a previously unrecognized consequence of these price increases.
These changing trends in regional income, mobility, and housing markets are important for policy-makers as well. Economists have long emphasized the importance of labor mobility within a currency union as a stabilizing mechanism (Mundell (1961)) and cited the relatively high degree of directed migration in the U.S.
as an important factor in the nation's success (Blanchard and Katz, 1992). In Europe, where net migration from poor to rich countries has been small in overall scope, there is now a crisis connected to diverging productivity levels within a currency union.
The remainder of the paper proceeds as follows. Section 2 develops a formal model to explore these issues.
In Section 3, we test the model's empirical predictions. Section 4 introduces a new instrument for housing prices, and Section 5 calibrates the role of migration in convergence and compares our supply constraints story to alternative explanations such as capital convergence and demand-side changes in human capital spillovers. Section 6 concludes with a summary of our findings and a discussion of economic implications.
2 Model of Regional Migration, Housing Prices, and Convergence We develop a model to explore how regional migration and housing markets affect income convergence. In our model, local economies differ in productivity and initial skill level. The high-productivity "North" pays workers their marginal product, and initially its workers are relatively high-skilled. Every worker lives on one plot of land, the supply of which is perfectly elastic. We develop and solve a general equilibrium model for this local economy.
Next, we consider the interregional allocation of labor. Once we allow migration, labor inflows into the productive North drive down the average skill-level and the average wage. These declines lead to regional convergence. We show a theorem for how an increase in housing prices differentially deters low-skilled workers from migrating North. If the elasticity of land supply falls, housing prices rise, and migration flows become smaller and biased towards high-skilled workers. These changes lead to the end of convergence.
Our model makes two important departures from the standard framework in which workers are indifferent across places (Roback (1982)) and output is determined by a neo-classical production function utilizing capital and labor. Both of these departures closely echo the work of Braun (1993), who solves a dynamic model with costly labor migration from unproductive to productive places. In his model, migrants generate congestion in public goods that drive down regional incomes. 7 We add heterogeneous skill types and model congestion through the land sector, both of which closely fit the empirical facts for the U.S. experience with convergence.
Finally, while we use this model for calibration, our reduced form analysis and IV estimates do not rely upon the functional forms developed below. Our interpretation of the data relies on two crucial features of our model: 1. Downward-sloping labor demand at the state level, meaning that labor inflows push down wages. This assumption is consistent with many papers on regional or sectoral labor flows (e.g. Acemoglu et al. (2004), Iyer et al. (2011), Hornbeck (2012, Boustan et al. (2010), and Borjas (2003)). Of particular note is the work of Cortes (2008), who finds that immigration reduces the price of immigrant-intensive services, which is inconsistent with both full factor adjustment and perfectly elastic product demand.
2. Non-homotheticity of land demand, meaning that low-skill workers spend a disproportionate share of income on land. Note that households may upgrade their land as they see fit, and the observed price of housing will reflect both land and structure costs. Three pieces of evidence support this assumption that the income elasticity of land is less than one. First, Glaeser and Gyourko (2005) and Notowidigdo 7 Blanchard and Katz (1992) develop a reduced-form model in which slow-moving migration absorbs regional productivity shocks and there is downward-sloping labor demand.
(2011) document inflows of low-skill workers after a negative productivity shock and argue that these workers are disproportionately attracted by low housing prices. Second, Glaeser et al. (2008) shows convincing evidence that the income elasticity of plot size is far below 1. Third, in Appendix Figure 1, we plot the relationship between housing share of consumption and total consumption expenditure in the CEX. A quadratic fit is quite strong and clearly shows the share declining with total consumption. Timeseries data from productivity shocks, cross-sectional data on land area, and cross-sectional data on housing expenditure are consistent with a fixed per-unit cost for land paid by every household.
The wage rate w is the cost of one unit of effective labor, k indexes effort, p is the price of land, and π are lump-sum federal transfers of profits. With a statewide wage of w, a worker with numeraire skill (ψ = 1) will supply 1 = w ε units of labor. In comparison, a worker with skill ψ k will have an effective labor supply of ψ 1+ε k 1 .
Because we assume that every worker inelastically demands one plot of land, H Demand (p) = k µ k . In the theory appendix, we develop an extension of the model where consumers have endogenous, non-homothetic demand for housing that is increasing in income.

State Decisions: Labor Demand and Land Supply
State-level production is given by We view this functional form as a reduced form representation of a more complicated production process.
The first term A can encompass capital differences, natural advantages, institutional strengths, different sectoral compositions, amenities, and agglomeration benefits, with the key assumption that A does not change over time. The second term L 1−α gives decreasing returns to scale in labor, which is helpful for the exposition, but is not necessary for our results. 8 In each skill group k, every member supplies k . Different types of effective labor are perfect substitutes.
Using the labor supply equation above, where ψ k k = ψ 1+ε k 1 , we can write the labor demand equation as effective labor γ times the amount of labor supplied by a worker with numeraire skill. This yields Effective labor gets its marginal product w = (1 − α)A(γ 1 ) −α . Solve for labor demand: State-level production and land profits are returned to consumers as a nationwide lump-sum rebate π.

Market-Clearing
An equilibrium within a state is a set of prices {p, w} and allocations such that individuals choose their labor supply optimally, workers earn their marginal product, and the land and labor markets clear.
Labor: Before, we derived labor supply and demand schedules as a function of individual supply and firm production. Now, we impose the labor market-clearing condition for the numeraire worker.
Land: Using the land market-clearing condition, we set:

Indirect Utility and Comparative Statics
Now we can express the indirect utility for worker of skill ψ k in a state s as a function of three state-specific parameters: productivity A s , effective labor γ s , and population One comparative static of interest is a population inflow that does not change the skill-mix of the state. A population inflow pushes down wages and per capita income because of downward-sloping labor demand.
Another comparative static of interest a change in the state's average skill level, holding constant the total population. An increase in the state's skill-level will mechanically raise per capita income.
ε per capita inc avg skill level These two elasticities summarize the impact of migration on income per capita and we return to them in the context of the calibration in Section 5.

Cross-state migration
We simplify the model to consider migration between a productive "North" and a place with fixed flow utility "Reservationville". The flow utility in Reservationville for skill ψ k is equal to ψ 1+ k Ω + 1. Workers draw stochastic moving costs wirh realization x and then decide whether to move. 10 Using the expression for indirect utility in equation (1), the agents' rule is to move if The cutoff x * k is implictly defined when (4) holds with equality.

Proposition 1
If (1) income in North is sufficiently high (v North > Ω + 1), (2) land supply in the North is perfectly elastic (β → ∞), and (3) average effective labor per capita is higher in the North than in Reservationville (γ North /N North > γ Ω /N Ω ), then migration generates per capita income convergence.
Proof Because lim β→∞ N 1/β = 1, all terms in equation (4) are proportional to ψ 1+ k , so x * k = x * ∀k. In this case, the share of people leaving Reservationville for North is F (x * ) for all skill types, where F is the CDF for the distribution of moving costs. Under assumptions (1) and (2) all groups find it worthwhile to move North, and under assumption (3) these flows are of a lower average skill than the average skill level in the North. Using equations (2) and (3), we see that ε per capita inc pop < 0 and ε per capita inc avg skill level > 0. Thus, per capita incomes fall in the North.
10 The moving cost x k is scaled by ψ 1+ε k such that it is proportional to flow utility. Conceptually, this assumption makes sense if the primary costs of moving are the search costs for a new job and the lost social networks for finding jobs in one's home state, both of which are proportional to time, not dollars.
The evidence presented in the introduction is consistent with proposition 1: population flows from poor to rich places ("directed migration") coincided with income convergence, and the end of income convergence coincided with the end of directed migration.

Migration by Skill Type
Define the gain from moving when land supply is completely elastic as − Ω A potential migrant from group k adheres to the following decision rule:

Proposition 2
If (1) the North is attractive in nominal income terms (∆(k) North > 0) and (2) N North > 1, so that an increase in β raises prices in the state ( ∂p N ∂β > 0). Then we can characterize migration flows to the North as a function of the land supply elasticity β, and the normalized gain from moving∆(k) North = log(∆(k) North + 1)/ log(N North ).

Land Supply Elasticity
Migration Flows This proposition shows how different values of the land supply elasticity affect migration by skill type.
When the land supply elasticity is high and land prices are low, all skill types migrate to the North because of its higher productivity. When the land supply elasticity falls, land prices rise. As land prices rise, since land is a larger share of low-skilled consumption, low-skilled types are differentially discouraged from moving North. 11 Indeed, for some values of land prices, the low-skilled types move out while the high-skilled types continue to migrate North. Proposition 2 generates three testable predictions associated with a decrease in β in high nominal income areas: 1. The returns to migration should fall differentially for low-skill workers.
2. Low-skilled workers should move less to areas with increased prices, but continue to move to areas with high real income.

A decline in the rate of human capital convergence between rich and poor states
Combining Proposition 1 and Proposition 2 yields a fourth testable prediction: 4. Convergence should fall more in areas with increased prices.
In the remainder of the paper, we test the above predictions in the data and calibrate the role that migration played in convergence.
3 Reduced Forms: Housing Prices, Returns to Migration, Migration Flows, and Human Capital Convergence In the model laid out in Section 2, a land price increase in the productive areas changed the returns to migration differentially across skill groups. The model predicted a larger fall in the returns for low-skilled workers than for high-skilled ones. Further, the model predicted that this change in returns would shift the economy from one in which there was directed net migration and human capital convergence, to one in which directed migration ceases and the interstate labor markets clear through skill-sorting. 12 In the last fifty years, there has been a shift in the relationship between state prices and state incomes. The slope of this relationship more than doubles over this period. In 1960, housing prices were 1 log point higher than average in a state with 1 log point higher than average income. In 2010, housing prices are 2 log points higher in a state whose income is 1 log point higher than average. The difference in prices between the rich areas and the poor areas has grown wider relative to the differences in income. The change in this relationship is not specific to the year 2010 or to the recent recession. In the bottom panel of Figure 4, we plot this relationship in the decennial census with the years 1960 and 2010 highlighted. Although there is a spike in 1990 due to the housing bubble at the time, there is a clear upward trend in the slope.
Given this changing slope, the model predicts that there should be differential changes in the returns and migration patterns of skill groups. We test these predictions in the following three subsections.

Returns to Living in Rich Areas (Prediction 1)
We now test Prediction 1 from the model, which states that the housing price trends documented above would induce a differential change in the real returns to living in productive areas. We test for changing returns by examining the relationship between unconditional mean income in a state and skill-group income net of housing prices (Ruggles et al. (2010)). The baseline results of this specification are shown in Figure   5. With i indexing households and s indexing state of residence, we show the regression: 12 A number of papers have noted a recent decline in gross interstate migration rates (Kaplan and Schulhofer-Wohl (2012), and Molloy et al. (2011)). Appendix Figure 2 computes gross migration rates since 1960 using data from the Census/ACS (Ruggles et al. (2010)), the CPS (King et al. (2010)) and IRS files (U.S. Department of the Treasury (2010)). As is apparent in the figure, gross-migration rates remain an order of magnitude larger than net migration rates despite recent trends. Therefore the overall decline in gross migration does not explain the decline in net migration documented in this study. 13 Although our model was written in terms of land prices, our empirical results use housing prices for reasons related to data availability. Albouy and Ehrlich (2012) demonstrate the tight link between these two prices.
where Y is is household wage income, P is is a measure of housing costs defined as 12 times the monthly rent or 5% of house value for homeowners, and S is is the share of the household that is high-skilled, and Y s is the mean nominal wage income in the state. 14 The left-hand side of the equation is a household-level measure of real income. This measure is regressed upon the state-level mean household income and the interaction of nominal income with the share of high-skilled workers in the household. In 1940, both lines have a strong positive slope, which shows that both groups gain from living in richer areas. In 2010, the slope for low-skilled workers is flatter. They no longer benefit as much when living in the richer states.
Appendix Table 3 reports these results in regression form and shows the evolution of β unskill and β skill over time. These coefficients measure the returns to an agent of low-skilled or high-skilled types to living in a state that is one dollar richer. For example, β unskill is 0.88 in 1940. This finding shows that after controlling for demographics and skill types, real income was $0.88 higher in states with $1.00 higher nominal income. β unskill declines rapidly over time, though. By 2010, the real income of low-skilled households is only $0.36 higher for each $1 increase in nominal income. States with high nominal income no longer appear to offer significantly higher real income to low-skilled households.
The coefficients on β skill show a different pattern. Initially negative or close to zero, this term in 1940 and 1960 suggests that the higher real incomes earned in states with higher nominal income were equally or progressively shared across skill groups. Over time, though, β skill becomes increasingly important, and is 0.61 by 2010. In other words, real income is four times more responsive to nominal income differences by state for high-skilled households than for low-skilled households. While the differences in nominal income across states correlate only modestly with real income differences for low-skilled workers today, they are highly correlated with the returns to high-skilled households.
These findings can be interpreted easily in light of Prediction 1. As predicted by the model, rising housing prices disproportionately reduce the value of living in productive areas for low-skilled workers. These high housing prices further induce skill-sorting, which makes unconditional mean income less representative of the differentials for low-skilled households. The net effect is that the returns to living in high income areas for low-skilled households have fallen dramatically, even as they have remained stable or grown for high-skilled households. 15

Migration Choices by Skill (Prediction 2)
Next, we test Prediction 2 from the model, which states that low-skilled workers should move less to 14 Income net of housing cost is a household-level variable, while education is an individual-level variable. We conduct our analysis at the household level, measuring household skill using labor force participants ages 25-65. A person is defined as high-skilled if he or she has 12+ years of education in 1940, and 16+ years thereafter. The household covariates X is are the size of the household, the fraction of household members in the labor force who are white, the fraction who are black, the fraction who are male, and a quadratic in the average age of the adult household members in the workforce. 15 Appendix Table 3 also reports the results of two robustness checks. First, to reduce the bias arising from the endogeneity of state of residence, we also provide instrumental variables estimates using the mean income level of the household workers' state of birth as an instrument. To be precise, we estimate Y is − P is = α + β unskillŶs + β skillŶs × S is + γX is + ε is , using Y s,birth and Y s,birth × S is as instruments for the two endogenous variablesŶs andŶs × S is . Second, to address concerns that secular changes in skill premia and skill mix composition may have induced a mechanical change in the correlation between left-hand side Y is and right-hand side Ys, we demonstate that housing costs have differentially changed the returns to living in high nominal income places for low-skilled workers using group-specific incomes. areas with increased prices, but continue to move to areas with high real income. This tests whether the changing returns to migration shown above can account for the observed change in migration patterns. 16 In Figure 6, we plot the five-year net migration rates by log nominal income for State Economic Areas (467 subregions) in 1940. As is evident from the graphs, both high-skilled and low-skilled adults moved to places with higher nominal income. The same relationship holds true if we plot migration rates for high-skilled and low-skilled workers against their real income. 17 In Figure 7, we plot the same relationship for migration PUMAs (1,020 subregions) in 2000. Although in this figure high-skilled adults are still moving to high unconditional nominal income locations, low-skilled adults are actually weakly migrating away from these locations. This finding sharply contrasts with the results from the earlier period in which there was directed migration for both groups to high nominal income areas. It is an apparent puzzle that low-skilled households would be moving away from high-income areas.
However, this seeming contradiction disappears when we adjust income to reflect the group-specific means net of housing prices. Thus, the apparent puzzle in the first panel is not due to a failure of the underlying economics, but rather is due to the fact that unconditional nominal income differences no longer reflect differences in returns for low-skilled workers. Increasing housing prices in high nominal income areas have made these areas prohibitively costly for low-skilled workers, consistent with Prediction 2 of the model.

Human Capital Flows (Prediction 3)
We now examine the effect of migration flows on human capital levels. We present evidence that the transition from directed migration to skill sorting appears to have subtanstially weakened human capital convergence due to migration. We follow the growth-accounting literature (e.g. (Denison, 1962), Goldin and Katz (2001)) in using a Mincer-style return to schooling to estimate a human capital index in the IPUMS Census files. Let k index human capital levels, and let i index individuals. Within a year t, we estimate the returns to human capital using the specification where Inc ik is an indivudal's annual income, and X ik includes other demographic covariates. 18 To give the reader a sense of differences in human capital over time, Appendix Table 2 reports income and wage premia by year and education group. Define predicted income as Inc k = exp(α k ) and Share ks as the share of people in human capital group k living in state s. A state-level index is The specifications used below involve some choices about how to parameterize housing costs and which migrants to study. We report a wide variety of robustness checks in Appendix Tables 4a and 4b. 17 Migration and education are person-level variables, while income net of housing cost is a household-level variable. We conduct our analysis at the individual level, merging on area-by-skill measures of real income. To construct area-by-skill measures, we define households as high-skilled if the adult labor force participants in said household are high-skilled, and as low-skilled if none of them are high-skilled. For ease of analysis, we drop mixed households, which account for roughly 15% of the sample in both 1940 and 2000. Then, we compute Y − P for each household and take the mean by area-skill group.
18 Skill level k is defined as the interaction of completed schooling levels (0 or NA, Elementary, Middle, Some HS, HS, Some College, College+), an age dummy (25-44 or 45-64), a dummy for black and a dummy for Hispanic. X ik includes other demographic covariates (dummy for gender and a dummy for foreign born). Admittedly, this framework does not allow for selection among migrants along other unobservables. This model of fixed national return to schooling is conservative, though, given the substantial literature showing that the South had inferior schooling quality conditional on years attained (e.g. Card and Krueger (1992)). Thus this measures is, if anything, likely underestimate the human capital dispersion across states.
Under the assumption of a fixed national return to schooling, the human capital index provides an estimate of state-level income as a function of a state's skill mix. Our research design exploits the fact that the Census asks people about both their state of residence and their state of birth. The state of birth question provides us with a no-migration counterfactual for the state-level distribution of human capital. We can then compute the change in the human capital index due to migration as 19 ∆HC s Impact of Migration of Human Capital Next, we take the baseline measure of what human capital would have been in the absence of migration (HC s,birth ) examine its correlation with how much migration changed the skill composition of the state (∆HC s ). Specifically, we regress ∆HC s = α + βHC s,birth + ε s Figure 8 and Table 3 row (1)

A Panel Instrument for Housing Prices
The previous section posited an exogenous increase in housing prices in productive areas. In this section, we develop a new instrument for housing prices and test Prediction 4 of the model, which states that convergence should slow the most in places with housing supply constraints. In fact, by splitting the sample into high and low regulation areas, we are able to test for our entire causal chain by showing that housing supply constraints reduce permits for new construction, raise prices, lower net migration, slow human capital convergence and slow income convergence.
Housing prices are, of course, partially determined by demand and thus can not be used to directly identify a causal change in migration or convergence over time. The desire to identify the causal chain from prices to convergence motivates our search for a supply side instrument. In our model, housing supply is governed by both a housing supply elasticity parameter β and the density of the city. We think that aggregate changes in density in the U.S. are unlikely to explain shifts in housing supply. 20 In contrast, empirical work has shown tight links between prices and measures of land use regulation in the cross-section. 21 Here, we introduce a new panel measure for housing price regulations based on state appeals court records and validate this instrument against the existing cross-sectional regulation measures. This measure is, to the best of our knowledge, the first panel of housing supply regulations in the United States, and we validate it 19 We limit the sample to people aged 25 and above so that we can measure completed education. To focus on migration flows for 20-year windows, we analyze the population aged 25-44. To the extent that people migrate before age 25 (or their parents move them somewhere else), we may pick up migration flows from more than twenty years ago. 20 In 1940, log(persons/square mile) had a mean of 3.36 and a standard deviation of 1.41 at the county level. By 2010, the mean density was 3.78, an increase of 1/3 of one standard deviation. Because the heterogeneity in densities across places greatly exceeds the aggregate change in density, it seems unlikely that substantial parts of America have reached a maximum building density.
21 Examples include Glaeser et al. (2005a), Katz and Rosen (1987), Pollakowski and Wachter (1990), and Quigley and Raphael (2005). with existing cross-sectional regulation measures. It offers the first direct statistical evidence that regulation increases covary directly with price increases.

Instrument Construction and Reliability
Our measure of land use regulations is based upon the number of state appellate court cases containing the phrase "land use" over time. The phrase "land use" appears 42 times in the seminal case Mount Laurel decision issued by the New Jersey Supreme Court in 1975. Municipalities use a wide variety of tactics for restricting new construction, but these rules are often controversial and any such rule, regardless of its exact institutional origin, will likely be tested in court. This makes court decisions an omnibus measure which capture many different channels of restrictions on new construction. We searched the state appellate court records for each state-year using an online legal database. 22 To ensure that our regulation measure capture the stock of restrictions rather than the flow of new laws, we set: We begin our measure in 1940, and initialize it by using case counts from 1920-1940. Appendix Table 5 shows selected values for each state. One immediate result from constructing this measure, is that the land use restrictions have become increasingly common over the past fifty years. Figure 9 displays the national regulation measure over time, which rises strongly after 1960 and grows by a factor of four by 2010. 23 We test the reliability of this measure against two national cross-sectional land-use surveys. The first survey, from the American Institute of Planners in 1975, asked 48 questions of planning officials in each state, 21 of which were tied to land use restrictions (The American Institute of Planners (1976)). To build a summary measure, we add up the total number of yes answers to the 21 questions for each state.
As can be seen in Figure 9, the 1975 values of our measure are strongly correlated with this measure.
Thus, the case-count measure was a meaningful proxy thirty-five years ago. Similarly, we compare the regulation measure to the 2005 Wharton Residential Land Use Regulation Index (WRLURI). 24 Finally, we compute residual Wharton regulation by fitting W RLURI s,2005 = α + β 1 AP I s,1975 + ε s and then calculating W RLURI s,2005 = W RLURI s,2005 − β 1 AP I s,1975 . We regress this on our regulaton measure and again find a strong positive correlation. This is further evidence that the measure truly captures the panel variation in regulations.
The instrument is a good predictor of housing prices. We examine the effect of the instrument on housing prices, with state and year fixed effects. In this setting, we find that that changes in the instrument are highly correlated with changes in prices, as shown in Appendix Figure 4.

Using the Instrument to Test the Model (Prediction 4)
Having established that our regulation measure is a good instrument for housing supply elasticity, we test its direct effect on the convergence relationship. Before turning towards regressions, we first explore the effect of land-use regulations on convergence graphically. Figure 10 shows differential convergence patterns among the high and low elasticity states. The convergence relationship within the low regulation, high elasticity states remains strong throughout the period. Conceptually, we can think of this group of states as reflecting Proposition 1, with within-group reallocations of people from low-income states to high-income states. In contrast, the convergence coefficients among states with low elasticities or tight regulations display a pronounced weakening over time. The patterns within this group reflect the skill-sorting of Proposition 2, where only high-skilled workers want to move to the high-income states. 25 We now turn towards regressions and explore the effect of regulations on the convergence mechanisms described above. For clarity, we divide our state-year observations in to high and low regulation bins based on a fixed cutoff. This division allows us to easily interpret how tight regulations alter the relationships between income and the dependent variables. We set the cutoff equal to the median value of the regulation index in 2005. In total, 12.5% of states fall in to the high regulation bin in 1970, and by 1990 this bin contains 40% of the states.
Our specifications are of the following form: The coefficients of interest, β and β high reg , measure the effect of lagged income in low and high regulation state-years and are reported in Table 4.
The first left-hand side variable considered with this specification in Column 1 is housing permits issued relative to the housing stock. Absent land use restrictions, places with higher income will face greater demand for houses and will permit at a faster rate. Accordingly, the uninteracted coefficent on income is 1.87, indicating that the permit rate increases by 1/5th of a standard deviation annually for each ten log points of income. The interaction term β high reg measures the change in the relationship between permits and income for high regulation states. Because the estimated coefficient here is -2.42, permitting is actually negatively related to income among the high regulation states-years.
Column 2 of Table 4 uses the same specification, with log housing prices as the dependent variable.
Whereas regulations weakened (or even reversed) the relationship between income and permitting, we expect regulations to steepen the relationship between income and prices. The uninteracted coefficient displayed in column 2 (1.03) recovers the 1-to-1 relationship between log income and log prices found in the first panel of Figure 4. The interaction term indicates that the relationship between income and prices in high regulation state-years is 1.7, meaning that the dispersion in prices between high and low income states is wider in the high regulation regime. More of the income differences are capitalized in to prices in the high regulation regime, just as employment shocks are capitalized into prices in high regulation regimes in Saks (2008).
Column 3 explores what effect regulations have on the link between income and population growth. In our model, states with high income per capita will draw migrants when the elasticity of housing supply is large. The uninteracted coefficient is 2.0, meaning that places that were 10 log points richer averaged population growth that was 0.2 percentage points higher per year over the next twenty years. This is similar to the coefficients found in the strong convergence period in Figure 3. When the elasticity of supply is low, housing prices make moving prohibitively costly and directed migration ceases. The interaction coefficient is large and negative (-2.8). This means that population growth in not predicted by income in the high regulation regime -if anything they are negatively related -matching the predictions of the model.
In our model, Proposition 2 showed that, when housing supply was perfectly elastic, migration would be skill-neutral and undo any initial human capital advantage held by productive places. We find in Column 4 that the uninteracted coefficient is sizeable and negative reflecting convergence in human capital in the unconstrained regime. The proposition also showed that as the supply elasticity fell, migration became increasingly skilled biased and its contribution to human capital convergence diminished. The interaction coefficient is sizeable and positive indicating as expected that human capital convergence ceases among high regulation observations.
Finally, column 5 brings this analysis full circle by directly looking at the effect of high regulations on the convergence relationship. Without the interaction terms, equation (7) reduces to a pooled version of the familiar convergence regressions displayed in Figure 2. As expected, the uninteracted coefficient (-2.3 annual rate) captures the strong convergence relationship that exists absent land use restrictions. This coefficient is similar in magnitude to the ones used in the calibrations in section 3. However, the interaction coefficient is large and positive (3.2). This finding indicates that the degree of convergence among states in periods of high regulation is virtually non-existent.
One concern with this result is that the regulations measure may simply be picking up fixed time-trends or fixed cross-sectional differences across states. To account for this potential error, we conduct three sets of monte carlo placebo experiments. In the first set, we randomly reassign the values of the regulation measure across both states and years. In the second, we randomly reassign the regulation measure values across years within a state. Doing so preserves the fixed cross-sectional features of the instrument, but removes its time-series properties. Finally, we randomly reassign the regulation measure across states within a year. Doing so preserves the time-series properties of the instrument, but removes the cross-sectional features. We conduct 100 randomizations for each experiment, and rerun the regression from Column 5 of Table 4. In other words, we test the interaction term in the convergence regression by using this monte-carlo generated regulation measure. We plot the cumulative distribution function of estimated coefficients along with the true estimate in Appendix Figure 4. This graph shows that our identification is driven about equally by the cross-sectional variation in regulations and the time series increase in regulations.
These results demonstrate that high regulations restrict supply, increase the capitalization of income differences in to prices, reduce directed migration, reduce human capital convergence, and slow the convergence in income. In short, the results test and confirm the mechanisms at work in Propositions 1 and 2.

Calibration and Alternate Explanations
The supply constraints account of the end of directed migration and convergence developed above departs from the conventional interpretation of regional convergence, which focuses on faster capital accumulation in poor states. It also departs from the conventional interpretation of the recent divergence in regional education levels, which focuses on demand-side changes in human capital spillovers. The capital convergence story does a good job of explaining the regional facts up to 1980, and the demand-side story can explain many of the post-1980 regional facts; our model and empirics are able to explain both sets of facts inside a single model.
In mathematical terms, given the neo-classical production function Y = AK α L 1−α , our supply-side account focuses entirely on changes in L rather than changes in K (capital convergence) or A (labor demandside changes). In this section, we show that changes in L were sufficiently large to account for changes in convergence, holding K and A fixed. Then, we revist the role of capital, the role of demand-side changes, and the possibility that U.S. states had already converged by 1980. Finally, we consider whether another shock whose timing and location coincided with the changes in regulation can explain our results.

Calibration -The Role of Changes in Labor
The model developed in Section 2 demonstrates two channels through which migration can affect income convergence. Now, we show that our model delivers a parsimonious, intuitive equation for the effect of migration on convergence, and that migration can explain a quantitatively important fraction of observed income convergence in the past. In terms of the model, a state's income per capita is given by y Taking logs, a change in per capita income can be decomposed into changes in effective labor per capita and changes in population, with The model delivers simple formulas for the effect of changes in effective labor per capita and population on per capita income (ε per capita inc avg skill level , ε per capita inc pop from equations (2) and (3)). Note that for comparison, a neo-classical model that had exogenous labor supply and homogeneous labor would have had the smaller impact of d log y = −αd log ( k µ k ). Endogenous labor supply and human capital amplify the effects of migration. Converting to regression coefficients, we obtain: per capita inc pop η Directed Migration (7) The formula in equation (7)  Population Channel: We estimate population flows using two different net migration series. 26 We estimate all flows in terms of 20-year windows, corresponding to the 20-year windows used for convergence in Figure   26 The first method combines state-level vital statistics on births and deaths with Census data to compute net migration as ). The second method uses national birth and survival ratios along with Census data to project state population in the absence of migration. Specifically, for each age-sex group k this method calculates a national birth and survival rate ω t,t+10 k from t to t + 10. It then projects the migration-less state population in year t + 10 asP op Net migration under this method is equal to NetMig Survive t,t+10 = P op s t+10 −P op s t+10 . All data come from Ferrie (2003), except birth-death method estimates from 1930-1940 are from Fishback et al. (2006).

2.
To estimate the effect of initial income on net migration, we regress log(NetMig t−20,t−10 + NetMig t−10,t + P op t−20 ) − log P op t−20 = α + η log y t−20 , s +ε s The results are shown in Table 2. For the 20-year periods ending in 1950, 1960, 1970 and 1980, there was substantial net migration towards higher-income states. Averaging the coefficients from this period, we find that a doubling of initial income raised the annual net migration rate to a state by 1.85 percentage points.
Human Capital Channel: We estimate human capital flows by examining the change in human capital due to migration relative to baseline levels and its relation to initial income. Specifically, we regress Details on the construction of this human capital measure are in the data appendix. The results are shown in Table 3, row (2). Human capital flows from rich states to poor states reached their peak in the 20-year period ending in 1980 or 1990, depending on which measure we use. This convergence-driving flow fell substantially and today ranges from one-third of the peak level to zero, depending on choice of measure. Until now, we have worked with income as a measure of human capital. Because in our model labor supply is determined endogenously, in principle wage rates are a better analog for skill type. If we assume that reported annual hours is a measure of labor supply, we use a Mincerian regression to compute an hourly wage rateŵage k . In the model with endogenous labor supply,î nc k =ŵage 1+ε k . The results are shown in Table 3, rows (3) through (5). Higher labor supply elasticities exacerbate the impact of existing human capital differences on state incomes. Therefore, convergence in human capital levels then implies a greater degree of convergence in state incomes. Additionally, we can multiply the national hourly wage for each skill group by the observed state-level hours distribution, settingî nc ks =ŵage k hours ks . Here, we find a sharp reversal in human capital flows, with strong convergence for the period 1960 to 1980, divergence in the period 1970 to 1990, and about zero net human capital flows today. 27 Now we can use the empirical estimates derived above to estimate our key formula (equation 7). Table 5 shows the results of this calibration. Recall that ε per capita inc avg skill level = 1−α 1+αε and ε per capita inc pop = − α(1+ε) 1+αε . We use the standard neo-classical value for α of 0.33. We consider three different cases for the elasticity of labor supply: static labor supply, ε LaborSupply = 0.6, and ε LaborSupply = 2.6. 28 Changing the value of the labor supply elasticity has three different effects on the calibration. First, more responsive labor supply means that wages will change more in response to a population inflow, thereby amplifying ε per capita inc pop . Second, more responsive labor supply means that when an low-skilled person moves to a high-productivity place he or she supplies more labor. This effect dampens the human capital convergence channel ε per capita inc avg skill level .
Finally, more responsive labor supply makes historic human capital differences play a larger role in historic income differences. Thus skill group changes have a larger impact on measured human capital.
27 Table 3 also reports robustness checks showing that the broad trends persist even when we omit foreign-born migration or black migration. Finally, the timeseries patterns for human capital levels by state of residence (which include endogenous human capital accumulation decision outside of our model) look somewhat different, but nonetheless feature a convergence in human capital levels in the past, and no convergence today. 28 We consider the following balanced growth path preferences, arg max k ln(c) − 1+1/ε k 1+1/ε , under the assumption that β → ∞. Log utility implies static labor supply. Chetty (2012) characterizes ε LaborSupply = 0.6 as the consensus estimate for the sum of extensive and intensive labor supply elasticities based on a meta-analysis of approximately thirty studies with quasi-experimental variation in tax rates. Borjas (2003)'s estimate of the effect of an increase in labor supply on total income is ε LaborSupply = 2.6.
Taken together, we find that for the period until 1980, migration can explain 40-75% of convergence and changes in migration can explain all of the changes in convergence, as shown in Table 5. 29

The Role of Capital
Past work, most notably Barro and Sala-i Martin (1992), has explored whether faster capital accumulation in poor states can explain regional convergence from 1880 to 1980 in a calibrated model. Empirical measures of the state-level capital stock are quite difficult to obtain, but three pieces of evidence suggest a more nuanced analysis of the role of capital in U.S. convergence is needed. 30 First, we analyzed a panel state-level measure of capital using the Census of Wealth, which was constructed using local tax records and conducted on a roughly decennial basis until 1922 (Kuznets et al. ( 1964)). Because property taxes were the primary source of revenue for state and local governments at this time, we consider this measure to be fairly accurate. The data show mixed evidence for the capital-driven convergence hypothesis, as depicted in Appendix Figure 5. From 1880 to 1900 capital grew most quickly in wealthy states (which contradicts the neo-classical model), and from 1900 to 1920, capital grew most quickly in poor states (which is consistent with the neo-classical model). Income per capita was converging in both periods, and consistent with the explanation put forward in this paper, there was substantial net migration towards wealthy states in both periods as well.
Second, we plot a timeseries of regional bank lending rates in Appendix Figure 5. If the return on capital is falling in the amount of the capital stock, we should see high interest rates in poor areas and low interest rates in rich areas. In fact, we see high interest rates in the rich West, which is inconsistent with the neoclassical model. Additionally, these regional interest rates largely converged by the end of World War II.
Under the neo-classical model, this finding would indicate that capital-labor ratios had converged as well.
Third, in some cases, capital may be a substitute rather than a complement for labor. Recent work by Hornbeck and Naidu (2012) suggests that an outflow of cheap black labor due to a flood in Mississippi in 1927 led to additional capital investment in the agricultural sector in subsequent years. Along similar lines, Lewis (2011) finds that increased low-skilled labor supply in the 1980s and 1990s was a substitute for capital investment in the manufacturing sector. In this case, capital investment would amplify the impact of changes in the labor supply rather than attenuate them.
After assessing the evidence on an international level, Jones and Romer (2010) wrote in a recent review article that "While the textbook transition dynamics-driven by diminishing returns to capital accumulation-are elegant and easy to explain, they are most likely not especially relevant to catch-up growth in practice." More research and more data are needed to seriously study the role of physical capital in U.S.
income convergence, but the existing data are insufficient to draw strong conclusions about its role.

The Role of Human Capital Spillovers and Steady State Differences
The supply constraints story developed in this paper to explain the end of convergence differs from a demand-side explanation based on the idea that agglomeration economies among skilled workers have
Both hypotheses are consistent with less income convergence. For example, the rise of Silicon Valley likely undermined regional convergence within California. However these two hypotheses generate conflicting predictions regarding migration.
A demand-side theory could take one of two forms: either a general increase in productivity in high BA areas that raises in-migration by all skill groups, or a skill-specific shock benefitting only high-skill workers and raising only their in-migration rate. In contrast, the negative supply shock hypothesis discussed here predicts sharply falling in-migration by low-skill workers and a steady or slightly declining in-migration for high-skill workers.
Indeed, we find in the data (Figure 7) that workers with less than a BA are leaving high nominal income areas, and that high nominal income areas are not growing disproportionately (Figure 3). Although information-economy cities such as San Francisco, Boston and New York offer high nominal wages to all workers (typically in the top quintile nationally), after adjusting for housing costs all three cities offer below average returns to low-skill workers (typically in the bottom decile).
In Appendix Table 6, we examine the flows of low-skilled and high-skilled workers in 1980 and 2010 to 'high skilled' states as measured in 1980. This period and independent variable were chosen to be consistent with the literature on agglomeration economies. We examine two measures of net migration: the first uses total change from people born in-state to people living in-state (similar to Figure 8), and the second examines the choice of destination conditional on the decision to leave one's birth state. Unlike directed migration to high-income places, there is still directed migration to high BA places under this measure, albeit at a far slower rate than in the past. There has been a remarkable shift in the composition of this migration.
Both data series find a large decrease in the in-migration rate of low-skilled workers to high BA states from 1980 to 2010, and small decline in the in-migration rate of high-skilled workers to high BA states. These comparative statics are more consistent with a shock to housing supply rather than a skill-biased shock to labor demand. 31 Finally, differences across states in incomes are much smaller today than they were in the past. Perhaps differences in incomes across states today reflect steady-state differences due to permanent amenity differences. While possible, two pieces of evidence are inconsistent with this suggestion. First, a close examination of Figure 2 shows that from 1940 to 1960 there was within-group convergence among the rich states as well as among the poor states. The income differences between Connecticut and Illinois or Mississippi and Tennessee in 1940 are much smaller than the differences between Connecticut and Mississippi in 1990, and yet we saw substantial within-group convergence from 1940 to 1960 and much less from 1990 to 2010. Second, our analysis with the regulation instrument (e.g. Figure 10) shows substantial within-group convergence in the low regulation group, suggesting that existing income differences are sufficiently large and transitory as to make convergence possible.

Omitted Variable Bias
There are little data to support stories about changes in capital convergence and the data on net migration seem more consistent with housing supply changes than with demand-side changes in agglomeration.
Nevertheless, it is possible that the change in regulations coincided in space and time with another change that directly affected convergence patterns. For example, perhaps some states had a shock benefitting skilled workers and these workers have a preference for land use regulations. Another less-likely story is that states that disliked immigrant and minority workers implemented land use restrictions (to facilitate residential segregation) and had high rates of hate crimes (which deterred immigrant and minority migration).
To investigate the plausibility of these alternate hypotheses, we once again make use of the rich time-series variation in our regulation measure. Although regulation was low across the board in 1955, there is still cross-sectional variation in our instrument for that year. This variation is predictive of subsequent increases in regulation, and the correlation between the instrument in 1955 and 2005 is 0.45. Dividing the states at the median in 1955, we classify the states above the median in that year as having a latent tendency to regulate.
In Table 6, we explore the effect of this latent tendency to regulate on income convergence before and after 1980. The table demonstrates that these states (like states with high regulation today) displayed similar convergence behavior before 1980. In the period after 1980, once these latent tendencies had been activated in the form of high regulations, these states experience a sizeable drop in their degree of income convergence.
We conduct a similar exercise in the last two columns by splitting states at the median based upon the geographic availability of develop-able land using data from Saiz (2010). 32 Again, the table demonstrates that states with low geographic land availability did not display different convergence behavior before 1980.
In the period with tight building restrictions after 1980, however, these states also experience a reduction in their rates of income convergence. The similarity of these patterns is striking given that the correlation between our two predetermined elasticity measures -the high latent regulation and low geographic land availability dummies -is only 0.33. This evidence raises the bar for alternative explanations. In the two examples given above, regulations were generated as the by-product of alternate shocks. However, Table 6 shows that an alternate hypothesis must explain not only why regulations are correlated with a non-related convergence-ending shock, but also why this new shock is correlated with states' geography and historical legal structures. Moreover, these explanations have to address the fact than neither feature influenced convergence rates prior to the period of high land use regulation. Although it is possible to generate such explanation, articulating such a story is sufficiently complicated that we feel these facts strongly suggest the simpler explanation developed here.

Conclusion
For more than 100 years, per-capita incomes across U.S. states were strongly converging and population flowed from poor to wealthy areas. In this paper, we argue that these two phenomena are related. By increasing the available labor in a region, migration drove down wages, reduced labor supply and induced convergence in human capital levels. Using a simple model we find that, unlike much of the prior literature, migration can account for 40-75% of the observed convergence and all of the observed change in convergence.
Over the past thirty years, both the flow of population to productive areas and income convergence have slowed considerably. We show that the end of directed population flows, and therefore the end of income convergence, can be explained by a change in the relationship between income and housing prices. Although 32 We population-weight metro areas using 1960 counts to derive state-wide averages.
cross-sectional housing prices have always been higher in richer states, housing prices now capitalize a far greater proportion of the income differences across states.
In our model, a reduction in the elasticity of housing supply in rich areas shifts the economy from one in which labor markets clear through net migration to one in which labor markets clear through skill-sorting.
As prices rise the returns to living in productive areas fall for low-skilled households, and their migration patterns diverge from the migration patterns of the high-skilled households. We find patterns consistent with these predictions in the data.
To identify the effect of these price movements, we introduce a new panel instrument for housing supply.
Prior work has noted that land-use regulations have become increasingly stringent over time, but panel measures of regulation were unavaliable. We create a proxy for these measures based on the frequency of land-use cases in state appellate court records. First, we find that tighter regulations raise the extent to which income differences are capitalized into housing prices. Second, tigher regulations impede population flows to rich areas and weaken convergence in human capital. Finally, we find that tight regulations weaken convergence in per capita income. Indeed, though there has been a dramatic decline in income convergence nationally, places that remain unconstrained by land use regulation continue to converge at similar rates.
These findings have important implications not only for the literature on land-use and regional convergence, but also for the literature on inequality and segregation. A simple back of the envelope calculation shown in Appendix Table 7 finds that cross-state convergence accounted for approximately 30% of the drop in hourly wage inequality from 1940 to 1980 and that had convergence continued apace through 2010, the increase in hourly wage inequality from 1980 to 2010 would have been approximately 10% smaller. The U.S. is increasingly characterized by segregation along economic dimensions, with limited access for most workers to America's most productive cities. We hope that this paper will highlight the role land-use restrictions play in supporting this segregation.

Downward-Sloping Product Demand, Population Flows, and Convergence
In Section 2, we developed a model where downward-sloping labor demand came from the assumption of a production function Y = AL 1−α that had decreasing returns to scale in labor. Here we show that downwardsloping labor demand can also come from a production function with constant returns to scale (Y = AL), combined with elastic product demand and monopolistic competition.

Individual Decisions: Labor Supply and Product Demand
Individuals i in the region "home" consume a basket of differentiated good {x j } from each region j ∈ [0, 1]. Individuals solve the following problem, taking the local price for labor w and the national price for products {p j } as exogenous Equation (8) holds for all markets j ∈ [0, 1]. We now apply the standard Dixit-Stiglitz solution techniques to derive the demand for any individual good j in terms of its own price p j , household income w i l i and the aggregate price index P. The first order conditions imply that an individual's consumption of two goods must have the following ratio: Recall that l i is actually l * i (w) from equation (8) which governed labor supply. We now substitute in for the labor supply elasticity above, to write an individual's demand for good x j as: where ξ i is a scaling of household marginal utility.

Firm Decisions: Product Supply and Labor Demand
We assume that each region has a single firm j, which takes the national demand curve and local wages as exogenous. As before, we suppress the notation for the location of the home firm throughout. Firms produce using the constant returns to scale production function q j = AL j . The firm serves the national market but hires labor locally (L j ) at wage w j . max pj ,lj ,qj Having derived the optimal prices, we can determine output by substituting the price FOC back in to equation (10) for consumer demand: i P 1−σ ξ i We can integrate over all the individuals i to calculate an aggregate demand curve for good j: Inverting the production function q = AL gives a company's labor demand as a function of wages and downward-sloping demand for their good.

Labor Market Equilibrium
Recall that labor supply is given by the individual labor supply decision (equation (8)) times the share of individuals µ j in the regional market.
Now we can equate labor supply from equation (11) and demand from equation (12) to solve for the marketclearing wage Recall equation (9), that consumer i's demand for good j is x ij = p −σ j wli P 1−σ . Plugging the demand equation into the marginal utility expression gives This shows that ξ is a function of prices which are exogenous from the perspective of the home region, meaning that it cancels from both sides of the labor-market clearing condition. This means we can solve for the market-clearing wage in terms of exogenous parameters. w

Market-clearing
With the market-clearing wage, we can go back to the individual labor supply condition (equation (8)) to solve for per capita income

Comparative Static
We are interested in the impact of a population change in the home region on local per-capita incomes, or mathematically, ∂w * l * /∂µ. A, P, σ,ξ and ε are exogenous parameters or functions of nation-wide variables. From equation (13) we have an elasticity of per capita income with respect to population of : where 0 < µ < 1, ε > 0, and σ > 1. This elasticity is natural counterpart to ε per cap income population = −α (1+ε) 1+αε in equation (2). We can interpret this elasticity intuitively. When the labor supply elasticity is high, inflows have a bigger impact on income because a small increase in labor supply greatly bids down the price of labor. When a monopolistic region faces a less elastic demand curve (σ ∼ 1), then it will not increase production much in response to a migration-induced decrease in the cost of labor. As a result, incomes will fall to a greater degree if the demand curve is more inelastic (σ is lower). In this way, monopolistically competitive markets can provide a microfoundation for the result of downward-sloping labor demand.
For a standard calibration, in which ε = .6 and σ = 1.6 33 , we have ε per cap income population = −.73. Note that this value is larger in absolute value than the elasticities reported in Table 4, and implies that net migration prior to 1980 can account for two-thirds of the observed convergence in that period. An extended version of this model that allows for home-bias or non-traded goods could be constructed, but this example demonstrates that our results are predicated upon downward-sloping labor demand, not decreasing returns to scale.

Endogenous Housing Demand
In Section 2, we derived a model in which all individuals had the same demand for one plot of land, which meant that housing demand was completely inelastic. That assumption is convenient for exposition, but not necessary for our results. Here we derive the two propositions in the text while allowing for endogenous, non-homothetic housing demand.

Individual Decisions: Labor Supply and Housing Demand
We assume quasi Stone-Geary preferences, with a minimum housing requirement of H per worker. Workers solve the following problem: The first order conditions are which implies the following ratios between labor, consumption, and housing: We can use these ratios along with the household budget constraint to derive expressions for household allocations in terms of prices and parameters: We now simplify the analysis by assuming = 1, we can apply the quadratic formula to the previous equation:

Indirect Utility
Given these allocations, household utility can be expressed in terms of paramters and prices: Note that if H = 0, p = 1, Λ = (1 − β)wψ. As a result, we have a simple expression: An individual's indirect utility, and therefore the attractiveness of migration, increases in the city wage w. Further if we now assume that the utility in Reservationville is proportional to ln(ψ), and that moving costs are additive, migration is once again independent of skill type. Because we did not changed the firm side of the economy, equation (14) is sufficient for establishing the results derived in Proposition 1.

Comparative Statics for Housing Prices and Returns to Migration
To compute the effect of housing prices on migration incentives, we take the derivative of indirect utility with respect to housing prices.
Equation (15) demonstrates that higher housing prices make moving to the North less attractive for all types.
We explore how this effect varies by skill type by taking the cross-partial of this derivative with respect to an individual's skill type ψ. That expression yields: Equations (15) and (16) establish that higher prices reduce utility, and hence the incentive to migrate. This mechanism has less of an impact on the utility of higher skilled agents, which replicates Proposition 2. Intuitively, the result emerges because the minimum housing requirement causes an individual's housing share of expenditure to fall with income. Therefore, higher housing prices induce a larger utility loss for poorer agents.

Proof of Proposition 2
We again assume that x ≥ 0 , ∆(k) North > 0 for all k , and that N N > 1. Thus. (N 1/β N − 1) is continuous and strictly monotonic in β and lim β→0 (N 1/β N − 1) → ∞. By Bolzano's theorem a strictly monotonic and continuous function will intersect a fixed value at most once on the half-closed interval bounded by 0 .The indirect utility exluding housing in the North ∆(k) North , is by definition continuous and strictly monotonic in ψ. As a result, there is a unique β * k for which the left-hand side of equation (5) equal zero for each skill type , and this unique value β * k is decreasing in k. For β > β * k , the left-hand side of equation (5) is strictly positive implying in-migration for skill type k. For β < β * k , the left-hand side of equation (5) is strictly negative implying out-migration for skill type k. The first case, where β → ∞, is proven in proposition 1.

Human Capital Convergence
Calibrating the extent of human capital convergence across multiple years for all workers is more involved than the reduced forms for people aged 25-44 that are discussed in the text. For each year t, we estimate the returns to human capital using the specification For all human capital analyses (Table 3 and Appendix Table 2) as well as inequality analysis (Appendix Table 7), we winsorize income or wages at the 1st and 99th percentile. Skill level k is defined empirically as the interaction of completed schooling levels (0 or NA, Elementary, Middle, Some HS, HS, Some College, College+), an age dummy (25-44 or 45-64), a dummy for black and a dummy for Hispanic. We also run the same regression with log W ageIncome ik AnnualHours ik as the dependent variable. We made every effort to code annual hours consistently across years. Because hours last week are reported in intervals in 1960 and 1970, we code each observation using the sample mean of hours within that interval from other years. In 2000 and 2010, the hours data come from the usual number of hours per week. Note that s 25−44 st,nonblack +s 45−64 st,nonblack < 1, which is different from the baseline setting. We do the same procedure to analyze the role of foreign-born migration. We also conduct robustness checks where we characterize the overall change in human capital (including within-state accumulation) as log(HC t ) − log(HC t−20 )

Labor Market Area Analysis
For the analysis in Appendix Table 1, Panel C, we construct a panel of income and population at the Labor Market Area (LMA) level. LMAs are linked by intercounty commuting flows and partition the United States (Tolbert and Sizer, 1996). LMA population is constructed simply by adding the population of constituent counties. LMA income is estimated as the population-weighted average of county-level income. The income series uses median family income from 1950-2000 from Haines (2010) and USACounties (2012). In 1940 and 2010, the series is unavailable. In 1940, we use pay per manufacturing worker from Haines (2010). Pay per manufacturing worker which had a correlation of 0.77 with median family income in 1950, a year when both series were available. In 2010, we use median household income from USACounties (2012), which had a correlation of 0.98 with median family income in 2000, a year when both series were available.
For the analysis in Appendix Table 6, we use 1980 and 2000 because 1970 doesn't have geographically coded migration, 1960 lacks any sub-state geographies, and 1950 & 2010 record migration relative to 1 year ago. We construct a consistent geographic coding at the LMA level. Sometimes, multiple LMAs will be associated with a single PUMA or county group. In that case, we use population weights to probablistically assign observations to LMAs. For example, suppose a person lived in a PUMA which has 75% of its population in LMA A1 and 25% in LMA A2. She moved to a PUMA that was equally divided between LMA B1 and LMA B2. We form four migration records: 3/8 weight A1 -> B1, 3/8 weight A1 -> B2, 1/8 weight A2 -> B1, and 1/8 weight A2 -> B2, and multiply each of these weighted by the already-given person weights. We then use current LMA of residence and LMA of residence 5 year ago to compute net migration counts for people with and without BAs. Finally, we compute a net migration rate for each skill group relative to the total LMA population 5 years ago.

Convergence Rates Over Time
Notes: The y-axis in first two panels is the annual growth rate of income per capita. The third panel plots coefficients from 20-year rolling windows. The larger dark red and light purple dots correspond to the coefficients from the first two panels.

Convergence and Directed Migration Rates Over Time
Notes: The y-axis in first two panels is the annual growth rate of log population. Third panel plots coefficients from 20-year rolling windows for population changes and income changes. The larger dark red and light purple dots correspond to the coefficients from the first two panels.

Timeseries of Coefs
Notes: The first two panels regress median housing value on income per capita at the state level. The third panel plots coefficients from 20-year rolling windows. The larger dark red and light purple dots correspond to the coefficients from the first two panels. Notes: This figure plots the relationship between unconditional state average household income and average state skill-specific income net of housing costs. Each panel stratifies households with at least one labor force participant aged 25-65 into 20 quantiles on state-wide average household wage income. It then computes the mean household wage income net of housing cost for high-and low-skilled households within each quantile after controlling for household demographics (see text for details). Housing costs are defined as 5% of house value for homeowners and 12X monthly rent for renters. high-skilled households in 2010 are defined as households in which all adult workers have 4+ years of college and low-skilled households are defined as households in which no worker adult worker has this level of education. High-skilled households in 1940 are defined as households in which all adult workers have 12+ years of education and low-skilled households are defined as households in which no worker adult worker has this level of education. The roughly 15% of mixed skill-type households in each year are dropped from the construction of the dependent variable, but not from the computation of unconditional state average income.

Real Income and Net Migration
Notes: These panels plot net migration as a fraction of the population for 467 state economic areas in the 1940 IPUMS Census extract. Each panel stratifies the SEAs into 20 quantiles by income, weighting each SEA by its population, and then computes the mean net migration as a fraction of the total initial population within each quantile. The two panels on the top plot net migration as a function of the mean log household wage income in the SEA for non-migrating households with at least one labor force participant aged 25-65, for individuals with less than 12 years of education (left) and those with 12+ years (right). The two panels on the bottom plot the migration rates for these skill groups against the log skill-group mean value of household wage income net of housing costs for households with one labor force participant aged 25-65. Housing costs are defined as 5% of house value for homeowners and 12X monthly rent for renters. All population counts are for aged 25-65.

Real Income and Net Migration
Notes: These panels plot net migration as a fraction of the population for 1,020 migration PUMAs in the 2000 IPUMS 5% Census extract. Each panel stratifies the PUMAs into 20 quantiles by income, weighting each SEA by its population, and then computes the mean net migration as a fraction of the total initial population within each quantile. The two panels on the top plot migration rates as a function of mean log household wage income in the PUMA for non-migrating households with at least one labor force participant aged 25-65, for individuals with less than 4 years of college (left) and with 4+ years (right). The two panels on the bottom plot the migration rates for these skill groups against the skill-group median value of household wage income net of housing costs for households with one labor force participant aged 25-65. Housing costs are defined as 5% of house value for homeowners and 12X monthly rent for renters. All population counts are for aged 25-65.

Extent of Human Capital Convergence due to Migration Over Time
Notes: Human capital index is estimated by regressing log Inc ik = α k +X ik β +ε ik , and then constructing Human Capital s = k exp(α k ) × Share ks . We separately estimate the human capital index by state of residence and by state of birth, to develop a no-migration counterfactual. The top panels show figures from a regression of HumanCap s,res − HumanCap s,birth = α + βHumanCap s,birth + ε s in 1960 and 2010. The skill premium ({α k }) is estimated in the 1980 Census, and these estimates are applied to skill shares in different Census years. The sample is aged 25-44 to capture human capital convergence from the prior 20 years, and because this group has much higher migration rates than people ages 45-64. See the data appendix for details. The bottom panel plots a timeseries of coefficients. The larger dark red and light purple dots correspond to the coefficients from the first two panels.  The bottom panel depicts the coefficients from running: ∆Inc s,t = α t + βInc s,t−20 + ε s,t over rolling twenty year windows. The regressions are estimated separately for two equally sized groups of states, split along their measure of land use regulations from the legal database. The groups are split at the median every year, so the composition of the groups changes over time.  1990Q1, 1991Q2, 1992Q3, 1993Q4. Housing consumption is defined following the NBER convention.
The data and documentation are available at: http://www.nber.org/data/ces_cbo.html. The CPI is used to adjust income and consumption values to 2012 dollars. Individuals are grouped in to equal weighted bins that are fitted with a quadratic. Individuals with less than $10,000 in total consumption are excluded. Notes: The top panels reconstruct Figure 2, relabeling states according to their housing supply elasticities levels, split along their measure of housing supply elasticity in Saiz (2010). We weight the time-invariant MSA-level measures from Saiz by population to produce state-level estimates and impute a value for Arkansas based on neighboring states. Blue states (lower case initials) have above median elasticities and purple states (upper case initials) have below median elasticities.
The bottom panel depicts the coefficients β from running: ∆Inc s,t = α t + βInc s,t−20 + ε s,t over rolling twenty year windows. The regressions are estimated separately for two equally sized groups of states.

APPENDIX FIGURE 5
Capital Convergence, 1880-1920, and Interest Rates, 1880-2002  Notes: The first panel shows capital convergence (or lack thereof) between 1880 and 1920 using estimates come from Kuznets et al. (1964). The original source for the capital estimates is the Census of Wealth.
Data series were assembled by Landon-Lane and Rockoff (2007) (2003) and Fishback et al. (2006) Notes: Robust standard errors are shown below coefficients. Birth-death method uses state-level vital statistics data to calculate net migration as ObservedPop t -(Pop t-10 + Births t,t-10 + Deaths t,t-10 ). Survival ratio method computes counterfactual population by applying national mortality tables by age, sex, and race to the age-sex-race Census counts from 10 years prior. The dependent variable for the last two rows is log (net migration t,t-20 + pop t-20 )log(pop t-20 ). Both series end in 1990. (2) Replicates specification (1), with y variable as log(!HumanCapital + HumanCapital_t-20)log(HumanCapital_t-20) and x variable as Log Inc Per Cap t-20 .
(3, 4, 5, 6) Replicates row 2, with alternative labor supply assumptions. For rows 3-5, HC it = ! j wage jt^( 1+")*share ijt , where wage is calculated as annual wage income/annual hours and " is parameterized to reflect labor supply differences from the model (Section 2.1.1). Row 6 uses HC it = # j wage jt *share ijt *hours ijt , where hours ijt reflects mean hours worked by skill group j in state i. (7,8) Replicates specification (2), assuming no change in human capital levels for blacks and foreign-born respectively. (9) log(HumanCapital t ) -log(HumanCapital t-20 ). Unlike the measure developed above, which focuses exclusively on changes due to migration, this measure includes changes in human capital accumulation by nonmovers.   Notes: The table reports the coefficients $ 1 and $ 2 from regressions of the form: !lny it =% t +% t I(reg>x)+$ 1 lny it-1 + $ 2 lny it-1 x I(reg>x)+" it . The construction of the land use regulation measure is described in the text, and the 'high regulation' cutoff value is set to the measure's 2005 median value. The dependent variables are new housing permits from the Census Bureau, the median log housing price from the IPUMS Census extracts, population change, the log change in human capital due to migration (as explained in section 5.1), and the log change in per-capita income. Standard errors clustered by state. *** p<0.01, ** p<0.05, * p<0. Inputs from Data (2)  Notes: This table estimates the role of migration in income convergence. Migration drives convergence through population flows from poor to rich states (measured in Table 1) and human capital flows from rich to poor states (measured in Table 2). The effect of these changes on income per capita is calibrated using the model (equation 5) and different assumptions on the elasticity of labor. We consider three scenarios: (1) balanced growth preferences, (2) a labor supply elasticity of 0.6, and (3) a labor supply elasticity of 2.6.  Notes: The table reports the coefficients !1 and !2 from regressions of the form "lny it,t-25 =! 1 +# 2 I(reg>x)+! 1 lny it-25 + ! 2 lny it-25 x I(reg>x)+$ i for the periods 1955-1980 and 1980-2005. In all columns the states are divided into equally-sized high and low elasticity groups.The first two columns divide states based on the value of their regulation instrument in 2005. The second two columns divide states based on the value of their regulation instrument in 1955 ("latent regulation"). The final two columns divide states based on the populationweighted land availability constructed from Saiz (2010). The

Measurement Error in Convergence and Labor Market Area-Level Analysis
Notes: Panel A. This panel reports the standard deviation of log income per capita across states. This corresponds to the " convergence concept in Barro and Sala-i-Martin (1992). Panel B. Table 2 calculated convergence coefficients using data on personal income from the BEA. That specification is biased in the presence of classical measurement error. We address the bias issue by instrumenting for the BEA measure using an alternative Census measure and vice versa. The Census measure is log wage income per capita for all earners, except in 1950 where it is only household heads. The first stage F-statistics range from 189 to 739. Classical measurement error is not an issue in these IV regressions, and the convergence coefficients display a similar time-series pattern. Panel C. This panel replicates the "OLS Census" specification from this table and the "" Log Pop" specification from Notes: Returns to education are expressed relative to workers with 12 years of education  and a high school degree . Income sample is all people ages 25-64, except 1950, when only incomes of household heads were recorded. Earnings sample is all people with positive wage income. Income and earnings are winsorized at 1st and 99th percentile to minimize the influence of outliers. a. Returns to education in terms of annual wage income are calculated using a Mincerian regression. The specification generally follows Delong, Goldin and Katz (2003), with dummies for black, Hispanic, and foreign-born, and a quartic in experience interacted with female. b. Mincerian regression uses annual wage income divided by annual hours. APPENDIX

Returns to Living in a High Income State by Skill
Income Net of Housing Costs Notes: All standard errors are clustered by state. *** p<0.01, ** p<0.05, * p<0.1 Panel A. This panel reports the coefficients ! 1 and ! 2 from the regression Y i -P i ="+#Skill i + ! 1 Y + ! 2 Y * Skill i + $X i + % i , where Y i and P i measure household wage income and housing costs respectively, Y measures average state income and X i are household covariates. Household Skill i is the fraction of household adults in the workforce who are skilled, defined as 12+ years of education in 1940 and 16+ years thereafter. Household covariates are the size of the household, the fraction of adult workers who are black, white, and male, and a quadratic in the average age of adult household workers. Housing costs P i are defined as 5% of house value or 12 times monthly rent for renters. Panel B. The IV regressions replicate panel A, but instrument for average state income and its interaction with household skill using the average income of the state of birth of adult household workers. The first stage F-statistics in these regressions exceed 80. 1950 is omitted since income data are available only for household heads. Panel C. This panel reports the coefficients ! 1 and ! 2 from the regression log(P i )="+#Skill i + ! 1 log(Y) + ! 2 log(Y)* Skill i + $X i + %.
APPENDIX  Figure 6. The second column shows the effect of doubling the housing costs described in the text to control for non-housing price differences across places. This reduces the number of SEAs for which log income net of housing can be calculated to 455 in Panel A and 462 in Panel B. The third column excludes intra-state migrants in calculating net-migration rates. The fourth column excludes non-white migrants in calculating net-migration rates. The final measure calculates migrants as the number of residents residing outside their state of birth (N=465 in Panel B). Additional details are presented in the text. Standard errors clustered by state. *** p<0.01, ** p<0.05, * p<0.1

Log Nominal Income
Log Group-Specific Income Net of Housing APPENDIX  The table regresses 5 year net-migration rates on average income (top rows) and skill-specific income net of housing (bottom rows) for Public Use Micro-data Areas. The baseline case reproduces the results in Figure 7. The second column shows the effect of doubling the housing costs described in the text to control for non-housing price differences across places. The third column excludes intrastate migrants in calculating net-migration rates. The fourth column excludes non-white migrants in calculating netmigration rates. The final measure calculates migrants as the number of residents residing outside their state of birth. Additional details are presented in the text. Standard errors clustered by state. *** p<0.01, ** p<0.05, * p<0.  Notes: This table examines differences by skill group and over time in migration to high BA states. Panel A measures net migration of 25-44 year olds relative to state of birth as a share of the state's total population. There is one observation per state, and robust SE are in parentheses. This measure is attractive because it captures both the decision to migrate and the choice of destination, but it is sensitive to differential trends in domestic BA production in the presence of non-economic migration. Panel B corrects for this issue and focuses on choice of destination among those who choose to migrate within the 48 continental states. Each observation is a state of origin by state of destination pair. We examine whether people who migrate are disproportionately attracted to states with high share BA. We normalize each observation by subtracting the ratio of the population of the destination state to the population of all states (dropping the population of the state of origin). Observations are weighted by the total number of migrants from the origin state, and the standard errors are clustered by destination. Share BA is calculated using people ages 25-65. Low-skill is defined as having less than a BA. High skill is defined as having a BA or higher. *** p<0.01, ** p<0.05, * p<0.1