Export Versus FDI with Heterogeneous Firms

Multinational sales have grown at high rates over the last two decades, outpacing the remarkable expansion of trade in manufactures. Consequently, the trade literature has sought to incorporate the mode of foreign market access into the “new” trade theory. This literature recognizes that  rms can serve foreign buyers through a variety of channels: they can export their products to foreign customers, serve them through foreign subsidiaries, or license foreign  rms to produce their products. Our work focuses on the  rm’s choice between exports and “horizontal” foreign direct investment (FDI). Horizontal FDI refers to an investment in a foreign production facility that is designed to serve customers in the foreign market. Firms invest abroad when the gains from avoiding trade costs outweigh the costs of maintaining capacity in multiple markets. This is known as the proximity-concentration tradeoff. We introduce heterogeneous  rms into a simple multicountry, multisector model, in which  rms face a proximity-concentration trade-off. Every  rm decides whether to serve a foreign market, and whether to do so through exports or local subsidiary sales. These modes of market access have different relative costs: exporting involves lower  xed costs while FDI involves lower variable costs. Our model highlights the important role of within-sector  rm productivity differences in explaining the structure of international trade and investment. First, only the most productive  rms engage in foreign activities. This result mirrors other  ndings on  rm heterogeneity and trade; in particular, the results reported in Melitz (2003). Second, of those  rms that serve foreign markets, only the most productive engage in FDI. Third, FDI sales relative to exports are larger in sectors with more  rm heterogeneity. Using U.S. exports and af liate sales data that cover 52 manufacturing sectors and 38 countries, we show that cross-sectoral differences in  rm heterogeneity predict the composition of trade and investment in the manner suggested by our model. We construct several measures of  rm heterogeneity, using different data sources, and show that our results are robust across all these measures. In addition, we con rm the predictions of the proximityconcentration trade-off. That is,  rms tend to substitute FDI sales for exports when transport * Helpman: Department of Economics, Harvard University, Cambridge, MA 02138, Tel Aviv University, and CIAR (e-mail: ehelpman@harvard.edu); Melitz: Department of Economics, Harvard University, Cambridge, MA 02138, National Bureau of Economic Research, and Centre for Economic Policy Research (e-mail: mmelitz@ harvard.edu); Yeaple: Department of Economics, University of Pennsylvania, 3718 Locust Walk, Philadelphia, PA 19104, and National Bureau of Economic Research (e-mail: snyeapl2@ssc.upenn.edu). The statistical analysis of  rmlevel data on U.S. Multinational Corporations reported in this study was conducted at the International Investment Division, U.S. Bureau of Economic Analysis, under an arrangement that maintained legal con dentiality requirements. Views expressed are those of the authors and do not necessarily re ect those of the Bureau of Economic Analysis. Elhanan Helpman thanks the NSF for  nancial support. We also thank Daron Acemoglu, Roberto Rigobon, Yona Rubinstein, and Dani Tsiddon for comments on an earlier draft, and Man-Keung Tang for excellent research assistance. 1 See Wilfred J. Ethier (1986), Ignatius Horstmann and James R. Markusen (1987), and Ethier and Markusen (1996) for models that incorporate the licensing alternative. 2 We therefore exclude “vertical” motives for FDI that involve fragmentation of production across countries. See Helpman (1984, 1985), Markusen (2002, Ch. 9), and Gordon H. Hanson et al. (2002) for treatments of this form of FDI. 3 See, for example, Horstmann and Markusen (1992), S. Lael Brainard (1993), and Markusen and Anthony J. Venables (2000). 4 See also Andrew B. Bernard et al. (2003) for an alternative theoretical model and Yeaple (2003a) for a model based on worker-skill heterogeneity. James R. Tybout (2003) surveys the recent micro-level evidence on trade that has motivated these theoretical models. 5 This result is loosely connected to the documented empirical pattern that foreign-owned af liates are more productive than domestically owned producers. See Mark E. Doms and J. Bradford Jensen (1998) for the United States and Sourafel Girma et al. (2002) for the United Kingdom.

Multinational sales have grown at high rates over the last two decades, outpacing the remarkable expansion of trade in manufactures.Consequently, the trade literature has sought to incorporate the mode of foreign market access into the "new" trade theory.This literature recognizes that rms can serve foreign buyers through a variety of channels: they can export their products to foreign customers, serve them through foreign subsidiaries, or license foreign rms to produce their products.
Our work focuses on the rm's choice between exports and "horizontal" foreign direct investment (FDI).Horizontal FDI refers to an investment in a foreign production facility that is designed to serve customers in the foreign market. 1,2Firms invest abroad when the gains from avoiding trade costs outweigh the costs of maintaining capacity in multiple markets.This is known as the proximity-concentration trade-off. 3We introduce heterogeneous rms into a simple multicountry, multisector model, in which rms face a proximity-concentration trade-off.Every rm decides whether to serve a foreign market, and whether to do so through exports or local subsidiary sales.These modes of market access have different relative costs: exporting involves lower xed costs while FDI involves lower variable costs.
Our model highlights the important role of within-sector rm productivity differences in explaining the structure of international trade and investment.First, only the most productive rms engage in foreign activities.This result mirrors other ndings on rm heterogeneity and trade; in particular, the results reported in Melitz (2003). 4Second, of those rms that serve foreign markets, only the most productive engage in FDI. 5 Third, FDI sales relative to exports are larger in sectors with more rm heterogeneity.
Using U.S. exports and af liate sales data that cover 52 manufacturing sectors and 38 countries, we show that cross-sectoral differences in rm heterogeneity predict the composition of trade and investment in the manner suggested by our model.We construct several measures of rm heterogeneity, using different data sources, and show that our results are robust across all these measures.In addition, we con rm the predictions of the proximityconcentration trade-off.That is, rms tend to substitute FDI sales for exports when transport costs are large and plant-level returns to scale are small.Moreover, the magnitude of the impact of our heterogeneity variables are comparable to the magnitude of the impact of the proximity-concentration trade-off variables.We conclude that intra-industry rm heterogeneity plays an important role in explaining international trade and investment.
As mentioned above, our model predicts that the least productive rms serve only the domestic market, that relatively more productive rms export, and that the most productive rms engage in FDI.We provide some evidence supporting this sorting pattern.We compute labor productivity (log of output per worker) for all rms in the COMPUSTAT database in 1996.
We then regress this productivity measure on dummies for multinational rms (MNEs) and non-MNE exporters, controlling for capital intensity and 4-digit industry effects.Table 1 reports the resulting estimates for the productivity advantage of MNEs and non-MNE exporters over the remaining rms. 6These results con rm previous ndings of a signi cant productivity advantage of rms engaged in international commerce.In addition, they highlight a new prediction of our model: MNEs are substantially more productive than non-MNE exporters; the estimated 15-percent productivity advantage of multinationals over exporters is signi cant beyond the 99-percent level.The remainder of this paper is composed of four sections.In Section I, we elaborate the model and we map the theoretical results into an empirical strategy.In Section II, we describe the data.We report and interpret the empirical ndings in Section III, and we provide concluding comments in the closing section.

I. Theoretical Framework
There are N countries that use labor to produce goods in H 1 1 sectors.One sector produces a homogeneous product with one unit of labor per unit output, while H sectors produce differentiated products.An exogenous fraction b h of income is spent on differentiated products of sector h, and the remaining fraction 1 2 ¥ h b h on the homogeneous good, which is our numeraire.Country i is endowed with L i units of labor and its wage rate is w i .Now consider a particular sector h that produces differentiated products.For the time being we drop the index h, with the implicit understanding that all sectoral variables refer to sector h.To enter the industry in country i, a rm bears the xed costs of entry f E , measured in labor units.An entrant then draws a laborper-unit-output coef cient a from a distribution G(a).Upon observing this draw, a rm may decide to exit and not produce.If it chooses to produce, however, it bears additional xed overhead labor costs f D .There are no other xed costs when the rm sells only in the home country.If the rm chooses to export, however, it bears additional xed costs f X per foreign market.On the other hand, if it chooses to serve a foreign market via foreign direct investment (FDI), it bears additional xed costs f I in every foreign market.We think about f X as the costs of forming a distribution and servicing network in a foreign country (similar costs for the home market are included in f D ).The xed costs f I include these distribution and servicing network costs, as well as the costs of forming a subsidiary in a foreign country and the duplicate overhead production costs embodied in f D .The difference between f I and f X thus indexes plantlevel returns to scale for the sector. 7Goods that 6 Our controls include the log of capital (book value net of depreciation) per worker, this variable squared, and 4-digit industry xed effects.Controlling for material usage intensity does not change the results. 7Part of the cost difference f I 2 f X may also re ect some of the entry costs represented by f E , such as the initial cost of building another production facility.are exported from country i to country j are subjected to melting-iceberg transport costs t ij .1. Namely, t ij units have to be shipped from country i to country j for one unit to arrive.After entry, producers engage in monopolistic competition.
Preferences across varieties of product h have the standard CES form, with an elasticity of substitution « 5 1/(1 2 a) .1.These preferences generate a demand function A i p 2 « in country i for every brand of the product, where the demand level A i is exogenous from the point of view of the individual supplier. 8In this case, the brand of a monopolistic producer with labor coef cient a is offered for sale at the price p 5 w i a/a, where 1/a represents the markup factor.As a result, the effective consumer price is w i a/a for domestically produced goods-supplied either by a domestic producer or foreign af liate with labor coef cient a-and is t ji w j a/a for imported products from an exporter from country j with labor coef cient a.
A rm from country i that remains in the industry will always serve its domestic market through domestic production.It may also serve a foreign market j.If so, it will choose to access this foreign market via exports or af liate production (FDI).This choice is driven by the proximity-concentration trade-off: relative to exports, FDI saves transport costs, but duplicates production facilities and therefore requires higher xed costs.9In equilibrium, no rm engages in both activities for the same foreign market. 10We assume (1) These conditions will be clari ed in the following analysis.
For expositional simplicity, assume for the time being unit wages in every country (w i 5 1).11Then, operating pro ts from serving the domestic market are p D i 5 a12 « B i 2 f D for a rm with a labor-output coef cient a, where B i 5 (1 2 a) A i /a 12 « . 12On the other hand, the additional operating pro ts from exporting to country j are p X ij 5 (t ij a) 12 « B j 2 f X , and the additional operating pro ts from FDI in country j are p I j 5 a 12 « B j 2 f I .These pro t functions are depicted in Figure 1 for the case of equal demand levels B i 5 B j . 13In this gure, a 12 « is represented on the horizontal axis.Since « .1, this variable increases monotonically with labor productivity 1/a, and can be used as a productivity index.All three pro t functions are increasing (and linear): more productive rms are more pro table in all three activities.The pro t functions p D i and p I j are parallel, because we assumed B i 5 B j .However, pro ts from FDI are lower, as the xed costs of FDI, f I , are higher than the xed costs of domestic production, f D .The pro t function p X ij is steeper than both p D i and p I j due to the trade costs t ij .Together with the rst inequality in (1), these relationships imply that exports are more prof- which ensures that some rms export to country j.In addition, the second inequality in (1) implies (a X ij ) 12 « .(a D i ) 12 « , which ensures that some rms serve only the domestic market.
The least productive rms expect negative operating pro ts and therefore exit the industry.This fate befalls all rms with productivity levels below (a D i ) 12 « , which is the cutoff at which operating pro ts from domestic sales equal zero.Firms with productivity levels between (a D i ) 12 « and (a X ij ) 12 « have positive operating pro ts from sales in the domestic market, but expect to lose money from exports and FDI.They choose to serve the domestic market but not to serve country j.The cutoff (a X ij ) 12 « is the productivity level at which exporters just break even.Higher-productivity rms can export pro tably.But those with productivity above (a I ij ) 12 « gain more from FDI.For this reason, rms with productivity levels between (a X ij ) 12 « and (a I ij ) 12 « export while those with higher productivity levels build subsidiaries in country j, which they use as platforms for servicing country j's market.It is evident from the gure that the cutoff coef cients (a D i ) 12 « , (a X ij ) 12 « , and (a I ij ) 12 « are determined by (2) Free entry ensures equality between the expected operating pro ts of a potential entrant and the entry costs f E .The form of this condition is reported in our working paper (Helpman  et al., 2003).The free entry condition together with (2)-(4) provide implicit solutions for the cutoff coef cients a D i , a X ij , a I ij , and the demand levels B i in every country.These solutions do not depend on the country-size variables L i as long as productivity-adjusted wages w i remain equalized (the numeraire outside good is produced everywhere and freely traded).Moreover, we can also allow cross-country variations in the xed-cost coef cients, as long as these variations do not lead some countries to stop producing the outside good.These generalizations are useful for empirical applications.
We report in our working paper generalequilibrium results for a special case in which countries only differ in size and trade costs per product are symmetric (implying t ij 5 t for j Þ i).These restrictions apply within each sector, so there can be arbitrary variations across sectors.Under these circumstances, (2)-( 4) and free entry imply that, as long as countries do not differ too much in size, wages are the same everywhere, all countries share the same cutoffs a D i 5 a D , a X ij 5 a X , a I ij 5 a I , and the same demand levels B i 5 B. Larger countries attract a disproportionately larger number of entrants (relative to country size) and a larger number of sellers (hence, more product variety).We also show that larger markets are disproportionately served by domestically owned rms, i.e., the market share of domestically owned rms is larger in the home market of a larger country.

A. Exports Versus FDI Sales
We now consider the relative magnitude of exports and local FDI sales for a pair of countries i and j.Let s X ij be the market share in country j of country i's exporters and let s I ij be the market share in country j of af liates of country i's multinationals.The relative size of these market shares is then in the symmetric case, where (6) Given our symmetry assumptions, this ratio is independent of i and j.That is, every country has the same relative sales of exporters and af liates in every other country.This ratio rises with the exporting cutoff coef cient a X and declines with the FDI cutoff coef cient a I .The cutoffs, in turn, are determined by the system of equilibrium conditions.A rise in the export costs f X or t, or a decrease in the FDI costs f I , all have similar impacts on the a X and a I cutoffs: they induce an increase in a I and a decrease in a X .The relative sales of exporters thus decline in all these cases.Recall that f I encompasses both the country-level xed costs embodied in f X and the duplicate plant overhead costs represented by f D .It is therefore natural to consider the effects of equivalent increases in f I and f X (representing higher countrylevel costs), and the effects of equivalent decreases in f I and f D (representing lower overhead plant costs, and hence smaller returns to scale).Again, we show that the a I and a X cutoffs move in the same directions as before, entailing a decrease in relative export sales.
These are sensible comparative statics predicting the cross-sectoral variation in relative exports sales.We expect the relative sales of exporters to be lower in sectors with higher transport costs or higher xed country-level costs (even when the latter costs are also borne by multinational af liates).We also expect them to be lower in sectors where plant-level returns to scale are relatively weak.These results show how the rm-level proximityconcentration trade-off results can be extended to sectors with heterogeneous rms that select different modes of foreign market access.
We now shift the focus to the role of rmlevel heterogeneity in explaining the crosssectoral variation in relative export sales.Note from (5) that the function V directly impacts the relative sales (holding the cutoff levels xed).Recall also that rm sales and variable pro ts are proportional to a 12 « in every market.V(a) therefore captures (up to a multiplicative constant) the distribution of sales and variable pro ts among rms that make the same export or FDI decisions.It also captures the distribution of domestic sales and variable pro ts among all surviving rms.We think of V(a) as summarizing rm-level heterogeneity in a sector.It is exogenously determined by the distribution of unit costs G(a) and the elasticity of substitution «.
In order to index differences in rm-level heterogeneity across sectors, we parametrize G(a).We use the Pareto distribution as a benchmark.When labor productivity 1/a is drawn from a Pareto distribution with the shape parameter k, the distribution of rm domestic sales, indexed by V(a), is also Pareto, with the shape parameter k 2 (« 2 1). 14he shape parameter of the Pareto distribution offers a natural and convenient index of dispersion, which characterizes heterogeneity.Given our assumptions, the domestic sales of all rms with sales above any given cutoff are distributed Pareto with the same shape parameter k 2 (« 2 1).A higher dispersion of rm productivity draws (lower k) or a higher elasticity of substitution «, raise the dispersion of rm domestic sales and variable pro ts.We now investigate the impact of such changes in heterogeneity on the relative sales of exporters.
The Pareto distribution implies that V(a 1 )/ V(a 2 ) equals (a 1 /a 2 ) k2 (« 2 1) for every a 1 and a 2 in the support of the distribution of a. Relative export sales in (5) can then be written as 15 ( 7) It follows that relative export sales decrease with decreases in k and increases in «. 16 Thus, we expect sectors with higher levels of disper- 14 The cumulative distribution function of a Pareto random variable X with the shape parameter k is given by where b is a scale parameter that bounds the support [b, 1 `) from below.Log x is then distributed exponentially with a standard deviation equal to 1/k.Any truncation from below of X is also distributed Pareto with the same shape parameter k.X has a nite variance if and only if k .2. We therefore assume that k .« 1 1, which ensures that both the distribution of productivity draws and the distribution of rm sales have nite variances.
sion in rm domestic sales-generated either by higher dispersion levels of rm productivity or by a higher elasticity of substitution-to have lower levels of relative export sales.This is a major implication of the model, which we test below.

B. Testable Implications
We focus our empirical work on the model's predictions concerning the determinants of the cross-sector and cross-country variation in relative export sales.This empirical analysis requires us to relax the symmetry assumptions imposed in the previous subsection and to allow for cross-country variation in wages, transport costs, and technology.
Consider the decisions of U.S. rms in sector h to serve country j via export sales versus af liate sales.The equilibrium cutoff levels must satisfy: where w U and w j are the wage levels in the United States and country j, t h Uj is the trade cost (transport and tariff) from the United States to country j in sector h, « h is the elasticity of substitution across varieties in sector h (common to all countries), B h j indexes the demand level for sector h in country j, and f hI j and f X j represent the xed costs of doing FDI in and exporting to country j.These conditions replace (3) and ( 4).Note that f hI j is also indexed by sector h, since it includes plant setup and overhead production costs.On the other hand, the xed exporting costs are common across sectors; they index particular characteristics of doing business in country j for U.S. rms.These costs would also be incurred by U.S. rms setting up af liates in country j, so the difference f hI j 2 f X j represents the overhead and setup production costs.Let f hP [ f hI j 2 f X j reference these costs.Equations ( 8) and ( 9) then imply: where v j [ w U /w j indexes the U.S. wage relative to country j. 17 We index the level of U.S. rm heterogeneity across sectors using the Pareto benchmark.We assume that the productivity draws for U.S.
rms in sector h are distributed Pareto with shape k h U , and therefore that the distribution of U.S. domestic sales indexed by V h U (a) is also Pareto with shape k h U 2 (« h 2 1).The sales of U.S. exporters to country j relative to the U.S. af liate sales in country j can then be written as Comparing ( 7) and ( 11) con rms that all our previously derived comparative statics remain valid in a cross section of both sectors and nonsymmetric countries: the proximity-concentration forces predict lower U.S. relative export sales for country-sector pairs with high transport costs t h Uj , countries with high xed costs f X j , and sectors with low plant-level returns to scale f hP .As was previously the case, the extent of rm-level heterogeneity remains an important determinant of relative export sales.Sectors with higher productivity dispersion levels (lower k h U ) or higher elasticities of substitution have lower relative export sales.We cannot separately measure k h U and « h .However, we can measure their difference k h U 2 (« h 2 1) under the Pareto assumption, because 1/[k h U 2 (« h 2 1)] then indexes the measurable dispersion of rm size in sector h, and provides a convenient overall measure of differences in rm-level heterogeneity across sectors.

II. Data
To test our model, we need data that vary across sectors and countries.The required data fall into three categories: data on the composition of international trade, variables that represent the proximity-concentration trade-off, and indices of rm-level heterogeneity.We describe in this section our choice of these data.Unless otherwise noted, all of the data are for 1994.

A. The Composition of International Commerce
The biggest constraint on any analysis that considers the trade-off between exports and FDI sales is the dearth of internationally comparable measures of the extent of FDI across both industries and countries.Because the United States is one of a handful of countries that collects data on multinational af liate sales, disaggregated by destination and industry, our study focuses on the composition of U.S. trade.
In the United States, the Bureau of Economic Analysis (BEA) collects census-type data on FDI.In its Benchmark Surveys, conducted every ve years, the BEA collects af liate-level data on a wide range of enterprise-level variables, including total af liate sales.Af liates are classi ed by their main line of business and assigned to one of 52 manufacturing sectors. 18o make our FDI data comparable to the data for exports, we aggregated the rm-level multinational sales data to the level of the industry.The export data are more familiar and have been taken from Robert C. Feenstra (1997).The data have been concorded from 4-digit SITC industrial classi cations into the BEA industry classi cations.
Finally, we consider two separate samples of countries, which can be characterized as narrow and wide.The narrow sample consists of the 27 countries originally studied by Brainard (1997), while the wide sample includes 11 additional, smaller countries, which are typically less developed. 19The bene t of the wider sample is that it includes a larger and more diverse set of countries, while the drawback is that these countries are more likely to have fewer strictly positive levels of FDI, creating a concern about censoring.

B. Proximity-Concentration Variables
Our theoretical model predicts exports relative to FDI sales as a function of the costs of each activity: unit costs of exporting, xed costs of exporting, and xed costs of investment abroad.However, these costs are not easily quanti ed.
First consider unit costs of foreign trade.These costs can be due either to the costs of moving goods across borders, such as transport and insurance, or due to barriers to trade, such as tariffs.We proxy for them with the variables FREIGHT and TARIFF, which are ad valorem measures of freight and insurance costs, and trade taxes.FREIGHT is computed as the ratio of CIF imports into the United States to FOB imports, using the data in Feenstra (1997).
TARIFF is calculated at the BEA industry/ country-level from more nely disaggregated data.It is the unweighted average of tariffs across subindustries within the BEA industry.Trade taxes are taken from Yeaple (2003b), where the data are described in more detail.
While the unit costs of shipping goods are reasonably straightforward to measure, the same cannot be said for the xed costs associated with exporting and FDI.In principle, these costs could vary by industry and country; but such measures do not exist in practice.To make progress, we maintain our assumption of a country-speci c xed cost that applies to both exports and FDI sales.As this cost is unobserved, country-speci c, and common to all industries, we subsume its measure into a country xed effect.We therefore assume that any remaining cost associated with FDI stems from the cost of maintaining additional capacity.Given our 18 See Table 1 in our working paper, Helpman et al.  (2003), for this classi cation. 19The 27 countries in the narrow sample are Argentina, Australia, Austria, Belgium, Brazil, Canada, Chile, Denmark, France, Germany, Hong Kong, Ireland, Italy, Japan, Mexico, Netherlands, New Zealand, Norway, Philippines, Singapore, South Korea, Spain, Sweden, Switzerland, Taiwan, United Kingdom, and Venezuela, while the 11 additional countries are Colombia, Finland, Greece, Indonesia, Israel, Malaysia, Peru, Portugal, South Africa, Thailand, and Turkey.model, we cannot use a measure for a rm that is somehow "representative" of the sector.Thus, standard measures of plant-level xed costs, such as the number of production workers at a plant of median size, are not appropriate.We seek a measure of such costs that is independent of rm size or productivity.We follow the model in choosing the number of nonproduction workers per establishment as reported in the 1997 Census of Manufacturing. 20We calculate the average number of nonproduction workers at the North American Industry Classi cation System (NAICS) level. 21Then, we compute the measure of plant-level xed cost, FP, for every BEA sector as the average of these numbers within the BEA sector, weighted by the NAICS-level sales in the sector.

C. Measures of Dispersion
The most novel feature of our model is the relationship between the degree of intra-industry rm heterogeneity and the prevalence of subsidiary sales relative to export sales.To test this hypothesis, we require data that quanti es the extent of this heterogeneity across industries.As we cannot directly measure the dispersion of intra-industry productivity levels, we rely on guidance from the model to construct alternative measures of within-industry heterogeneity.According to our model, the dispersion of rm size within a sector captures the joint effect of the dispersion of rm productivity and the elasticity of substitution, which magni es the effect of productivity differences across rms.Since the size distribution of rms is observable, we use its dispersion as a measure of rm-level heterogeneity.
To quantify this dispersion measure across industries, we assume that the stochastic process that determines rm productivity levels is Pareto, with the shape of the distribution varying across industries.This assumption is convenient, because it suggests two conceptually equivalent ways to measure dispersion.The rst is to regress the logarithm of an individual rm's rank within the distribution on the logarithm of the rm's size.It can be shown that the estimated coef cient of such a regression is k 2 (« 2 1), which is exactly the measure of dispersion that appears in the reduced form of the model. 22The second method is to compute the standard deviation of the logarithm of rm sales, which-given our distributional assumption-is computationally equivalent to the slope of the conditional expectation of log rank on log size. 23hile our distributional assumption yields a precise methodology for computing dispersion, the choice of data is more problematic.We require disaggregated data on the distribution of sales across rms.Unfortunately, we do not have access to such data on U.S. rms.As a result, we rely on two alternative sources.
First, we use the publicly available data from the 1997 U.S. Census of Manufacturing.However, these data are aggregated into ten different size categories, precluding the estimation of size dispersion measures using regression techniques.Nevertheless, we can compute the standard deviation of log sales by making an additional assumption: we assume that all establishments falling within the same size category have log sales equal to the mean for this category.We then treat each of the size categories in the many subindustries of the BEA industry classi cation as separate observations.Adopting this method, we calculate the standard deviation of log sales using the number of rms in each size category as weights.
Second, Bureau van Dijck Electronic Publishing has recently made available a large data set of European rms. 24This database, named Amadeus, includes information on the consolidated sales, the national identity, and the main line of business of a large number of European rms.There are roughly 260,000 rms in this sample.
We compute each of our two measures of dispersion for every industry in two subsets of these data: all Western European rms and French rms only.We compute our rm dispersion measures using French rms only for two reasons.First, using data for multiple countries raises the issue of industrial composition.Within every BEA industry there are many subindustries for which countries might produce different mixes.France's industrial structure is very similar to the United States, however, and so might share most of the same distributional aspects of rm characteristics.Second, French rms are highly overrepresented in the sample relative to all other Western European countries. 25Our dispersion measures are based on a sample of 55,339 large Western European rms, and a subset of 15,148 French rms. 26e regression-based measures of dispersion provide a natural way of evaluating the crosssectional variation in this variable relative to the measurement errors induced by the tting of the Pareto distributions. Figure 2, which has been constructed from the sample of Western European rms, plots rm rank against rm sales in four sectors on the same log-log scale.In every plot the dispersion measure is represented by the slope of the regression line while its goodness of t is represented by the deviation from linearity.Figure 3 quanti es this comparison by showing the 95-percent con dence intervals around the coef cients of dispersion, estimated as the slopes of the regression lines in these sectors.Evidently, these slopes are precisely estimated in all the sectors, with the exception of ve outliers that we discuss below. 27here are four measures of dispersion calculated from the Amadeus data set and one measure calculated from the U.S. data.The 25 Due to national differences in reporting requirements, no information on U.K. rms is available, and only an extremely limited number of German rms appear in the sample. 26Because small rms are underrepresented throughout the Amadeus database, we rst drop rms with sales below a cutoff of U.S. $2.5 million per year.Note that, under the assumption of a Pareto size distribution, our measures of dispersion are invariant to the choice of lower bound cutoff.We computed the dispersion measures using several differ-ent cutoffs.Any cutoff above U.S. $2.5 million yields a size distribution that is closely approximated by a Pareto distribution, and a dispersion measure that varies very little with the cutoff. 27As all 52 manufacturing sectors could not t on one graph, only one of the seven food processing sectors (201meat products) is represented.The coef cients and condence intervals for the other six food processing sectors are very similar to the one represented.2 (along with our measure of plantlevel xed costs, FP, and the industries' capitallabor ratio, KL, and R&D intensity, RD).The table shows that all four measures from Amadeus are highly correlated with one another, as one might expect.The table also shows that the U.S.-based measure of dispersion is positively correlated with the measures of dispersion calculated from the European data, except that this correlation is not as high as the correlations among the four measures of dispersion that were calculated from the European data.There are at least two reasons why this might be so.First, the method of calculation is very differ-ent: the European measures are computed from actual rm-level data while the American measure is calculated from semiaggregated establishment-level data.Given the differences in methods of calculation, one might argue that the correlations are surprisingly high.Second, there exists an aggregation problem.If the composition of output varies across countries according to comparative advantage, then within each BEA industry the product mix of goods produced in the United States may differ from the mix produced in Europe.For this reason the European and American dispersion measures cannot be perfectly aligned.

III. Speci cations and Results
Our aim is to estimate a linearized version of (11) that relates the logarithm of relative sales to our measure of rm-size dispersion, the logarithm of our proxy for plant xed costs, the logarithms of transport and tariff costs, and a set of country dummies that we use to control for the differences in f X and v across countries.Of course, this linearization precludes any structural interpretation of the estimated parameters.Our goal is limited to testing whether the central tendencies in the data are consistent with the partial derivatives implied by ( 11), and to assessing the economic signi cance of the magnitudes associated with the estimated coef cients.
We consider several variants of the basic speci cation in order to raise the level of condence in the results.Given the critical importance of the size distribution of rms, we report results corresponding to each one of the ve measures of dispersion in rm size.We also report results for both samples of countries: narrow and wide.Finally, we explore the sensitivity of the results to alternative assumptions that incorporate other determinants of relative sales not captured by equation ( 11).
We begin the analysis by considering a speci cation that controls for sectoral capital and R&D intensities. 28The results across speci cations for our two samples and ve measures of dispersion are shown in Table 3.The columns correspond to different measures of dispersion, beginning with the U.S. standard deviation of log sales, proceeding to the European and French-only standard deviation measures, and ending with the estimated distribution parameters for the European and the French-only sample, respectively.Country xed effects are not reported.
First consider the narrow sample of relatively large countries, studied by Brainard (1997).The coef cients on FREIGHT and TARIFF are negative and statistically signi cant in each one of the ve speci cations.These results are consistent with Brainard (1997).In addition, the co-ef cient of FP is positive and signi cant.We therefore con rm the predictions of the proximityconcentration trade-off: rms substitute FDI sales for exports when the costs of international trade are relatively high and the returns to scale are relatively small.
Next consider the effects of dispersion.The estimated coef cients on the various dispersion measures are all negative and statistically signi cant.Industries in which rm size is highly dispersed are associated with relatively more FDI sales relative to exports, precisely as the model predicts.None of these results changes signi cantly when the set of countries is expanded to include the 11 smaller countries (the wide country sample). 29lthough all measures of dispersion yield coef cients that are statistically signi cant, the choice of dispersion measure has a noticeable impact on the results.The measures that were derived by tting a Pareto distribution to the distribution of rm size, yield substantially lower coef cients and higher standard errors than the nonparametric dispersion measures, i.e., the standard deviations of log sales.This pattern is driven, in large part, by ve sectors that exhibit the largest differences between the measurement of dispersion by means of the shape of a Pareto distribution and by means of the standard deviation, for both Western European and French rms. 30These sectors have the lowest number of rms in the data, and they yield-without exception-the poorest ts to the Pareto distribution, as measured by R-squares.We believe that in these cases the nonparametric measures (the standard deviations) better describe the levels of dispersion within the sectors.Dropping these ve outliers from the sample and reestimating the equations, we nd that the two different ways of measuring dispersion yield much more similar results.Af-ter dropping the outliers, all the dispersion measures yield negative coef cients that are signi cant beyond the 99-percent con dence level.
To get a sense of the economic signi cance of the estimated coef cients on our dispersion measures, we have calculated standardized coef cients-also known as "beta" coefcients-for all the independent variables.They are reported in Table 4 for the narrow sample, along with the sample means and standard deviations.A beta coef cient is de ned as the product of the estimated coef cient and the standard deviation of its corresponding  independent variable, divided by the standard deviation of the dependent variable.It converts the regression coef cients into units of sample standard deviations. 31These beta coef cients suggest that each one of the ve measures of dispersion has a comparable impact to each one of the standard proximity-concentration variables. 32For instance, a one standard deviation decline in an industry's freight costs raises the logarithm of the ratio of exports to FDI sales by 27 percent of a standard deviation; and a one standard deviation decline in the dispersion measures induce comparable changes in the dependent variable, with an average impact of 26 percent across the dispersion measures.The impact of tariffs is lower while the impact of returns to scale is higher.Taken as a whole, these results suggest that rm-level heterogeneity adds an important dimension to the observed trade-off between exports and FDI sales.These results strongly support the theoretical model's predicted link between rm-level heterogeneity and the ratio of exports to FDI sales.Nevertheless, these results have to be interpreted with caution, because they may also re ect-at least to some degree-variations in industry characteristics that are not captured by our parsimonious model.This problem is partly taken care of by our control of cross-industry variations in capital and R&D intensities.Note that both these variables represent characteristics of an industry's technology that are not captured by our model. 33Furthermore, as shown in Table 2, these measures of technology are correlated with all the different dispersion measures, although the correlations with the U.S.-data-based dispersion measure are rather weak. 34Table 3 suggests that R&D intensity is not a useful predictor of exports relative to FDI sales, while capital intensity is; more capitalintensive sectors export less relative to FDI sales.These results are interesting, but our theoretical model offers no guidance concerning their interpretation. 35f course, differences in capital intensity may not be the only other source of variation across sectors that affects exports relative to FDI sales.In order to address the possibility that some other unmeasured characteristics of sectors fall into this category, we estimate the previous speci cation (with the capital and R&D intensity controls) adding random industry effects.A bene t of this estimation strategy is that it allows for ef cient estimation in the presence of common components in the residuals that might be induced by unmeasured industry characteristics.To validate this specication, we need to assume that these unmeasured industry characteristics are uncorrelated with our right-hand-side variables.This is a strong assumption.We feel, however, that it is most likely to hold for our dispersion measures, which are the focus of the empirical analysis. 36he results are reported in Table 5.As could be predicted, the standard errors have increased.But so have the point estimates of the impact of dispersion on exports relative to FDI sales.Importantly, however, the coef cients for all the dispersion measures remain highly signi cant.On the other hand, the magnitude of the coefcients on FREIGHT and TARIFF are greatly reduced, and the coef cients on TARIFF are no longer signi cant.These results support our earlier conclusion that the economic signi cance of rm heterogeneity compares favorably with the signi cance of the standard proximityconcentration trade-off variables in explaining the export to FDI sales ratio.
Another robustness check addresses the potential interdependence of the residuals across countries, which may exist even after we con-trol for country xed effects.This type of interdependence pattern could be created by the ability of af liates to re-export a portion of their production to a third country.In this case, a rm's decision to operate an af liate in one country, say Belgium, would not be independent from its decision to locate af liates in other neighboring European countries.In the Appendix to our working paper, Helpman et al.  (2003), we show that the predicted link between rm-level heterogeneity within sectors and exports relative to FDI sales is theoretically consistent with an extended version of the model that explicitly allows for re-exports by af liates.However, the pattern of interdependence may be particularly strong among the overrepre-sented and highly integrated economies of Western Europe.
To address this concern, we treated all the Western European countries as a single aggregate unit and reestimated our speci cation with the industry controls (capital and R&D intensities) and industry random effects.We found that all the dispersion measures remain highly signi cant.As could be predicted, the point estimates on the dispersion measures were slightly lower, which re ects the fact that the smaller developing countries now receive a greater weight in the sample. 3737 See Table 8 of our working paper for these estimates.Our nal robustness check addresses sources of endogeneity bias in the dispersion measures, including measurement error.To address these concerns, we instrument the U.S. dispersion measure using all four European dispersion measures.We also use a different method to control for the potential correlation of the residuals within sectors by adjusting the standard errors for clustering (within sectors). 38These speci cations are reported in Table 6 for all previously discussed country samples (narrow, wide, and aggregated Europe).Instrumenting the U.S. dispersion measure signi cantly increases the magnitude of both the estimated coef cient and its standard error.However, as in all the previous speci cations, the effect of dispersion on relative exports and FDI sales remains statistically signi cant.
Finally, we brie y report a number of other robustness checks.One potential complication arises from the fact that rms engage in intrarm trade in intermediate inputs.This trade does not appear in our model, but is of suf cient size in a number of industries to be of concern.
We found that netting out the value of these imports from our FDI sales data had no appreciable impact on the dispersion coef cients, although it had a small impact on the size of the FREIGHT and TARIFF coef cients.In other speci cations, we included the four-rm concentration ratio as a control, in order to assess whether our measures of rm heterogeneity offer information in excess of this crude measure of concentration.We found that controlling for concentration reduces the point estimates of the coef cients on the dispersion measures, but that this decline is rather small.

IV. Conclusion
We have developed in this paper a model of international trade and investment in which rms can choose to serve their domestic market, to export, or to engage in FDI in order to serve foreign markets.Every industry is populated by heterogeneous rms, which differ in productivity levels.As a result, rms sort according to productivity into different organizational forms.The least productive rms leave the industry, because, if they stay, their operating pro ts will be negative no matter how they organize.Other low-productivity rms choose to serve only the domestic market.The remaining rms serve the 38 Under our assumptions on the source of this potential correlation in the residuals-unmeasured sectoral characteristics-the previously reported random-effects coefcients are the ef cient estimators.domestic market as well as foreign markets.
Their mode of operation in foreign markets differs, however.The most productive rms in this group choose to invest in foreign markets while the less productive rms choose to export.This sorting pattern is con rmed by previous empirical work and by our own estimates.
Our model embodies standard elements of the proximity-concentration trade-off in the theory of horizontal foreign direct investment.As a result, it predicts that foreign markets are served more by exports relative to FDI sales when trade frictions are lower or economies of scale are higher.To these factors, our model adds a role for the within-sector heterogeneity of productivity levels.This heterogeneity induces a size distribution of rms, which affects the ratio of exports to FDI sales.
Using data on exports and FDI sales of U.S. rms in 38 countries and 52 industries, we estimated the effects of trade frictions, economies of scale, and within-industry dispersion of rm size, on exports versus FDI sales.The results support the theoretical predictions.In particular, they show a robust cross-sectoral relationship between the degree of dispersion in rm size and the tendency of rms to substitute FDI sales for exports.The size of this effect is of the same order of magnitude as trade frictions.We therefore conclude that we have identi ed a new element-namely, within-sectoral heterogeneity-that plays an important role in the structure of foreign trade and investment.

FIGURE 2 .
FIGURE 2. EMPIRICAL DISTRIBUTION OF FIRM SALES

FIGURE 3 .
FIGURE 3. REGRESSION FIT TO THE PARETO DISTRIBUTION

TABLE 1 -
PRODUCTIVITY ADVANTAGE OF MULTINATIONALS

TABLE 2 -
CORRELATIONS BETWEEN ALTERNATIVE MEASURES OF DISPERSION

TABLE 5 -
EXPORTS VERSUS FDI-RANDOM EFFECTS Notes: T-statistics are in parentheses.Constant and country dummies are suppressed.

TABLE 6 -
EXPORTS VERSUS FDI-ADDITIONAL ROBUSTNESS RESULTS (Clustered standard errors and IV speci cations) In the IV speci cations, the U.S. dispersion measure is instrumented using all four European dispersion measures.All T-statistics (in parentheses) are computed from standard errors that are heteroskedasticity consistent and adjusted for clustering by industry.Constant and country dummies are suppressed.