Is the U.S. Aggregate Production Function Cobb-Douglas? New Estimates of the Elasticity of Substitution

I present new estimates of the elasticity of substitution between capital and labor using data from the private sector of the U.S. economy for the period 1948-1998. I ﬁrst adopt Berndt’s (1976) speciﬁcation, which assumes that technological change is Hicks neutral. Consistently with his results, I estimate elasticities of substitution that are not signiﬁcantly different from one. I next show, however, that restricting the analysis to Hicks-neutral technological change necessarily biases the estimates of the elasticity towards one. When I modify the econometric speciﬁcation to allow for biased technical change, I obtain signiﬁcantly lower estimates of the elasticity of substitution. I conclude that the U.S. economy is not well described by a Cobb-Douglas aggregate production function. I present estimates based on both classical regression analysis and time series analysis. In the process, I deal with issues related to the nonsphericality of the disturbances, the endogeneity of the regressors, and the nonstationarity of the series involved in the estimation.


Introduction
The elasticity of substitution between capital and labor is a central parameter in economic theory. Models investigating the sources of economic growth and the determinants of the aggregate distribution of income have been found to deliver substantially di¤erent implications depending on the particular value of the elasticity of substitution.
Perhaps to a larger extend than in other …elds, the elasticity of substitution between capital and labor is central in growth theory, both traditional and new. On the one hand, in the framework of the Neoclassical growth model, the sustainability of long-run growth in the absence of technological change depends crucially on whether the elasticity of substitution is greater than or smaller than one. 1 On the other hand, the recent literature on induced technical change has developed models that deliver very di¤erent implications depending on the particular value of the elasticity of substitution. For instance, Acemoglu (2003) builds on the assumption of a lower-than-one elasticity of substitution to construct a model that rationalizes the coexistence of both purposeful labor-and capital-augmenting technological change in the transitional dynamics of an economy that converges to a balanced growth path in which technical change is purely labor-augmenting. Furthermore, as pointed out by Hsieh (2000), the value of the elasticity of substitution is also relevant to the empirical debate on the sources of economic growth (cf., Mankiw, Romer and Weil, 1992). 2 In the …eld of public …nance, the value of the elasticity of substitution constitutes an important determinant of the response of investment behavior to tax policy. The late 1960's witnessed a lively debate between those who perceived …scal policy as an e¤ective tool for in ‡uencing investment behavior (e.g., Hall and Jorgenson, 1967) and those who recognized only minor bene…ts from tax incentives (e.g., Eisner and Nadiri, 1968). Their debate revolved around the issue of whether the elasticity of substitution between capital and labor was signi…cantly below one, with a lower elasticity being associated with 1 If the elasticity of substitution is greater than one, the marginal product of capital remains bounded away from zero as the capital stock goes to in…nity. Under certain parameter restrictions, this violation of the Inada condition can yield long-run endogenous growth even in the absence of technological progress (cf. Barro and Sala-i-Martin, 1995, p. 44).
2 Following Mankiw, Romer and Weil (1992), most studies trying to disentangle the relative role of technological change and factor accumulation in explaining cross-country income di¤erences have assumed the elasticity of substitution to be equal to one. Hsieh (2000), shows that relaxing this assumption and allowing for biased technical change may alter substantially the results of these studies. a lower response of investment to tax bene…ts. 3 Soon after the explicit derivation of the Constant Elasticity of Substitution (CES) production function by Arrow et al. (1961), a wealth of articles appeared trying to estimate this elasticity for the U.S. manufacturing sector. 4 Cross-sectional studies at the two-digit level tended to …nd elasticities insigni…cantly di¤erent from one (e.g., Dhrymes and Zarembka, 1970). Lucas (1969), however, discussed several biases inherent in the use of cross-sectional data in the estimation of the elasticity. He suggested the use of time series data instead. Time series studies generally provided much lower estimates of the elasticity. Lucas (1969) himself estimated the elasticity of substitution to be somewhere between 0.3 and 0.5, while Maddala (1965), Coen (1969) and Eisner and Nadiri (1968) also computed estimates signi…cantly below one.
In a widely cited contribution, Berndt (1976) illustrated how the use of higher quality data translated into considerably higher time-series estimates of the elasticity, thus leading to a reconciliation of the time-series and crosssectional studies. In particular, a careful construction of the series involved in the estimation led him to obtain time series estimates for the period 1929-1968 insigni…cantly di¤erent from one. It has become customary in the literature to cite Berndt's paper as providing evidence in favor of the assumption of a Cobb-Douglas functional form for the aggregate production function (e.g., Judd, 1987, Trostel, 1993. In this paper, I will start by following closely the approach suggested by Berndt (1976), which assumes that technological change is Hicks neutral. Using time-series data from the private sector of the U.S. economy for the period 1948-1998, the …rst result of this paper will be a con…rmation of Berndt's …nding of a unit elasticity of substitution between capital and labor when technological change is assumed to be Hicks neutral. The second and more substantive contribution of this paper will consist in demonstrating that in the presence of non-neutral technological change, Berndt's approach leads to estimates of the elasticity that are necessarily biased towards one. When the econometric speci…cation is modi…ed to allow for biased technical change, I generally obtain signi…cantly lower estimates of the elasticity of substitution.
The source of the bias is rather simple to illustrate. Suppose that U.S. aggregate output can be represented by a production function of the form: characterized by constant returns to scale in the two inputs, capital and labor. 3 See Chirinko (2002) for more details. 4 For a thorough review of this literature see Nerlove (1967) and Berndt (1976Berndt ( , 1991.
The parameter A t is an index of technological e¢ ciency, which is neutral in Hicks' sense, i.e., in the sense that it has no e¤ect on the ratio of marginal products for a given capital-labor ratio. Pro…t maximization by …rms in a competitive framework delivers two optimality conditions equalizing factor prices with their marginal products. Combining these conditions delivers where f (k) is output per unit of labor, k is the capital-labor ratio, and r and w are the rental prices of capital and labor, respectively. As is well-known, in the United States, the value of the left-hand side of this expression has been remarkably stable throughout the post-WWII period, while the capital-labor ratio has steadily increased. It follows that this equation can be consistent with is a Cobb-Douglas production function. 5 In words, when technological change is Hicks neutral and the capital-labor ratio grows through time, the only aggregate production function consistent with constant factor shares is one featuring a unit elasticity of substitution between capital and labor. As I will discuss in section 2, the approach of Berndt (1976) consists of running log-linear speci…cations closely related to the expression above. In light of this discussion, his …nding of a unit elasticity of substitution should not be surprising. The main problem with Berndt's approach is that when technological change is allowed to a¤ect the ratio of marginal products, the Cobb-Douglas production function ceases to be the only one consistent with stable factor shares. In particular, a well-known theorem in growth theory states that, in the presence of exponential labor-augmenting technological change, any wellbehaved aggregate production function is consistent with a balanced growth path in which factor shares are constant. In a similar vein, Diamond, McFadden and Rodriguez (1978) formally proved the impossibility of identifying the separate roles of factor substitution and biased technological change in generating a given time series of factor shares and capital-labor ratios. The literature has generally circumvented this impossibility result by imposing some type of structure on the form of technological change. As I will discuss in section 5, when technological e¢ ciency grows exponentially these two e¤ects can be separated and the elasticity of substitution can be recovered from the available data. Furthermore, my empirical results below suggest that allowing for biased technological change leads to estimates of the elasticity of substitution that 5 Solving the di¤erential equation f 0 (k t )k t =f (k t ) = yields y = Ck t , where C is a constant of integration. are, in general, signi…cantly lower than one. I conclude from my results that the U.S. aggregate production function does not appear to be Cobb-Douglas. This is not the …rst paper to estimate the elasticity of substitution while taking into account the presence of biased technological change. Among others, David and van de Klundert (1965) and Kalt (1978) ran regressions analogous to those in section 5 below, and estimated elasticities equal to 0.32 and 0.76, respectively. This paper adds to this literature in at least three respects. First, by explicitly discussing and correcting the bias inherent in the assumption of Hicks-neutral technological change, I am able to reconcile the traditional low estimates of Lucas (1969) and others with the widely cited ones of Berndt (1976). 6 Second, by focusing on a more recent period, I am able to bene…t from the higher-quality data made available by the work of Herman (2000), Krusell et al. (2000), and Jorgenson and Ho (2000). Finally, my empirical analysis incorporates recent developments in the econometric analysis of time series that permit a better treatment of the nonstationary nature of the series involved in the estimation. 7 The rest of the paper is organized as follows. In section 2, I follow Berndt (1976) in deriving six alternative speci…cations for the estimation of the elasticity of substitution under the assumption of Hicks-neutral technological change. Section 3 discusses the data used in the estimations. Section 4 presents estimates of the elasticity based on both classical econometrics techniques and modern time series analysis. Section 5 discusses the crucial misspeci…cation in Berndt's (1976) contribution and presents estimates that correct for it by allowing for biased technical change. Section 6 concludes.
2 Model speci…cation I begin by assuming that aggregate production in the U.S. private sector can be represented by a constant returns to scale production function characterized by a constant elasticity of substitution between the two factors, capital and labor. 6 Kalt (1978), for instance, incorrectly dismissed Berndt's results by claiming that he had estimated the elasticity "without regard to technological change" (p. 762). 7 Following the dual cost function approach pioneered by Nerlove (1963) and Diewert (1971), a separate branch of the literature has provided estimates of the elasticity based on …rst-order conditions derived from cost minimization rather than pro…t maximization. For instance, Berndt and Christensen (1973) …tted a translog cost function to the U.S. manufacturing sector for the period 1929-68, obtaining elasticities of substitution between capital equipment and labor and between capital structures and labor slightly higher than one. Nevertheless, their estimates should be treated with caution because, like Berndt (1976), the authors failed to deal properly with technological change. Arrow et al. (1961) showed that the assumption of a constant elasticity of substitution implied the following functional form for the production function: where Y t is real output, K t is the ‡ow of services from the real capital stock, L t is the ‡ow of services from production and nonproduction workers, A t is a Hicks-neutral technological shifter, is a distribution parameter, and the constant is the elasticity of substitution between capital and labor. 8 Following Berndt (1976), it is useful to de…ne the aggregate input function F t Y t =A t , which given the assumption of Hicks-neutral technological change is independent of A t . Pro…t maximization by …rms in a competitive framework implies two …rst-order conditions, equating real factor prices to the real value of their marginal products. These conditions can be rewritten and expanded with an error term to obtain: where R t , W t , and P t are the prices of capital services, labor services, and aggregate input F t , respectively, and 1 and 2 are constants that depend on . 9 A third alternative speci…cation can be obtained by subtracting (1) from Following Berndt (1976) one can also rearrange equations (1) through (3) to obtain the following three reverse regressions: I hereafter denote the estimates of based on equations (1) through (6) by i , i = 1; :::6. 10 As pointed out by Berndt (1976), in this bivariate setting, 8 The elasticity of substitution between capital and labor is de…ned as = d log (K=L) =d log (F L =F K ), where F K and are F L the marginal products of capital and labor, respectively. 9 A simple way to justify these disturbance terms is to appeal to optimization errors on the part of …rms (cf., Berndt, 1991, p. 454). 10 In the presence of imperfect competition in the product market, the markup becomes the following equalities will necessarily hold for the OLS estimates: where R 2 i refers to the R-square in equation i. These equalities in turn imply the inequalities 1 0 4 , 2 0 5 and 3 0 6 . More importantly, it follows from (7) that the larger the R-square in the OLS regressions, the closer will the standard and reverse estimates be. It should be emphasized, however, that these results hold only for the OLS estimates. 11 On the other hand, nothing can be predicted on statistical grounds about the relative size of the estimates 1 , 2 and 3 , although previous studies led Berndt (1976) to point out that estimates based on the marginal product of labor equation (2) seem to yield higher estimates of the elasticity of substitution than estimates based on the marginal product of capital equation (1). One could therefore expect the estimates to satisfy 2 > 1 . 12

Data Construction and Sources
Estimation of equations (1) through (6) requires data on the ‡ow of labor services L t , the nominal price of these labor services W t , the ‡ow of capital services K t , the rental price of capital R t , and the aggregate input index F t , as well as its associated price P t . To illustrate the e¤ect of data quality on the estimates of the elasticity, I experiment with di¤erent methods in the construction of these variables.
I initially assume that labor services are proportional to employment and an omitted variable in equations (1), (2), (4), and (5), but equations (3) and (6) remain valid. On the other hand, in the presence of imperfect competition in the factor markets, even equations (3) and (6) may produce biased estimates if the wedge between marginal products and factor prices is di¤erent for di¤erent factors. 11 As pointed out by a referee, the error terms " i;t , i = 1; ::; 6, are likely to be correlated across equations. I have experimented with running equations (1) through (3) and (4) through (6) as a seemingly unrelated regression (SUR) system. Because little e¢ ciency is gained by doing so, I only present single-equation estimates, which are easier to compare with previous studies. 12 One point that was not explicitly described in Berndt (1976) is the derivation of the standard errors for 4 , 5 and 6 . By a direct application of the Delta Method, the estimated variance of these elasticities can be computed as follows: proxy the ‡ow of these services by total private employment, de…ned as the sum of the number of employees in private domestic industries and the number of self-employed workers. 13 For the regressions including the public sector (data con…guration A below), the total number of government employees was added to the labor input measure. Jorgenson has argued repeatedly that total employment is not an appropriate measure of the ‡ow of labor services because it ignores signi…cant di¤erences in the quality of the labor services provided by di¤erent workers. Jorgenson and collaborators have also provided quality-adjusted measures of labor services in several contributions by combining individual data from the Censuses of Population and from the Current Population Survey. Their measure of labor input re ‡ects characteristics of individuals workers, such as age, sex and education, as well as class of employment and industry. In particular, their measure is a weighted sum of the supply of the di¤erent types or categories of labor input, where the weights are the share of overall labor compensation captured by a particular type. I consider here the most recent series reported in Jorgenson and Ho (2000), which considers 168 di¤erent categories of workers. I take the nominal price of labor services to equal the total compensation of employees divided by L t . Compensation of employees was obtained from the National Income and Production Accounts (NIPA) and includes wage and salary accruals, as well as supplements to wages and salaries (e.g., employer contributions for social insurance). Following the approach in Krueger (1999), I next correct this wage measure by adding two-thirds of proprietors'income to the overall compensation of employees. 14 As is standard in the literature, I assume that the ‡ow of capital services is proportional to the U.S. capital stock. 15 The nominal capital stock data was obtained from Herman (2000) and is de…ned as the sum of nonresidential private …xed assets and government assets, the latter being left out when only the private sector is considered. The real capital stock K t is simply de…ned as the nominal capital stock divided by the price of capital. 16 I …rst construct 13 These series were obtained from the Bueau of Econonomic Analysis website. 14 Gollin (2002) suggests treating all proprietors's income as labor income. This alternative adjustment turns out to have only a marginal e¤ect on the estimates (details available upon request). 15 An interesting literature (Burnside et al., 1995;Basu, 1996) casts some doubts on this assumption by emphasizing the importance of variations in capital utilization for explaining the procyclical nature of productivity. An explicit correction for factor utilization is beyond the scope of this paper. 16 As a robustness test, I employed the perpetual inventory method to construct an alternative measure of the real private capital stock using investment data from NIPA and depreciation data from Fraumeni (1997). The resulting capital stocks were remarkably sim-the price of capital using nonresidential private investment de ‡ators obtained from the NIPA. The NIPA de ‡ator for capital equipment has been criticized for not adjusting for the increasing quality of capital goods, thereby systematically overstating the price of capital equipment. Krusell et al. (2000) have constructed an alternative de ‡ator for equipment, building on previous work by Gordon (1990). They also suggest the use of the implicit price de ‡ators for nondurable consumption and services when de ‡ating the nominal stock of capital structures. I employ their price indices to construct an alternative de ‡ator for private nonresidential …xed assets using Tornqvist's discrete approximation to the continuous Divisia index. 17 In particular, letting P E t be the adjusted price of equipment and P S t the price of structures, the price of capital P K t is constructed as follows: where s E t is the arithmetic mean of the expenditure shares in capital equipment in the two periods, i.e., Capital income is de…ned as the sum of corporate pro…ts, net interest, and rental income of persons, and is taken from the NIPA. The rental price of capital services R t is computed as the ratio of total capital income to the real capital stock K t . 18 As shown by Hulten (1986), if capital is the sole quasi-…xed input in production and there is perfect competition, this approach yields unbiased estimates of the unobserved shadow rental rate of capital, whereas the alternative Hall and Jorgenson (1967) formulae produce biased estimates. 19 Finally, we are left with the construction of the aggregate input index F t and its price P t . Unfortunately, there is no clear counterpart for these variables in the data. Given that F t = Y t =A t , one alternative would be to construct F t by de ‡ating value added Y t by some index of Hicks-neutral technical e¢ ciency. Berndt (1976) instead suggested constructing a measure of F t based on the available data on capital and labor services. In particular, he computed the ilar to those obtained by Herman (2000). 17 The de ‡ator for structures is also constructed as a Tonqvist index using the NIPA implicit price de ‡ators for nondurable consumption and services. 18 Assuming instead that the rental price of capital is proportional to the price of capital P K t leads to very similar results. 19 I am grateful to an anonymous referee for pointing this out.
price of aggregate input P t as a Tornqvist price index of the rental prices of capital (R t ) and labor (W t ). The aggregate input index F t is then constructed as In the …rst part of the paper and in order to facilitate comparison with his results, I will follow the approach in Berndt (1976). In section 5, I will show that this approach is infeasible in the presence of biased technical change, and will discuss alternative speci…cations that make use of time series on value added.  Table 1 summarizes the six di¤erent data con…gurations for which I computed estimates of the elasticity. I started with speci…cation A which (i) includes the public sector, (ii) does not add any fraction of proprietors' income to total compensation of employees, (iii) uses the NIPA de ‡ators to construct the value of capital services and their price, and (iv) uses employment as a measure of labor input services. The public sector is excluded in data procedure B, while proprietors'income is added in C. Columns D and E incorporate sequentially the quality-adjusted price of capital indices of Krusell et al. (2000) and the quality-adjusted labor input series of Jorgenson and Ho's (2000). I interpret data con…gurations A through E as employing successively more re…ned data.

Estimates under Hicks-Neutral Technological Change
In this section, I present estimates of the elasticity of substitution between capital and labor based on both classical regression analysis and modern time series analysis. I start by reporting simple Ordinary Least Squares estimates of equations (1) through (6) for di¤erent data con…gurations. I later re…ne these estimates by dealing with issues related to autocorrelation of the disturbances, endogeneity of the regressors, and nonstationarity of the series. Table 2 presents OLS estimates of equations (1) through (6) for the di¤erent data con…gurations described in the previous section. The results are striking in that all the estimates of the elasticity are remarkably close to one. Furthermore, the R-square in the regressions tends to increase with the quality and precision of the data, implying that the standard and reciprocal speci…cations of each …rst-order condition yield increasingly similar estimates. Furthermore, the more re…ned the data procedure, the closer the estimates become across …rst order conditions. For instance, with the least preferred data con…guration (column I), the estimates range from 0:924 to 0:962, whereas with the most preferred data procedure (column V) this range collapses to the interval [1:002; 1:022]. A comparison of the standard errors of the estimates with those obtained by Berndt (and reported in column VIII of Table 2) reveals that my estimates are four to …ve times more precise than his. Despite this fact, the null hypothesis of a unit elasticity of substitution cannot be rejected at the 5% signi…cance level for any of the six speci…cations. 20 Table 2 also reports the Durbin-Watson statistic for each estimation. The highest Durbin-Watson statistic in columns I through V is 0.625, indicating a clear rejection of the null hypothesis of no serial autocorrelation in the residuals. 21

Feasible Generalized Least Squares Estimation
The OLS Durbin-Watson statistics indicate the existence of serial correlation in the residuals but are not informative about the speci…c autocorrelation structure. A natural candidate is a standard AR(1) process, i.e., " t = " t 1 +u t , with u t being white noise. Richer ARMA processes could potentially provide a better …t of the residuals, but this would leave us with fewer observations for the estimation of the parameters of interest. In order to study the plausibility of the assumption of an AR(1) process, I ran the regression b " t = b " t 1 + u t ,where b " t is the vector of OLS residuals in column V. 22 Ljung-Box tests at up to …ve lags were performed for each of the six speci…cations leading to no rejections of the null hypothesis of the estimated residuals b u t being white noise. These results favor the use of an AR(1) process to parameterize the structure of the disturbances in equations (1) through (6).
Column VI of Table 2 then presents FGLS estimates of the elasticity obtained by applying the two-step Prais-Winsten procedure to the preferred data 20 In Berndt (1976), the hypothesis is rejected in three of the six cases. 21 For each of the six speci…cations in column V, I also performed a Ljung-Box test for autocorrelation. The null hypothesis of no autocorrelation up to order k was rejected in all six regressions for all k 30. 22 Hereafter, I limit the analysis to the most re…ned data con…guration E.  Berndt's (1976) approach of estimating equations (1) through (3) and their reverse speci…cations clearly exposes the existence of an endogeneity problem. Equations (1) through (6) were derived from the …rst order conditions of pro…t maximization and, hence, they can be readily interpreted as the aggregate private sector demand for capital and labor services. From the theory of simultaneous equations models, it is well known that these demand equations will not be identi…ed unless the estimation makes use of some set of exogenous variables that shift the supply of capital and labor (cf., Hausman, 1983). Consequently, the OLS and FGLS estimates presented above are likely to be biased, with the direction of the bias being uncertain. Berndt (1976) acknowledged the same simultaneous equation bias and proposed a simple two-stage least squares (2SLS) procedure to resolve it. In his …rst stage regressions, Berndt (1976) introduced a rather large number of instruments. 24 The use of such large set of instruments can be detrimental in at least two respects. On the one hand, if any of these instruments is in fact endogenous, 2SLS estimates will be inconsistent. On the other hand, if some of the instruments are only weakly correlated with the regressors, even if the exogeneity requirement is met, small sample biases will arise (cf., Bound, Jaeger and Baker, 1995). For these reasons, I instead focus on a smaller set of instruments. In particular I take the following three variables to be exogenous to the model but correlated with the regressors: (1) U.S. population, (2) wages in the government sector, and (3) real capital stock owned by the government. 25 I interpret these variables as di¤erent types of supply shifters. It is clear that the size of the U.S. population is likely to have a signi…cant e¤ect on the supply of both capital and labor services. Government wages are also likely to a¤ect the supply of labor in the private sector, with government 23 Iterating the coe¢ cients to convergence had only a minor e¤ect on the results. Since nothing is gained asymptotically by iterating the process, I only report the two-step estimates, which are more comparable to the Generalized IV estimates reported in Table 2 and discussed below. 24 See Berndt and Christensen (1973) or Antràs (2003) for a complete list. 25 Wages in the government sector are computed as labor income accruing to government employees divided by their total number, and de ‡ated by the aggregate input price index P t . To construct the real stock of capital owned by the government I divide the nominal …gures of Herman (2000) by the price of capital index P K t .

Generalized IV Estimation
capital formation having an analogous e¤ect on the supply of capital in the private sector. The exogeneity of these instruments also seems plausible. As it is standard in macroeconomics, I take the fertility choice to be exogenous to the model, while government variables are assumed not to respond (at least contemporaneously) to market prices and quantities.
Having discussed the choice of instruments, I next turn to the selection of an appropriate estimation technique. One alternative would be to run equations (1) through (6) using a standard 2SLS procedure. Nevertheless, there is no reason to believe that instrumenting would solve the autocorrelation problem discussed above. I choose instead to implement a generalized instrumental variable (GIV) procedure developed by Fair (1970) and which I summarize in Appendix A. Column VIII of Table 2 presents estimates of the elasticity of substitution obtained by applying this technique to our preferred data con-…guration E. The estimates are contained in the interval (0:989; 1:017). The standard errors are again higher than the OLS ones and the null hypothesis of a unit elasticity of substitution cannot be rejected for any of the six estimates. A quick comparison of columns VI and VII also reveals that, relative to the FGLS estimates, the GIV estimates are slightly higher in equations (1), (2), and (3), but slightly lower in equations (4), (5), and (6). I interpret this as an indication that the instruments are not only dealing with the simultaneous equation bias, but might also be correcting for a latent errors-in-variables bias (remember that equations (4), (5), and (6) deliver an estimate of 1= i ). 26

Time Series Estimation
Up to this point, I have followed closely the approach proposed by Berndt (1976), the major variation being in the explicit treatment of serial correlation in the disturbances. I now turn to a whole set of di¤erent issues that arise when considering the nonstationary nature of the series involved in the estimation. Figure 1 graphs the six series that form the basis of our estimates, where the logarithm of the variables has been normalized to equal 0 in 1948. Two facts emerge from the …gure. First, the graph uncovers potential nonstationarities in each one of the series: the logarithm of W t =P t , F t =L t , W t =R t and K t =L t all clearly trend upwards, while R t =P t and F t =K t show a downward trend. Second, the two variables in each of the speci…cations (1) through (6) follow similar 26 Conditional on the process followed by the disturbances being AR(1) and under the null hypothesis of exogeneity of the regressors, FGLS provides consistent and asymptotically e¢ cient estimates of the elasticity of substitution, whereas Fair's GIV estimates are also consistent but ine¢ cient. Under the alternative hypothesis, the FGLS estimates are inconsistent, while the GIV ones remain consistent. Using a Hausman (1978) speci…cation test, I tested for the null hypothesis of exogeneity of the regressors and for all six equations the null was not rejected at signi…cance levels well above 5%.  6 1948 1951 1954 1957 1960 1963 1966 1969 1972 1975 1978 1981 1984 1987 1990 1993 1996 log(F/K) log(R/P) log(F/L) log(W/P) log(K/L) log(W/R) trends. This suggests that the correlations captured in the regressions above might all be of the so-called spurious type (cf. Newbold, 1974, andPhillips, 1986). The very high R-squares and the low Durbin-Watson statistics obtained in the OLS estimation certainly point out towards that direction. I next turn to a formal investigation of this possibility. Table 3 reports a summary of the unit root tests I performed on each of the series. The first row presents the results of a simple Dickey-Fuller test of a unit root in the series against the alternative hypothesis of trend-stationarity. It is clear from Table 3 that for none of the six series does the test reject the hypothesis of a unit root. The next two rows extend this simple test to allow for serial correlation by adding higher-order autoregressive terms to the test. I performed this so-called Augmented Dickey-Fuller test with one and two lags, and the null hypothesis of a unit root was again not rejected for any of the series. Finally, I also implemented a Phillips-Perron test at truncation lags 2, 3 and 4 reaching again the same conclusion. In the bottom panel of Table  3, I report the results of the same tests performed on each of the six series expressed in first differences. In this case, the results indicate a rejection of the null hypothesis of the series being integrated of order two.
I therefore conclude that all six series are nonstationary and integrated of

Max-Lambda Trace
Test r = 0 vs r = 1 r 1 vs r = 2 r = 0 vs r = 1 r 1 vs r = 2 Num. of lags order one, which implies that the OLS, FGLS and GIV estimates computed above are all potentially subject to a spurious regression bias. In fact, as shown by Phillips (1986), in this situation, OLS estimates will not be consistent unless a linear combination of the dependent and independent variables is stationary, that is, only if the two variables entering each regression are cointegrated. 27 Table 4 presents the results from two cointegration tests. The top panel considers Engle and Granger's (1987) residual-based Augmented Dickey-Fuller test, which hinges on testing the stationarity of the residuals from the OLS regressions (1) through (6). As pointed out by Engel and Granger (1987), the critical values of standard unit root tests are not appropriate when applied to the OLS residuals because they lead to too many rejections of the null hypothesis of no cointegration. MacKinnon (1991) has linked the appropriate critical values to the sample size and to a set of parameters that only vary with the speci…cation of the cointegration equation, the number of variables and the signi…cance level. These critical values are reported in the last column. As is apparent from the Table, the results are not conclusive. When the estimation includes one lagged …rst di¤erence of the residuals, the null hypothesis of nonstationarity of the residuals is rejected for all six speci…cations, implying that the series in the estimation seem to be cointegrated. Nevertheless, when a second lagged …rst di¤erence is added, the results are overturned and the tests suggest instead that the series in the estimation may in fact not be cointegrated.
The bottom panel of Table 4 implements the maximum likelihood cointegration test suggested by Johansen and Juselius (1990), which tests the null hypothesis of the existence of r cointegrating vectors against the alternative of the existence of r + 1 cointegrating vectors. Implementing the test requires specifying a particular model for the cointegration equation as well as choosing the number of lags of the …rst di¤erence of the variables to be included in the estimation. In light of equations (1) through (6), I choose a model with a constant and no trend and compute the statistics with one and two lagged …rst di¤erences of the data. The results in the bottom panel of Table 4, indicate that, when the estimation includes one lag, the null hypothesis of no cointegration is clearly rejected for all of speci…cations. As with the Engel and Granger (1987) tests, however, when the number of lags is increased to two, the null hypothesis of zero cointegrating vectors cannot be rejected for any of the speci…cations.
Overall, these mixed results of the cointegration tests indicate that the OLS estimates in Table 2 should be interpreted with caution because of a potential 27 As discussed below, this is not necessarily true for our FGLS and GIV estimates. spurious regression bias. As discussed by Hamilton (1994, p. 562), a natural cure for spurious regressions would be to di¤erence the data before estimating the equations. The disadvantage of this approach is that important long-run information would be lost and, in particular, the interpretation of our estimates of i would become much less transparent. Interestingly, the existence of a unit root in the OLS residuals implies that the FGLS and GIV estimates in Table 2 are asymptotically equivalent to the estimates that would be obtained with the di¤erenced data (cf. Blough, 1992, and the discussion in Hamilton, 1994, p. 562). This implies that the estimates in columns VI and VII of Table  2 are consistent, but it also indicates that these estimates might be neglecting important long-run information in the data.

Estimates under Biased Technological Change
The results in the previous section provide some evidence in favor of a Cobb-Douglas speci…cation of the U.S. aggregate production function. In particular, the GIV estimates on the preferred data con…guration lead to no rejections of the null hypothesis of a unit elasticity of substitution between capital and labor. Nevertheless, an explicit treatment of the nonstationary nature of the data suggested that the results should be treated with caution.
In this section, I will cast further doubts on the validity of the estimates obtained under the assumption of Hicks-neutral technological change. I will …rst show that in the presence of biased technical change, Berndt's (1976) estimation equations are misspeci…ed in a critical way, in fact biasing the estimates towards …nding the results that support a Cobb-Douglas view of the U.S. economy. I will next o¤er some solutions to this misspeci…cation problem and will examine how the estimates are a¤ected by these modi…cations.

The Source of the Bias
Consider again the Arrow et al. (1961) CES production function now expanded to allow for non-neutral technological change where A K t is an index of capital-augmenting e¢ ciency and A L t is an index of labor-augmenting e¢ ciency. It is straightforward to show that, given equation (8), there is no total-factor productivity index A t such that F t = Y t =A t is only a function of the capital-labor ratio, and not of A K t , A L t , or their ratio. In other words, under biased technological change, it becomes impossible to remove factor e¢ ciency from the aggregate input index F t . This implies that, under these circumstances, the estimations equations (1), (2), (4), and (5), which include F t and its associated price, are all misspeci…ed in the sense that they su¤er from an omitted-variable bias. Furthermore, taking the …rst order conditions with respect to capital and labor, one can obtain the following expression from which equations (3) and (6) are derived: Equation (9) clearly illustrates that as long as A K t 6 = A L t (i.e., as long as technological change is non-neutral) equations (3) and (6) also su¤er from an omitted-variable bias. To get a better understanding on the direction of the bias, it is useful to subtract log (K t =L t ) from equation (9): The left-hand side of equation (10) is simply the logarithm of the labor share in total output divided by the capital share. As discussed in the introduction, this variable has been remarkably stable over the period 1948-1998. On the other hand, the capital-labor ratio K t =L t on the right-hand side of (10) has steadily increased during the same period. Consequently, whenever the bias in technological change is ignored, i.e., whenever the ratio A K t =A L t is not included in the regression, the estimate of (1 ) = will necessarily be close to zero, implying that the estimate of will necessarily be close to one. At …rst glance this does not seem surprising: if a Cobb-Douglas function describes well aggregate production in the U.S. private sector, the labor share should be approximately constant. The problem is that in the presence of biased technological change the argument does not run both ways. As it is clear from equation (10), if K t =L t and A L t =A K t grow at the same rate, then steady factor shares can be consistent with any well-behaved production function, and certainly with aggregate production functions with non-unit elasticities of substitution.
A notable example is a version of the neoclassical growth model with laboraugmenting technological change (cf., Barro and Sala-i-Martin, 1995, Chapter 2). With rather weak conditions on the aggregate production function, the model delivers a balance growth path in which the capital-labor ratio grows at the same rate as the index of labor-augmenting e¢ ciency. Another example is provided by Acemoglu (2003), who develops a model in which the incen-tives to innovate depend on the size of the bill paid to each factor. In his model an increase in the labor share encourages labor-augmenting technological change, which in turn increases the capital-labor ratio and the wage-rental ratio. The fact that capital and labor are assumed to be gross complements ( < 1) ensures that the increase in K=L is higher than the increase in w=r, thereby bringing the labor share back to its steady-state equilibrium value. Acemoglu's (2003) model nicely illustrates the misspeci…cation inherent in restricting technological change to be neutral in the regressions above. By not controlling for the bias in technological change, one is implicitly ascribing the full variation in the capital-labor to factor substitution, when in fact part of the variation is explained by technological change.

Model Speci…cation and Additional Data
As discussed above, in the presence of biased technological change, it is impossible to construct an index of aggregate input F t that is independent of the e¢ ciency indices A K t and A L t . Consequently, in order to consistently estimate the elasticity of substitution, it becomes necessary to device a method to control for these indices, which in turn requires the imposition of some type of structure on the form of technological change (cf., Diamond, McFadden and Rodriguez, 1978). I follow the bulk of the literature in assuming that A K t and A L t grow at constant rates K and L . 28 Under this assumption, the production function (7) becomes The …rst-order conditions for pro…t maximization can then be manipulated to obtain the following six speci…cations, analogous to equations (1) through (6) in section 2: 29 (6') There are two important di¤erences between the speci…cations in (1') through (6') and those in (1) through (6). First, because of the impossibility of constructing an aggregate input index F t , this index is replaced by real output Y t . Second, all six speci…cations now include a time trend. As is clear from the equations, the exclusion of the time trend in the regressions above, would in general lead to inconsistent estimates of the elasticity of substitution. The only exception is the Cobb-Douglas case ( = 1), in which the bias would e¤ectively be zero. 30 Estimation of equations (1') through (6') requires data on real output Y t and its associated price P Y t . A natural candidate to proxy for Y t is real GDP in the U.S. private sector. As argued by Berndt (1976), using value added to measure Y t is in general problematic. As he points out, most studies consider only a subset of capital inputs, namely equipment and structures, thus ignoring land, inventories, and working capital. Because in this paper I have also focused on capital equipment and structures, the use of value added as a proxy for Y t is arguably inappropriate. Nevertheless, in the presence of biased technological change, Berndt's (1976) approach of constructing the aggregate input index is infeasible, because the index F t will necessarily depend on A K t and A L t , which are unobservable. Furthermore, as discussed above, ignoring the adjustment for A K t and A L t is not an option, since this leads to estimates of the elasticity that are necessarily biased towards one. The infeasibility of using Berndt's (1976) aggregate input approach inclines me to use series on value added. In particular, I use GDP in the U.S. private sector to proxy for Y t , and the corresponding GDP de ‡ator to proxy for P Y t .

Estimation Results
Column I in Table 5 presents OLS estimates of equations (1') through (6'). In order to assess the e¤ect of controlling for biased technological change, these estimates should be compared with those in column V of Table 2. As it is clear from the results, the point estimates of the elasticity drop to values that are, in general, well below one and that range from 0:551 to 0:948. Furthermore, the standard errors of the estimates are quite low, implying that the null hypothesis of a unit elasticity is rejected at the 1% signi…cance level for …ve of the six speci…cations, while it is rejected at the 10% level for the remaining equation (see the t-stats in Table 5). Interestingly, the estimates are also consistent with the empirical regularity discussed in Berndt (1976), by which the estimates of 30 Notice also that the constant terms are not only a function of , but depend also on , A K 0 and A K 0 (see Klump and De La Grandville, 2000, for more on this). the elasticity based on the marginal product of labor equations tend to be higher than the estimates based on the marginal product of capital equations. As in the regressions in section 4, the Durbin-Watson statistics indicate the existence of serial correlation in the residuals. 31 I performed again Ljung-Box tests and the null hypothesis of the residuals following an AR(1) process was again not rejected. Column II in Table 5 presents FGLS estimates that apply the Prais-Winsten procedure. The results are qualitatively similar to the OLS ones, with the lowest estimates becoming even lower and the highest estimate reaching a value slightly higher than one. Finally, in column III, I present estimates based on Fair's (1970) GIV technique. Instrumentation seems to help a great deal in bringing the estimates from di¤erent speci…cations closer together. In particular, the GIV estimates range from 0:681 to 0:891, and even the highest point estimate, 5 , is signi…cantly lower than one at the 5% signi…cance level. Overall, of the eighteen estimates in columns I, II, and III, seventeen are below one, with …fteen of these estimates being signi…cantly below one at the 5% level. Furthermore, the typical elasticity lies in the range 0.6 to 0.9.
The high R-squares and low Durbin-Watson statistics obtained under OLS suggest that, as in section 4, the results might su¤er from a spurious regression bias. Section 4 uncovered signi…cant nonstationarities in the data used in the regressions with Hicks-neutral technological change. I repeated the unit root tests for the series involved in the estimation of equations (1') through (6') and found very similar results. The null hypothesis of the series being integrated of order one was not rejected in any of the six cases, while the null hypothesis of their …rst di¤erence being integrated of order one was clearly rejected in all cases. These results call for a cointegration test to assess whether or not the regressions above are spurious. The top panel of Table 6 presents the results of Engle and Granger's (1987) residual-based test. The residuals are obtained from a model that includes a time trend, and the speci…cation of the test includes one, two or three lags of the …rst di¤erence of the residuals. The statistics reported in Table 6 should be compared with the critical values computed à la MacKinnon (1991), which are also adjusted to take into account the time trend in the equations. As in section 4, the results are not conclusive. When the estimation includes one lag, the tests generally reject the null hypothesis of no cointegration, but the results are not robust to adding a second lagged …rst di¤erence of the residuals. Similarly, in the bottom panel of Table 6 I report the Johansen and Juselius (1990) test, which equally de- livers di¤erent conclusions depending on the number of lags speci…ed in the estimation. Both types of tests suggest, however, that the null hypothesis of no cointegration is more easily rejected in the case of the marginal product of labor equations (2') and (5'). 32 The mixed results of the cointegration tests complicate the evaluation of the consistency of the OLS estimates presented in Table 5. On the one hand, if we are willing to reject the null hypothesis of no cointegration (i.e., if we believe the tests should be speci…ed with one lag), OLS estimates are not only consistent but also superconsistent, in the sense that they converge to their true value at a higher speed than in the absence of nonstationarity in the series (cf., Phillips andDurlauf. 1986, andStock, 1987). Furthermore, in that case, OLS estimates are consistent even in the presence of autocorrelation in the disturbances and endogeneity of the regressors. Phillips and Durlauf (1986) showed, however, that OLS estimates of cointegrated relationships have nonstandard asymptotic distributions, thus invalidating standard inference techniques. Furthermore, small-sample biases, which are likely to be important in our regressions, suggest the need to use alternative superior estimates. Saikkonen (1991) suggests a simple modi…cation to the OLS procedure, which delivers both consistent and e¢ cient estimates that have asymptotically standard distributions. 33 On the other hand, if we interpret the results in Table 6 as indicating that the null hypothesis of no cointegration cannot be rejected (i.e., if we believe the tests should be speci…ed with two lags), then OLS estimates su¤er from a spurious regression bias, and are therefore inconsistent. As argued before, a natural cure for this bias is to di¤erence the data before estimating the equations, the disadvantage being that, by doing so, valuable long-run information may be lost. An alternative approach is to include lagged values of both the dependent and independent variables in the regression. This procedure leads to consistent estimates of the elasticity and to t-tests of the hypothesis i = 1 that are asymptotically N (0; 1). 34 Rather than taking a strong stance on whether the variables in the regressions are in fact cointegrated or not, I next report the results of applying Saikkonen's (1991) procedure for estimating cointegrating vectors, as well as 32 The statistics in the Engle and Granger (1987) are always higher for these two equations. Furthermore, even with two lagged …rst di¤erences in the estimation, the max-lambda statistic in the Johansen and Juselius test is very close to its critical value for this pair of variables. 33 See King, Plosser, Stock and Watson (1991) for an application of a similar technique. 34 Conversely, F tests of hypotheses that a set of estimates are jointly signi…cant have nonstandard limiting distribution (c.f., Hamilton,pp. 562). standard OLS estimates that include lagged values of both variables to correct for spurious regression bias in the absence of cointegration. 35 Column IV of Table 5 presents the results of the implementation of Saikkonen's (1991) procedure for l = 1 and p = 1. The details of the estimation procedure are relegated to Appendix B. For each of the six regressions, the table reports the estimate of the elasticity of substitution and the modi…ed t-statistic, which should be compared with the associated critical value from a standard normal distribution. 36 The results indicate that the inclusion of the leads and lags does not have much of an e¤ect on the point estimates of the elasticity, which are very similar to those obtained under the GIV procedure in column III of Table 5. The modi…ed t-statistics are somewhat lower than the standard ones, but they still lead to a rejection of the null hypothesis of a unit elasticity (at reasonable signi…cance levels) in four of the six cases.
Finally, in column V of Table 5, I report OLS estimates extended to include lagged values of both the dependent variable y t and the independent variable x t , as well as a time trend. This has a larger impact on the estimates, suggesting that spurious regression biases might be important. The estimates of the elasticity obtained from equations (1'), (3'), (4'), and (6') are substantially lower than the ones obtained in the …rst four columns of Table 5, and provide evidence that the elasticity of substitution might well be lower than 0:5. These values are consistent with the …ndings of David and van de Klundert (1965), Eisner and Nadiri, and Lucas (1969). On the other hand, the estimates of the elasticity obtained from equations (2') and (5') indicate that the elasticity is much higher, and might even be larger than one. Remember, however, that the approach of adding lags of both variables in the model is only appropriate under the null hypothesis of no cointegration of the variables. The fact that this hypothesis was relatively easier to reject for the variables in equations (2') and (5') suggests that these high values of the elasticity should not be taken at face value.

Estimates of the Bias in Technological Change
As is apparent from equations (1') through (6'), with the our estimates of the elasticity of substitution at hand, we can also obtain estimates of the parameters K and L , that is, estimates of the growth rate of capital-and labor-augmenting technological change. This was already recognized by David 35 The Saikonnen (1991) procedure also adds lags to the estimated equation (see Appendix B). Although the introduction of these lags is here justi…ed on statistical grounds, these lags could also be rationalized appealing to adjustment costs in capital and labor (see Lucas, 1969). 36 This modi…ed t-statistic corresponds to t p s 2 = in Appendix B.
and van de Klundert (1965) who ran an equation analogous to (2') for the U.S. private sector in the period 1899-1960 and found L = 0:019, or a growth rate of labor e¢ ciency of 1.9 percent a year. They also ran equations analogous to (3'), both with and without lags, thus obtaining values for L K of 0.72% and 0.86% per year, respectively. As shown in Table 6, using my own estimates of 3 from equation (2') under Saikonnen's procedure (i.e., column IV), yields an estimate of L of 1.85% per year, a …gure remarkably close to David and van de Klundert's. Furthermore, when computing L with the estimates of the reverse equation (5'), I obtain a value of 1.90% which matches their …gure to the second decimal. On the other hand, as is clear from Table 6, my …ndings suggest that the bias in technological change, L K , is much larger than the one they estimated. In particular, I …nd that labor-augmenting e¢ ciency grew about 3% faster than capital-augmenting e¢ ciency. In fact, my estimates suggest that capital e¢ ciency shows a downward trend. To understand this result, remember that A L t and A K t are actually indices of unmeasured quality or unmeasured e¢ ciency. A possible interpretation of the …ndings on Table 7 is that the Krusell et al. (2000) price of capital de ‡ator does a better job of incorporating quality improvements than does the Jorgenson and Ho (2000) quality-adjusted labor input index. As discussed above, the imposition of a particular structure on the form of technological change is dictated by the need to identify the elasticity of substitution. Given this constraint, the choice of constant exponential growth rates of factor e¢ ciency seems a natural one. Alternatively, one could consider a speci…cation that allowed for a stochastic component in technological e¢ ciency. In particular, consider the case in which A K t = A K 0 e K t+ K t and A L t = A L 0 e L t+ L t . It is straightforward to check that the stochastic components K t and L t would enter as disturbance terms in the …rst-order conditions that form the basis of the empirical speci…cations. In section 2, these error terms had been justi…ed appealing to optimization errors on the part of …rms. Technological shocks provide a second plausible explanation for these terms. Notice, however, that omitted e¢ ciency shocks are much more likely to be correlated with factor demands than optimization errors are. The reason is that although K t and L t are unobserved by the econometrician, they may be in the information set of …rms, which will take them into account in choosing input demands. This suggests that, to the extent that …rms respond quickly to productivity shocks, my estimates of the elasticity might be biased. Remember, however, that the generalized instrumental variable (GIV) estimates in Table 5 (column III) are remarkably close to those obtained under Saikkonen's (1991) method. To the extent that the instruments used in the GIV estimation (U.S. population, wages in the government sector, and real capital stock owned by the government) are unresponsive to technology shocks but are correlated with the variables in equations (1') through (6'), the results in Table 5 indicate that ruling out stochastic components in A K t and A L t does not have a sizeable e¤ect on the estimates of the elasticity. 37

Conclusion
This paper has argued that a Cobb-Douglas speci…cation of the U.S. aggregate production function may be misleading. My estimates suggest that, controlling for biased technological change, the elasticity of substitution between capital and labor is likely to be considerably below one, and may even be lower than 0.5. This contrasts with the results of Berndt (1976), who reported estimates of the elasticity insigni…cantly di¤erent from one under the assumption of Hicks-neutral technological change. I have shown, however, that ignoring the bias in technological change puts the data in a straightjacket that naturally leads to an acceptance of the null hypothesis of a unit elasticity of substitution between capital and labor. I illustrated this source of bias by showing that, in my sample, ignoring biased technological change also leads to estimates of the elasticity insigni…cantly di¤erent from one. 37 I have also experimented with an alternative speci…cation of A K t and A L t that allows for di¤erent growth rates of factor-bias in di¤erent subperiods. This amounts to including dummy variables for di¤erent subperiods, e.g., 1948-1960, 1961-1972, 1973-1984, 1985-1998. This leads to slightly lower estimates of the elasticity for most speci…cations and estimation techniques, but the results are very similar to those reported in Table 5. The results are relevant for the debates on the sources of economic growth, as well as for the debate on the e¤ects of tax behavior on investment. Hsieh (2000) shows that when the elasticity of substitution between capital and labor is lower than one, standard growth accounting exercises tend to understate the role of productivity growth as a determinant of economic growth. Similarly, as pointed out by Eisner and Nadiri (1968), low values of the elasticity of substitution imply e¤ects of tax policy on investment behavior that are signi…cantly lower than the ones advocated by Hall and Jorgenson (1967), who considered only the Cobb-Douglas case.
Although the analysis has been conducted with only U.S. data, the …ndings of this paper lend support to a recent literature that has pushed the view that certain cross-country stylized patterns are not reconcileable with aggregate output being represented by an aggregate production function featuring a unit elasticity of substitution (e.g.,. Acemoglu, 2002, Caselli and Coleman, 2003, and Jones, 2003. Consistent with my …ndings, Gollin (2002) shows that even after adjusting labor income to include self-employment income, employee compensation as a share of GDP di¤ers substantially across countries. Furthermore, other studies have documented large movements of the labor share in OECD countries and have related these movements to the capital-labor ratio (e.g., Blanchard, 1997 andBentolila andSaint-Paul, 2003). My estimates suggest that even for a country, the United States, with a relatively stable labor share, the evidence seems to reject a Cob-Douglas speci…cation of the aggregate production function.