Essays on Time-Varying Discount Rates

A dissertation presented

by

Ian Louis Dew-Becker to The Department of Economics in partial fulﬁllment of the requirements for the degree of Doctor of Philosophy in the subject of Economics Harvard University Cambridge, Massachusetts May 2012

©2012 — Ian Louis Dew-Becker All rights reserved.

Dissertation Advisor: Professor John Y. Campbell

Ian Louis Dew-Becker

Essays on Time-Varying Discount Rates

ABSTRACT
This dissertation consists of three essays that explore the interaction between various discount rates and the macroeconomy. The ﬁrst essay studies the cross-section of discount rates, speciﬁcally, the term structure of interest rates. When physical capital is discounted like a bond with a similar duration, a high term spread is associated with low average duration for investment. I document a strong negative correlation between the term spread and the duration of investment, implying an important role for the cost of capital in determining the composition of aggregate investment. The results are robust to including a variety of controls. Consumer durable goods purchases display similar behavior. The second essay develops a new utility speciﬁcation that incorporates Campbell– Cochrane–type habits into the Epstein–Zin class of preferences. It is a model in which risk premia change over time. In a simple calibration of a real business cycle model with EZ-habit preferences, the model generates a strongly countercyclical equity premium, substantial equity return predictability, and a stable riskless interest rate, as in the data. Moreover, conditional on the average level of risk aversion, time-variation in risk aversion increases the volatility and mean return of equities. On the real side, the model matches the short and long-term variances of output, consumption, and investment growth. As an additional empirical test, I measure implied risk aversion and ﬁnd that it has an R² of over 50 percent for 5-year stock returns in post-war data. The third essay develops a New-Keynesian model in which households have Epstein– Zin preferences with time-varying risk aversion and the central bank has a time-varying inﬂation target. The model matches the dynamics of nominal bond prices in the US economy well: the ﬁtting errors for individual bond yields are roughly as large as those obiii

tained from a non-structural three-factor model, and two thirds smaller than in models with constant risk aversion or a constant inﬂation target.

iv

CONTENTS

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. Investment and the Cost of Capital in the Cross-Section: The Term Spread Predicts the Duration of Investment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alternative explanations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Consumer durables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The ﬁrm-level mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

iii vii

1 1 5 12 14 19 27 29 37 39 39 44 51 72 81 91

2. A model of time-varying risk premia with habits and production . . . . . . . . . . . . . 2.1 2.2 2.3 2.4 2.5 2.6 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Calibration and simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Empirical return forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3. Bond pricing with a time-varying price of risk in an estimated medium-scale Bayesian DSGE model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

v

3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Household preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

93 97

Aggregate supply . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Model solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 Empirics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 Asset pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 The real economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

Appendix

146

A. Appendix to Chapter 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 A.1 The approximation for average duration . . . . . . . . . . . . . . . . . . . . . 147 A.2 Further robustness tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 B. Appendix to chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 B.1 The certainty equivalent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 B.2 Derivation of the SDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 B.3 The log-linear model with production . . . . . . . . . . . . . . . . . . . . . . 153 B.4 Details of return forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 C. Appendix to Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 C.1 Results from the text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 C.2 Approximation method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 C.3 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

vi

ACKNOWLEDGMENTS
I completed this dissertation with the support of many people. John Campbell was endlessly patient, and amazingly diligent about reading and commenting on any work I sent him. He went above and beyond any reasonable expectation. I also beneﬁted from numerous comments and conversations with my other advisors, Emmanuel Farhi, Efﬁ Benmelech, and also David Laibson, Robin Greenwood, and Jim Stock, among others. Countless conversations with fellow students made this possible, including in particular Jason Beeler, Stefano Giglio, Kelly Shue, and Eric Zwick. Economists too numerous to name also generously gave me constant feedback on my work which is reﬂected here. But none of the above should be implicated in any errors or omissions.

vii

1. INVESTMENT AND THE COST OF CAPITAL IN THE CROSS-SECTION: THE TERM
SPREAD PREDICTS THE DURATION OF INVESTMENT

1.1 Introduction
This paper studies the cross-section of investment. While there is an enormous amount of work studying the aggregate level of investment and the determinants of ﬁrm-level investment, there is essentially no analysis of the determinants of investment in different types of assets. This paper begins that task by analyzing the distribution of investment across assets according to their depreciation rates. I show that when interest rates for longduration assets are higher than those for short-duration assets, aggregate investment shifts relatively towards high-depreciation assets. The response of investment to the cost of capital is a key mechanism in macroeconomics and ﬁnance. It is central to production-based asset pricing theories (e.g. Cochrane 1991, 1996); a primary feedback mechanism in standard general-equilibrium models; one of the key drivers for the response of the economy to monetary policy shocks; the source of the classical crowding-out effect of government spending; and an important determinant of the size of distortions from taxes. This paper considers a novel method for uncovering an empirical relationship between investment and the cost of capital. There is a long literature that studies the effect of the cost of capital on investment. Simple methods have, in general, failed to ﬁnd important effects.1 Bernanke and Gertler (1995) ﬁnd that nonresidential investment seems to respond only weakly to shocks to the Federal funds rate.2 The discount rate is also a determinant of Tobin’s Q, but estimates of the impact of Q on investment tend to be small (Summers, 1981; Eberly, Rebelo, and Vincent,
1 2

See Chirinko, 1993, for an extensive review.

However, in recent unpublished papers, Gilchrist and Zakrajsek (2008) and Guiso et al. (2002) ﬁnd a relationship in micro data between investment and interest rates.

1

2009, give a recent review). The primary contribution of this paper is to show that interest rates affect what assets ﬁrms invest in at the aggregate level. Furthermore, the effect is relevant at the cyclical frequency: it is neither centered around discrete (and somewhat rare) policy changes, such as tax changes, or dependent on very long-term effects.3 The basic idea here is to forecast the cross-section of investment using the cross-section of interest rates, instead of forecasting the level of investment with the level of interest rates. Long-term assets are discounted with long-term interest rates, and short-term assets with short rates. When long rates are higher than short rates—the term spread is high—a cost-of-capital effect implies investment should shift towards short-duration assets. The negative relationship holds strongly in the data: it explains roughly one third of the crosssectional variation in investment by duration, and the effect holds both within and across industries. Standard regressions of aggregate investment on the level of interest rates have the fundamental identiﬁcation problem that periods of high interest rates may also be periods when investment demand is high, so the correlation between investment and interest rates could be zero or even positive. By studying the cross-section, I abstract from aggregate shocks, hopefully reducing this endogeneity problem. The strong empirical results suggest that in fact endogeneity is less of an issue in the cross-section. The data is simple to construct. I obtain nominal investment by asset and year from the Bureau of Economic Analysis. I study an index of average duration deﬁned as the average the of the assets’ economic life-spans, weighted by their share in aggregate investment in each year.4 Figure 1.1 shows that this index of average duration is highly negatively correlated with the spread between interest rates on ten and one-year nominal Treasury bonds (note that the axis for average duration is reversed for the sake of clarity). When interest rates are relatively high for long-duration assets, investment shifts towards shortduration assets, creating a strong negative correlation between average duration and the
3 Caballero (1994) and Schaller (2006) use cointegration methods to show that in the long run the cost of capital is meaningfully related to the size of the capital stock (and hence the level of investment). A number of researchers have also focused on high-frequency changes in taxes which produce large movements in the cost of capital and investment (Hassett and Hubbard, 2002, provide an extensive review). 4

Speciﬁcally, "lifespan" is measured as a Macaulay duration using data on economic depreciation rates.

2

term spread. A negative raw correlation between investment and interest rates suggests that the cost of capital has an important role in determining the cross-sectional distribution of investment, but there are alternative mechanisms that could produce this result. I therefore build a simple Q-theory model to help elucidate the possible sources of bias in the basic result in ﬁgure 1.1 and try to account for them in subsequent regressions. I control for the level of productivity and expected productivity growth in a variety of ways and ﬁnd that they do not eliminate the basic effect. More importantly, I ﬁnd that the term spread–average duration relationship is not driven by changes in demand across industries. When the term spread is high, investment shifts towards low-duration assets within individual industries, in addition to shifting from industries that use more long-duration assets to ones that use more short-duration assets. In addition to contributing to the literature on the determinants of the level of investment, this paper is related to the recent literature on production-based asset pricing with projects that have differing characteristics (e.g. Berk, Green, and Naik, 1999, and Gomes, Kogan, and Yogo, 2009). While those papers show that variation in the types of capital owned by ﬁrms can lead to differences in their stock prices, I ﬁnd that variation in the cross-section of asset prices can affect the types of investment that ﬁrms undertake. The ﬁndings here are also relevant to understanding the relationship between interest rates and debt issues. The ﬁnal section of the paper provides novel evidence that ﬁrms match the maturity of their debt issues to their physical investment, consistent previous evidence in the ﬁnance literature (e.g. Stohs and Mauer, 1996). My ﬁndings suggest that the timing of debt issues to the term spread documented by Baker, Greenwood, and Wurgler (2003) could be explained by the dynamics of physical investment and the fact that ﬁrms match the maturity of their debt to their assets. The remainder of the paper is organized as follows. Section 1.2 describes the data and section 1.3 reports the main result. Next, I outline in section 1.4 a simple model that justiﬁes the regression of the average duration of investment on the term spread. Section 1.5 controls for a number of possible biases suggested by the investment model and shows that the term spread is the single most powerful predictor of the average duration of in3

Figure 1.1: The average duration of investment versus the term spread

2

-0.2

1.5

-0.15

Term Spread

1

-0.1

0.5

-0.05

Term Spread

-0.5

0.05

-1

0.1

Average Duration
0.15

-1.5

-2

0.2

Note: The term spread is the gap between the 10 and 1-year treasury yields averaged over the previous year. Both variables are HP-detrended. The axis for average duration is reversed. Grey bars indicate NBER-dated recessions.

Duration

4
1953 1958 1963 1968 1973 1978 1983

0

0

1948

1988

1993

1998

2003

vestment. In section 1.6 I show that the relationship between duration and the term spread also appears in purchases of consumer durable goods. Section 1.7 examines the relationship between the type of debt that ﬁrms sell and the duration of their assets. I ﬁnd a positive relationship (consistent with maturity-matching theories), which gives added support for the idea that long-term interest rates are the relevant cost of capital for long-duration assets and short rates for short-term assets. Finally, section 1.8 concludes.

1.2 Data
To study the relationship between investment and the cost of capital in the crosssection, we need a relevant measure of the cost of capital that differs across assets. The duration of assets is a natural source of variation because it is easy to quantify for both physical assets (through their depreciation rates) and bonds (through maturities). Of course, the cost of capital depends on more than simply the level of interest rates. The equity premium is large and variable (e.g. Lettau and Ludvigson, 2001). The advantage of focusing on interest rates here is that we can directly observe the cost of capital for assets of different durations. While there have been studies of the term structure of equity (Lettau and Wachter, 2007), there is no simple way to actually measure the term structure of expected returns on equity, let alone the variation in the slope of that term structure. I obtain data on Treasury yields measured at year-end from the Federal Reserve. Treasury data has the advantage of including bonds with a large variety of maturities over a long period of time. However, ﬁrms do not in general borrow at the Treasury yield. I therefore also study the spread between yields on 3-month commercial paper and the Moody’s seasoned Baa corporate bond yield (from Global Financial Data and the Federal Reserve, respectively). The Moody’s index is meant to measure bonds with remaining maturities near 30 years. The main results focus on the Treasury yield spread.5 A potential concern is that the relevant discount rate for investment is the real interest rate, not the nominal rate. One method for obtaining the real interest rate would be
Ideally, we would measure the true cost of capital for each asset, including the cost of capital for equity, in particular. While there is research on the term structure for equity (Lettau and Wachter, 2007), it is not obvious how to construct an equity cost of capital for each asset simply by looking at its depreciation rate.
5

5

to subtract an inﬂation forecast from the nominal rate. In general, random-walk inﬂation forecasts are competitive with more sophisticated methods (Atkeson and Ohanian, 2001). With a random-walk forecast, the nominal term spread and the spread obtained after subtracting expected inﬂation will be identical, which suggests that there is little to be gained by forecasting inﬂation here.6 Another option is to look at yields on inﬂation-protected bonds. The time series of inﬂation-protected bonds in the United States is relatively short, but inﬂation protected bonds have been sold in the United Kingdom since the 1980’s. Figure 1.2 plots the 10/5 year term spread in the UK for both nominal and inﬂation-protected bonds since 1985. Over the sample, the two series move together closely, even through the ﬁnancial crisis. Their variances differ, but they are over 70 percent correlated. This result suggests that by studying the nominal term spread, we will obtain results that are similar to what we would obtain with the unobservable real term spread. Data on capital stocks and investment come from the Bureau of Economic Analysis’s (BEA) ﬁxed asset tables. The main results focus on aggregate investment by asset, but the BEA also reports data at the asset×industry level. Data on depreciation rates is from Fraumeni (1997), the source for current depreciation rates used by the BEA.7 The BEA uses geometric (declining balance) depreciation for nearly all assets.8 Depreciation rates are estimated primarily from data on service lives and sales of vintage assets. Given the resale value of an asset for each age along with a service life, one can estimate an approximate geometric depreciation rate.9 These depreciation rates are closer to economic depreciation than the straight-line method used for accounting purposes by many ﬁrms (Hulten and Wykoff, 1981). I use 36 asset classes from the BEA tables, excluding household and government as6 Furthermore, we would need to estimate a 10-year inﬂation forecast, which would be difﬁcult even if inﬂation were relatively easy to forecast at short horizons. Another option would be to use survey data on inﬂation forecasts, but this would substantially limit the available time series.

Her depreciation rates closely match depreciation obtained by simply diving BEA reported depreciation by the capital stock.
8 9

7

Missiles and nuclear fuel rods, for example, are modeled with straight-line depreciation.

The BEA’s current estimates are a combination of data from a variety of studies on resale values reviewed in Fraumeni, 1997.

6

Figure 1.2: United Kingdom 10/5 year term spreads, 1985–2010

2

1.5

1

0.5

Nominal

7
1/90 1/95 1/00

0 1/05 1/10

1/85

-0.5

-1

Inflation-adjusted

-1.5

-2

Note: gap between yields on 10 and 5-year nominal and inflation-protected bonds

sets and educational, health, and religion-related structures. The majority of the analysis focuses on equipment investment. The investment literature generally ﬁnds that models have substantial trouble explaining structures investment (Oliner, Rudebusch, and Sichel, 1995). This may be partly caused by the fact that nonresidential building projects take fourteen months to complete on average (Edge, 2000).10 The main results below go through when structures are included, but the relationships are far less clear. I therefore leave the analysis of structures to future work so as not to distract from a complete analysis of equipment investment. For each asset class, the BEA reports total stocks (on a current-cost basis) and investment for the private nonresidential economy. The asset classes accounting for the most nominal investment in 2007 were software (16 percent of total investment), petroleum and natural gas exploration and wells (8 percent), communication equipment (7 percent), and computers and peripheral equipment (6 percent). Except for oil and gas, these assets all have high depreciation rates, and substantial investment is necessary just to keep the stocks at constant levels. For an asset with geometric depreciation rate δi , if we assume that productivity is constant and there is a ﬁxed discount rate r ∗ , Macaulay’s duration, Di , will be

Di =

j =1

∑j

∞

(1 − δi ) j−1 (1 + r ∗ ) j

=

1 + r∗ r ∗ + δi

(1.1)

When measuring durations I ﬁx r ∗ = 0.03.11 Table 1.1 lists the assets used in this study along with their depreciation rates and durations. Software, computers, and ofﬁce and accounting equipment have the highest depreciation rates, all above 20 percent per year. Types of heavy industrial machinery tend to have lower depreciation rates, as low as 5 percent. Finally, for the purpose of summarizing the cross-section of investment, I deﬁne an
See Edge (2000) for an empirical model of residential and nonresidential structures investment that takes into account building lags.
11 10

r∗ .

Allowing for a constant rate of productivity growth would be the equivalent of choosing a lower value of The results are not sensitive to the choice of r ∗ .

8

Table 1.1: Assets, depreciation rates, and durations

Depreciation Duration Asset rate (percent) (years) Information processing equipment and software 0.25 3.73 Computers and peripheral equipment 0.40 2.37 Software 0.14 6.11 Communication equipment 0.14 6.24 Medical equipment and instruments 0.14 6.24 Nonmedical instruments 0.18 4.90 Photocopy and related equipment 0.31 3.01 Office and accounting equipment Industrial equipment 0.09 8.46 Fabricated metal products 0.05 12.62 Engines and turbines 0.12 6.75 Metalworking machinery 0.10 7.74 Special industry machinery, n.e.c. 0.11 7.51 General industrial, including materials handling, equipment 0.05 12.88 Electrical transmission, distribution, and industrial apparatus Transportation equipment 0.15 5.72 Trucks, buses, and truck trailers 0.22 4.12 Autos 0.12 7.10 Aircraft 0.06 11.31 Ships and boats 0.06 11.59 Railroad equipment Other equipment 0.14 6.15 Furniture and fixtures 0.12 6.87 Agricultural machinery 0.16 5.51 Construction machinery 0.15 5.72 Mining and oilfield machinery 0.16 5.42 Service industry machinery 0.18 4.83 Electrical equipment, n.e.c. 0.15 5.81 Other nonresidential equipment Structures 0.02 18.83 Office, including medical buildings 0.02 19.73 Commercial 0.03 16.89 Manufacturing 0.02 19.18 Electric 0.02 19.18 Other power 0.02 19.18 Communication 0.08 9.80 Petroleum and natural gas 0.05 13.73 Mining 0.02 19.62 Other buildings 0.03 17.91 Railroads 0.02 19.11 Farm
Note: Depreciation rates are otained from the BEA. Duration is measured as 1.03/(0.03+δ).

9

index measuring the average duration of investment, ¯ Dt ≡

∑∑
i

Iit Di i Iit

(1.2)

¯ Dt is simply a weighted average of the durations of the assets, where the weights are the assets’ shares in aggregate nominal investment. When investment shifts relatively towards ¯ ¯ short-duration assets, e.g. computers or software, Dt falls. Furthermore, Dt is constructed so that it is not mechanically related to the level of investment. There is no particular reason why there need be a positive or negative relationship between the level of investment (or ¯ the state of the business cycle) and Dt .12 ¯ Figure 1.3 plots Dt for 1948–2008. As might be expected, average duration has been falling over time. The fastest rate of decline appears in the late 1980’s, and the series ﬂattens out after 1994, actually rising substantially between 2006 and 2008. We should not expect transitory changes in the term spread to explain the long-term changes in the duration of investment. Long run changes are driven by technological shifts, e.g. the introduction of computers, software, and other electronic equipment. Instead, the term spread will explain ¯ the year-to-year variation in Dt .13 Table 1.1 shows that computers have a depreciation rate of 25 percent, and software 40 percent. Their combined share of nominal investment rises from 7.4 percent in 1978 ¯ to 32.7 percent in 2008. Figure 1.3 also includes a version of Dt that excludes investment in computers and software, and we can see that if not for computers and software, there is no decline in the average duration of investment over time. Over the sample, though, ¯ the correlation of the ﬁrst differences of the two versions of Dt is over 90 percent. For the main regressions, I detrend all of the variables using the Hodrick-Prescott (HP) ﬁlter with a smoothing parameter of 25. I obtain similar results when I use a polynomial trend or take ﬁrst differences (see table 1.2 for results in ﬁrst differences).
Rather than using an index of average duration, which involves a discount rate, we could also simply use an index of the average depreciation rate of investment. All of the results below go through with this alternative measure.
13 Tevlin and Whelan (2003) give a more extensive discussion of the recent decrease in the duration of the capital stock. 12

10

Figure 1.3: Average Duration of Investment, 1948–2008

11

10.5

No computers or software

10

9.5

9

Years

8.5

11
1953 1958 1963 1968 1973 1978 1983

8

All assets

7.5

7

6.5

6 1988 1993 1998 2003 2008

1948

Note: average duration is duration summed across all assets, weighted by nominal investment shares. Investment is obtained from the BEA fixed asset tables.

1.3 Results
¯ Figure 1.1 plots HP-detrended Dt and the 10/1 year term spread at the end of the ¯ previous year (with the axis for Dt reversed). The negative relationship is immediately apparent. The term spread and average duration have a correlation of -54 percent. Gray bars indicate NBER-dated recessions. In most recessions, the term spread rises due to the Fed cutting interest rates, and the duration of investment falls. Duration is often high just prior to recessions, e.g. 1970, 1990, 2001, and 2007, when the yield curve is inverted. Looking more closely, we can see that over time the term spread has become more volatile ¯ while Dt has become somewhat less volatile. This is a common ﬁnding: the real economy has become less volatile (the great moderation), while Federal Reserve policy has become more aggressive, causing higher volatility in interest rates. ¯ Table 1.2 reports results of regressions of Dt on the ﬁrst lag of the term spread. All of the variables in table 1.2 are standardized to have unit variance so that the regression coefﬁcients indicate how a one standard deviation increase in the independent variables ¯ ¯ affects Dt in terms of its own standard deviation. The units of Dt have no deep economic meaning on their own. As expected, in the ﬁrst column we ﬁnd a highly signiﬁcant negative coefﬁcient on the term spread and an R2 of 0.30. This is a high value; Oliner, Rudebusch, and Sichel (1995), when forecasting the level of aggregate investment using models with as many as 11 lags of quarterly data, obtain at best an R2 of 0.34. With a single variable, I am able to get an ¯ R2 nearly as high for Dt . Column two uses the term spread on corporate bonds instead of Treasuries and ﬁnds a nearly identical coefﬁcient and R2 . ¯ The third column of table 1.2 controls for the lagged level of Dt . The coefﬁcient is only marginally signiﬁcant and the coefﬁcient on the term spread is essentially unchanged. Column 4 shows that leading values of the term spread have no explanatory power for average duration, which is consistent with the theory that ﬁrms are responding to the cost of capital, rather than there being some underlying variable that causes the term spread ¯ and D to generally move together. Finally, the ﬁfth column runs the basic regression using investment in all assets instead

12

Table 1.2: Regressions of the average duration of investment

Assets: Term spread(t-1)

(4) Equip. -0.58 *** [0.14]

(5) All -0.31 *** [0.12]

First differences (6) (7) (8) Equip. Within Between -0.41 *** -0.28 *** -0.14 *** [0.11] [0.06] [0.06]

Corporate TS(t-1)

Duration(t-1)

(1) (2) (3) Equip. Equip. Equip. -0.56 *** -0.49 *** [0.11] [0.12] -0.52 *** [0.09] 0.20 * [0.11]

Term spread (t)

Term Spread(t+1) 59 0.30 59 0.25 47 0.37 59 0.10

N R2

0.06 [0.09] 0.00 [0.08] 58 0.31

58 0.22

58 0.21

58 0.12

13

Note: * indicates significance at the 10 percent level, ** 5 percent level, *** 1 percent level. Annual data, 1950–2008, where available. The dependent variable is the average duration of investment. Investment and depreciation rates are obtained from BEA. The term spread is the 10-year minus the 1-year treasury yield at the end of the calendar year. The corporate term spread is the spread between the Moody's AAA corporate 30 year index and the St. Louis Fed's 3 month commercial paper yield. Columns 7 through 9 give results from first differenced regressions. Column 8 uses the effect of within-industry reallocation on average duration as the dependent variable. Column 9 is defined analogously using cross-industry reallocation. All variables are detrended with the HP filter with a smoothing parameter of 25 and standardized to have unit variance. Newey-West standard errors with a 3-year window are reported in brackets.

of equipment alone. The results still go through. The symmetrical regression using only structures investment is unenlightening because there is not enough variation in duration within structures to provide reasonable statistical power. ¯ To test for a break in the relationship between Dt and the term spread, I use the sup-F test (also known as the Quandt likelihood ratio test). We might expect that the break in this relationship would have appeared following the great moderation, when monetary policy became more aggressive and the economy less volatile. The F-test for a break, though, is ¯ maximized in 1958. Looking at ﬁgure 1.1, it is clear that after 1958 the volatility of Dt fell and the volatility of the term spread rose. The F-statistic for a break is never above the critical value reported in Andrews (1993) except for in 1958 and 1959. The highest value outside those two years is 4.56 in 1992, well below the 10 percent critical value of 5.00. There is thus evidence for a structural break, but not where we might have thought. For ¯ the period since 1960, we cannot reject the hypothesis that the relationship between Dt and the term spread has been stable.

1.4 Model
With the basic result in hand, it is useful to build a simple and stylized model to help understand where this correlation might come from. It is tempting to immediately jump to the conclusion that there is variation in the cost of capital (i.e. shocks to the supply of investment goods), which drives the result in ﬁgure 1.1. The model helps identify what other factors might induce a similar correlation. I consider a standard inﬁnite-horizon setup with a few simpliﬁcations for analytic tractability. Firms face a linear production function in each type of capital, where the current level of productivity for asset i is Bit . That is, revenue is equal to

∑ Bit Kit
i

(1.3)

where Kit is the stock of asset i at date t. Note that this revenue function ignores complementarities between types of assets. In general, if a decline in the term spread is expected to shift investment towards long-duration assets, complementarity across assets will at14

tenuate this effect (in the limit of a Leontief production function, ﬁrms would never vary the composition of the capital stock). I follow Baxter and Crucini (1993) and Jermann (1998) in specifying the update process for capital as Kit+1 = (1 − δi ) φi ( Iit−1 ) + φi ( Iit ) (1.4)

The update process for capital assumes that capital only operates for two periods for the sake of analytic simplicity. Each asset depreciates by the factor (1 − δi ) between its ﬁrst and second period of operation, and is subsequently obsolete. φi incorporates adjustment costs in investment so that a unit of investment may create less than one unit of capital. φi takes the form φi ( Iit ) = η1i 1−1/γ Iit + η2i 1 − 1/γ (1.5)

φi has the useful property that the elasticity of investment with respect to Tobin’s Q will equal the constant γ.14 The parameters η1i and η2i determine the level of investment and the size of the adjustment costs paid and are allowed to vary across assets.15 Denoting the discount rate between dates t and t + 1 as rt+1 , the ﬁrm maximizes the discounted value of its revenue net of investment costs, Πt = max ∑ ∑ exp −∑k=1 rt+k Et Bit+ j Kit+ j − Iit+ j
j Iit j =0 i ∞

(1.6)

where Et denotes the expectation operator conditional on information available at date t. All proﬁts are discounted at the riskless rate rt . For productivity growth, I assume that different assets may have different current levels of productivity, but expected productivity growth in the future is the same for all assets,
14 This functional form has the drawback that it is not necessarily consistent with negative investment. However, asset-level investment is always positive in the data, so this is not a practical concern here.

These two parameters allow us to choose a steady state level of investment xi where Qi = 1 and φi ( xi ) = xi and φi ( xi ) = 1. That is, they allow us to choose a point where the ﬁrm pays no adjustment costs overall and on the margin.

15

15

Et log Bi,t+ j /Bi,t+ j−1 = µt+ j . Taking the ﬁrst-order condition for Iit gives 1 = exp (µt+1 − rt+1 ) Bit + exp (µt+1 + µt+2 − rt+1 − rt+2 ) Bit (1 − δi ) φi ( Iit )

(1.7)

The appendix shows that, using a ﬁrst-order approximation, we can derive an approximate ¯ expression for the index of average duration, Dt , ¯ Dt ≈ d0 + γN −1 ∑ log ( Bit ) Di + F (µt+2 − rt+2 )
i

(1.8)

where d0 is a constant, N is the number of assets, log ( Bit ) ≡ log ( Bit ) − N −1 ∑i log Bit is the deviation of the productivity of asset i from the period average, and F is a strictly increasing function.16 We can rewrite the term µt+2 − rt+2 as µ t +2 − r t +2 = ( µ t +1 + µ t +2 ) − ( r t +1 + r t +2 )
Total prod. growth Total discount rate

+ ( µ t +2 − µ t +1 ) − ( r t +2 − r t +1 )
Productivity spread Term spread

(1.9)

The ﬁrst line is total productivity growth and the total discount rate between periods t and t + 2. The term (rt+2 − rt+1 ) is the relevant concept of the term spread, and I refer to

(µt+2 − µt+1 ) as the productivity growth spread.
¯ The previous section considered a simple regression of Dt on the term spread,

(rt+2 − rt+1 ). Holding all else equal (including total expected productivity growth and
the total discount rate), equation (1.8) conﬁrms the simple intuition that this relationship should be negative. Equation (1.8) shows, however, that there are at least four potential omitted variables in this regression: total expected productivity growth and discount rates between t and t + 2 (µt+1 + µt+2 and rt+1 + rt+2 ); the productivity growth spread,

(µt+2 − µt+1 ), and the levels of idiosyncratic productivity, N −1 ∑i log ( Bit ) Di .
16

Speciﬁcally, F ( x ) = −

exp( x ) −1 ¯ N 1+exp( x )(1−δ)

ˆ ¯ ˆ ¯ ∑i δi Di where δ ≡ N −1 ∑i δi and δi ≡ δi − δ. Note that since Di is

decreasing in δi , F is strictly increasing.

16

First, holding the term spread and the productivity spread ﬁxed, an increase in productivity growth (µt+1 + µt+2 ) or a decrease in discount rates (rt+1 + rt+2 ) will tilt the distribution of investment towards long-duration assets. This effect is the primary feature of duration: long-duration assets gain more value from a decline in interest rates or an increase in expected productivity growth than do short-duration assets. To the extent that the term spread is correlated with long-term average productivity growth and interest rates, then, a regression of the average duration of investment on the term spread will be biased. Speciﬁcally, we could spuriously ﬁnd a negative relationship between the term spread ¯ and Dt if expected long-term productivity growth is low in periods when the term spread is high. The term spread is countercyclical, so this would correspond to a situation in which expected long-term productivity growth (µt+1 + µt+2 ) is low during recessions. I will try to control for these effects by controlling for the level of aggregate investment and various other indicators of the state of the business cycle. The second source of bias is that the productivity spread (µt+2 − µt+1 ) could be correlated with the term spread. In particular, if productivity growth is expected to slow down in the same periods that the term spread is high, we would ﬁnd a spurious negative relationship between average duration and the term spread. In this case, recessions would have to be periods in which productivity growth is expected to decelerate in the future, which seems unlikely given that recessions are periods when growth is already slow in the ﬁrst place (by deﬁnition). Finally, the levels of productivity across assets could be related to duration, affecting ¯ Dt through the N −1 ∑i log ( Bit ) Di term, which can be thought of as the covariance between duration and productivity across assets. If this covariance changes over time and is systematically related to the level of the term spread, then omitting it from the regression would bias the coefﬁcient on the term spread. Over long horizons, investment and productivity shift substantially across different assets. The most notable of these changes is the long-run decline in prices and increase

17

in investment in computers and software (Tevlin and Whelan, 2003).17 The model would ¯ interpret this phenomenon as an increase in Bit for low-duration assets, which drives Dt ¯ downward. A simple way to control for those movements is to detrend Dt . Short-run movements in idiosyncratic productivity are more difﬁcult to account for, though. If changes in the term spread are correlated with shifts in productivity that favor ¯ certain assets, then the regression of Dt on the term spread will be biased. In the empirical analysis below, I discuss and control for some speciﬁc mechanisms, most importantly industry demand shifts, that could drive high-frequency movements in N −1 ∑i log ( Bit ) Di . Instead of running a regression of average duration on investment, it would be nice to estimate a more fundamental parameter, such as the coefﬁcient on marginal Q, which tells us about the size of adjustment costs in investment. One way to do that would be to calculate Tobin’s Q for each asset individually, as in Abel and Blanchard (1986), using the full term structure of interest rates. The problem is that we do not actually directly measure the marginal product of any individual asset at any point in time. Moreover, we do not measure anything like the true discount rate for each asset. Rather, the term spread in this paper is measured using Treasury yields and is taken as an indicator of differences in discount rates across assets. A deeper problem is that Abel and Blanchard’s method would also require forecasting inﬂation at very long horizons, when the literature generally ﬁnds that inﬂation is difﬁcult to forecast even at quarterly and annual horizons (e.g. Atkeson and Ohanian, 2001).18
17 See also Caballero, 1994, and Schaller, 2006, for studies of the relationship between investment and the cost of capital in the long-run.

Euler equation estimation is also an option. In a pair of papers, Oliner, Rudebusch, and Sichel (1995, 1996) study the effectiveness and internal consistency of Euler equation models for investment. They obtain parameter estimates that are somewhat difﬁcult to reconcile with economic theory, ﬁnd that supposedly "structural" parameters are unstable over time, and that the models have little forecasting power. There are also legitimate concerns about the validity and relevance of the instruments used in these models (especially when extended to asset-level data). I attempted to estimate an Euler equation using the panel of data on asset-level investment. Between twostage least squares, LIML, and GMM methods, there were substantial differences in results indicating that the model is misspeciﬁed or there are problems with the instruments. I also replicated some of the troubling results found by Oliner, Rudebusch, and Sichel. Furthermore, Euler equations are clearly difﬁcult to estimate even with quarterly data, and I only have annual data on asset-level investment. The Euler-equation method is also more restrictive than the methods used in this paper because it is difﬁcult or impossible to incorporate all of the controls that I consider. Euler equations are useful for estimating speciﬁc parameters in tightly theorized models. The regressions used here are meant to test a broader range of possible explanations for the correlation between average duration and the term spread and to measure the explanatory

18

18

What the regression of average duration on the term spread is useful for is testing whether the term spread drives investment in the direction that we would expect and how much explanatory power the term spread has for the cross-section of investment. A high R2 in a regression of average duration on the term spread is evidence that the crosssection of interest rates is an important determinant of the cross-sectional distribution of investment.

1.5 Alternative explanations
The working hypothesis is that the negative relationship between average duration and the term spread is a simple cost-of-capital effect. The model in the previous section shows that there are a number of other factors that could cause us to ﬁnd the correlation we observe in ﬁgure 1.1. This section considers a range of possible alternative explanations. I ﬁnd that the correlation is driven to some extent by these other factors, but that the cost of capital retains a substantial amount of explanatory power and is generally the most powerful variable for explaining average duration.

1.5.1 Correlations by asset and industry ¯ One possible explanation for the correlation between the term spread and D is that demand for the products of different industries depends on the term spread. For example, suppose when the term spread is high consumers demand fewer durable goods (the term spread tends to be countercyclical, as are durables purchases; Yogo, 2006). If durable goods industries tend to use relatively more long-duration capital than services providers (for example, a car manufacturer may use more heavy machinery than a barber shop), then we would see investment shift towards low-duration assets. In the terms of the model, this is a story about the covariance term ∑i D (δi ) log ( Bi0 ) − N −1 ∑i log ( Bi0 ) . The correlation ¯ between D and the term spread then would be driven by consumer demand (and hence the variation in the marginal product across assets) instead of the cost of capital. We can
power of the term spread. I therefore leave the Euler equation analysis of this panel dataset for future work.

19

¯ test this hypothesis by decomposing D into components driven by within-industry reallocation and changes in the composition of investment across industries. As noted above, the BEA not only reports data on aggregate investment; it also gives ¯ ¯ levels of investment at the asset×industry level. Denoting the ﬁrst difference of Dt as ∆ Dt , ¯ we can decompose ∆ Dt following van Ark and Inklaar (2006) using the industry-level data as ¯ ∆ Dt =

∑
j j

1 2 1 2

Ij,t Ij,t−1 − ¯ ¯ It It−1 Ij,t−1 Ij,t + ¯ ¯ It It−1

¯ ¯ D j,t + D j,t−1 ¯ ¯ D j,t − D j,t−1 (1.10)

+∑
¯ where D j,t ≡ ∑i
Ij,i,t ¯ Di Ij,t

is the average duration of industry j at time t. The ﬁrst part of

equation (1.10) can be thought of as a cross-industry reallocation effect. It sums the changes in the industry investment shares weighting by their average depreciation rates at dates t and t − 1. The second term is the within-industry reallocation term. It represents the effects of industries changing their mix of investment among different assets. I refer to the two effects as the between and within-industry effects, respectively. The ﬁnal three columns of table 1.2 report results from ﬁrst-differenced regressions ¯ ¯ of ∆ D and its decomposition (1.10) on the change in the term spread. ∆ D and ∆TS are standardized to have unit variance as in the remainder of the table. The three columns report results from regressions with different dependent variables. The ﬁrst column uses ¯ ∆ D. The coefﬁcient on the term spread is similar to though somewhat smaller than the ¯ coefﬁcient in column 1. In other words, the relationship between D and the term spread is somewhat weaker in high frequency data, which is perhaps not surprising considering the effects of planning, ordering, and building lags. The coefﬁcients in columns 7 and 8 by deﬁnition sum to the coefﬁcient in column 6. The within-industry coefﬁcient is twice the size of the between-industry coefﬁcient; in other words, two thirds of the aggregate effect comes from reallocation within industries. The hypothesis that industry demand is correlated with the term spread seems to be true, but it explains only a minority of the variation in average duration over time.

20

To analyze how the relationship in ﬁgure 1.1 and table 1.2 differs across assets, I run a regression of each asset’s share of aggregate investment on the term spread.19 Speciﬁcally, for each asset we run the regression Iit = αi + β i TSt + ε it ∑i Iit

(1.11)

It is straightforward to show that if β i is negatively related to each asset’s duration, then ¯ there will be a negative relationship between the term spread and Dt . This is a way of asking whether the relationship we observe at the aggregate level is pervasive across assets, or is driven by a few outlier assets. Figure 1.4 plots the coefﬁcients β i against duration. The black boxes are for equipment, grey diamonds structures. Regression lines are included for the sample of all assets and for equipment only. The correlations between β i and Di are -0.42 and -0.31 for equipment only and all assets, respectively. Looking across equipment, the relationship between the composition of investment and the term spread is broadly based not driven by a few outliers. The plot includes labels for the assets that make up the largest part of investment over the last 15 years. Numbers in parentheses represent their percentage shares over that period. Within equipment, auto purchases as a share of total investment are far more positively correlated with the term spread than any other asset, though they represent a relatively small part of aggregate investment. Software is the single largest component of investment and it is well above the best ﬁt line. Communication equipment and computers are next in the rankings and are somewhat closer to the regression line. Structures do not match the results for equipment very well. While the shares of structures are generally negatively related to the term spread, they are not as negative as we would think from just looking at equipment. Electric-power plants, in particular, are a large positive outlier. As noted above, the fact that building lags average over a year (a time that does not take into account the time required for planning) is likely to distort the
19 To control for long-term changes in the composition of investment I ﬁrst detrend the dependent variable and the term spread using the HP ﬁlter with a smoothing parameter of 25 as above.

21

Figure 1.4: Coefﬁcients from regressions of investment shares on the term spread

0.004

Cars (3.4)

0.003

0.002

Software (14.9) Trucks and buses (6.9) Medical equipment (3.7) Electric power plants (2.0) Furniture and fixtures (3.3)

Regression Coefficient

0.001

22
5 10

Computers (7.4)
15 20 25

-1E-17

0

Construction machinery (2.1) Communication equipment (8.2)

Full sample best fit Railroad Equipment (0.6) Petroleum and natural gas (4.0) Office/medical buildings (4.3)

-0.001

Commercial buildings (5.3)

Metalworking machinery (2.6)

Equipment only best fit

-0.002

Duration Note: Coefficients from regressions of each asset's share of aggregate investment on the term spread. Both variables are detrended with the HP filter with a smoothing parameter of 25. Numbers in parentheses are investment shares over 1993–2008.

regressions for structures.

1.5.2 The business cycle, volatility, and other explanations Table 1.3 explores a number of other mechanisms that could cause the observed correlation beyond changes in demand across industries. Columns 1 and 2 control for the business cycle with the lagged detrended unemployment rate and level of output. In both cases the coefﬁcient on the term spread is smaller but still statistically and economically signiﬁcant. This is perhaps not surprising: even if the term spread does represent a true cost-of-capital effect, it is also a proxy for the business cycle. Controlling for other business cycle indicators will probably lower its coefﬁcient. Including the current value and longer lags of unemployment and output do not change the results of the regressions. Another obvious question is whether there is a mechanical relationship between average duration and the level of investment. Suppose a ﬁrm has equal stocks of two assets, one with a depreciation rate of 1 percent, the other 10 percent. In a maintenance phase with no net capital growth, there will be 10 times as much investment in the high depreciation as the low depreciation asset. However, in an expansion phase, assuming both assets are expanded equally, investment will shift towards being equally balanced between the two assets. If the term spread is correlated with the level of investment, it might also then be correlated with average duration. Column 3 tests that hypothesis by including detrended aggregate equipment investment. Puzzlingly, unlike the example just given, when investment is high, duration actually tends to be low. However, the coefﬁcient on the term spread is still large and signiﬁcant. The term spread thus has explanatory power beyond its indication of either the business cycle of overall level of investment. Column 4 shows that if we include all three aggregate indicators, unemployment, GDP, and investment, the coefﬁcient on the term spread is the highest, and has the highest t-statistic, of any of the variables (implying that the marginal R2 of the term spread is higher than any of the business-cycle indicators). Abel et al. (1996), among many others, study the effects of irreversibility on investment. With irreversibility, when idiosyncratic uncertainty is high, ﬁrms may be less willing to

23

Table 1.3: Robustness tests

Term Spread(t-1)

(2) -0.33 *** [0.11]

(3) -0.65 *** [0.11]

(5) -0.41 *** [0.09]

(6) -0.36 *** [0.10]

(7) -0.37 *** [0.10]

Unemployment(t-1) 0.38 *** [0.10] -0.23 ** [0.11] 0.37 *** [0.12] 0.31 ** [0.14]

(1) -0.27 ** [0.12] -0.45 *** [0.14]

GDP(t-1)

0.38 *** [0.09]

Investment(t)

(4) -0.44 *** [0.08] -0.01 [0.19] 0.40 ** [0.17] -0.29 *** [0.09]

SD_profits(t+1)

SD_returns(t+1)

-0.01 [0.09] -0.31 *** [0.08] -0.24 *** [0.08]

24
58 0.45 59 0.45 59 0.35 58 0.52

Bank tightness(t)

Value spread(t) 44 0.65 37 0.62

N R2

-0.22 *** [0.07] 59 0.49

Note: See table 2. The dependent variable is the detrended average duration of equipment investment. The value spread is the gap between log book/market (B/M) for the top and bottom 30 percent of firms ranked by B/M, among the smaller 50 percent of firms, measured at the beginning of the year. SD_profits and SD_returns are the cross-sectional standard deviations of quarterly firm profit growth and stock returns, controlling for a time trend and 3-digit industry dummies. The unemployment rate is the national rate obtained from the BLS. GDP is real GDP from the BEA. Bank tighness is the Fed's Survey of Senior Loan Officers index (from Morgan and Lown, 2006). Investment is aggregate real nonresidential equipment investment. All variables are detrended with the HP filter with a smoothing parameter of 25 (except for the value spread, for which it is 100), and standardized to have unit variance.

invest in long-duration assets. Intuitively, if it is more difﬁcult to sell a long-duration asset (e.g. a large wind turbine) because it is more costly to disassemble than a shortduration investment, then there is option value to delaying investment which is increasing in uncertainty.20 Campbell et al. (2001) and Bloom (2009) ﬁnd that when the volatility of returns on the aggregate stock market is high, so is idiosyncratic ﬁrm volatility. If the term spread is partially driven by aggregate volatility (a ﬁnding of Bloom, 2009, and implied by ¯ many term structure models, e.g. Longstaff and Schwartz, 1992), and volatility drives D, ¯ then we would ﬁnd a spurious correlation between the term spread and D. I use two measures of cross-sectional volatility that are also used in Bloom (2009): the period-by-period cross-sectional standard deviations of ﬁrm quarterly proﬁt growth and stock returns, including controls for 3-digit SIC industries.21 Column 5 of table 1.3 reports ¯ results of a regression of D on the volatility indexes. Both measures of volatility are positively correlated with the next year’s term spread, which is consistent with Bloom’s (2009) results. He ﬁnds that volatility shocks lead to economic contractions and reductions in the short rate. Table 1.3 shows that conditional on the term spread and the state of the business cycle, high stock return volatility (though not proﬁt growth volatility) in the following year is associated with low duration investment. This is consistent with the hypothesis that long-duration investment involves a bigger commitment for ﬁrms than short-duration investment. That is, the hypothesis that high volatility interacts with ﬁxed costs of adjustment to decrease investment seems to apply more strongly to long than short-term assets. Note, though, that even when controlling for volatility, the term spread remains signiﬁcant and has a large coefﬁcient. Another alternative hypothesis is that the term spread does not reﬂect the cost of capital but is simply an indicator of the stance of monetary policy. When the Federal Reserve contracts the money supply, this may inhibit bank lending, as in Kashyap and Stein (2000). If banks are more likely to ﬁnance projects of a certain duration (either high or low), then ¯ the term spread might simply be correlated with movements in D because it is correlated
20 21

House and Shapiro, 2008, discuss the relationship between real option-type effects and asset duration.

The original data was retrieved from Compustat and CRSP. I obtained the data used here from Nick Bloom’s website.

25

with bank lending standards. One way to test this hypothesis is to try to directly measure bank lending standards. The Federal Reserve has administered a Survey of Senior Loan Ofﬁcers since 1967 (with a gap between 1983 and 1989) that asks banks about the level of their lending standards.22 Column 6 includes the tightness index from this survey in the regression. The coefﬁcient on the term spread remains signiﬁcant. When bank lending standards are relatively tight (a high value of the index), average duration is low. This is perhaps surprising, since banks are usually thought of as ﬁnancing short-duration projects, while ﬁrms go to credit markets for longer-term ﬁnancing. One possible explanation is that lending standards tend to be high when other factors are driving ﬁrms towards shortduration investment. In particular, standards might be high in times of high uncertainty. The appendix includes further robustness tests. When all of the controls are included simultaneously, the term spread is the only signiﬁcant variable and it has more explanatory power than any of the other variables individually. Lettau and Wachter (2007) argue that the differences in returns between high and low book/market (B/M) stocks can be explained by differences in the duration of their cash ﬂows (see also Hansen, Heaton, and Li, 2008). A high value spread is associated with a high valuation for growth stocks, or long-duration assets, which implies investment in long-duration assets should be high. Since stock prices represent claims on capital, whereas Treasury bonds are claims on currency, we might expect that the value spread would have more predictive power than the term spread. Column 7 of table 1.3 reports the results of a regression including the value spread. I measure the value spread here as the ratio of the book to market ratios for the top and bottom third of stocks sorted by book to market (as reported on Kenneth French’s website).23 The coefﬁcient is signiﬁcantly negative: the opposite of what the duration theory of the value spread would predict. One possible explanation for this result is that ﬁrms with growth stocks tend to have lowerduration assets—e.g. technology ﬁrms—so when their values are high average duration
22 23

I obtain data from Lown and Morgan, 2006.

Speciﬁcally, French reports value spreads for small and large stocks, split at the median of market capitalization. I average these two value spreads. Furthermore, I detrend the value spread using the HP ﬁlter with a smoothing parameter of 100.

26

falls. To many readers, that may have been the obvious result all along. Nevertheless, it runs against Lettau and Wachter’s theory.

1.6 Consumer durables
If the term spread truly represents a cost of capital effect then we would expect household purchases of durable goods to respond to it in a manner similar to nonresidential investment. Households face some of the same choices as ﬁrms when deciding what types of durable goods to purchase. In particular, long-lasting durable goods may have ﬁnancing arrangements with longer terms than those of shorter duration assets.24 Denoting the duration of durable good of type i as Ci and purchases as Pi , I deﬁne the average duration of consumer durables purchases as ∑ Ci Pit ¯ Ct ≡ i ∑i Pit

(1.12)

Table 1.4 lists the assets available from the BEA, along with their depreciation rates and durations. The two assets with the lowest depreciation rates are luggage and furniture at 13 percent. Computer software and motor vehicle parts have the highest rates at 76 and 90 percent, respectively. The assets are mostly clustered in a small range of depreciation rates, though: three fourths have depreciation rates between 16 and 25 percent. ¯ Figure 1.5 plots HP-detrended Ct against the detrended term spread. As in ﬁgure 1.1, ¯ the axis for Ct is reversed so that a negative correlation in the data is an easier-to-read positive correlation in the ﬁgure. For most of the sample, there is a strong negative correlation, just as we observe for nonresidential investment. In a regression similar to those in table 1.2, consumer durables on the lagged term spread, the coefﬁcient is -0.31 with a p-value of 0.008. There is thus a signiﬁcant relationship over the full sample, though the correlation is somewhat weaker than what we observe for nonresidential investment. The correlation is clearest between 1965 and 1991. For nonresidential investment the correlation is more consistent over time, which explains why the QLR test in section 1.3 indicated a break point
24 Attanasio, Goldberg, and Kyriazidou (2008) show that auto loan terms tend to be between three and ﬁve years, while home loans may be as long as 30 years.

27

Table 1.4: Consumer durables, depreciation rates, and durations

Depreciation Duration Asset rate (percent) (years) 0.28 0.25 0.90 0.13 0.18 0.18 0.18 0.16 0.18 0.18 0.20 0.18 0.44 0.76 0.18 0.18 0.18 0.18 0.18 0.18 0.26 0.18 0.20 0.16 0.32 0.18 0.13 0.18 3.27 3.70 1.11 6.63 4.90 4.91 4.89 5.37 4.90 4.90 4.45 4.92 2.21 1.31 4.91 4.91 4.91 4.92 4.90 4.91 3.52 4.91 4.46 5.36 2.95 4.91 6.63 4.88 Motor vehicles and parts Autos Light trucks Motor vechicle parts & accessories Furnishings and household equipment Furniture Clocks, lamps, lighting fix & other Carpets and other floor coverings Window coverings Household appliances Glassware, tableware, & household uten Tools & equipment for house & garden Recreational goods and Vehicles Video & audio equipment Photographic equipment Personal computers and peripheral equip Computer software & accessories Calcs, typewrtrs, & oth info proc equip Sporting equip, supplies, guns, & ammo Motorcycles Bicycles & accessories Pleasure boats Pleasure aircraft Other recreational vehicles Recreational books Musical instruments Other durable goods Jewelry & watches Therapeutic appliances & equip Educational books Luggage & similar personal items Telephone & facsimile equipment

Note: Depreciation rates are otained from the BEA. Duration is measured as 1.03/(0.03+δ).

28

only in the very beginning of the sample. ¯ The relationship between Ct and the term spread seems to abruptly break down after 1991. If we run a QLR test as before, we can reject the hypothesis of no break at the 1 percent level. The F-statistic is maximized in 1991, only one year different from the local maximum that is obtained in the F-statistic for nonresidential investment.25 The fact that these two break tests are maximized around the same time suggests that the breakdown in the consumer durables plot is not due to a factor that is speciﬁc to consumers. One possible consumer-speciﬁc explanation is that there was some sort of change in consumer credit markets around 1991. Perhaps easier access to credit cards made consumers less dependent on long-term ﬁnancing for some durables purchases, which made them less sensitive to long-term credit conditions. The Flow of Funds accounts measure total credit card balances and household net worth. The ratio of consumer credit debt to net worth rises from 1.0 to 3.8 percent between 1945 and 1965, but then stays ﬂat subsequently. While there were certainly changes in consumer credit markets following 1965, the total quantity of credit has remained in this sense stable.

1.7 The ﬁrm-level mechanism
To augment the analysis above, this section studies two aspects of ﬁrm-level investment. I begin by asking whether ﬁrms that invest in long-duration assets also tend to sell long-term debt. Next, I look at whether industries with larger cash holdings are more sensitive to the term spread. The data answer both these questions in the afﬁrmative, but when we include a full set of industry and year dummies the results go away, possibly because of insufﬁcient statistical power. The ﬁrst result indicates that when ﬁrms go to debt markets, the interest rate that they face depends on the duration of the investment they plan on undertaking. The second result shows that ﬁrms that are more likely to have to go to debt markets seem to vary the composition of their investment more strongly in response to interest rates.
25

Note, again, that the local maximum for nonresidential investment is not statistically signiﬁcant.

29

Figure 1.5: Average duration of consumer durable purchases versus the lagged term spread

2

-0.06

-0.04

1

Term Spread
-0.02

0

1951

1956

1961

1966

1971

1976

1981

1986

1991

1996

2001

2006

0

Term Spread

0.04

-2

Average Duration

0.06

-3 0.08

-4

0.1

Note: Average duration of consumer durables is defined analogously to that for durable equipment. Both lines represent HP-detrended values. The axis for average duration is reversed. Grey bars indicate NBER-dated recessions.

Duration

30

-1

0.02

1.7.1

The maturity of assets and debt

The link between the term spread and the cost of capital will be most clear to managers if investment in long-duration assets is ﬁnanced with long-duration debt. If, for example, ﬁrms always borrow at the same maturity and simply roll over their debt, then they might only pay attention to the interest rate for the maturity at which they borrow, instead of the full term structure. Baker, Greenwood, and Wurgler (BGW, 2003) ﬁnd that ﬁrms time the debt market when they sell bonds. In particular, when the term spread is high ﬁrms sell short-term debt. BGW argue that ﬁrms do this because when the term spread is high, the prices of shortterm bonds are expected to fall in the future. Firms are selling expensive or overpriced debt, which BGW claim represents arbitrage. But if it is true that ﬁrms try to match the maturity of their debt to the maturity of their investments, then the results in the previous sections could explain the BGW result. Matching the maturity of debt to assets reduces potential deadweight losses from bankruptcy (see, e.g., Stohs and Mauer, 1996). Graham and Harvey (2002) report evidence from surveys that maturity matching is the single most important determinant of debt maturity choice.26 Section 1.3 showed that when short-term yields are low, ﬁrms invest in short-duration assets. If the maturity-matching hypothesis is correct then those ﬁrms should also sell short-duration debt. That matches the Baker et al. result: low short yields are associated with short-duration investment, which is associated with sales of short-duration debt. BGW claim that ﬁrms are arbitraging debt markets; I claim they are managing risk through maturity-matching. The key to completing the argument is showing that ﬁrms actually do try to match the duration of their debt to that of their assets. In this section I provide evidence in support of this proposition.27
26

See also Barclay and Smith, 1995, and Guedes and Opler, 1996, among many others.

27 Baker at al. tried measuring the duration of assets with a similar strategy to mine. However, rather than using industry depreciation reported by the BEA, they used the amount of depreciation reported to the IRS by individual ﬁrms. Presumably this data was substantially more noisy than the BEA data, which caused them to ﬁnd inconclusive results. Moreover, accounting depreciation is in general not the same as economic depreciation. The majority of ﬁrms use straight line depreciation, rather than the declining balance method found to better match the resale value of assets (Hulten and Wykoff, 1981).

31

I obtain data from two sources. Data on capital stocks come from the BEA’s detailed ﬁxed asset tables as before.28 I continue to measure average duration within industry j as ∑i Di Iijt ¯ D jt = ∑i Iijt where i indexes assets, j indexes industries, and I is investment I obtain data on corporate debt from Compustat. Following Baker et al. (2003) and Greenwood et al. (2009), the long-term share in a given industry and year is the sum of all outstanding long-term debt reported by ﬁrms in that industry divided by all long and short-term debt.29 I estimate issuance of long-term debt as the change in the level of longterm debt, and short-term issuance as simply the level of short-term debt (since short-term debt has, by deﬁnition, a maturity of less than one year). The long-term issuance share is then just the ratio of long-term issuance to total issuance.30 An important issue here is that Compustat only covers publicly traded ﬁrms, whereas the BEA’s ﬁxed-asset data covers all ﬁrms. To the extent that private ﬁrms have limited access to long-term credit markets, this will bias the level of the long-term share upwards.31 It is less clear, though, that selection should cause us to spuriously ﬁnd that high-depreciation industries have a low long-term share. The selection would need to occur in such a way that ﬁrms in high-depreciation industries are more likely to go public but are no more likely to have access to long-term credit markets. Table 1.5 reports regressions of the long-term level and issuance shares on industry average duration. The ﬁrst two columns use the level share, the second two the issue share. Columns 2 and 4 include industry ﬁxed effects. Each regression includes year dummies and the standard errors are corrected for clustering within industries. Columns 1 and 3
28 The BEA has its own industry classiﬁcation which is slightly different from NAICS. I use industries that roughly correspond to a 2-digit NAICS classiﬁcation, but I combine some industries to ensure that I have sufﬁcient ﬁrm observations to get good ﬁnancial data. I end up with 22 industries 29 I measure long term debt as the sum of items 9 (long term borrowing) and 44 (long term debt about to retire), and short term debt as item 9 plus item 34 (current liabilities) minus long term debt.

(1.13)

I drop ﬁrm observations if the level of long term debt drops by more than one half (as this amount of retirement is implausible). Industry-year observations are dropped if they have a negative level of long term debt issuance.
31

30

For example, Titman and Wessels, 1988, ﬁnd that small ﬁrms are less likely to use long-term debt ﬁnancing.

32

show that there is a signiﬁcant negative relationship between long-term debt levels and issuance and the depreciation rate of assets in an industry. However, columns 2 and 4 show that when we include industry dummies the effect goes away. That is, there is not evidence that when an industry shifts towards higher-depreciation assets, it also changes the composition of its debt. One reason I do not ﬁnd within industry effects in table 1.5 could be that the data is not sufﬁciently precise. The median number of ﬁrms that is used to create the industry×year observations is only 148, and the 25th percentile is 30. Moreover, the measure of the duration of debt is extremely rough. Firms could easily be changing the maturity of their long-term issues, rather than substituting between long and short-term issues.32

1.7.2

Cash reserves and investment

If ﬁrms match the maturities of assets and debt, as suggested by table 1.5, then when they borrow to ﬁnance investment, shifts in the term spread directly feed into their cost of capital, and presumably their investment decisions. However, when ﬁrms ﬁnance investment internally, we might think they simply use a rule-of-thumb method for the cost of capital, ignoring the term structure of interest rates (e.g. Graham and Harvey, 2002). I study this question by looking at how investment differs across industries with different cash holdings. I study the following regression ¯ D j,t = α j + β 1 TSt−1 + β 2 CHj,t + β 3 TSt−1 × CHj,t + ε j,t

(1.14)

where, as before, TSt is the term spread and CHj,t is a measure of cash holdings in industry j at time t. The coefﬁcient β 3 measures the effect of cash holdings on the response of an industry’s average duration of investment to the term spread. Under the hypothesis that ﬁrms that can ﬁnance investment internally respond less to the term spread, we should observe a positive value for β 3 . I use two measures of an industry’s ability to ﬁnance investment internally: its total
While there is data with more detail on the duration of corporate debt, it does not have a long enough time series to be useful for ﬁnding the aggregate effects that I am looking for here.
32

33

Table 1.5: Regressions of the long-term corporate level and issues shares

Duration Fixed Effects? N

(1) Levels 0.013 *** [0.005] No 1,040

(2) Levels -0.008 [0.006] Yes 1,040

(3) Issues 0.021 ** [0.11] No 1,023

(4) Issues -0.008 [0.012] Yes 1,023

Note: The long term level share is the share of total corporate debt accounted for by long term (>1 year maturity) debt. The issues share is the share of issues accounted for by long term debt. Duration is the average duration rate of the industry's capital stock. All regressions include year dummies. Standard errors reported in brackets are corrected for clustering within industries. Annual data for 22 industries, 1950–2008.

34

current cash holdings and its cash ﬂows (income before extraordinary items), both scaled by current property, plants, and equipment (PPE).33 One of the measures for CHj,t is a ﬂow, while the other is a stock. If industries differ in the amount of cash that they prefer to hold at any given time, then cash ﬂows might be more relevant for their ability to ﬁnance investment internally. On the other hand, cash reserves could be saved precisely for use in future investment, and hence represent a source of funds for investment.34 I test both possibilities. ¯ As above, I obtain data from Compustat and detrend D j,t and TSt with the HP ﬁlter. I subtract industry means from the measure of cash holdings, CHj,t , so that the coefﬁcient ¯ on the interaction term, β 3 , represents the change in the response of D j,t to the term spread depending on the difference between the industry’s current cash holdings and the sample average for that industry. Columns 1 and 2 of table 1.6 report estimates of equation (1.14). Column 1 shows ¯ that cash ﬂows seem to have no signiﬁcant relationship with the response of D j,t to the term spread. In column 2 we see, though, that cash holdings have a strong effect. For an ¯ industry with its sample average of level of cash holdings, the response of D falls by 0.15 standard deviations for every one standard deviation increase in the term spread. For an industry with cash holdings one standard deviation above their mean, this response falls to only 0.08 standard deviations. This result ﬁts with the hypothesis that when ﬁrms are forced to ﬁnance investment in credit markets they are more sensitive to the cost of capital. Columns 3 and 4 are the same as columns 1 and 2 except that I generate a variable CH j,t , by regressing CHj,t on a set of year and industry dummies. CH j,t thus measures the deviation of an industry’s cash holdings relative to both its sample average and the average for the year. This controls for general trends in cash holdings over time. The coefﬁcients on the interaction terms are no longer signiﬁcant. As before, this may simply be a power issue. Note that the standard error on the interaction term is substantially larger in column
33 34

The results are unchanged if we use free cash ﬂow instead of income.

See Opler et al., 1999, for an analysis of the determinants of cash holdings and their use for investment, and the large literature following Fazzari, Hubbard, and Petersen, 1988, on the relationship between investment and cash ﬂows.

35

Table 1.6: Interaction of the term spread with industry cash holdings

Term spread (t-1) Cash flows (t-1) Cash holdings (t-1) TS(t-1)xCF(t-1) TS(t-1)xCH(t-1) Industry dummies Year dummies R2 N

(1) -0.15 *** [0.03] 0.03 [0.02]

(2) -0.15 *** [0.02]

(3) -0.15 *** [0.03] 0.08 ** [0.03]

(4) -0.15 *** [0.02]

0.03 * [0.02] -0.02 [0.03] 0.07 ** [0.03] Yes No 0.09 806 0.03 [0.02]

0.07 [0.06]

Yes No 0.08 770

Yes Yes 0.08 770

-0.03 [0.06] Yes Yes 0.07 806

Note: Regressions of the average duration of investment by industry on the term spread interacted with measures of cash holdings. Cash data is obtained from Compustat. TS is the term spread. CF is cash flows. CH is cash holdings. All variables are measured at the end of the year All regressions include industry dummies. Standard errors are clustered by industry.

36

4 than column 2. We in fact cannot reject that the two coefﬁcients are equal.

1.8 Conclusion
This paper shows that there is a strong relationship in aggregate data between investment and the cost of capital. I ﬁnd that the term spread can explain a third of the variation of the cross section of investment. While this relationship does not quantify the magnitude of internal adjustment costs facing ﬁrms, it does show that the cost of capital is a major factor driving the variation in the type of investment that ﬁrms do. The composition of investment changes meaningfully over the business cycle, and a substantial portion of these changes can be explained by the term spread alone. The results are robust to including a variety of controls, including multiple indicators of the state of the business cycle. None of the controls eliminate the coefﬁcient on the term spread. Moreover, when we include all of the controls at once, the term spread is the only variable that remains signiﬁcant. Of all of the variables I study, the term spread is the most robust and powerful explanator of the distribution of investment. The dimension of investment studied here has not been examined before. The results extend also to consumer durables purchases: households tend to buy less-durable durables when the yield curve is steep. Cochrane (2011) gives an extensive review of the literature on return predictability and variation in the price of risk, arguing that shifts in discount rates are part of "the central organizing question of asset pricing research." As Treasury bonds have (nominally) riskless payoffs, shifts in the term spread are purely driven by discount rates. The ﬁnding that the term spread determines the composition of investment is thus connected to Cochrane’s organizing question by showing its relevance for the aggregate economy, and not just ﬁnancial markets. There are many other cross-sectional sources of variation in the cost of capital beyond differences in asset lives. Tax policies, e.g. R&D tax credits and bonus depreciation, distort the cost of capital, as will changes in the price of risk. The ﬁnding here that shifts in the term structure of interest rates affect the composition suggests that tax policy can succeed

37

distorting investment choices. Similarly, to the extent that the price of risk varies over time, an interesting question is whether a high price of risk causes businesses to shift relatively towards low-risk/low-reward projects.

38

2. A MODEL OF TIME-VARYING RISK PREMIA WITH HABITS AND PRODUCTION

2.1 Introduction
Stock prices are more volatile than can be explained by movements in expected dividends. Moreover, excess returns on the aggregate stock market are predictable over time. The two phenomena are connected: changes in the discount rates applied to future dividends can induce excess volatility in asset prices. This paper develops a new preference speciﬁcation with time-varying risk aversion that generates realistically predictable and volatile stock returns. When combined with a production framework, the model can match the short and long-run volatilities of output, consumption, and investment growth and at the same time generate a high and volatile price of risk. Simulated stock-return forecasting regressions are consistent with empirical results, and the model also delivers a new method for forecasting stock returns. The structural estimate of risk aversion has an R2 for 5-year stock returns in the post-war period of over 50 percent. The standard model of time-varying risk aversion is the habit speciﬁcation of Campbell and Cochrane (1999).1 In their model, when an agent’s consumption falls close to her habit, her risk aversion rises. Using aggregate consumption data, they ﬁnd that their implied risk aversion measure can explain a large proportion of the movements in the price-dividend ratio on the stock market. Campbell and Cochrane study an endowment economy, though, so they never test whether their utility function generates a realistic consumption process in equilibrium. In fact, Lettau and Uhlig (2000) and Rudebusch and Swanson (2008) ﬁnd that Campbell–Cochrane preferences imply that consumers smooth
A partial selection of other early papers studying habit formation is Abel (1990), Constantinides (1990), Boldrin, Christiano, and Fisher (2001) and Jermann (1998). For other papers that study return predictability in a production setting, see Gourio (2010), Campanale, Castro, and Clementi (2010), and Guvenen (2009), though note that the latter two papers do not match the degree of predictability observed in the data.
1

39

consumption growth extremely and implausibly strongly following technology shocks in standard general-equilibrium models. This paper embeds the intuition behind Campbell and Cochrane (1999)—that persistent external habits can induce time-varying risk aversion—into the framework developed by Kreps and Porteus (1978), Epstein and Zin (1989), and Weil (1989). The Epstein– Zin speciﬁcation allows us to model risk aversion and intertemporal substitution separately, while the Campbell–Cochrane intuition motivates time-variation in risk aversion. In particular, consumers are modeled as having a time-varying external habit, which is a benchmark to which they compare their own lifetime utility. When lifetime utility is farther above the benchmark, risk aversion over proportional shocks to future welfare is lower. By explicitly separating variation in risk aversion from intertemporal substitution, the Epstein–Zin framework eliminates the problems that arise when standard Campbell– Cochrane preferences are used in a production setting. I refer to the new preference speciﬁcation as the EZ-habit model for its combination of these two frameworks.2 The simple real business cycle (RBC) model with ﬁxed labor supply provides a transparent laboratory in which to study the effects of time-variation in risk aversion on the macroeconomy in general equilibrium. I ﬁnd that the dynamics of real variables and real interest rates under the EZ-habit speciﬁcation are highly similar to a model with Epstein– Zin utility and constant relative risk aversion.3 The model can match both the short and long-run variances of output, investment, and consumption. Since consumption and wealth are cointegrated under balanced growth, their long-run variances must be the same. But empirically, the short-run (quarterly) variance of consumption growth is much smaller than the variance of changes in wealth. To match both the long and short-run moments, a model must have either mean-reversion in wealth or strong persistence in consumption growth. A number of recent asset-pricing papers (e.g. Bansal and Yaron, 2004, Kaltenbrunner and Lochstoer, 2010) have gone the route of choos2 Melino and Yang (2003) study a utility speciﬁcation that is highly similar to mine in reduced form. However, they do not discuss the inclusion of a habit, and they do not insert the preferences into a production setting.

For other recent studies of asset pricing in production economies, see Danthine, Donaldson and Mehra (1992), Rouwenhorst (1995), Tallarini (2000), and Cochrane (2005).

3

40

ing very strong persistence for consumption growth. In Kaltenbrunner and Lochstoer’s (2010) analysis of asset prices in the RBC model, for example, innovations to the permanent component of consumption have a standard deviation of 8 percent per year, which is at odds with the data. The EZ-habit model, on the other hand, implies that consumption is roughly a random walk—the short and long-run variances are nearly equal—but wealth is mean-reverting: declines in risk aversion raise current asset prices and lower expected returns. Whereas other papers in the production-based asset-pricing literature do not check the ﬁt of their models to the long-run variance of consumption and output, I show that the EZ-habit model can match this moment along with the short-run variances. In addition to matching macro moments, the EZ-habit model improves the ﬁt of the RBC model to ﬁnancial moments. Previous habit-based models designed to generate high or volatile risk premia tended to have implausibly volatile interest rates, a ﬂaw not found here.4 The reasonable behavior of interest rates is an important innovation of this paper; the EZ-habit model is able to have stable interest rates but still generate substantial asset price volatility because it has variation in discount rates on risky assets that is driven by variation in risk aversion.5 Movements in discount rates imply that asset returns should be predictable, and extensive tests show that the degree of predictability in the model is similar to what is observed in the data. Variation in risk aversion not only raises the volatility of asset returns, I ﬁnd that it also makes the equity premium roughly 1/3 larger on average than it would be otherwise. Countercyclical movements in risk aversion thus increase both the quantity and price of risk in ﬁnancial markets: good times seem even better and bad times worse. There are numerous empirical methods of forecasting stock returns, but the majority of them are not based on equilibrium theories. For example, regressions of stock returns on price-dividend ratios are motivated simply by an identity that links the price-dividend ratio to future returns and dividend growth. Under the EZ-habit model, though, it turns
4 See Jermann (1998); Boldrin, Christiano, and Fisher (2001); Campanale, Castro, and Clementi (2010); and Miao and Wang (2010).

See LeRoy and Porter (1981), Shiller (1981), and Grossman and Shiller (1981), for early studies of excess volatility in asset prices and the relationship between return predictability and volatility.

5

41

out to be possible to directly measure risk aversion. As is standard in the habit literature, I assume that positive innovations to household welfare reduce risk aversion. So if we can measure welfare, we can also measure risk aversion. Under Epstein–Zin preferences with a constant elasticity of intertemporal substitution, welfare is a function of current household wealth and consumption. And this result holds generally; it is not dependent on the RBC model I analyze. Using data on consumption and wealth, I construct an empirical estimate of risk aversion and show that it is a strong forecaster of aggregate stock returns: it outperforms the price-dividend ratio, Lettau and Ludvigson’s (2001) measure of the consumption-wealth ratio, and Campbell and Cochrane’s (1999) excess consumption ratio. This result differentiates my paper from models of time-varying disaster risk because it does not rely on an unobservable latent process to drive risk premia.6 The model also can match forecasting results for consumption growth. Lettau and Ludvigson (2001) ﬁnd little ability to forecast consumption growth using their measure of the consumption-wealth ratio. Campbell and Shiller (1988) obtain similar results for the stock market. As in the empirical data, it is essentially impossible to forecast consumption growth in the EZ-habit model using the consumption-wealth ratio, but forecasts of risk premia are highly effective. An alternative way to forecast consumption growth is with interest rates. Hall (1988) and Campbell and Mankiw (1989), in trying to estimate the elasticity of intertemporal substitution (EIS), essentially ask whether consumption growth can be forecasted with interest rates. They ﬁnd little forecasting power, suggesting the representative household has a small or even zero EIS. In this paper, the EIS is set to 1.5, but I still replicate the regression results from Hall (1988) and Campbell and Mankiw (1989). The EZ-habit model explains the failure of those regressions through a time-varying precautionary-saving effect. When risk aversion is high, households want to save more to protect themselves against future shocks, which drives interest rates downward. This effect biases standard Euler-equation estimation based on models with constant relative risk aversion. After testing the model’s ﬁt to macro and asset pricing moments and the predictions for
6

See Gourio (2010), and Wachter (2010), for recent models with time-varying disaster risk.

42

the EIS regressions and return forecasting, I consider two extensions to the model. First, I examine the effect of time-varying risk aversion on labor supply. Following positive technology shocks, risk aversion falls, raising consumption (through a decline in precautionary saving demand). This effect also lowers the response of labor supply to technology shocks. Intuitively, intratemporal optimization means that when households are willing to spend more money to raise consumption, they are also willing to sacriﬁce in terms of opportunity costs to raise leisure. Endogenous labor supply has little effect on risk premia in the economy, though. The reason is simply that under Epstein–Zin preferences with a high elasticity of intertemporal substitution, the volatility of the stochastic discount factor is driven mainly by the permanent component of consumption; so even if households smooth consumption growth by varying labor supply, the total amount of risk in the economy is essentially unchanged. The second extension is a log-linearization of the model using methods similar to Campbell (1994) and Lettau (2003). Unlike standard perturbation methods, the log-linearization used here does not impose certainty equivalence, so we can obtain expressions that take into account potentially time-varying risk premia even in the ﬁrst-order approximation. I am able to derive explicit expressions for the Sharpe ratio in the model as a function of current risk aversion and the underlying parameters of the model and ﬁnd that the results are highly similar to those from accurate numerical solutions. Much of the previous production-based asset-pricing literature has focused on simulations to study the implications of various models, so this paper introduces an important methodological contribution in extending and simplifying the analytic results of Campbell (1994) and Lettau (2003). Further, in the case where risk aversion is constant, I give an analytic characterization of how endogenous consumption smoothing generates long-run risks in a production setting (Bansal and Yaron, 2004; Kaltenbrunner and Lochstoer, 2010). The log-linearization thus provides an analytic explanation for results that were previously supported only with simulation-based evidence. The log-linear solution returns a stochastic discount factor (SDF) that takes on the essentially afﬁne form that is widely used in the empirical asset-pricing literature. This is possibly the ﬁrst paper to derive an essentially afﬁne SDF with a time-varying price of risk 43

from a production-based model. It thus connects the standard modeling framework in macroeconomics with one of the most widely used asset-pricing speciﬁcations in empirical ﬁnance. The paper is organized as follows. Section 2.2 discusses the preference speciﬁcation and lays out the economic environment. Section 2.3 calibrates a production economy and compares its behavior to the data. Section 4 tests the empirical implications of the model for return forecasting, and section 2.5 studies extensions to the basic framework. Section 2.6 concludes.

2.2 The model
2.2.1 Household preferences

For households with a constant elasticity of intertemporal substitution (EIS), Epstein– Zin (1989) utility can be expressed as
1− ρ

Vt =

(1 − exp (− β)) Ct

+ exp (− β) Gt−1 ( Et [ Gt (Vt+1 )])

1− ρ

1/(1−ρ)

(2.1)

for some function Gt , where Ct is household consumption and Et is the expectation op− erator conditional on information available at date t.7 The term Gt 1 ( Et [ Gt (Vt+1 )]) is a − certainty equivalent. When there is no uncertainty about Vt+1 , Gt 1 ( Et [ Gt (Vt+1 )]) = Vt+1 .

The usual choice for Gt (going back to Weil, 1989, and Epstein and Zin, 1991) is power utility, GtPower (Vt+1 ) = Vt1−α +1 (2.2)

Epstein and Zin (1989) show that the coefﬁcient of relative risk aversion for a household with preferences of the form (2.1) is equal to the coefﬁcient of relative risk aversion for Gt , while the EIS is equal to 1/ρ.
The preferences can be further generalized to study alternative time aggregators, instead of the constant elasticity of substitution form.
7

44

Now consider a habit-formation utility function for G, GtHabit (Vt+1 ; Ht ) = (Vt+1 − Ht )1−α

(2.3)

Value functions involving GtHabit are related to those using GtPower in the same way that usual habit speciﬁcations, e.g. Constantinides (1991), are related to time-separable power utility. Rather than caring only about the absolute level of their continuation value, GtHabit says that households care about the spread between tomorrow’s value and a benchmark Ht . Since the utility function adds a habit to Epstein–Zin, I refer to it as the EZ-habit speciﬁcation.8 I refer to the version of Vt using GtPower for the certainty equivalent as canonical Epstein–Zin in deference to its popularity in the literature.
V The coefﬁcient of relative risk aversion for GtHabit is equal to α Vt+1t+1Ht . As the spread −

between value and habit rises, the coefﬁcient of relative risk aversion falls. Intuitively, when the continuation value falls close to its benchmark, proportional shocks to Vt+1 loom much larger than when the household has a cushion between its continuation value and Ht . In principle it is possible to analyze a model with GtHabit , but it has three important drawbacks. First, if the support of the shocks to Vt+1 is sufﬁciently wide, there is a nonzero probability that Vt+1 will fall below Ht , leaving the certainty equivalent undeﬁned.9 Second because GtHabit is not log-linear in Vt+1 , obtaining simple analytic results with it is difﬁcult or impossible. Third, because GtHabit is not log-linear, standard arguments for the existence of a representative agent do not apply.10
Other papers, for example Rudebusch and Swanson (2010) and Yang (2008), incorporate consumption 1− ρ habits into Epstein–Zin preferences. That is, the Ct term is replaced by (Ct − Xt )1−ρ where Xt is the habit. Rudebusch and Swanson (2008) show that in general equilibrium this does not lead to a time-varying Sharpe ratio because households endogenously smooth consumption to reduce their overall risk exposure. That said, the speciﬁcation in Rudebusch and Swanson (2008) is meant to generate smooth consumption growth rather than a high risk premium. In principle, there is no reason that this type of habit formation could not be added to the EZ-habits model to help generate smoother consumption (e.g. to help explain the excess smoothness puzzle of Campbell and Deaton, 1989). Dew-Becker (2011) studies preferences with both time-varying risk aversion and consumption habits in a medium-scale DSGE model. This issue also arises in other habit speciﬁcations. When models are solved with standard perturbation methods, the problem is simply ignored. I use a more precise global numerical solution technique that forces me to grapple with the problem.
10 9 8

A representative agent may exist, but their preferences need not actually look like the preferences of any

45

For the remainder of the paper I therefore replace GtHabit with the alternative GtTV (Vt+1 ) = Vt1−αt +1 αt = α
TV (−1)

Vt Vt − Ht

(2.4)

Gt

Et GtTV (Vt+1 ) (where TV stands for time-varying) is a second-order approxima-

tion to Gt
Habit(−1)

Et GtHabit (Vt+1 ) around the non-stochastic version of the model.11 Moreover,

the appendix shows that in the continuous-time limit (i.e. under stochastic differential utility), preferences with GtTV are exactly equivalent to preferences using GtHabit .12 G TV is locally equivalent to G Habit in terms of risk preferences, but it solves the problems of integrability inside the certainty equivalent and the existence of a representative agent. As in Campbell and Cochrane (1999), I assume that households take the excess value ratio,
Vt Vt − Ht ,

and hence the coefﬁcient of relative risk aversion, αt , as external to their own

decisions. The ﬁnal step, then, is to specify a dynamic process for risk aversion. I assume a simple log-linear process, which we will ﬁnd to be highly tractable,
A A ¯ αt+1 = φαt + (1 − φ) α + λ ∆vt+1 − Et ∆vt+1

(2.5)

A where vt is the log of Vt for the representative agent. Intuitively, when value unexpect-

edly rises, it moves away from the habit and risk aversion falls, so λ < 0. Movements in the habit, and hence risk aversion, depend on aggregate value so that they are not affected by an individual household’s decisions. The AR(1) speciﬁcation for risk aversion is approximately equivalent to a speciﬁcation where log Ht is a geometrically weighted
particular agent. Ideally, if every agent has identical preferences, the representative agent will also have those preferences.
11

More precisely, the second-order approximation also assumes no growth. Adding a constant growth rate
(1+ µ )V

µ to V would change the result to αt = α (1+µ)V −tH . The remainder of the analysis is identical. t t
12 Melino and Yang (2003) study a utility function with the same form as G TV , but they take α as a latent t variable and give no theoretical motivation for its variation. This paper is original for proposing inserting habits into the certainty-equivalent part of Epstein–Zin preferences to motivate movements in αt .

46

A moving average of past values of vt .13

The appendix shows how to derive the marginal rate of intertemporal substitution (the stochastic discount factor, or SDF) for the general form of Epstein–Zin preferences in (2.1). In the case of G TV , we end up with the expression, ∂Vt /∂Ct+1 ≡ = exp (− β) ∂Vt /∂Ct Vt+1
ρ−αt
ρ−αt 1− α t

Mt + 1

Ct+1 Ct
−ρ

−ρ

(2.6)

Et Vt1−αt +1

with the only difference from the SDF under canonical Epstein–Zin preferences being the subscript on αt . The SDF is a critical piece of the model since its volatility determines the price of risk in the economy.14 As usual, changes in expected consumption growth or volatility will affect the SDF through their effects on Vt+1 . Changes in αt+1 (or Ht+1 ) will also affect the SDF in the same way. Speciﬁcally, when the habit rises and households are more risk averse, they penalize consumption uncertainty more, driving Vt+1 down. High risk-aversion states thus have high Arrow–Debreu prices. It is also straightforward to derive the standard result that
1− ρ ρ

Wt = Vt

Ct / (1 − exp (− β))

(2.7)

where Wt is the equilibrium price of a claim on the household’s consumption stream, which I refer to as the aggregate wealth portfolio. This formula holds regardless of whether risk aversion varies over time. Intuitively, the market price of the consumption stream is equal to the utility value that a household places on it, Vt , divided by the marginal utility of consumption, Vt Ct Zin (1991), Mt+1 = exp (− β)
1− α t 1− ρ

ρ

−ρ

(1 − exp (− β)). This leads to the familiar result from Epstein and
Ct+1 Ct
−ρ
1− α t 1− ρ ρ−αt

1− Rw,tρ 1 +

(2.8)

where Rw,t+1 is the return on the wealth portfolio.
13 It is straightforward to derive the actual process that H must follow in order for risk aversion to follow t the process in (2.5).

Hansen and Jagannathan (1991) show that the maximum Sharpe ratio (expected excess return divided by standard deviation) attained by any asset in the economy is equal to the standard deviation of the SDF divided by its mean.

14

47

2.2.2

Discussion

The model is motivated as an extension of habit-based preferences. Rather than consumers having a habit level of consumption that they target, I assume they have a habit level of value. Since equation (2.7) shows that there is a direct link between value and wealth, we could also think of the model as saying that households have a benchmark level of wealth. The house-money effect of Thaler and Johnson (1990) has a somewhat similar intuition. They ﬁnd that when subjects in lab experiments have recently gained money in betting games, they play more aggressively.15 Abel (1990) interprets habits in consumption as a "keeping up with the Joneses" effect. That intuition extends to the EZ-habit model. What households try to keep up with in this model, though, is fundamentally different. For example, consider a college senior who is trying to decide between following her friends into consulting or getting a law degree. With the J.D., she knows that in the short run her consumption will be lower than that of her friends, but in the long run she will likely be better off. In a model with an external consumption habit, three years of consumption below that of her friends looks painful. But when the habit appears as a function of value, the student is comfortable giving up consumption in the short run as long as she knows she will do well compared to her friends in the long run. Since the habit appears only in the risk aggregator, an agent with EZ-habit preferences is willing to substitute consumption over time in a way that an agent with standard habit-forming preferences is not. For the same reason, the EZ-habit model is not inconsistent with the mixed evidence on the effects of classic consumption habits at the micro level (e.g. Dynan, 2000, and Ravina, 2007). There are a number of papers that use investment choices to measure variation in risk aversion. Carroll (2002) ﬁnds that households with higher wealth tend to tilt their investment portfolios towards more risky assets. Brunnermeier and Nagel (2008), though, argue that there is little evidence that changes in wealth affect portfolio choices in household data. Rather, they ﬁnd that inertia is the dominant characteristic of household portfolio
Barberis, Huang, and Santos (2001) embed the house-money effect in a full asset-pricing model. See Gertner (1993) and Post et al. (2008) for evidence on the house-money effect from game shows.
15

48

choice. Calvet, Campbell, and Sodini (2009), after controlling for the inertia studied by Brunnermeier and Nagel, ﬁnd a strong and signiﬁcant relationship between innovations to wealth and the riskiness of a household’s portfolio.16 Furthermore, they show that weakness in the instruments for wealth shocks can cause a researcher to erroneously ﬁnd that wealth does not affect risk-taking. Calvet and Sodini (2010) show that higher past income, controlling for current wealth and genetic differences in risk attitudes, is also negatively related to the share of household portfolios invested in risky assets. On net, with the notable exception of Brunnermeier and Nagel (2008), the empirical literature supports the idea that increases in wealth reduce risk aversion.

2.2.3

Production

Aggregate output is a function of the capital stock, Kt , and productivity At
1− γ γ

Yt = At

Kt

(2.9)

In section 2.5.3 I add endogenous labor supply and show that it does not substantially change the dynamics of the model. The production function (2.9) can be thought of Cobb– Douglas with labor supply held ﬁxed at unity. The aggregate resource constraint is

Kt+1 = (1 − δ) Kt + Yt − Ct

where δ is the depreciation rate of capital.
16 See also Tanaka, Camerer, and Nguyen (2010), who ﬁnd that income, both its raw level and instrumented for with exogenous shocks, has a negative impact on loss aversion, and Guiso, Sapienza, and Zingales (2011) who ﬁnd that following the ﬁnancial crisis of 2008, households both reduced the risky shares of their portfolios and became more averse to gambles in survey questions.

49

For the benchmark calibration, productivity follows a random walk in logs,17

log At+1 = log At + µ + σa ε t+1 ε t+1 ∼ N (0, 1)

(2.10)

The drawback of using random-walk technology is that it is difﬁcult to generate the degree of volatility for output and investment that is observed in the data.18 I therefore also consider a dual-shock version of the model that can match both the short and longrun variances of output, ¯ A t = A t Xt ¯ ¯ log At+1 = log At + µ + σa ε t+1 log Xt+1 = φx log Xt + σx ε x,t+1 ε t+1 , ε x,t+1 ∼ i.i.d. N (0, 1)

(2.11) (2.12) (2.13) (2.14)

¯ At here is the permanent component of output, while Xt can be interpreted as a simple method of trying to capture forces that drive short-run ﬂuctuations in output and consumption, e.g. shocks to monetary policy or energy prices. I refer to the version of the model with random-walk technology as the benchmark model, while the model with permanent and temporary technology shocks is the dual-shock model.
17 An alternative is a trend-stationary process for productivity. Alvarez and Jermann (2005) argue that permanent shocks to the level of productivity (more generally, to the level of state prices) are necessary to explain asset-pricing facts. Also, in models with Epstein–Zin preferences, because the SDF depends not only on current consumption but also on the level of the value function itself, an I(1) process for productivity tends to increase the volatility of the SDF compared to models with trend-stationary productivity, which helps explain the equity premium. Kaltenbrunner and Lochstoer (2010) ﬁnd that in order to match the empirical equity premium in a model with trend-stationary productivity, their model needs an implausibly small EIS (0.05). With difference-stationary productivity they are able to choose a more reasonable value (1.5). 18 In particular, without a mean-reverting component, it is impossible for the model to replicate the result from Cochrane (1994) that the long-run variance of output is smaller than the unconditional variance. In the RBC model, output does not overshoot its long-run trend following a permanent increase in technology: it does not have a mean-reverting component to its dynamics. Because they rely only on permanent shocks in the RBC model, Kaltenbrunner and Lochstoer (2010) have to set the annual standard deviation of technology shocks to an implausibly high 8.2 percent per year to match the unconditional standard deviation of output growth.

50

2.3 Calibration and simulation
I solve the model with projection methods, which entails ﬁtting a polynomial approximation to the decision rule and searching for coefﬁcients so that the equilibrium conditions hold exactly at certain speciﬁed points in the state space.19 The Euler equation errors in the simulations imply households misprice a claim on capital by uniformly less than 1/100th of 1 basis point (i.e. one part in one million) over the range of the state space that the simulations visit, and the median simulated error is an order of magnitude smaller. The model is parameterized to match quarterly data. Table 2.1 lists the parameter values and the target moments. Many of the parameters, e.g. the exponent on capital in the production function, take standard values. I discuss here the parameters that are unique to this paper or do not have standard and agreed-upon values. I set ρ = 2/3 as in Bansal and Yaron (2004), for an EIS of 1.5. Bansal and Yaron note that an EIS greater than 1 is necessary for increases in volatility to lower asset prices (speciﬁcally, the wealth-consumption ratio) in an endowment economy. In a production economy this result does not hold exactly (because consumption is endogenous), but it is approximately true. Similarly, an EIS greater than 1 ensures that increases in risk aversion increase the expected return on the wealth portfolio and lower its current price.20 Many studies attempting to estimate the EIS have obtained values much smaller than 1 (Hall, 1988; Campbell and Mankiw, 1989). An important test of the model will be whether it can match that result even though the calibrated EIS is larger than 1. I choose the variance of permanent innovations to technology to match the long-run variance of consumption growth in the data. Since technology and consumption are cointegrated in the model, the long-run variance of consumption growth is equal to the variance of the permanent technology shocks. I estimate the empirical long-run variance (i.e. the spectral density at frequency zero) of consumption growth with a third-order univariSee Caldara et al. (2009) for a good description of the method as applied to models with recursive utility. When solving the RBC model with Epstein–Zin preferences, they ﬁnd that projection methods are orders of magnitude more accurate than the perturbation methods used in the majority of the macro literature. Intuitively, an increase in risk aversion or volatility has two effects – it lowers the risk-free rate and raises the excess return on the wealth portfolio. Which of these effects dominates depends on the EIS.
20 19

51

Table 2.1: Calibration

Parameter γ β δ µ φ ρ σa mean(αt) stdev(αt) σx φx

Value 0.33 0.9975 0.02 0.005 0.94 0.67 0.0088 14 6.2 0.012 0.9

Target Capital income share 2% annual real risk-free rate 8% annual depreciation (BEA data) 2% annual output growth Persistence of price/dividend ratio A priori (see text) Long-run standard deviation of consumption growth Mean Sharpe ratio (0.32 annualized) Stock return predictability Variance of output growth Variance of output growth

Note: Parameters used for the structural models. In table 2, the CRRA model uses with stdev(α)=0; the benchmark EZ-habit model (column 3) sets σ x=0.

52

ate AR model (where the lag length was selected with the Bayesian information criterion) and obtain a value of 0.00882 . That is, the quarterly innovations to the permanent component of consumption have a standard deviation of 0.88 percent.21 For the dual-shock model, I select the parameters σx and φx to match the short-run volatility of consumption and output growth. The parameters imply that the temporary component of technology has an unconditional standard deviation of 2.7 percent.22 The persistence of risk aversion, φ, is set to match the empirical persistence of the pricedividend ratio for the aggregate stock market, as in Campbell and Cochrane (1999). The mean and volatility of risk aversion (¯ and, implicitly, λ) are chosen to match the average α Sharpe ratio for the stock market in the post-war sample and the degree of predictability observed using the price-dividend ratio to forecast stock returns. Mean risk aversion is 14 and the standard deviation is set to 6.2.23

2.3.1

Comparisons across models

Table 2.2 reports basic moments from the three models. The ﬁrst column gives the moments from the data while the second column gives results from the canonical Epstein– Zin model with constant relative risk aversion (EZ-CRRA). Columns 3 and 4 give results for the EZ-habit model under the benchmark calibration and with temporary technology shocks added. The ﬁrst row simply shows that all three models are calibrated to match the longrun variance of consumption exactly, which, under balanced growth, means they also match the long-run variances of output and investment growth. Rows 2 through 4 give the standard deviations of quarterly output, consumption, and investment growth. Both
21 By choosing a smaller value for the long-run varaince than the long-run risks literature, I only make the task of matching the equity premium harder. I also make the model consistent with the point estimate of the long-run variance of consumption, rather than choosing a value in the upper end of the conﬁdence interval. Empirically, I measure consumption as real per-capita nondurable and service consumption from the BEA. 22 Smets and Wouters (2007) estimate that the 1-quarter autocorrelation of stationary technology shocks is 0.95. On the other hand, the 1-quarter autocorrelation of detrended real GDP is 0.85. I take φx = 0.90 as the midpoint between these two values.

When αt < 0, I still use the standard Euler equation even though the household’s optimization problem is convex. In the simulations, αt < 0 only 1.5 percent of the time. Treating households as if they are risk-neutral in periods when αt < 0 (i.e. censoring αt at zero) has no discernible effect on the results.

23

53

Table 2.2: Comparison of preference speciﬁcations

1 2 3 4 5 6 7 8 9 10 11 12

Model: Real moments: Long-run SD(dC,dY,dI) (%) StdDev(dY) (%) StdDev(dC) (%) StdDev(dI) (%) corr(dC(t),Rf(t-1)) Financial moments: Mean SR (annualized) Std. dev. SR p-value SR std. dev. Mean Rw (annualized %) StdDev(Rw) (annualized %) Mean Rf (annualized %) StdDev(Rf) (annualized %)

1 Data 0.88 0.99 0.46 2.65 -0.09 0.32 0.22 N/A 6.78 21.19 0.91 1.16

2 EZ-CRRA 0.88 0.59 0.28 1.11 0.28 0.22 0.12 0.16 1.04 4.71 2.20 0.21

3 4 EZ-habit Dual-shock 0.88 0.59 0.47 0.83 0.07 0.32 0.22 0.50 4.17 13.30 2.04 0.25 0.88 1.03 0.56 2.37 0.08 0.32 0.22 0.42 4.15 12.98 1.94 0.26

Note: Column 2 gives results under Epstein–Zin preferences with constant relative risk aversion, column 3 uses EZ-habit preferences and random-walk technology, and column 4 EZ-habit preferences with the dual-shock specification. All models are calibrated as in table 1. All variables are measured using quarterly values. dI is investment growth, dY output growth, and dC consumption growth. Rf is the risk-free rate (measured empirically as the nominal 3-month yield minus an inflation forecast), and Rw is the annualized return on a levered consumption claim (with a leverage ratio of 2.74). The long-run SD is the square root of the spectral density at frequency zero multiplied by 2π. SR is the annualized Sharpe ratio; the standard deviation of the Sharpe ratio is measured by the standard deviation of the fitted values in forecasts of one-quarter-ahead returns in 228-quarter simulated samples divided by the unconditional standard deviation of returns. Row 7 reports the median of the standard deviations of the Sharpe ratio in the simulated samples. Row 8 reports the fraction of simulated samples in which the standard deviation of the Sharpe ratio is as large as in the data.

54

the EZ-CRRA and single-shock EZ-habit models have volatilities for output and investment growth that are well below the empirical values. The dual-shock model rectiﬁes this problem, matching both the short-run and long-run variances well. Both versions of the EZ-habit model match the empirical variance of consumption growth. Row 5 reports the correlation between the risk-free rate and the next period’s consumption growth. Empirically, the real risk-free rate is measured as the 3-month nominal interest rate minus an inﬂation forecast.24 In the EZ-CRRA model, the risk-free rate has a substantial amount of forecasting power for consumption growth, while in the data interest rates and consumption growth seem essentially unrelated. The two EZ-habit calibrations come much closer to matching that fact. Rows 1 through 5 show that the EZ-habit model can capture the basic unconditional moments of output, consumption, and investment. Rows 6 through 12 of table 2.2 summarize the ﬁnancial side of the model. We can begin by looking at a measure of the price of risk. The Sharpe ratio on an asset is the ratio of its expected excess return over the riskfree rate divided by its standard deviation, so it measures the risk–return tradeoff. Hansen and Jagannathan (1991) show that the maximum Sharpe ratio obtained by any asset in the economy is equal to the standard deviation of the SDF divided by its mean. Recall that all three calibrations have the same average coefﬁcient of relative risk aversion. The Hansen– Jagannathan bound and the mean Sharpe ratio for the consumption claim are roughly 1/3 higher in the two EZ-habit models than the EZ-CRRA case. The reason for this is that the household’s value, Vt , a component of the SDF (equation 2.6), is more volatile in the EZ-habit models. In all the models, a technology shock permanently raises expected consumption and hence Vt . In the EZ-habit case, the coefﬁcient of relative risk aversion also falls. Households become less averse to future uncertainty, so Vt rises even more. Countercyclical variation in risk aversion thus makes good times even better and bad times even worse, raising the volatility of the SDF. This effect allows the model to explain the equity premium (or at least the Sharpe ratio on equities) with a lower coefﬁcient of relative risk aversion than we would need in the EZ-CRRA model.
24 Expected inﬂation is measured as a forecast of quarterly inﬂation based on lagged levels of inﬂation and the nominal risk-free interest rate.

55

To test whether the models can match the degree of predictability for stock returns that is observed in the data, I regress simulated quarterly excess returns on the consumption claim on its lagged price-dividend ratio. I then estimate the standard deviation of the conditional Sharpe ratio as the standard deviation of the ﬁtted returns divided by the unconditional standard deviation of returns (i.e. assuming a constant volatility). Row 7 reports the median standard deviation from 5,000 simulations of 228 quarters of data, while row 8 reports the proportion of the simulations that have a standard deviation as high as observed empirically (0.22).25 In column 2, we can see that there is actually a nontrivial amount of implied predictability on average in the EZ-CRRA model due to small-sample overﬁtting, but only 16 percent of the simulations match the variability observed in the data. For the EZ-habit model, the predictability observed in the data is calibrated to be exactly the median value in the simulations. Rows 9 and 10 report the mean and standard deviation of the excess return on a levered consumption claim in the model. For comparability to past results, I follow Abel (1999) and
2.74 Gourio (2010) in assuming a leverage ratio of 2.74 (i.e. the asset pays a dividend of Ct ).

The two EZ-habit models are able to generate means and volatilities for returns that are far closer to the equity return observed in the data than the EZ-CRRA model can. Part of the reason for this success is that consumption growth, and hence dividend growth, is more volatile in the EZ-habit models than in the EZ-CRRA case, and part of the reason is that discount rates are more volatile. Following a positive technology shock, not only do dividends rise, but discount rates fall, thus making the returns on the wealth portfolio and the levered consumption claim more volatile.26 Rows 11 and 12 show that the means and standard deviations of the real risk-free rate in the three models are all reasonably close to the data. The volatility of interest rates is similar across all three models, and somewhat lower than in the data. The real risk-free
The ﬁtted Sharpe ratio is measured empirically by forecasting the CRSP value-weighted aggregate excess return with the aggregate price/dividend ratio.
26 LeRoy and Porter (1981) and Shiller (1981) argue that dividends do not seem sufﬁciently volatile to explain the volatility of stock prices. Grossman and Shiller (1981) suggest that variation in discount rates can explain this puzzle. 25

56

rate is measured empirically as the nominal 3-month Treasury yield minus a forecast of inﬂation. Errors in the inﬂation forecast will make the estimated real risk-free rate more volatile than the true real risk-free rate, which explains some of the divergence between the empirical and simulated volatilities. A common problem in early attempts to generate a high equity premium (e.g. Constantinides, 1990; Boldrin, Christiano, and Fisher, 2001, and Jermann, 1998), is a highly volatile risk-free rate. The EZ-habit speciﬁcation replaces movements in discount rates coming from the risk-free rate with movements coming from risk premia. To summarize, table 2.2 shows that the EZ-habit model can match a broad array of features of the economy—the short and long-run variances of output growth, the relative volatilities of investment and consumption growth, and the mean and standard deviation of the Sharpe ratio on equities. The model also helps generate a larger premium on a levered consumption claim, closing roughly half the gap in the equity premium between the EZ-CRRA model and the data. Finally, the behavior of the risk-free rate is reasonably similar to the data, unlike previous general-equilibrium attempts at generating a high and volatile Sharpe ratio.

2.3.2 Predictability in the simulated model 2.3.2.1 The magnitude of return predictability

Figure 2.1 plots R2 s from univariate regressions of excess aggregate stock returns over various horizons on the log price-dividend ratio on the CRSP value-weighted portfolio (e.g. Campbell and Shiller, 1988, among many others), Lettau and Ludvigson’s (2001) measure of the consumption-wealth ratio, cay, Campbell and Cochrane’s (1999) excess consumption ratio, and an estimate of risk aversion derived from the EZ-habit model in section 2.4. For the four different variables used in the empirical sample, the R2 s generally rise as the sample length grows, and estimated risk aversion outperforms cay, excess consumption, and the price-dividend ratio. The gray line labeled "Simulated mean" gives the mean R2 from 5000 regressions of excess returns on a consumption claim on the price-dividend ratio (equivalently, the wealth-

57

Figure 2.1: Simulated and empirical R2 s

0.7

0.6

0.5

Estimated risk aversion

0.4

R2

58
6 11 16 21

0.3

0.2

cay

0.1

0 26 Forecast horizon (quarters) 31 36

1

Note: R2s from univariate regressions of stock returns on various predictors. The forecast horizon is reported in quarters. Data for cay is obtained from Sydney Ludvigson's website; Price/dividend data comes from CRSP; the Campbell–Cochrane excess consumption ratio is computed using their parameter values and consumption data from the BEA. The gray lines give the mean and 95th percentiles in the simulation of the EZ-habit model.

consumption ratio) over 228-quarter spans in the benchmark simulation of the singleshock model (the same length as the empirical sample). The upper gray line gives the 95th percentile of the simulations. As in the data, the simulated R2 s rise as the horizon lengthens. The model compares favorably with the price-dividend and excess-consumption ratios, with the simulated mean tracking the empirical values closely (the median follows almost the same path). The empirical R2 s for cay are at or below the 95th percentile in the simulations. The only variable that the simulations cannot match is estimated risk aversion, but raising the volatility of risk aversion in the calibration would solve this problem. The R2 s generated here are substantially higher than those obtained in production models such as Campanale, Castro, and Clementi’s (2010) model of time-varying ﬁrstorder risk aversion and Guvenen (2009) and De Graeve et al.’s (2010) studies of limited participation. The population R2 s are also essentially identical to those found by Wachter (2010) and Gourio (2010) in endowment-economy and production-based models, respectively, with time-varying disaster risk. The top panel of table 2.3 reports the percentage of simulated samples in which the simulated R2 is as high as we observe in the data for cay and the price-dividend ratio (results for excess consumption are similar to the price-dividend ratio), and where a high price-dividend ratio forecasts low returns. The table reports values for horizons of one quarter and one through ﬁve years. The EZ-CRRA model matches empirical R2 s for cay less than 5 percent of the time at horizons shorter than 16 quarters, but can match the R2 s for the price-dividend ratio 15 to 25 percent of the time. The habit model substantially raises the likelihood of the simulations of matching the data, by a factor of three or more at every horizon, and it never matches less than 5 percent of the time except for cay at the one-quarter horizon. As an alternative to the R2 , I also consider the test statistic suggested by Kiefer, Vogelsang, and Bunzel (KVB, 2000) based on Newey–West standard errors with the lag window equal to the sample size. At various horizons, I calculate the t-statistic on the coefﬁcient in a regression of stock returns on the price-dividend ratio in the simulated samples. The bottom panel of table 2.3 repeats the analysis from the top half, but with the KVB test statistics. In every case, the habit model matches the empirical t-statistics at least ﬁve per59

Table 2.3: Proportion of simulated samples that match empirical statistics

Forecasting R2 Model: Predictor: Horizon (quarters)

1 4 8 12 16 20

EZ-CRRA cay P/D 0.00 0.14 0.01 0.13 0.01 0.13 0.02 0.16 0.04 0.20 0.10 0.25 EZ-CRRA cay 0.10 0.04 0.02 0.03 0.06 0.06

EZ-habit cay 0.05 0.11 0.12 0.17 0.26 0.45 EZ-habit cay 0.38 0.17 0.08 0.12 0.22 0.19

P/D 0.52 0.52 0.51 0.56 0.62 0.69

KVB t-statistics Model: Predictor: Horizon (quarters)

1 4 8 12 16 20

P/D 0.05 0.11 0.11 0.15 0.14 0.20

P/D 0.22 0.35 0.35 0.39 0.38 0.47

Note: The top panel reports, for each predictor, the proportion of simulated 228quarter samples in which the R 2 in a return-forecasting regression is at least as large as observed in the data and where the predictive relationship has the correct sign. The bottom panel reports the proportion of simulated samples that generate KieferVogelsang-Bunzel t-statistics for each variable that are as large as in the data and have the same sign. The EZ-habit model is the benchmark (single-shock) model. cay is the consumption/wealth ratio from Lettau and Ludvigson (2001). P/D is the price/dividend ratio from CRSP. All simulated regressions use as the predictor the wealth/consumption ratio (equivalently, the price/dividend ratio on a claim to aggregate consumption) and the dependent variable is the excess return on a consumption claim. The regressions are run at horizons listed in the left-hand column. Bold numbers are less than 0.05, bold italics less than 0.01.

60

cent of the time. The EZ-CRRA model again has trouble matching the results for cay, and only replicates the statistics for the price-dividend ratio in 5 to 20 percent of the samples, compared to 20 to 50 percent of the samples for the habit model.

2.3.2.2

Other return predictors

Table 2.4 reports the simulated correlation between ﬁve-year excess returns on the aggregate wealth portfolio and a variety of return predictors. The ﬁrst row gives the correlation for actual risk aversion, which we would expect would be highest of all of the variables. The second row shows that the predictive power of the price-dividend ratio is nearly as high as that of αt in the benchmark model, but somewhat lower in the dual-shock calibration (though still not as much lower as in the data). Fama and Schwert (1977) and Campbell (1987) ﬁnd that short term interest rates negatively predict future stock returns.27 In table 2.4, I do not replicate the result that the real interest rate negatively forecasts returns, but the risk-free rate minus its 4-quarter moving average (denoted RREL as in Campbell, 1987), does weakly negatively forecast returns. Table 2.4 shows that the correlation of the ﬁve-year excess stock return with the real risk-free ˆ rate and RREL is substantially negative and nearly as large as that of α. In the model, two effects cause interest rates to forecast stock returns. First, positive technology shocks raise interest rates and lower risk aversion. Second, even if risk aversion were driven by shocks unrelated to technology, interest rates might still forecast stock returns since a decline in risk aversion lowers the precautionary saving effect, raising interest rates. Intuitively, there is a ﬂight-to-quality effect in interest rates, linking them to expected stock returns. Table 2.4 reports the mean and standard deviation of the real term spread for the EZhabit model. For the sake of simplicity, I follow the literature in modeling long-term debt as an asset that has a constant probability of paying its principal of one unit of the consumption good and retiring.28 If the bond does not retire and pay out, the holder retains
Campbell (1991) subtracts the 12-month moving average of the nominal risk-free rate from itself as a way to detrend the short term interest rate, since the nominal rate may be nonstationary if there are changes in trend inﬂation. Detrending in that way should be unnecessary in the model since interest rates are stationary, but I still check this variable.
28 27

See, e.g. Rudebusch and Swanson (2008) and Miao and Wang (2010).

61

Table 2.4: Various return predictors

Data EZ-CRRA Five-year excess stock return correlations: αt 0.69 0.05 Real interest rate 0.09 -0.05 RREL -0.09 -0.03 Term spread 0.20 0.05 P/D -0.39 -0.05 Term spread summary statistics: Mean 1.43 -0.14 Std. dev. 1.17 0.11

EZ-habit 0.31 -0.30 -0.22 0.29 -0.30 -0.23 0.07

Dual-shock 0.27 -0.23 -0.17 0.10 -0.23 -0.26 0.14

Note: The bottom two rows are the mean and standard deviation of the spread between the yield on a tenyear equivalent bond and a one-quarter riskless bond (measured in the data using nominal Treasuries). The remaining rows give population correlations between various variables and five-year stock retuns. For the simulations, population values are reported. RREL is the gap between the risk-free rate and its fourquarter moving average; P/D is the price/dividend ratio. In the simulations, P/D is measured as the wealth/consumption ratio. ἃ is the value of risk aversion in the model, and estimated in the data as in section 4.

62

the bond for another period. I assume that the quarterly probability of payout is 1-0.91/4 so that the expected maturity of the bond is ten years. The term spread is the yield to maturity on this bond minus the one-quarter riskless yield. The term spread in the model is on average negative, whereas the nominal Treasury yield curve is almost always upwardsloping in the data. The reason we have a negative term spread in the model is that in good times the marginal product of capital, and hence the risk-free rate, is above average. So in good times, short-term bonds have low prices, and hence they are a hedge and have a negative risk premium. Fama and French (1989) show that the term spread forecasts stock returns. Interestingly, table 2.4 shows that even though the term spread is negative on average in the model, it still positively predicts future stock returns as in the data. This essentially comes through an expectations-hypothesis effect. In periods when the risk premium is low, the risk-free rate is high and expected to fall. To the extent that long-term yields are just averages of expected future short yields, long yields will rise less than short yields. So in periods when risk aversion is low, the term spread falls, and the term spread thus positively predicts stock returns. In the model, the equity premium is a nearly a constant multiple of αt .29 The variables that forecast returns in table 2.4 are all correlated with αt , but imperfectly. For example, the price-dividend ratio also depends on expected consumption growth and interest rates. The fourth column of table 2.4 shows that the dual-shock model can qualitatively, if not quantitatively, match the empirical result in column 1 that estimated risk aversion is a more powerful forecaster of excess stock returns than any of the other variables, since it is uncontaminated by factors like expected consumption growth.

2.3.2.3

Consumption growth predictability

The aggregate price-dividend and wealth-consumption ratios may be driven by either movements in expected dividend (consumption) growth or movements in discount rates. For the aggregate stock market, Campbell and Shiller (1988) and Cochrane (2008) ﬁnd that
29

This result is exact in the log-linear approximation.

63

the price-dividend ratio has at best weak forecasting power for dividend growth. Similarly, Lettau and Ludvigson (2001) ﬁnd that the wealth-consumption ratio has little forecasting power for consumption growth. Figure 2.2 shows that the EZ-habit model is consistent with those results. First, to get a general sense of the dynamic properties of consumption growth, the top panel of ﬁgure 2.2 plots the autocorrelations of consumption growth against their empirical counterparts. The shaded region is the 95-percent conﬁdence interval for the empirical estimates using the Newey–West method with a lag window of 12 quarters. In the model, the autocorrelations are near zero at all horizons. The data suggests that the ﬁrst three autocorrelations are positive, which the model does not match. At longer lag lengths, though, there is no evidence for persistence in consumption growth, consistent with the model. The bottom panel of ﬁgure 2.2 simulates 228-quarter samples as in ﬁgure 1 and calculates correlations between consumption growth between dates t and t + k and the consumptionwealth ratio at date t. What we see is that while many of the simulated correlations are far from zero, the mean sample correlation between the wealth-consumption ratio and future consumption growth is nearly zero. The ﬁgure also plots empirical correlations between cay and future consumption growth at various horizons, and they are similar to the simulated mean. Figure 2.2 thus shows that the EZ-habit model not only matches the short and long-run variances of consumption growth, but it also replicates relevant features of the dynamics of consumption.

2.3.3

Impulse response functions

Figure 2.3 plots impulse response functions (IRFs) in the EZ-CRRA and benchmark EZhabit models for four variables: consumption, household value, the risk-free rate, and the Sharpe ratio on the consumption claim. The lines give log deviations from steady-state, except for the risk-free rate, for which I report the absolute change in annualized percentage points. The shock is a unit standard deviation (88 basis-point) permanent increase in the level of technology, which will lead to an identical long-run increase in consumption, capital, and output.

64

Figure 2.2: Consumption predictability

Autocorrelation of consumption growth
0.6

Empirical 95% confidence interval
0.4

Empirical estimate Model-implied values

0.2

0

-0.2

-0.4 1 6 11 16 21 Quarters 26 31 36

Correlation of long-horizon consumption growth with W/C
0.6

0.4

Simulated 95%

0.2

Empirical estimate
0 1 -0.2 6 11 16 21 26 31 36

Simulated mean

-0.4

-0.6

Simulated 5%
-0.8 Forecast horizon (quarters)

Note: The top panel reports the empirical autocorrelation function for consumption. The gray shaded region is the 95% confidence interval using Newey–West standard errors with a lag window of 12 quarters. The bottom panel reports the correlation of consumption growth between periods t and t+x with the wealth-consumption ratio (cay ) at date t, where x is reported in quarters on the horizontal axis. The solid lines give the mean and 5th and 95th percentiles of the same correlation in simulated 228-quarter samples.

65

Figure 2.3: Impulse response functions

Value Sharpe ratio
0 1 -0.02 EZ-CRRA -0.06 -0.08 -0.1 -0.12 11 -0.14 16 21 26 31 36 41 -0.04 6 11 16 21 26 31 36 41 EZ-habit

0.012

0.01

0.008

0.006

0.004

0.002

0

1

6

Risk-free rate (levels)
0.008 0.007 0.006 0.005 0.004 0.003 0.002 0.001 0 11 16 21 26 31 36 41 1 6 11

Consumption

66

0.07

0.06

0.05

0.04

0.03

0.02

0.01

0

1

6

16

21

26

31

36

41

Note: Impulse responses for the EZ-CRRA and EZ-habits models. The shock is a positive unit-standard-deviation increase in technology. The dotted lines are for EZCRRA, solid are for EZ-habit. All functions are reported as fractions of the variables' means except for the risk-free rate, for which the response is in annualized percentage points. Value is lifetime utility; the Sharpe ratio is for an asset that pays aggregate consumption as its dividend.

The top-left panel shows the response of household value. For the EZ-CRRA model, value immediately jumps to a point just below its new steady state, and then slowly rises as households accumulate capital. For the EZ-habit model, though, value actually overshoots its new steady state. The reason is that the positive shock to productivity drives risk aversion down. When households are less risk-averse, they place a higher value on their future consumption stream because they penalize uncertainty less strongly. This effect helps increase the volatility of the SDF (equation 2.6), raising the Hansen–Jagannathan bound. The top-right panel shows that on the impact of a shock, the Sharpe ratio in the EZ-habit model falls by 12.5 percent (as a fraction of its mean), and then gradually rises again, with a half-life of 12 quarters. The bottom-left panel shows the dynamics of the risk-free rate. The initial response is essentially identical for the two models. The reason for this is that the risk premium on an unlevered claim on capital is very small in the model, so the return on capital is roughly equal to the risk-free rate. Since the size of the capital stock is essentially ﬁxed in the short-run, an increase in productivity directly increases the return on capital and hence the risk-free rate. The ﬁnal panel of ﬁgure 2.3 shows the response of consumption in the two models. The EZ-habit model shows a larger initial response of consumption, with lower expected consumption growth going forward. To see why this is, we can write consumption growth in equilibrium as ¯ Et ∆ct+1 = c + ρ−1 r f ,t+1 + αt × vol (2.15)

where r f ,t+1 is the risk-free interest rate between dates t and t + 1, vol represents a measure ¯ of the total volatility in the model, and c is a constant. αt × vol represents the precautionary saving effect and is a function of the current level of risk aversion and the variances of the shocks in the model. The standard interpretation in an endowment economy is that conditional on consumption growth, a strong precautionary saving motive leads to a low risk-free rate. In a production setting, though, it is the risk-free rate that is held roughly ﬁxed since it is tied to the marginal product of capital, which is hard to change quickly through investment. Conditional on the risk-free rate, then, a small precautionary saving

67

motive leads to lower expected consumption growth (more consumption today, saving less for tomorrow). In the EZ-habit model, a positive technology shock lowers risk aversion, and hence consumption rises more than in the canonical EZ case. This effect also serves to increase the volatility of the SDF, just as the higher response of value does. Given the results in ﬁgure 2.3, it is straightforward to see what would happen in this economy if there were a pure shock to the coefﬁcient of relative risk aversion. Since the risk-free rate is tied to the marginal product of capital, it would not move on the impact of a shock. The only effect on real variables of a pure decline in risk aversion then is that households would want a smaller buffer stock of savings, so they would raise consumption and lower investment: shocks to risk aversion look like simple demand shocks.

2.3.4 Estimating the EIS from interest rate regressions The value of the elasticity of intertemporal substitution is controversial. Regressions based on aggregate consumption and asset returns often ﬁnd a very small EIS (Hall, 1988; Campbell and Mankiw, 1989). Campbell (2003) reviews the literature and estimates the EIS using a variety of speciﬁcations and data from a broad range of countries, ﬁnding values generally less than 0.5 and often less than 0.2.30 This result is in conﬂict with the calibration used here and in other recent production-based asset pricing studies (Kaltenbrunner and Lochstoer, 2010; Gourio, 2010), which assume that the EIS is greater than 1. The question is whether the EZ-habit model generates small EIS estimates in regressions similar to those estimated in Campbell (2003). The standard aggregate EIS regressions start from a model in which the risk-free rate takes the form r f ,t+1 = b0 + ρEt ∆ct+1 (2.16)

where r f ,t+1 is the riskless interest rate between periods t and t + 1. b0 is a parameter depending on the discount rate and underlying volatility in the model (which are taken
30 Vissing-Jorgenson (2002) ﬁnds an EIS less than unity in micro data. On the other hand, Vissing-Jorgenson and Attanasio (2003) and Gruber (2006) obtain larger estimates using micro data, both above unity. Gruber (2006) is particularly well-identiﬁed, using variation in the capital income tax rate as the source of exogenous differences in the after-tax interest rate earned by households.

68

to be constant). This relationship is straightforward to derive in an endowment economy with homoskedastic consumption growth and where households have a constant EIS and coefﬁcient of relative risk aversion. It is also obtained in a log-linearization of the standard RBC model with homoskedastic technology shocks. In principle, the EIS can be estimated from a regression of interest rates on consumption growth or vice versa. However, since the reduced-form relationship between consumption growth and interest rates is nearly zero in the EZ-habit model, in some of the simulations regressing Et ∆ct+1 on r f ,t+1 produces explosive estimates for ρ−1 (since we have to invert the coefﬁcient estimate). Moreover, consumption in the EZ-habit model nearly follows a random walk, so it is essentially unpredictable and there are serious weak-instruments problems in an IV regression of interest rates on consumption growth. I therefore focus on the regression of consumption growth on interest rates, Et ∆ct+1 = b0 + ρ−1 r f ,t+1

(2.17)

In the simulations of the model in section 2.3, we have the ability to directly measure Et ∆ct+1 . The ﬁrst row of table 2.5 reports the population estimate of ρ−1 in regression (2.17) under the EZ-CRRA and EZ-habit models. In the EZ-CRRA case, the regression identiﬁes ρ−1 exactly. On the other hand, the estimate of ρ−1 is biased substantially downwards in the EZ-habit speciﬁcations. The bias comes from the fact that the time-varying precautionary saving effect (equation 2.15) is omitted from the regression. Since precautionary saving is correlated with both expected consumption growth and interest rates, omitting it biases the simple IS-curve regression usually used to identify the EIS. An alternative way to see the source of the bias is to go back to the IRFs in ﬁgure 2.3. In both models the risk-free rate rises by the same amount following a shock. In the EZ-habit speciﬁcation, though, because of the decline in precautionary saving, expected consumption growth is lower following a shock than in the EZ-CRRA case. That means that the estimate of ρ−1 will fall.31
31 In Bansal and Yaron (2004), time-variation in the volatility of shocks in principle causes EIS regressions to be biased. However, Beeler and Campbell (2010) show that their calibration generates almost no actual bias— the median sample EIS estimates are well above 1. This paper thus represents an improvement in being able

69

Table 2.5: Regressions estimating the elasticity of intertemporal substitution

1 2 3 4 5 6 7

Model: Population, infeasible (Et[∆ct+1]) Population Small sample [2.5%, 97.5%] Population, RRA control Small sample, RRA control [2.5%, 97.5%]

Data N/A N/A 0.14 N/A N/A 0.18 N/A

EZ-CRRA 1.50 1.50 1.16 [0.03, 1.79] N/A N/A N/A

EZ-habit 0.64 0.56 0.03 [-1.98, 1.02] 1.50 0.07 [-3.08, 3.11]

Dual-shock 0.78 0.71 0.35 [-1.32, 1.34] 1.50 0.94 [-1.07, 2.78]

Note: Values reported are the coefficient from regressions of consumption growth or expected consumption growth on the risk-free rate. The dependent variable in row 1 is expected consumption growth (computed numerically in the simulations); all other rows use actual consumption growth. The small-sample regressions are based on 228 quarters of data, and median coefficient estimates are reported; 2.5 and 97.5 percentiles are reported in brackets. The RRA control is actual risk aversion in the simulations and estimated risk aversion (section 4) in the empirical regressions.

70

The regression in the ﬁrst row of table 2.5 is in some sense ideal, but it is not the regression that we are actually able to run in the data since Et ∆ct+1 is unobservable.32 Rows 2 through 4 report results for estimates of ρ−1 from regressions of actual consumption growth, ∆ct+1 , on the risk-free rate, r f ,t+1 . Row 2 gives the population estimates, while rows 3 and 4 give the median and 95-percent range of the estimates from 228-quarter simulations. With constant relative risk aversion, the population regression in row 2 estimates the EIS exactly. The median estimate from the small-sample regressions in row 3 is 1.16. The 95-percent range is wide, and it only just barely contains the estimate from the data. So it is in principle possible for the EZ-CRRA model to generate an estimate of the EIS as small as what we observe in the data, but the probability is small (less than 10 percent). In the EZ-habit models, the bias is far larger. The population estimate in the singleshock case is 0.56, and the median sample estimate is 0.03. For the dual shock model, the estimates are only slightly better—0.71 in population and a median of 0.35 in small samples. The reason for this slight improvement is that the temporary shocks have little effect on risk aversion, so they induce variation in consumption and the risk-free rate that is closer to the usual EZ-CRRA case. The upper end of the 95 percent range for the simulated estimates is well below the true value of the EIS. The ﬁrst four rows of table 2.5 show that in general, regressions of interest rates on expected consumption growth are not a very good way to estimate the EIS, and in the EZhabit model they are biased and inconsistent. It is worth noting, though, that if we could observe αt , we could completely eliminate the bias in the EIS regressions. Empirically, this suggests that regressions designed to estimate the EIS could be improved by including a control for risk aversion, such as the price-dividend ratio on the stock market. The ﬁnal three rows of table 2.5 try to estimate the EIS including a control for risk aversion. In the data, I use a measure of risk aversion derived from the EZ-habit model below in section
to generate a substantial bias in aggregate regressions without large movements in the conditional volatility of consumption. In principle, the real risk-free rate, r f ,t+1 , is also unobservable in the data. As above, I form r f ,t+1 as the difference between the nominal 3-month interest rate and a forecast of inﬂation based on lagged inﬂation and nominal interest rates. Errors in the estimate of the true real-risk-free rate would bias the estimate of ρ−1 towards zero. Instrumental-variables methods can theoretically eliminate this bias.
32

71

ˆ 2.4, denoted αt . The empirical estimate of the EIS is essentially unchanged from when ˆ αt is not included. In population, when αt is included in the simulated regressions, the EIS is estimated exactly. In small-sample regressions, though, the estimate of the EIS in the model is still biased downward. In the single-shock model, the median estimate is 0.07. and in the dual-shock model 0.94. Row 7 shows, though, that the 2.5 percentile of the small-sample estimates is -3.08 in the dual shock model, while the 97.5 percentile is 3.11. So even though the median estimate in the dual-shock model is not enormously biased, the empirical value of 0.18 is well within the simulated range. In the end, while controlling for risk aversion should, in principle, allow us to estimate the EIS consistently, in small samples the regressions still do not seem to provide useful estimates because of weak-identiﬁcation problems. As an alternative to these regressions, the EIS could be consistently estimated if we had an instrument for the risk-free rate that was uncorrelated with risk aversion. Standard aggregate instruments like lagged interest rates and consumption growth (e.g. Campbell, 2003) will certainly not be valid instruments under the EZ-habit model since the precautionary saving effect is persistent. Household-level instruments would work better if my model is correct in treating risk aversion as being driven by aggregate factors.33 The EZhabit model thus has the ability to explain the divergence between micro and macro estimates of the EIS if the micro instruments are valid and the macro instruments invalid.

2.4 Empirical return forecasting
This section shows how the model suggests we can directly estimate the coefﬁcient of relative risk aversion in the data, and then demonstrates that the estimate is a powerful predictor of stock returns. Second, I present novel evidence that technology growth forecasts stock returns, just as it does in the production model. The results differentiate the EZhabit paper from models with time-varying disaster risk. Gourio (2010) predicts that when
33 Gruber (2005) discusses precisely these issues and tries to resolve them by using household-speciﬁc variation in tax rates as an instrument for consumption growth. Dynan (1993) explicitly controls for the precautionary savings effect at the household level by predicting the conditional volatility of consumption, but she does not deal with the possibility that risk aversion varies over time.

72

there are changes in the probability of a large disaster occurring, price-dividend ratios will forecast returns, which is also true in the EZ-habit model. The EZ-habit model also predicts, though, that technology and estimated risk aversion will forecast stock returns, and that estimated risk aversion will be the single most powerful forecaster of returns, which would not be true in the time-varying disaster model or models based on other forms of time-varying volatility (e.g. Bansal and Yaron, 2004, Bloom, 2009, or Fernandez-Villaverde et al., 2011).

2.4.1

Estimating risk aversion

If risk aversion follows the AR(1) process given in (2.5), then we can measure current
A risk aversion if we simply observe the history of aggregate value, vt . For a given value A of the EIS and observed data on wealth and consumption, it is possible to calculate vt by

rearranging equation (2.7)
A vt =

ρ A 1 1 wA − c + log (1 − exp (− β)) 1−ρ t 1−ρ t 1−ρ

(2.18)

If we can measure household wealth and consumption, then we can measure value. We
A then simply plug the estimates of vt into equation (2.5) to obtain estimates of αt .

Lettau and Ludvigson (2001) study a cointegrating relationship between consumption and aggregate wealth. Their method is valid in my model since consumption and wealth are cointegrated under balanced growth. While their analysis was designed to estimate the consumption-wealth ratio, it also delivers, as a byproduct, a measure of aggregate wealth (since we can always add consumption to the wealth-consumption ratio to obtain wealth). The estimate of wealth derived using their method is a combination of asset wealth data obtained from the ﬂow of funds accounts plus an estimate of human wealth. They treat labor income as the dividend from the stock of human wealth. Assuming the price-dividend ratio for human wealth is stationary, we can use labor income as a proxy for human wealth. Denoting asset wealth as at and labor income as yt , the appendix shows

73

that we then have a cointegrating relationship,

ct = ζωat + ζ (1 − ω ) yt + ξ t

(2.19)

where ζ and ω are parameters and ξ t is a stationary error term. Lettau and Ludvigson (2001) refer to the residual ξ t as cay. This variable essentially represents an estimate of the consumption-wealth ratio. Since I want to estimate wealth, I deﬁne

ayt ≡ ωat + (1 − ω ) yt

(2.20)

which, under the assumptions above, will be a statistically unbiased estimate of total wealth, but will include error due to the fact that we do not assume we directly measure human wealth.34
A With our measure of wealth ayt , we estimate vt as

ˆA vt =

ρ 1 ayt − ct 1−ρ 1−ρ

(2.21)

where we ignore constants, and a circumﬂex indicates an estimated variable. Note that since the parameters of the cointegrating relationship for ct , at , and yt are estimated superconsistently we do not have to modify any standard errors in the subsequent analysis to take into account the fact that ayt is a generated regressor (which is why it does not receive a circumﬂex). That said, to the extent that there is measurement error in the consumption ˆA ˆA or wealth data, vt will inherit that same error. When we use vt to forecast market returns, this measurement error should only weaken the results. For measurement error to generate a spurious predictive relationship, it would have to be correlated with other predictors of returns.35
34 In simulations with variable labor supply, The price/dividend ratio on human wealth does vary over time, but the variation is relatively small: risk aversion calculated using the method here (assuming a constant price-dividend ratio on human wealth) is over 95 percent correlated with actual risk aversion.

One obvious source of measurement error is that human capital is not a perfect estimator of the value of human wealth. Suppose risk aversion rises above average and lowers the price-dividend ratio on human wealth below average. Labor income will then be overestimating human wealth (compared to its average). ˆ High levels of wealth drive out measure of αt downward, so this measurement error should bias the results against correctly forecasting returns (high risk aversion in the data leads to low risk aversion in our estimates).

35

74

ˆA This deﬁnition of vt is similar to Lettau and Ludvigson’s cayt , except they have equal weights on ct and ayt , whereas equation (2.21) uses a combination where the weights deˆA pend on the EIS. Also, cayt is stationary by construction, whereas vt is growing over time (it is cointegrated with consumption and wealth). In equation (2.21) a high EIS (low ρ) raises the weight on consumption relative to asset wealth. If the EIS is less than 1 (ρ > 1), the weight on wealth, ayt , is actually negative, and the weight on consumption greater than 1. Bansal and Yaron (2004) and Kaltenbrunner and Lochstoer (2010) both ﬁnd that an EIS of 1.5 allows their models to ﬁt asset pricing facts, so I use the same value. This value is also consistent with the micro evidence of Vissing-Jorgensen and Attanasio (2003). The results reported below are broadly similar as ˆA long as the EIS is greater than 1.1 (at that level and below, vt becomes very volatile). The appendix reports a sensitivity analysis for various values of the EIS. ˆA Figure 2.4 plots vt both in its raw levels and with a linear trend taken out. As we would ˆA expect, vt follows a strong upward trend. There seem to be both low and high frequency ˆA components to detrended vt . In particular, there are long-run swings with peaks in the early 1970’s and 2000’s and a trough around 1994, generally consistent with movements in the aggregate price-dividend ratio and variation in average output growth.36 At the same time, there are business-cycle frequency movements, e.g. the troughs in 1973, 2001, and 2008. ˆ I construct an estimate of αt , αt , using the update process for risk aversion, equation ˆA (2.5), and the data on vt . In particular, we have
A ˆ ˆ ¯ ˆA αt+1 = φαt + (1 − φ) α + λ ∆vt+1 − Et ∆vt+1

(2.22)

A As above, I assume that φ = 0.96. Et ∆vt+1 is estimated simply as the sample average of

ˆA ∆vt .37
In the simulations not reported here, though, this effect seems to be small. Note, though, that linear detrending will tend to make the series look as if it is mean-reverting even if it follows a random walk. The linear trend is only used to make the graph legible; none of the results involve it.
A A In principle, it is possible to forecast ∆vt+1 , but the amount of predictability in ∆vt+1 is sufﬁciently small A simply follows a random walk. that the results are nearly identical to assuming that vt+1 37 36

75

Table 2.6: Value, raw and linearly detrended

11.5

0.25

11.3

0.2

11.1

Log household value, linearly detrended

Log household value (lifetime utility)
0.15

10.9 0.1

10.7 0.05

10.5 0

76
-0.05 -0.1 -0.15 -0.2 1957 1962 1967 1972 1977 1982 1987 1992 1997 2002 2007

10.3

10.1

9.9

9.7

9.5

1952

Note: Household value (lifetime utility) is measured using data on wealth and consumption from Sydney Ludvigson's website. The thin line is the absolute level of value (left-hand axis); the thick line is value linearly detrended. Both variables are measured in logs. Grey bars are NBER-dated recessions.

The parameter λ governs the volatility of αt , but it has only a multiplicative effect on ˆ ˆ αt . That is, two estimates of αt will be perfectly correlated with each other, regardless of ¯ what values are chosen for λ. The same argument applies for α. As long as we are simply trying to forecast stock returns using a linear regression, we can ignore any additive or ˆ ¯ ˆ multiplicative shifts in αt . Therefore, I set α = 0 and choose λ so that αt has unit variance, normalizations that will have no effect on the regression-based measures of forecasting power (and I choose a negative value of λ to match the habit-formation motivation of the ˆ ¯ model). In the ﬁrst period of the sample I assume α = α. An important feature of this method of forecasting is that it is based only on the preference speciﬁcation. None of the assumptions we made about the production side of the economy are required for this method to be valid. We simply take advantage of the relationship between household value and changes in risk aversion and the relationship under Epstein–Zin preferences between household value and wealth.

2.4.2

Forecasting market returns

The next question is to what extent the model-implied variation in expected returns is ˆ related to actual returns. Figure 2.5 plots αt and 5-year excess returns on the stock market (the value-weighted excess return from Kenneth French). The strong correlation between the two series (0.68) is immediately apparent. There are both high and low-frequency ˆ movements in αt associated with changes in growth in value. In the periods when value is growing quickly, e.g. the late 1990’s, risk aversion falls. At the same time, there are higherfrequency movements, such as the temporary increases in estimated risk aversion around the recessions in 1991 and 2001. ˆ Figure 2.1 plots R2 s from regressions of future stock returns on αt , cayt , the pricedividend ratio (P/D), and the excess consumption ratio from Campbell and Cochrane (1999). Each line gives the R2 from a univariate regression. The x-axis gives the horizon for the return in quarters. The nth point is the R2 from a regression of ∑n=1 rt+ j on j the predictor at time t. The regressions are all run on quarterly data from 1952 to 1999 (to
The appendix shows that the results are robust to different choices for φ.

77

Figure 2.4: Estimated risk aversion and 5-year excess stock returns

4

6.5

3

5.5

2

5-year excess stock returns

Estimated risk aversion

4.5

3.5

1 2.5

0 1.5

-1 0.5

Estimated risk aversion (normalized)

-2 -0.5

-3

-1.5

-4 1957 1962 1967 1972 1977 1982 1987 1992 1997 2002 2007

-2.5

1952

Note: Excess stock returns are for the CRSP value-weighted index minus the risk-free rate, from Ken French's website. Returns are forward-looking five-year averages. Risk aversion is estimated from data on aggregate wealth and consumption and is normalized to have zero mean and unit variance.

Excess returns

78

ensure we have data for the 40-quarter regression). Each regression uses the same sample for the predictors. ˆ At every horizon, α is dominant. At the ﬁve-year horizon, the R2 for estimated risk aversion peaks at more than twice that of the other variables. The R2 s are impressively ˆ high for just a single variable: at the 5-year horizon, α explains 50 percent of the postwar variation in stock returns. Furthermore, in horse-race regressions (reported in the ˆ appendix), α dominates cay at all horizons. An important consideration in long-horizon forecasting regressions is that the residuals are highly persistent. Kiefer, Vogelsang and Bunzel (2000) and Kiefer and Vogelsang (2005) show that by using Newey–West standard errors with a very long lag-window, we can obtain test statistics with better size properties than techniques that use a ﬁxed (and usually short) lag window. I choose a lag window equal to half the sample size and use the critical values reported in Kiefer and Vogelsang (2005).38 For cay, every regression except for ˆ those with horizons greater than 30 quarters is signiﬁcant at the 5 percent level. For α, the largest p-value is 0.0008. The price-dividend ratio is signiﬁcant at the 5 percent level for forecasts of 14 quarters or longer. In other words, these regressions all imply that we have ˆ substantial ability to forecast stock returns in the post-war period, and α is the strongest of the predictors. Out-of-sample tests with both asymptotic and bootstrapped critical values give similar results (appendix D.3). Appendix D examines the sensitivity of the results in this section to the various parameters we had to calibrate (e.g. the EIS and the persistence of habits). The basic results hold across a broad range of parameter sets.

2.4.3

Forecasts from estimates of technology

The method of estimating the level of risk aversion studied above does not rely on any assumptions about the structure of production in the economy, being derived purely from the preference speciﬁcation. However, in the production model, changes in value are
Kiefer and Vogelsang (2002) note that there is a size-power tradeoff. When the lag window is increased, the size of the test statistics gets closer to their nominal size, but there is a loss of power. I choose a lag window of half the sample to balance these considerations. The results are basically identical when using a lag window equal to the sample size as in Kiefer, Vogelsang, and Bunzel (2000), though the price-earnings is signiﬁcant in more of the regressions.
38

79

closely related to changes in productivity. If we can measure innovations to technology, then risk aversion should follow an AR(1) process where the innovations are equal to the shocks to the stochastic trend in technology. There is a large literature that tries to estimate aggregate technology shocks. I consider two methods here. The ﬁrst builds off of Solow (1957) and uses restrictions from a constantreturns production function:

at = yt − γk t − (1 − γ) lt

(2.23)

at measures technology if the economy has a Cobb–Douglas production function, with l denoting log labor supply. I also consider a simpler metric, labor productivity, l pt = yt − lt . Labor productivity does not take into account the effects of capital accumulation and simply models technology as the average product of labor. Capital can be difﬁcult to measure, whereas the number of hours supplied in the economy is a fairly concrete quantity (though the quality of those hours is difﬁcult to account for).39 To extract the stochastic trend from the two productivity series, I estimate univariate ARMA models for each variable. The Bayesian information criterion implies that TFP growth is best ﬁt with an MA(2), while labor productivity growth should be treated as i.i.d. ε TFP is deﬁned as the residual in the MA(2), while ε LP is simply equal to labor productivity t t growth. That is, ε TFP and ε LP are innovations to the Beveridge–Nelson (1981) trends in t t productivity. Section 2.5.1 shows that, at least in the case where log technology follows a random walk, risk aversion follows an AR(1) process of the form, ¯ αt = (1 − θ ) αt + θαt−1 + ε X t

(2.24)

where ε X denotes a measure of technology growth. We then have two measures of αt ,
Furthermore, labor productivity determines the tradeoff that households face between consumption and leisure. If the capital stock rises because foreigners want to invest more in the US, household welfare will increase even if TFP does not. Similarly, a tax increase that reduced desired saving could lower welfare and ˆ labor productivity, without having any effect on TFP. And welfare is the relevant input in estimating αt .
39

80

ˆ ˆ which I denote α TFP , α LP , using ε TFP and ε LP , respectively.40 The two measures turn out to t t be highly correlated (93 percent). ˆ ˆ Figure 2.6 plots ﬁve-year excess returns against α TFP and α LP The two series are both clearly highly correlated with future excess returns. The p-values in regressions of quarˆ ˆ terly excess returns on α TFP and α LP are 0.032 and 0.026, respectively (using Kiefer, Vogelsang, and Bunzel, 2000, t-type-statistics to account for autocorrelation). The relationship between the three series is most clear around the turning points. Productivity growth begins slowing down around 1970, driving risk aversion upwards. Forward-looking stock returns reach their trough at roughly the same point. Productivity growth rises again starting in the mid-1990’s, which is exactly when stock returns begin falling again. The two measures of risk aversion in ﬁgure 2.6 clearly do not have the explanatory power of the variables studied above. The wealth and consumption data used above have a forward-looking component that is not present in instantaneous measures of technology. On the other hand, my measure of risk aversion is highly correlated with the consumptionwealth ratio. In almost any model with time-varying discount rates, the consumptionwealth ratio will forecast stock returns. It is not the case, though, that any model will predict that measures of technology should forecast stock returns. Figure 2.6 thus provides evidence in favor of the model presented here over other explanations of time-varying expected returns.

2.5 Extensions
2.5.1 Log-linearization

This section studies a log-linear version of the production economy from above. I use the solution to build a better understanding of the basic results reported in the previous sections. In particular, I derive analytic approximations for the consumption function, the risk-free rate and the conditional Sharpe ratio for the wealth portfolio. I also derive
40 Note that α TFP includes some forward-looking information since its construction requires the estimation ˆ ˆ of an MA(2) on the full sample. α LP does not suffer from this ﬂaw. It is true that in both cases we have to estimate mean productivity growth, but shifts in the estimted mean simply correspond to shifts in the mean ˆ ˆ of αt ; they have no effect on its dynamics. In regressions of returns on αt , the constant will thus always absorb ¯ shifts in α, so the estimation of the mean of productivity growth is irrelevant for forecasting returns.

81

Figure 2.5: Stock returns and estimates of risk aversion from productivity growth

3

6

5

2 4

1

3

Risk aversion from labor productivity
2

0 1

Estimated risk aversion (normalized)

-1

Five-year stock returns

0

-1

-2 -2

Risk aversion from total factor productivity
-3 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005

-3

1955

Note: Total factor productivity is the quarterly Solow residual obtained from John Fernald's website. Labor productivity is output per hour in the non-farm private business sector from the BLS. Risk aversion is an AR(1) with innovations equal to the (negative) innovations to the Beveridge–Nelson trend in productivity.

Five-year stock returns

82

an essentially afﬁne model of the term structure with a time-varying price of risk. This result connects the standard production theory in the macro literature to one of the most commonly used empirical asset-pricing frameworks.

2.5.1.1 Approximation method and solution I derive a log-linear consumption function as the solution to a model that represents a log-linearization of the environment derived above with permanent technology shocks only. Speciﬁcally, if the capital accumulation equation, the return on capital, and the return on the wealth portfolio are log-linearized, then we are able to obtain an exact formula for the consumption function under EZ-habit utility (with canonical Epstein–Zin and power utility as special cases). The methods build on Campbell (1994) and Lettau (2003). Unlike the usual techniques in the macro literature, the solution is not based on certaintyequivalent or higher-order approximations to expectations.41 Rather, I take advantage of log-normality to calculate expectations exactly. That feature of the method is critical for accurately capturing risk premia and precautionary saving effects. It is straightforward to show that the approximation technique delivers policy functions that are identical to those obtained from perturbation up to the ﬁrst order (depending on where the derivatives are taken). The difference is that because I take advantage of formulas for log-normal expectations, a term involving risk aversion appears in the solution, meaning that the approximation captures the time-varying precautionary saving effect that is central to driving the difference in the consumption response between the EZ-habit model and the RBC model with canonical Epstein–Zin preferences.42 The approximation method involves log-linear approximations to three components of the model: the budget constraint, the return on the wealth portfolio, and the return on
The usual technique is the perturbation method of Judd (1999). See Woodford (2003) for a representative application and Rudebusch and Swanson (2009) for extensions to higher order approximations
42 In perturbation, the equilibrium equations are not only approximated with respect to the endogenous and exogenous variables, but also the volatility of the technology shock, σa . The ﬁrst-order perturbation solution therefore does not include any terms involving interactions of state variables with σa . The approximation used 2 here includes a term for αt σa . The solutions are otherwise identical. 41

83

capital,

k t +1 ≈ λ 0 + λ k k t + λ a a t + λ c c t rw,t+1 ≈ Et rw,t+1 + ∆Et+1 ∑ θ j ∆ct+ j+1 − ∆Et+1 ∑ θ j rw,t+ j+1
j =0 j =1 ∞ ∞

(2.25) (2.26) (2.27)

rk,t+1 ≈ r0 + rk (k t+1 − at+1 )

where {λ0 , λk , λ a , λc , θ, r0 , rk } are linearization coefﬁcients that I solve for in the appendix and lower-case letters denote logs.43 ˜ ˜ Deﬁne ct ≡ ct − at and k t ≡ k t − at . With the the three log-linear approximations (2.25), (2.26), and (2.27), the appendix shows that we can obtain the following result: Proposition 2.1. Given the log-linear budget constraint, the log-linear return on the consumption claim, and the production function, the optimal consumption plan takes the form ˜ ˜ ct = ηc0 + ηck k t + ηca αt

(2.28)

and the return on the wealth portfolio can be written as rw,t+1 = ηw0 + ηwa αt + ρEt ∆ct+1 + κr ε t+1

(2.29)

where the coefﬁcients {ηc0 , ηck , ηca , ηw0 , ηwa , κr } are solved for in the appendix. Furthermore, the log price-dividend ratio of the consumption claim is linear in risk aversion and scaled capital. The ﬁrst important implication of this result is that even though risk aversion is timevarying, both consumption growth and returns on the wealth portfolio are homoskedastic. We did not assume the existence of a log-linear policy for consumption or homoskedastic wealth return. Variation in risk aversion induces variation in expected returns on risky assets, but not in their volatility. This same result is obtained in the numerical solution above.
(2.25) is a log-linearization of the resource constraint, Kt+1 = (1 − δ) Kt + At Kt − Ct ; (2.26) is the Campbell–Shiller (1988) approximation for the return on the wealth portfolio; (2.27) is a linearization of γ −1 1− γ Rk,t+1 = γKt+1 At+1 + 1 − δ.
43 1− γ γ

84

Remark 2.1. ηck does not depend on the level or volatility of the coefﬁcient of relative risk ¯ aversion (i.e. on α or λ). The ﬁnding that ηck is not affected by the time-variation in the coefﬁcient of relative risk aversion helps us build intuition as to why the IRF for consumption changes when risk aversion varies. Consumption responds to a technology shock more strongly in the EZhabit model than in the RBC model purely because the coefﬁcient of relative risk aversion falls in response to positive technology shocks. If the economy experienced a hypothetical shock to the size of the capital stock holding the coefﬁcient of relative risk aversion ﬁxed, the behavior of consumption and saving would be identical under EZ-habit and EZ-CRRA preferences.

2.5.1.2 The risk-free rate and excess returns on the wealth portfolio Denote the conditional standard deviation of a variable x as σ ( x ). We have the following formulas for the risk-free rate and Sharpe ratio, Proposition 2.2. In the log-linearized model, the risk-free rate follows

r f ,t+1 = η f 0 + ρE∆ct+1 − η f a αt

(2.30)

and the Sharpe ratio of the consumption claim is Et rw,t+1 − r f ,t+1 + 1 σ2 (rw ) 2 = ρσ (∆c) + (αt − ρ) σ (∆v) σ (r w )

(2.31)

As usual, expected consumption growth affects the risk-free rate in proportion to the inverse of the EIS. There is an additional term η f a αt reﬂecting the time-varying precautionary saving motive. When risk-aversion is high, precautionary saving demand is high, driving the risk-free rate downwards, all else equal. It is immediately clear that a simple regression of consumption growth on interest rates will not identify the EIS, 1/ρ, unless the instruments used for the interest rate are uncorrelated with current risk aversion or risk aversion is controlled for. Note also that the Sharpe ratio is strictly increasing in αt . The terms σ (∆c) and σ (∆v) 85

are the standard deviations of growth in household consumption and value, respectively (both of which are constant in equilibrium). Tallarini (2000) and Lettau (2003) show that in an RBC model with power utility, an increase in risk aversion need not increase the Sharpe ratio because consumers can endogenously smooth consumption.44 In this setting, though, risk aversion unambiguously increases the Sharpe ratio. The reason is that the preference for consumption smoothing comes from the EIS. An increase in risk aversion does not cause consumers to smooth consumption endogenously, and so the only effect is to raise the Sharpe ratio on the wealth portfolio. An obvious question is what the term σ (∆v) actually is. The appendix derives the following results, Proposition 2.3. In equilibrium, the coefﬁcient of relative risk aversion αt follows the process,

¯ αt = φαt−1 + (1 − φ) α + σaa ε t

(2.32)

where σaa depends on the parameters λ, θ, φ, and σa . The standard deviation of innovations to value is

σ (∆v) =

−1 +

θ 1 + 2 1−θφ σaa σa θ 1−θφ σaa

(2.33)

σ (∆v) is increasing in σa for σa > 0 and σaa < 0. Result 2.3 ﬁrst shows that the coefﬁcient of relative risk aversion follows an AR(1) process with innovations that are perfectly correlated with technology shocks. This result is a consequence of household value being a log-linear function of the level of technology, so that vt+1 − Et vt+1 is a linear function of the technology shock. Result 2.3 shows that the standard deviation of innovations to household value is constant. Furthermore, in the benchmark case where technology shocks drive risk aversion down, an increase in their volatility raises the volatility of innovations to value, as we would expect.
Lettau and Uhlig (2004) and Rudebusch and Swanson (2008) obtain similar results for Campbell–Cochrane preferences.
44

86

In the case where σaa = 0, which corresponds to power utility, we obtain a surprisingly simple formula: Remark 2.2. For the RBC model where technology follows a random walk and consumers have constant relative risk aversion, the Sharpe ratio on a consumption claim is approximately

SRt ≈ ασa (1 − ηck ) + (α − ρ) σa ηck

(2.34) (2.35)

≈ ασa − ρσa ηck

This formula is similar to the formula obtained by Bansal and Yaron (2004) for the Sharpe ratio in the presence of long-run risks in an endowment economy. σa represents long-run shocks to consumption growth, since consumption eventually catches up to a technology shock. Of that total response, σa (1 − ηck ) comes in the ﬁrst period, with σa ηck in subsequent periods.45 We can thus think of the ﬁrst component of the Sharpe ratio, ασa (1 − ηck ), as Bansal and Yaron’s short-run risk term, and (α − ρ) σa ηck as long run risks. Kaltenbrunner and Lochstoer (2010) also show that production models generate long-run risk endogenously, but this simple formula for the Sharpe ratio has not been obtained elsewhere. The second line shows that we can isolate the ηck term. If the EIS is large (ρ is small) then the endogenous response of consumption in the model is unimportant and the Sharpe ratio is determined simply by the volatility of technology shocks and the coefﬁcient of relative risk aversion.

2.5.2

Afﬁne bond pricing

It turns out that the log-linear solution to the model allows us to connect the standard macro framework to the bond-pricing literature through the following result:
45

Recall that ηck is the coefﬁcient on scaled capital in the consumption function. A unit increase in the technology shock ε t+1 raises consumption by σa ; the associated decline in scaled capital of σa lowers consumption ηck σa .

87

Proposition 2.4. The log stochastic discount factor can be expressed as 1 ( ω0 + ω1 α t ) 2 σ 2 + ( ω0 + ω1 α t ) ε t +1 2

mt+1 = −r f ,t+1 −

(2.36)

The SDF takes the tractable essentially afﬁne form studied in much of the recent bondpricing literature (see Duffee, 2002, and Piazzesi, 2010, for a recent review). We have a production-based general-equilibrium afﬁne model of the term structure with an endogenously varying price of risk. Not only is the one-period risk-free rate afﬁne in the state variables, but so are the prices and yields for all longer-term zero-coupon bonds. The fact that the SDF is afﬁne is convenient because it means that the model could be estimated using the Kalman ﬁlter, either through Bayesian or frequentist methods. I am not aware of an afﬁne model of the term structure with a time-varying price of risk being derived in a production setting previously. 2.5.3 Labor supply

I model labor supply as in Gourio (2010) and van Binsbergen et al. (2010). The household’s value function takes the form
1− ρ
1− ρ 1− α t 1 1− ρ

Vt =

1 (1 − exp (− β)) Ct −v (1 − Nt )n

+ exp (− β) Et Vt1−αt +1

(2.37)

where Nt represents market labor and αt follows the same process as above. The household’s labor supply condition is 1 1 − v ωt = 1 − Nt v Ct where ωt is the wage. Note that risk aversion and habits only affect labor supply to the extent that they affect consumption. When consumption rises, labor falls, all else equal. Since positive permanent technology shocks drive consumption up farther in the EZ-habit model compared to the EZ-CRRA case, labor supply will rise by less in the EZ-habit model. Following a temporary technology shock, there is little change in risk aversion, so labor supply in that case 88

(2.38)

will look similar in the EZ-habit and EZ-CRRA models. This result is not speciﬁc to the Cobb-Douglas utility speciﬁcation studied here. In general, preferences consistent with balanced growth will specify labor supply as some function H (ωt /Ct ) (see King, Plosser, and Rebelo, 1988). Since ωt /Ct is stationary with balanced growth, labor supply will be too. If H is monotonically increasing, regardless of its functional form, the increase in consumption induced by a decline in risk aversion will also lower labor supply. To see how habits affect labor supply here, ﬁgure 2.7 plots the response of employment to a shock to technology in the EZ-habit model versus a model with constant relative risk aversion.46 With EZ-CRRA we have the usual RBC result that the increase in technology increases ωt /Ct thus raising employment. Employment then slowly falls back down to its steady state. In the simple RBC model, it is possible to make labor supply fall following a shock by varying the parameters, but it always monotonically returns to steady state. In the EZhabit speciﬁcation, though, the response of labor supply has a hump shape. On the impact of the shock, employment barely increases at all, and it then rises slowly thereafter. This behavior actually matches the response of employment to technology shocks in the literature following Gali (1999).47 In particular, all of those papers, though they use different methods, and though they ﬁnd different initial responses of employment to technology, ﬁnd a pronounced hump shape. Basu, Fernald, and Kimball (2006) argue that this could be explained by a New Keynesian model. Figure 2.7 shows that variation in risk aversion could also explain that behavior.48 Boldrin, Christiano, and Fisher (2001) and Jaccard (2011) note that in the RBC model with power utility and additive habits, variable labor supply undermines the ability of the RBC model to generate a volatile SDF. Intuitively, households can use labor supply to smooth consumption growth. Under power utility, the volatility of consumption growth
46 47

All of the parameters are identical to the main text, and v is set to 0.33 as in Gourio (2010).

See, e.g. Christiano, Eichenbaum, and Vigfusson (2004), Francis and Ramey (2005), and Basu, Fernald, and Kimball (2006). Note, though, that those papers also ﬁnd hump-shaped responses for output and consumption, which I do not obtain.
48

89

Figure 2.6: Response of employment to a technology shock

0.3

0.25

EZ-CRRA

0.2

Percent

0.15

90
EZ-habit
7 13 19 Quarters

0.1

0.05

0 25 31 37

1

Note: Response of labor supply to a unit-standard-deviation permanent increase in the level of technology.

is what determines the volatility of the SDF. Under Epstein–Zin utility, though, the ability to smooth consumption shocks does not reduce the volatility of the SDF since the SDF loads almost purely on the permanent component of consumption, as we saw in the previous subsection.49 The EZ-habit model thus does not suffer from the drawback of previous habit-based models that freely variable labor supply could substantially reduce the Hansen–Jagannathan bound.

2.6 Conclusion
This paper presents a model of time-varying risk aversion. It simultaneously matches the basic behavior of macroeconomic and ﬁnancial aggregates. The EZ-habit model gives a framework in which consumption, output, and investment growth are all realistically volatile in both the short and long-run, consumption growth is nearly a random walk, and risk premia are high and volatile. More generally, this paper provides a general framework for modeling time-varying discount rates that can be used with other macro models. As pointed out by Cochrane (2011), asset-pricing research has recently focused on understanding variation in the price of risk over time. This paper gives a way of analyzing time-varying risk prices in the standard macro framework. I show that for the RBC model, the effect of an increase in risk aversion on consumption and investment looks similar to a decline in the household’s rate of time preference in the sense that it temporarily increases investment and reduces consumption. An obvious next step is to study the EZ-habit preferences in a richer setting. DewBecker (2011) estimates a standard medium-scale DSGE model with sticky prices and wages, but with the added feature that risk aversion varies over time, as here. Complementing the results in this paper on equity pricing, Dew-Becker (2011) shows that the EZ-habit model, when augmented with a model of inﬂation, can match the behavior of the nominal term structure well, generating a strongly upward-sloping term structure of
I conﬁrm this result numerically; the Hansen–Jagannathan bound is essentially identical with and without labor supply.
49

91

nominal interest rates and a volatile term premium.

92

3. BOND PRICING WITH A TIME-VARYING PRICE OF RISK IN AN ESTIMATED
MEDIUM-SCALE BAYESIAN DSGE MODEL

3.1 Introduction
Non-structural models are widely used in both macroeconomics and the study of the term structure of interest rates. Recently, Smets and Wouters (2003) have shown that a structural New Keynesian model can match the dynamics of the macroeconomy as well as or better than a benchmark non-structural VAR. This paper extends that work by showing that a suitably augmented version of their model can also match the dynamics of the term structure of interest rates as well as a standard non-structural model. In addition, including information from the term structure has substantial effects on the estimated sources of variation in the real economy. Bekaert, Cho, and Moreno (2010) show that a log-linearized macro model naturally also delivers closed-form expressions for bond prices. Their approximation method, however, is not able to describe risk premia, and even if it could, the model assumes that risk premia are constant. This paper builds on their work by using an approximation method that allows for positive and time-varying risk premia. I then estimate the model using Bayesian methods, and show that it ﬁts interest rates with errors that are similar to those generated by a non-structural three-factor model. The errors in ﬁtting annualized yields on bonds with maturities ranging from 1 quarter to 10 years have a standard deviation of 8 basis points. For the production side of the economy, I take the model described in Justiniano, Primiceri, and Tambalotti (JPT; 2010) and combine it with a preference speciﬁcation that endogenously generates the essentially afﬁne stochastic discount factor of Duffee (2002). Households are assumed to have Epstein–Zin preferences with time-varying risk aversion as in

93

Melino and Yang (2003) and Dew-Becker (2011a), which induces a time-varying price of risk. I also allow the central bank to have a time-varying inﬂation target, movements in which shift the entire term structure, inducing a so-called level factor in interest rates. The steady-state term spread in the model simply represents the average risk premium on long-term bonds. The steady-state term spread is estimated to be 152 basis points, similar in magnitude to the 207-basis-point average observed in the sample. To understand why that risk premium would be large, we ﬁrst need to understand what drives the variance of the pricing kernel. When the representative household has Epstein–Zin preferences with a coefﬁcient of relative risk aversion that is substantially larger than the inverse of its EIS (preferring an early resolution of uncertainty), state prices are almost entirely driven by innovations to the household’s lifetime utility, i.e. the value placed on its entire future stream of consumption and leisure. With a high EIS, transitory changes in consumption have a small effect on lifetime utility. Permanent technology shocks, though, will have large effects. Shifts in risk aversion also affect lifetime utility because they affect how much the household penalizes future uncertainty. Even though there are nine shocks in the economy, only two of them turn out to be relevant for the pricing kernel—labor-neutral technology and risk aversion. Since all of the other shocks (e.g. monetary policy, markups, government spending) are purely transitory, they have little effect on permanent income or welfare (because the household is estimated to have a relatively high EIS of 1.33), and thus they do not have a strong effect on state prices. Following a positive innovation to the level of technology, nominal interest rates are estimated to fall, making long-term bonds risky and inducing a positive slope in the term structure. This result is common to a variety of New-Keynesian models, e.g. JPT, Smets and Wouters (2004), and Christiano, Trabandt, and Walentin (2011). In this paper, the reason is that the central bank’s inﬂation target falls following positive technology shocks. Intuitively, a positive supply shock lowers inﬂationary pressure, which the central bank takes as an opportunity to drive inﬂation lower for an extended period. The fact that the negative correlation between technology shocks and interest rates is obtained in numerous other models that assume a constant inﬂation target suggests that this is in fact a well94

identiﬁed feature of the data. Variation in risk aversion also makes an important contribution to the model’s ability to the term structure of interest rates, though. Standard statistical tests easily reject a model with constant risk aversion in favor of one with time-varying risk aversion. The pricing errors for bonds are smaller by a factor of three when risk aversion is allowed to vary over time. Movements in risk aversion account for a large fraction of the variance of the term spread, particularly outside of recessions. While the variance decompositions imply that the pricing kernel is driven entirely by the labor-neutral technology and risk aversion shocks, I ﬁnd that those two shocks have only minor effects on the dynamics of the real economy in the short-run. Risk aversion explains less than 5 percent of the variance of output, consumption, investment, and hours worked at business-cycle frequencies. The variable that is most responsive to the technology shock is hours worked, and the technology shock still explains only 25 percent of its variance. The variance decompositions also differ substantially from the results found by JPT. Whereas JPT ﬁnd that investment technology shocks are an important driver of the business cycle, I ﬁnd that they explain little other than investment, and monetary policy and markup shocks play much larger roles. This ﬁnding suggests that including information about bond prices in estimation has important effects on estimation results. In addition to matching the behavior of the term structure, the estimated parameters imply reasonable behavior for equity prices. The steady-state annualized Hansen– Jagannathan bound is estimated to be 0.47, which is consistent with the observed Sharpe ratio for the stock market in the data sample, even though data on equity returns is not included in the estimation. Furthermore, the estimated degree of variation in risk aversion is similar to (though somewhat higher than) the value used in Dew-Becker (2011a), who calibrates a general-equilibrium model that can match the both the average Sharpe ratio on equities and also empirical stock return forecasting regressions. At business-cycle frequencies, estimated risk aversion displays similar behavior to Cochrane and Piazessi’s (2005) tent-shaped bond return forecasting factor (and they are both strongly correlated with the term spread). This paper is related to a small but growing literature on bond pricing in production 95

economies. Bekaert, Cho, and Moreno (2010) and Doh (2011) estimate New-Keynesian macro models, but they do not focus on the size and volatility of the term premium, whereas that is the feature of the term structure that this paper concentrates on. Rudebusch and Swanson (2011) generate a large and volatile term premium in a calibrated model. This paper moves beyond them by considering a substantially more complex model and showing that it can be dynamically estimated through standard Bayesian methods using the Kalman ﬁlter. Models of the business cycle have strong implications for the term structure of interest rates, so adding that information can have strong effects on estimation results. For example, I ﬁnd that when the model is estimated without bond price information, the shock to investment technology is estimated to account for a large fraction of the variance of short-term interest rates and the term spread. But when the term spread is included as part of the information set, the effects of investment technology shocks are much smaller. The implications of the model for wage-setting also change when interest rates are added: I estimate a substantially larger Frisch elasticity than JPT do, coming in closer agreement with micro evidence. The remainder of the paper is organized as follows. Section 3.2 describes household preferences and derives the pricing kernel. Section 3.3 describes the remainder of the economy including the production process, price setting, and monetary and ﬁscal policy. Next, section 3.4 explains how the model is solved. If we used perturbation methods, a third-order approximation would be necessary to capture time-variation in risk premia. The estimation of the model turns out to be sufﬁciently difﬁcult, however (due to numerous local extrema in the likelihood function, a common feature of models of the term structure), that the use of a nonlinear ﬁlter for calculating the model’s marginal likelihood is infeasible. I therefore use the essentially afﬁne solution method described in Dew-Becker (2011b). The method approximates the pricing kernel separately from the remainder of the model, allowing it to take the essentially afﬁne form with a time-varying price of risk described in Duffee (2002). The essentially afﬁne method is equivalent to a ﬁrst-order perturbation local to the non-stochastic steady-state, but it includes corrections for volatility that allow it to substantially outperform perturbation in stochastic simulations. The key feature of the essentially afﬁne method is that risk premia may vary over time and affect 96

real variables, not just asset prices. Section 3.5 describes the Bayesian methods used to estimate the model. Sections 3.6 and 3.7 examine the implications of the estimates for asset prices and the dynamics of the real economy, respectively. Finally, section 3.8 concludes.

3.2 Household preferences
3.2.1 Objective function and budget constraint I assume the household has recursive preferences over consumption and leisure
1− ρ 1− α t 1 1− ρ

Vt =

¯ (1 − βBt ) U (Ct , Ct−1 , Nt , Zt ) + βBt Et Vt1−αt +1

(3.1)

¯ where Ct is consumption, Ct is aggregate consumption, Nt is the number of hours worked outside the home, and Et denotes the expectation operator conditional on information ¯ available at date t. The term Ct−1 allows the period utility function to potentially include external habit formation. The level of technology, Zt , may also affect household utility in order to ensure balanced growth (as in Rudebusch and Swanson, 2010). Bt is an exogenous shock to the household’s rate of time preference. The choice of exactly how to specify this preference shock is not trivial. The goal is to generate variation in consumption demand conditional on the level of interest rates. However, because Bt enters the value function, it may also affect the level of Vt , and hence asset prices. The ¯ speciﬁcation (3.1) has the feature that if period utility, U (Ct , Ct−1 , Nt , Zt ), is constant over time, then a change in Bt will have no effect on Vt . So in some sense it purely affects the relative preference for consumption today versus in the future, as opposed to also affecting the household’s overall level of welfare.1 This speciﬁcation thus imposes the restriction that intertemporal preference shocks are per se unpriced (in the sense that if they have no effect on consumption or leisure, they have no effect on the pricing kernel) since they have no direct effect on the level of welfare.
Variance decompositions for the estiamted model reported below conﬁrm that shocks to Bt have essentially no effect on the pricing kernel.
1

97

The household’s coefﬁcient of relative risk aversion, αt , is allowed to vary over time. Dew-Becker (2011a) motivates variation in αt by considering adding a time-varying benchmark to the standard Epstein–Zin certainty equivalent, Et (Vt+1 − Ht )1−α . When Vt+1 is close to Ht , the household’s effective risk aversion over shocks to Vt+1 rises. The formulation (3.1) has the advantage that it is log-linear and we do not have to worry about the possibility that Vt+1 falls below Ht . In Dew-Becker (2011a), movements in αt are connected to movements in the household’s welfare. I loosen that constraint here and allow for independent shocks to risk aversion (equivalently, independent shocks to the habit). Melino and Yang (2003) study a similar speciﬁcation, but without the emphasis on the habit. Unlike the intertemporal preference shock, since αt directly affects the level of welfare, shocks to αt will be per se priced – that is, even if they have no effect on consumption or leisure, they will still affect the pricing kernel through their impact on the level of welfare. The household’s budget constraint is ¯ ¯ Pt Ct + Pt It + Ht + Dt = (1 + it ) Ht−1 + Wt Nt + Πt + Rk,t ut Kt−1 − Pt a (ut ) Kt−1 + Dt−1 (3.2) where Pt is the price of the consumption good, It is the expentiture on physical investment, Ht is holdings of one-period nominal bonds, Dt is cash holdings, it is the nominally riskless interest rate, Wt is the wage, and Πt represents proﬁts and other lump-sum transfers paid ¯ to the household. Rk,t is the rental rate on capital, K the quantity of capital the household owns, and ut the fraction it chooses to rent (with associated costs a (ut )). The dynamics of investment and capital accumulation will be discussed in more detail below. For now it is sufﬁcient to simply note that the household sells labor and capital and allocates the proceeds between consumption and saving. For the sake of simplicity, I study the so-called cashless economy described in Woodford (2003). The monetary authority is able to control the interest rate because money enters the household’s utility function, but the effect of money on total utility is sufﬁciently small that we can ignore it when writing V (i.e. we take the limit where the relative importance of money goes to zero). I do not discuss money any further and from now on drop it from the household’s budget constraint. 98

3.2.2

The stochastic discount factor

In general, the stochastic discount factor under recursive preferences involves transformations of the household’s value function. It is often practically difﬁcult to directly solve for the value function. As usual with Epstein–Zin preferences, it is possible in this setting to obtain an expression for the stochastic discount factor (SDF) involving consumption growth and an asset return. However, the asset whose return enters the SDF is no longer the household’s total wealth portfolio: it is now an asset that pays a dividend depending on the period utility function U and the marginal utility of consumption. The intertemporal marginal rate of substitution of consumption between neighboring dates is Mt + 1 1 − βBt+1 UC,t+1 ∂Vt /∂Ct+1 = βBt ≡ ∂Vt /∂Ct 1 − βBt UC,t Vt+1
ρ−αt
ρ−αt 1− α t

(3.3)

Et Vt1−αt +1

where UC,t ≡ ∂Ut /∂Ct is the marginal (period) utility of consumption. Mt+1 denotes the SDF between dates t and t + 1. In the case where Ut = Ct
1− ρ

and Bt is constant, Mt+1 reduces to the usual formula for

the SDF when utility depends only on consumption (e.g. Epstein and Zin, 1991). If the (period) marginal utility of consumption depends on labor, then the SDF will be distorted in the usual ways through
UC,t+1 UC,t .

Even if UC only depends on consumption, though (i.e.

if period utility is separable between consumption and leisure), variation in labor will still affect the SDF through Vt+1 : with recursive preferences, it is not generally possible to separate labor supply decisions from asset prices, unlike the case where preferences are separable between consumption and labor and over time.

3.2.2.1

Substituting in an asset return

−1 Now consider an asset that pays Ut UC,t as its dividend in each period. In the usual

analysis of Epstein–Zin preferences, one substitutes the return on an asset that pays consumption as its dividend into the SDF. In the present case, dividing period utility, Ut , by the marginal utility of consumption intuitively converts Ut from utility units into consumption units.

99

−1 We now derive the price of a claim to Ut UC,t . Denote the cum-dividend price of this

asset as WU,t . The appendix conﬁrms the guess that
1− ρ −1 −1 Bt UC,t / (1 −

WU,t = Vt

βBt )

(3.4)

and that we can substitute the return on this asset into the SDF to obtain
1− α t 1− ρ

Mt + 1 = β where RU,t+1 ≡

UC,t+1 1 − βBt+1 Bt UC,t 1 − βBt

1− α t 1− ρ

1− RU,tρ 1 +

ρ−αt

(3.5) (3.6)

WU,t+1 −1 WU,t − Ut UC,t 3.2.3 Period utility

¯ The period utility function, U (Ct , Ct−1 , Nt , Zt ) is motivated as a reduced form of a model of household production as in Rudebusch and Swanson (2010). Suppose households have power utility over both market goods and goods produced at home. ¯ Ct Ct−1
η 1− η 1− ρ 1− ρ

Ut =

1−ρ

+ ϕ1

CH,t

1−ρ

(3.7)

where CH,t is consumption of the home good. Households do not derive utility directly from leisure, but rather from what they are able to produce in their non-market-work time
αH (as in Campbell and Ludvigson, 2001). The home production function is Zt NH,t , for hours

worked at home NH,t and a coefﬁcient 0 < α H < 1. The level of labor-neutral technology in the economy is assumed to be equal (up to a constant of proportionality) in the home and market production sectors.2
Note that in the household sector, an exogenous shift in Zt , all else equal, raises output one-for-one, whereas below we will see that in the market sector it will raise output less than proportionally. The reason is that in the market sector, an increase in Zt also leads to an identical increase in the size of the capital stock. So, ultimately, the marginal product of labor in both sectors is proportional to Zt . One way to rationalize this slight elision would be if the household accumulates durable goods at home that aid household production. That feature of the model is left out for simplicity.
2

100

The period utility function can then be written as ¯ Ct Ct−1
η 1− η 1− ρ α (1− ρ )

Ut ≡

1−ρ

+ Zt

1− ρ

ϕ1

¯ ( H − Nt ) H 1−ρ

(3.8)

¯ ¯ where NH,t = H − Nt . H denotes the maximum number of hours that the household can work, either at home or in the market, and Nt is market labor. If sleep is part of home ¯ ¯ production, then H might equal 8760 hours for annual data. More generally, though, H ¯ might be smaller. As a practical matter, H affects both the elasticity of utility with respect ¯ to market labor and the Frisch elasticity. The three parameters ϕ1 , H, and α H jointly determine three primary features of household behavior: hours worked, the Frisch elasticity, and the elasticity of utility with respect to market labor. The ﬁrst term in (3.8) gives the utility that comes from consumption. The household has power utility over a Cobb–Douglas aggregate of current and (aggregate) past consumption. This formulation differs from the standard recent implementation in the macro literature in that I assume a multiplicative instead of additive habit. Campbell and Cochrane (1999) show that an additive habit can induce time-varying risk aversion, whereas the multiplicative habit will have no affect on risk aversion; that way, variation in risk preferences is driven purely by αt . The key feature of the additive habit is simply that the marginal utility of current consumption is increasing in last period’s consumption, which induces consumers to try to smooth consumption growth, as observed in the data. To obtain that result in this setting (assuming 0 < η < 1), we need ρ < 1.

3.3 Aggregate supply
For the supply side of the model, I follow exactly the setup in Justiniano, Primiceri, and Tambalotti (JPT; 2010). JPT is a standard medium-scale New-Keynesian model. It has 7 fundamental shocks—price and wage markups, labor-augmenting technical change, investment-speciﬁc productivity, monetary policy, discount rates (Bt ), and government spending. In JPT’s formulation, the monetary authority’s inﬂation target is constant. I allow it to vary to help match the movements in the long end of the yield curve. Other

101

than that and the preference speciﬁcation, my model is identical to theirs. The model is also highly similar to Smets and Wouters (SW; 2003). The critical difference between the present setup and SW is that technology is difference-stationary rather than trend-stationary, where the former is standard in the production-based asset pricing literature.3 The difference-stationarity assumption helps generate large risk premia: when technology is trend-stationary, there is very little overall risk in the economy, so households must have an implausibly high coefﬁcient of relative risk aversion in order to generate realistic asset prices.4 Since the model is standard and laid out in JPT and the main contribution of this paper is the preference speciﬁcation and bond pricing, the remainder of this section gives a relatively short description of the production setup. The appendix gives a full derivation of the model, and the reader is referred to JPT for a more detailed analysis. My description follows theirs closely. 3.3.1 Producers of physical goods

Final-good producers are competitive in both input and output markets and have a CES production function, Yt =
0

ˆ

1

Yt (i )

1 1+λ p,t

1+λ p,t

di

(3.9)

where i indexes the types of intermediate goods, Yt is output of the ﬁnal good, which can be used for either consumption or investment, Yt (i ) is the use of intermediate of type i, and the elasticity of substitution across the intermediates, which determines markups in the intermediate-goods sector, varies over time. Intermediate-good producers are monopolists for their own goods with production function Yt (i ) = max Kt (i )γ Zt
1− γ

¯ Nt (i )1−γ − Zt F, 0

(3.10)

3 A difference-stationary process has ﬁrst-differences that follow a stationary process, so it is integrated of order one. A trend-stationary process, on the other hand, is a process that has random stationary deviations around a non-stochastic trend (where the trend is generally unmodeled and taken as exogenous).

Below, I estimate average risk aversion to be 18.7 (ignoring the correction from Swanson, 2011). Rudebusch and Swanson (2011), who use stationary technology (with a slightly different preference speciﬁcation) choose an analogous parameter to be 149.

4

102

¯ where F is a ﬁxed cost of production that ensures that proﬁts are zero in steady state. Kt (i ) and Nt (i ) are intermediate-good producer’s i purchases of capital and labor services, and Zt is the level of labor-augmenting technology. 3.3.2 Price setting

We assume Calvo pricing. In every period, a fraction 1 − ξ p of intermediate good producers can change their prices, while the remainder index their prices following the rule,
p Pt (i ) = Pt−1 (i ) πt−1 π 1−ι p

ι

(3.11)

where Pt (i ) is the price of good i in terms of the numeraire, πt ≡ Pt /Pt−1 is aggregate inﬂation, and Pt =
0

ˆ

1

Pt (i )

λ −1 p,t

λ p,t

di

(3.12)

is the aggregate price index (equal to the marginal cost of a unit of the ﬁnal good). π is the steady-state inﬂation rate, and the parameter ι p determines the degree of indexation to lagged versus average inﬂation. The ﬁrms that can choose their prices freely in a given period set them to maximize the present discounted value of proﬁts over the period before they are allowed to choose a new price

Et

s =0

∑ ξ sp Mt,t+s

∞

Pt (i )

k =1

∏ π t + k −1 π 1− ι
ιp

s

p

Yt+s (i ) − Wt+s Nt+s (i ) − Rk,t+s Kt+s

(3.13)

where Mt,t+s ≡ ∏s=1 Mt+ j , Wt+s is the wage rate, and Rk+s is the rental rate for capital. j t 3.3.3 Employment agencies and wage setting Each household is a monopolistic supplier of specialized labor, Nt ( j). Competitive employment agencies aggregate labor supply into a homogeneous labor input (just as the

103

ﬁnal good producers aggregate intermediate goods) with the production function, ˆ Nt =
0 1

Nt ( j)

(1+λw,t )−1

1+λw,t

dj

(3.14)

where, as with prices, λw,t determines the elasticity of demand and hence markups in the labor market. λw,t acts as a labor-supply shock. Since the employment agencies are competitive, the price of a unit of the homogeneous labor input is ˆ Wt =
0 1 λ −1 w,t λw,t

Wt ( j)

dj

(3.15)

The labor demand function is then Wt ( j) Wt
−
1+λw,t λw,t

Nt ( j) =

Nt

(3.16)

As with prices, wages can only be changed intermittently, with probability (1 − ξ w ). If a household cannot change its wage, it indexes according to the rule Zt−1 Zt−2
ιw

Wt ( j) = Wt−1 ( j) πt−1

(π exp (γ))1−ιw

(3.17)

where γ is the average growth rate of technology. The household will choose its wage in a manner similar to how the intermediate-good ﬁrms set prices: it maximizes expected utility over the period that the wage will remain unchanged.

3.3.4

Capital and investment

Intermediate-good ﬁrms rent capital from the households at rate Rk,t . Households own ¯ a stock of capital Kt and choose a utilization rate ut . The effective quantity of capital rented to ﬁrms in period t is ¯ Kt = u t Kt (3.18)

104

The household pays a cost of utilization a (ut ) per unit of capital, with u = 1 in steady state, a (1) = 0 and χ ≡ a (1) /a (1).5 Households accumulate capital according to the rule, ¯ ¯ K t = (1 − δ ) K t −1 + µ t 1 − S It It−1

It

(3.19)

where δ is the depreciation rate and the function S incorporates adjustment costs in the rate of investment. In steady state, S = S = 0 and S > 0. µt is a shock to the cost of investment at date t.

3.3.5

Government policy

The central bank follows a Taylor rule taking the form Rt = R R t −1 R
ρR φπ

∗ πt

πt ∗ πt

Xt ∗ Xt

φX 1 − ρ R

Xt /Xt−1 ∗ ∗ Xt /Xt−1

φdX

ηmp,t

(3.20)

where Rt is the gross nominal interest rate, R is its steady-state value, Xt is total output,
∗ ∗ Xt is the level of output that would prevail if prices had always been ﬂexible, and πt is

the inﬂation target at date t. The central bank is allowed to respond to both the level and change in the output gap. This ﬂexibility helps ensure the model can match the dynamics of short-term interest rates, which is obviously critical for capturing the dynamics of the term structure. ηmp,t is an exogenous monetary policy shock.
∗ πt is a time-varying inﬂation target, which can potentially help match the high inﬂa∗ tion and long-term interest rates seen in the early part of the sample. More generally, πt

induces a level factor in the term structure. The government ﬁnances public spending by selling single-period bonds. Government expenditures, Gt , are a time-varying fraction of total output, 1 gt

Gt =

1−

Yt

(3.21)

5 As usual, in the log-linear approximation, the conditions on the ﬁrst and second derivatives in steady state are sufﬁcient to describe the dynamics of the model.

105

where gt follows an exogenous process deﬁned below. Households receive no utility from government expenditures. As long as the share of output consumed by the government is stationary, that assumption will have minimal effects on asset prices.

3.3.6 The aggregate resource constraint is

Market clearing

¯ Ct + It + Gt + a (ut ) Kt−1 = Yt

(3.22)

3.3.7

Exogenous processes

The price and wage markup shocks follow ARMA(1,1) processes,

log 1 + λ p,t = 1 − ρ p log 1 + λ p + ρ p log 1 + λ p,t−1 + ε p,t − θ p ε p,t−1 log (1 + λw,t ) = (1 − ρw ) log (1 + λw ) + ρw log (1 + λw,t−1 ) + ε w,t − θw ε w,t−1

(3.23) (3.24)

2 2 where ε p,t ∼ N 0, σp and ε w,t ∼ N 0, σw . The ARMA(1,1) form potentially helps

match both the high and low-frequency features of inﬂation. Productivity has a unit root and its growth rate follows an AR(1) process, ¯ ∆zt = (1 − ρz ) z + ρz ∆zt−1 + ε z,t

(3.25)

2 where ε z,t ∼ N 0, σz . The AR(1) setup potentially allows the model to incorporate the

long-run risks studied by Bansal and Yaron (2004). The level of investment-speciﬁc productivity is assumed to be a stationary AR(1) process, log µt = ρµ log µt−1 + ε µ,t (3.26)

2 where ε µ,t ∼ N 0, σµ . Note that µt simply determines the efﬁciency of the transformation

of the ﬁnal output good into the investment good, so investment still beneﬁts from the unit-root innovations to Zt . The government’s share of output, the monetary policy shock, and the time-preference 106

shock follow AR(1) processes,

log gt = 1 − ρ g log g + ρ g log gt−1 + ε g,t ηmp,t = ρmp ηmp,t−1 + ε mp,t log Bt = ρb log Bt−1 + ε b,t

(3.27) (3.28) (3.29)

2 2 2 where ε g,t ∼ N 0, σg , ε mp,t ∼ N 0, σmp , and ε b,t ∼ N 0, σb .

The two exogenous processes that are added to JPT’s original model are the inﬂation target and risk aversion. As in Dew-Becker (2011a), I allow the innovations to risk aversion to be correlated with ε z,t . Intuitively, this means that risk aversion depends on innovations to the permanent component of consumption. There are also exogenous innovations to risk aversion. We thus have

¯ αt = ρα αt−1 + (1 − ρα ) α + θα,z ε z,t + ε α,t ˆ2 with ε α,t ∼ N 0, σα .

(3.30)

While a number of recent papers have studied models with time-varying inﬂation targets (e.g. Gurkaynak, Sack, and Swanson, 2005; Doh, 2010), there is little understanding of what actually drives the inﬂation target. Because the inﬂation target has a very strong impact on long-term bond prices, the relationship between the inﬂation target and the other innovations is a key determinant of the prices of long-term bonds. I therefore consider a loose speciﬁcation where innovations to the inﬂation target may be correlated with all of the other fundamental shocks (excluding risk aversion). We thus have
∗ ∗ log πt = (1 − ρπ ) log π + ρπ log πt−1

+ θπ ∗,g ε g,t + θπ ∗,z ε z,t + θπ ∗,p ε p,t + θπ ∗,w ε w,t + θπ ∗,b ε b,t + θπ ∗,µ ε µ,t + θπ ∗,mp ε mp,t + ε π ∗,t
(3.31) ˆ2 with ε π ∗,t ∼ N 0, σπ ∗ . All of the shocks ε are also assumed to be independent. The θ parameters in equations (3.30) and (3.31) are somewhat difﬁcult to interpret and 107

choose priors for. I therefore transform these parameters so that they can be interpreted as variance shares. Deﬁne
2 2 2 ˆ2 σα ≡ θα,z σz + σα

(3.32)

2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 ˆ2 σπ ∗ ≡ θπ ∗,g σg + θπ ∗,z σz + θπ ∗,p σp + θπ ∗,w σw + θπ ∗,b σb + θπ ∗,µ σµ + θπ ∗,mp σmp + σπ ∗ (3.33)

2 2 σα and σπ ∗ are the variances of the total innovations to risk aversion and the inﬂation

target, respectively. Next, deﬁne
2 2 θα,z σz 2 σα

σα,z ≡ sign (θα,z )

(3.34)

σα,z is the share of the total variance of the innovations to risk aversion that is accounted for by labor-neutral technology shocks. The sign of σα,z determines whether the effect of
∗ technology shocks on risk aversion is positive or negative. Similarly, for πt we can deﬁne
2 2 θπ ∗,X σX 2 σπ ∗

σX,z = sign (θπ ∗,X )

(3.35)

2 2 for X ∈ { g, z, p, w, b, µ, mp}. The parameters σα,z , σX,z , σα and σπ ∗ map uniquely into the

ˆ2 ˆ2 original parameters θα,z , θπ ∗,X , σα , and σπ ∗ are are more easily interpreted as they represent variances and signed variance shares.

3.4 Model solution
The standard method for approximating models of the form studied here is perturbation. The drawback of perturbation methods for our purposes is that if we want timevariation in risk aversion to have any effect on the dynamics of the model, we need to take a third-order approximation to the model. Since the solution would be non-linear, we would have to use the particle ﬁlter or some other nonlinear method in order to calculate the marginal likelihood of the model. I have found, though, that it is in general very difﬁcult to ﬁnd the peak of the likelihood function for this model, and it would be infeasible with a method as slow as the particle ﬁlter. This is a common problem in models of the

108

term structure (e.g. Ang and Piazzesi, 2003; Hamilton and Wu, 2011). I therefore use the essentially afﬁne approximation method described in Dew-Becker (2011b). The essentially afﬁne method delivers an approximation to the equilibrium dynamics of the model that is linear in the state variables but still allows time-varying risk aversion to affect the behavior of the endogenous variables. Dew-Becker (2011b) describes the method in detail and show that Euler equation errors in simulated models are competitive with third-order perturbations. Local to the non-stochastic steady-state, the essentially afﬁne approximation is as accurate as a ﬁrst-order perturbation (in a Taylor sense), and hence less accurate than higher-order perturbations. However, in a stochastic setting, it performs well. This section gives a short overview of the method, and the appendix provides further details. Denote the vector of the variables in the model (including the exogenous processes) as Xt and the vector of fundamental shocks as ε t ≡ ε mp,t , ε z,t , ε b,t , ε µ,t , ε g,t , ε p,t , ε w,t , ε α,t , ε π ∗,t . The equations determining the equilibrium of the model take the form

0 = G ( Xt , Xt+1 , σε t+1 )

(3.36)

where the expectation operator may appear in the function G. There is one equation for each variable. σ is a parameter controlling the variance of the shocks. We will approximate ¯ around the point σ = 0, with the non-stochastic steady-state deﬁned as the point X such that ¯ ¯ 0 = G ( X, X, 0) The equations G can be divided into two types: those that do not involve taking expectations over the SDF and those that do.   G ( X t , X t +1 , ε t +1 ) =  D ( Xt , Xt+1 , σε t+1 ) Et [ M ( Xt , Xt+1 , σε t+1 ) F ( Xt , Xt+1 , σε t+1 )]    (3.37)

where D and F are vector-valued functions and M is the (scalar-valued) stochastic discount

109

factor.6 For the equations that do not involve the SDF, I use standard perturbation methods and simply take a log-linear approximation. We approximate D as

0 = log ( D (exp ( xt ) , exp ( xt+1 ) , σε t+1 ) + 1) ˆ ˆ 0 ≈ d0 + d x xt + d x xt+1 + dε σε t+1

(3.38) (3.39)

where the terms d0 , d x , d x , and dε are coefﬁcients from a Taylor approximation and xt ≡ log Xt ¯ ˆ xt ≡ log Xt − log X

D will include equations such as budget constraints. The second set of equations is dynamic and involves expectations. In many economic models, including the present one, equations involving expectations take the form

1 = Et [ M ( Xt , Xt+1 , σε t+1 ) F ( Xt , Xt+1 , σε t+1 )]

(3.40)

where M ( Xt , Xt+1 , σε t+1 ) is the stochastic discount factor induced by the household’s intertemporal optimization condition. The key source of non-linearity in the model is the time-variation in risk aversion, which induces heteroskedasticity in the SDF. It is therefore natural to deal with M and F separately to isolate the relevant non-linearity. I now show that if we log-linearize F, we can transform (3.40) into a linear condition that can be solved alongside the remaining equations. M ( Xt , Xt+1 , σε t+1 ) will not even be log-linear in the state variables, but we will be able to state the equilibrium conditions in as a set of linear expectational difference equations.
6 Note that this formulation does not actually restrict F. Speciﬁcally, suppose there were a set of equilibrium conditions 1 = Et h ( Xt , Xt+1 , σε t+1 ), i.e. that do not involve the SDF. We could simply say that F ( Xt , Xt+1 , σε t+1 ) ≡ h ( Xt , Xt+1 , σε t+1 ) /M ( Xt , Xt+1 , σε t+1 ).

110

First, guess that the equilibrium dynamics of the model take the form ˆ ˆ xt+1 = C + Φ xt + Ψε t+1

(3.41)

We conﬁrm in the end that the solution is actually in this form. The next step then is to take log-linear approximations to M and F separately. Loglinearizing F is straightforward, and we obtain,

log F ( xt , xt+1 , σε t+1 ) ≈ f 0 + f x xt + f x xt+1 + f ε σε t+1

(3.42)

For M, in the case of the preferences laid out in section 3.2, the appendix shows that is it is possible to derive a ﬁrst-order accurate expresion of the form 1 (1) ˆ mt+1 = m0 + m x xt + (κ0 + αt κ1 ) σε t+1 − σ2 α2 κ1 Σκ1 t 2
(1)

(3.43)

where Σ is the variance matrix of ε t . The superscript (1) indicates that mt+1 is ﬁrst-order accurate for the true SDF. (3.43) is the essentially afﬁne form from Duffee (2002). Taking the expectation of the approximated Euler equation yields,  ˆ  m0 + m x xt + (κ0 + αt κ1 ) σε t+1 − 0 = log Et exp  + f 0 + f x xt + f x xt+1 + f ε σε t+1 ˆ ˆ 0 = m0 + m x x t + f 0 + f x x t + f x ( C + Φ x t ) 1 + σ2 ( f x + f ε ) ΨΣΨ 2 f x + f ε + αt σ2 κ1 ΣΨ fx + fε
1 2 2 2 αt σ κ1 Σκ1

  

Since every equation in the system is now linear in the variables of the model, we can solve the system for the parameters Φ and Ψ from (3.41). Speciﬁcally, we solve the

111

following system,

ˆ ˆ 0 = d0 + d x xt + d x xt+1 + dε σε t+1 ˆ ˆ 0 = m0 + m x x t + f 0 + f x x t + f x ( C + Φ x t ) 1 + σ2 ( f x + f ε ) ΨΣΨ 2 f x + f ε + αt σ2 κ1 ΣΨ fx + fε

(3.44)

(3.45)

at the point σ = 1. The reason that the essentially afﬁne SDF is useful is that the expectation in (3.45) will be linear in the state variables, so we have a simple linear system to solve. This system can be solved through, for example, Sims’ (2001) Gensys algorithm. Dew-Becker (2011b) shows that the transition function for the model obtained through the essentially afﬁne method is ﬁrst-order accurate for the true transition function and ﬁrstorder equivalent to a ﬁrst-order perturbation. Clearly, though, the approximation includes higher-order terms that account for movements in risk aversion. αt will affect not only asset prices but also the dynamics of real variables. Dew-Becker (2011b) calibrates a simple version of the RBC model with time-varying risk aversion and ﬁnds that the essentially afﬁne approximation has accuracy between that of second and third-order perturbations. Standard results derived in the appendix also deliver real and nominal zero-coupon bond prices.

3.5 Empirics
I estimate the model using standard Bayesian methods. The observable data is the same as in JPT, but with bond prices added. Both real variables and bond prices are linear functions of the underlying state variables contained in the vector xt , so we can write the model in state-space form and measure the likelihood using the Kalman ﬁlter. I proceed by ﬁnding the posterior mode and running a monte carlo chain from that point to sample from full posterior distribution. The appendix describes the details of the estimation.

112

3.5.1

Data

The sample is 1983q1 to 2004q4. I do not include the ﬁnancial crisis in the sample because the zero lower bound on nominal interest rates becomes binding, a phenomenon that the model is not designed to capture. The sample is cut off in 1983 in order to ensure that monetary policy is consistent over the estimation period. The observable variables are real GDP, consumption, and investment growth, hours worked per capita, wage and price inﬂation, and yields on three-month, 1, 2, 3, 5, and 10year Treasury bonds. The 1 through 5-year yields are obtained from the Fama–Bliss CRSP ﬁles, the 10-year yield from Gurkaynak, Sack, and Wright (2006), and the three-month yield from the Fama risk-free rate CRSP ﬁle. The bond yields and inﬂation rates are always reported in annualized percentage points, unless otherwise noted. The real variables are all obtained from the BEA and the BLS. Consumption is deﬁned as expenditures on nondurables and services, while investment is the sum of residential and non-residential ﬁxed investment and consumer durables expenditures. Real wages are calculated as nominal compensation per hour in the non-farm business sector (from the BLS) divided by the GDP deﬂator. The change in the log GDP deﬂator is the measure of inﬂation. Hours worked per capita in the non-farm business sector are obtained from Francis and Ramey (2009) as updated on Valerie Ramey’s website. None of the variables are detrended. Figure 3.1 plots the data used in the estimation (with the exception of the intermediateterm bond yields). Output, consumption, and investment growth all look stationary over the sample and relatively homoskedastic. Hours worked per capita has a strong upward trend in this sample. Interest rates decline signiﬁcantly over the sample, even though inﬂation only declines marginally. The short-term interest rate is substantially more volatile than the long-term rate, and the term spread is clearly countercyclical. The model has 9 fundamental shocks, but we have 13 observable variables. I follow JPT in assuming that the 6 macro variables plus the short-term interest rate are observed without error. I also assume that the 10-year bond yield is measured without error, which will help identify the inﬂation target. For the remaining bonds, I assume that the yields have i.i.d. measurement errors with identical standard deviations. The standard deviation

113

Figure 3.1: Data series for estimation

2.5 2 1.5 1 0.5 0 -0.51983 -1 -1.5 10 8 6 4 2 0 -21983 -4 -6 -8 3 2.5 2 1.5 1 0.5 0 -0.51983 -1 -1.5 14 12 10 8 6 4 2 0 1983 1988

Real GDP growth

2 1.5 1 0.5

Real consumption growth

1988

1993

1998

2003

0 1983 -0.5 -1 1988 1993 1998 2003

Real investment growth

685 680 675 670

Hours worked per capita

1988

1993

1998

2003

665 660 655 1983 1988 1993 1998 2003

Real wage growth

6 5 4 3 2

Annualized inflation

1988

1993

1998

2003

1 0 1983 1988 1993 1998 2003

1-quarter interest rate, annualized

14 12 10 8 6 4 2 0

10-year interest rate, annualized

1993

1998

2003

1983

1988

1993

1998

2003

Note: No variables are detrended. GDP, consumption, and investment are obtained from the BEA. Compensation per hour, and inflation are obtained from the BLS. Hours worked is obtained from Valerie Ramey's website. The one-quarter yield is the Fama risk-free rate. The ten-year yield is from Gurkaynak, Sack, and Wright (2006).

114

of these measurement errors is another parameter that will be estimated. The assumption of zero measurement error for the long and short ends of the yield curve forces the model to focus on matching the term spread, while leaving some ﬂexibility in matching curvature.

3.5.2

Priors

Table 3.1 lists the parameters and priors. For all of the parameters that I share with JPT, I choose the same priors. The remaining parameters are listed in the bottom section of the table. Many of them have uniform priors since I do not have strong a priori views about, for example, the fraction of the variance of the Federal Reserve’s inﬂation target that is driven by shocks to government spending. For the volatility of risk aversion, I choose a beta distribution over the ratio of the unconditional standard deviation of risk aversion to its mean. This means that average risk aversion is forced to be at least one standard deviation above zero. This prior could potentially be tightened to enforce a stronger restriction. As a practical matter, the data tends to push for a high volatility for risk aversion, and average risk aversion in the estimation simply rises high enough to accommodate the unconditional standard deviation. I constrain the persistence of the inﬂation target to follow nearly a random walk with ρπ ∗ = 0.99, consistent with the idea that the target is highly persistent. The assumption that ρπ ∗ < 1 ensures that inﬂation is stationary so that there is a steady-state around which we can approximate. The priors over the shares of the variances of the inﬂation target and risk aversion coming from the other shocks are uniform.

3.5.3

Posterior modes

Table 3.1 lists the posterior modes for the parameters along with the 5th and 95th percentiles of the posterior distribution. Many of the posterior modes are reasonably close to the corresponding prior means. I focus mainly on those parameters that differ from the prior or are unique to this model. The prior for the variance of the innovations to the inﬂation target favors a reasonably low standard deviation, but the posterior seems to want a highly volatile target—the

115

Table 3.1: Priors and posterior modes

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

α ιp ιw 100γ η λp λw LSS πSS 100(β -1-1) ν ξp ξw χ S fπ fy f∆y ρR ρz ρg ρµ ρλp ρλw ρb ρmp θp θw σR σz σg σµ σλp σλw σb σπ* σπ*,mp σπ*,z σπ*,g σπ*,µ σπ*,λp σπ*,λw σπ*,b σα/αSS σα,z ρα ρ αSS σyields

Description Capital share Price indexation Wage indexation Mean technology growth Habit parameter Mean price markup Mean wage markup Mean log hours per capita Mean quarterly inflation Discount factor Inverse Frisch elasticity Price adjustment frequency Wage adjustment frequency Capital utilization costs Investment adjustment costs Taylor rule inflation Taylor rule output gap Taylor rule output gap growth Interest rate smoothing Technology shock AR Government spending AR Investment technology AR Price markup AR Wage markup AR Consumption demand shock AR Monetary policy AR Price markup MA Wage markup MA MP shock vol. Neutral tech. shock vol. Gov't spending vol. Investment tech. vol. Price markup vol. Wage markup vol. Demand shock vol. Inflation target vol. MP var. shr. in pi* z var. shr. in pi* g var. shr. in pi* mu var. shr. in pi* Price shock var. shr. in pi* Wage shock var. shr. in pi* b var. shr. in pi* RRA volatility/RRA mean z var. shr. in RRA RRA persistence Inverse EIS Mean risk aversion Bond measurement errors (bp)

Distribution Normal Beta Beta Normal Beta Normal Normal Normal Normal Gamma Gamma Beta Beta Gamma Gamma Normal Gamma Normal Beta Beta Beta Beta Beta Beta Beta Beta Beta Beta IG(1) IG(1) IG(1) IG(1) IG(1) IG(1) IG(1) IG(1) U[-1,1] U[-1,1] U[-1,1] U[-1,1] U[-1,1] U[-1,1] U[-1,1] Beta U[-1,1] U[-1,1] U[-1,1] Normal IG(1)

Priors Mean Std. Dev. 0.3 0.05 0.5 0.15 0.5 0.15 0.5 0.25 0.5 0.1 0.15 0.05 0.15 0.05 6.7 0.2 0.5 0.1 0.25 0.1 2 0.75 0.66 0.1 0.66 0.1 5 1 4 1 1.7 0.3 0.125 0.04 0.125 0.05 0.6 0.2 0.6 0.2 0.6 0.2 0.6 0.2 0.6 0.2 0.6 0.2 0.6 0.2 0.4 0.2 0.5 0.2 0.5 0.2 0.1 1 0.5 1 0.5 1 0.5 1 0.1 1 0.1 1 1 1 0.1 0.1 0 0.58 0 0.58 0 0.58 0 0.58 0 0.58 0 0.58 0 0.58 0.5 0.2 0 0.58 0.5 0.29 0.5 0.29 15 5 13 33

Mode 0.13 0.39 0.77 0.48 0.52 0.10 0.15 6.75 -0.20 0.28 1.83 0.67 0.67 5.08 4.96 1.89 0.08 0.25 0.96 0.20 0.99 0.50 0.95 0.99 0.77 0.19 0.16 0.98 0.14 0.80 0.29 6.21 0.12 0.35 0.41 0.33 0.39 -0.16 0.00 0.01 -0.04 -0.06 0.28 0.95 0.00 0.77 0.76 18.70 8.20

Posterior 5% 0.10 0.21 0.66 0.43 0.32 0.02 0.07 6.71 -0.90 0.21 1.16 0.60 0.57 3.23 3.16 1.51 0.04 0.21 0.92 0.09 0.99 0.34 0.92 0.99 0.70 0.09 0.02 0.97 0.12 0.69 0.25 3.76 0.10 0.30 0.29 0.28 0.19 -0.26 0.00 0.00 -0.10 -0.12 0.18 0.83 -0.03 0.72 0.51 13.42 7.66

95% 0.15 0.56 0.88 0.52 0.71 0.18 0.24 6.79 0.74 0.47 2.87 0.71 0.72 7.19 6.71 2.51 0.16 0.30 0.97 0.31 0.99 0.67 0.97 0.99 0.80 0.30 0.39 0.98 0.17 0.97 0.35 9.39 0.18 0.43 0.63 0.41 0.48 -0.12 0.01 0.04 -0.01 -0.01 0.40 0.99 0.05 0.83 0.90 26.57 8.96

Estimates from JPT 0.17 0.24 0.11 0.48 0.78 0.23 0.15 N/A 0.71 0.13 3.79 0.84 0.7 5.3 2.85 2.09 0.07 0.24 0.82 0.23 0.99 0.72 0.94 0.97 0.67 0.14 0.77 0.91 0.22 0.88 0.35 6.03 0.14 0.2 0.04 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A 1 1 N/A

Note: Priors, posterior mode, and percentiles of the posterior distribution from the benchmark model. The far-right column reports the parameters from JPT, where applicable.

116

estimated standard deviation of the innovations to the annualized inﬂation target is 1.3 percent. This helps the model capture the observed volatility of the level factor in bond yields, but it is implausibly high. The shock to the level of labor-neutral technology has an important effect on the inﬂation target, accounting for 16 percent of the variance of its innovations. Following a positive innovation to technology, the central bank is estimated to lower its inﬂation target, consistent with the idea that following beneﬁcial supply shocks that drive inﬂation downward, the central bank takes the opportunity to drive inﬂation lower persistently (e.g. Gurkaynak, Sack, and Swanson, 2005). This mechanism will turn out to be critical to the main results. The labor-neutral technology shock has a standard deviation of 0.72 and an autocorrelation of 0.22. The permanent component of the technology process (the Beveridge–Nelson trend) thus has a standard deviation of 1.00, which is similar to the values often calibrated in the production-based asset pricing literature (e.g. Gourio, 2010, and Dew-Becker, 2011a). The estimated long-run variance of technology growth is far smaller than the value calibrated in the long-run risks literature (e.g. Bansal and Yaron, 2004, and Kaltenbrunner and Lochstoer, 2010), but it is consistent with estimates obtained in JPT and SW and with simple univariate estimates from consumption and output data. The standard deviation of the investment technology shock is far larger than the prior, which is also consistent with JPT, showing that innovations that are isolated to the investment sector may play a large role in ﬂuctuations. Alternatively, it could mean that the model simply matches investment poorly. The estimates imply that there is essentially zero correlation between innovations to technology and innovations to risk aversion. This runs against the theory from DewBecker (2011a), and implies that the price of risk in bond markets is driven by some factor other than permanent innovations to household consumption. Intuitively, part of the source of this result is that the price of risk is measured to be high in recessions, but over the 1983–2004 period, productivity growth has been only weakly procyclical, and in fact rose substantially in the 2000 recession. As in JPT, the government spending shock is estimated to follow nearly a unit root, 117

explaining the trend in the consumption-output ratio over the sample. The wage markup shock also follows nearly a unit root, which helps capture the strong trend in hours worked per capita seen in ﬁgure 3.1. In general, my parameter estimates are reassuringly similar to the values from JPT reported in the far-right column of table 3.1, even though I use a different sample period (post-1983 versus post-war) and extra data on bond yields. The main place where my estimates seem to differ from JPT is in price and wage determination. My estimates imply that wages are strongly indexed to inﬂation, whereas JPT estimate little indexation. The wage-markup shock is also substantially more volatile under my estimates than JPT. Interestingly, when bond prices are dropped from the estimation, I obtain values for wage indexation and the volatility of the wage markup shock that are much closer to JPT. This suggests that wage dynamics are important for matching the term structure. I also estimate a smaller inverse Frisch elasticity and a lower average price markup. In both cases, my estimates bring the model in closer to the priors and micro estimates.

3.6 Asset pricing
This section studies the asset-pricing implications of the model. I ﬁrst analyze the ﬁt of the model to the term structure and show that it is competitive with a non-structural model. Next, I decompose the variance of the SDF to understand the source of the positive term premium in the model. I then analyze the prices of other assets, including the aggregate capital stock and a claim to aggregate proﬁts.

3.6.1 3.6.1.1

Bond prices Fitted yields

Figure 3.2 plots the deviations of the ﬁtted yields from their actual values for the ﬁve yields that are assumed to be measured with error (reported in annualized basis points). The estimated standard deviation of these ﬁtting errors is 8 basis points, which is economically small compared to the overall variation of the yields that is on the order of hundreds 118

of basis points. The errors are all centered around zero, meaning that the model can capture the shape of the term structure on average. The volatility of the errors looks somewhat higher for the 1 and 5-year yields and in the earlier part of the sample. There is clearly some autocorrelation in the errors; the ﬁtted value for the 3-year yield is consistently too high in the ﬁrst half of the sample, and the 4-year ﬁtted yield is consistently too low in the second half, for example. And there is also some cross-correlation in the errors; the ﬁrst principal component explains 37 percent of the total variance of the errors (twice what it would if the errors were orthogonal). These are thus clearly not classical (i.i.d.) measurement errors, but their small mean and volatility shows that the model does a reasonable job of ﬁtting the data, and they are not disturbingly far from white noise. While there nine unobservable shock processes that can help us match the data, the model is asked to ﬁt 13 data series, so obtaining a good ﬁt for the bond yields is not trivial. Loosely, we have 6 macro variables that identify 6 shock processes, plus three extra processes (the monetary policy shock, the inﬂation target, and risk aversion) that can be used to ﬁt the bond yields. The degrees of freedom here are thus comparable to a nonstructural bond-pricing model with three unobservable factors, but we also have numerous constraints on dynamics and risk prices. Table 3.2 lists the standard deviation of the yield errors in basis points obtained from regressing the bond yields on their ﬁrst three principal components. The standard deviations are all between 4 and 8 basis points. I force the structural model to match the 1-quarter and 10-year yields exactly, and the remaining yields have errors with standard deviations of 8 basis points. The ﬁt of the model to the yields is thus comparable to a non-structural model with three unobservable factors that have completely unrestricted dynamics. The third and fourth rows of table 3.2 report the measurement errors in constrained models that assume constant relative risk aversion and a constant inﬂation target (the other parameters are reesetimated). In both cases, the measurement errors have standard deviations roughly three times larger than the benchmark model, giving a measure of the improvement in ﬁt generated by the benchmark model. Figure 3.2 and table 3.2 show that the model is able to provide a very close ﬁt to bond yields in the data. The quality of the ﬁt is essentially identical to the that of a purely non119

Figure 3.2: Bond yield errors

40

1-year 4-year

2-year 3-year 5-year

40

30

30

20

20

10

10

Yield errors (basis points)

-10

-10

-20

-20

-30

-30

Note: each axis plots the measurement errors in basis points for one of the bond yields. Errors are measured from the Kalman-filtered estimates at the posterior mode.

Yield errosr (basis points) 0 1980 1990 2000 1980 1990 2000

120
2000 1980 1980 1990 2000 1990 2000

0 1980

1990

Table 3.2: Fitting Errors

PCA Benchmark model Constant RRA Constant π*

1-quarter 1-year 2-year 3-year 4-year 5-year 10-year 4.54 7.32 6.29 4.26 6.20 7.98 6.53 0 8.12 8.12 8.12 8.12 8.12 0 0 23.30 23.30 23.30 23.30 23.30 0 0 23.87 23.87 23.87 23.87 23.87 0

Note: Fitting errors measured in annualized basis points. The model-based estimates use the posterior modal estimate for the standard deviation. The 1-quarter and 10-year errors are constrained to equal zero in the structural model. The errors from PCA are the standard deviations of the residuals from regressions on the bond yields on their first three principal components.

121

structural model.

3.6.1.2

Steady-state yields

Another way to evaluate the ﬁt of the model is to ask whether the steady state of the model matches the average term structure in the data. Looking at the steady state keeps the Kalman ﬁlter from using large deviations in the unobservable state variables to ﬁt the term structure. Figure 3.3 plots the average term structure in the sample along with its model-implied steady state. The solid black line gives the steady-state term structure in the model, renormalized so that the ten-year yield matches the empirical ten-year yield.7 To capture the uncertainty in the empirical term structure, the grey area gives the 95-percent conﬁdence intervals for the means of the empirical yields relative to the ten-year yield (i.e. the conﬁdence intervals for the spreads; the intervals are calculated using the Newey– West method with lag a 6-quarter lag window). What ﬁgure 3.3 shows is that the model matches the spread between the 10 and 2-year yields, but it does not match the curvature of the term structure below two years. However, all of the model-implied yields are within the 95 conﬁdence intervals. One potential explanation at the very short end of the yield curve is that there is a small liquidity premium that the model is not incorporating. We will see that two features of the model are critical for generating the large steadystate term premium: ﬁrst, following a positive shock to technology, the Fed’s inﬂation target falls; second, variation in risk aversion raises the premia on risky assets. To see the prima facie evidence that these two effects are key, ﬁgure 3.3 includes two lines giving the steady-state term structure in constrained models. The ﬁrst line assumes that innovations to the inﬂation target are uncorrelated with the permanent technology shock, while the second line assumes that risk aversion is ﬁxed. Neither line reestimates the other parameters, so they simply isolate the effects of those two features of the model. The line exiting the top of the chart is for the model when shocks to technology are assumed to have no impact on the inﬂation target. We then obtain the usual result that the term structure is downward-sloping, and the steady-state term spread is -261 basis points.
I use this normalization because the estimated inﬂation target is above zero through most of the sample. The unconditional variance of the inﬂation target is sufﬁciently high that its average level is not well identiﬁed.
7

122

Figure 3.3: Steady-state nominal bond yields

8

7.5

Steady-state yields Constant risk aversion (spread=0.63)

Steady-state yields π* indep. of z (spread=-2.61)

7

6.5

Steady-state yields (spread=1.52)

6

Annualized nominal yield (percentage points)

123
Average empirical yields (spread=2.07) Empirical 95% confidence band
4 8 12 16 20 Maturity (quarters)

5.5

5

4.5

4 24 28 32 36

0

Note: The solid black line gives the yield curve at the model's steady state; the grey lines are for the model with constant risk aversion and where the inflation target is unaffected by shocks to labor-neutral technology. The other parameters are not reestimated. All the linear are normalized to match the 10-year yield exactly, so the plot measures steady-state spreads. Boxes are average sample yields. The grey area is the 95% confidence band for the average yields relative to the 10-year yield, calculated using the Newey–West method with 6 lags. The solid black line gives the yield curve at the model's steady state, normalized to match the 10-year yield exactly.

Time-varying risk aversion also turns out to be important, though. When risk aversion is ﬁxed, the term structure is still upward-sloping, but the spread is quantitatively small— only 63 basis points in steady-state, compared to 207 in the data and 152 in the benchmark model.

3.6.1.3

Term premia

The size of the steady-state term spread in the model can be interpreted as the average term premium—it is the excess return (in logs) that an investor earns in expectation by buying a long-term bond and holding it to maturity instead of buying short-term bonds and rolling them over for the same amount of time. An important feature of this model is that risk aversion varies over time, which should make the term premium also vary over time. The top panel of ﬁgure 3.4 plots the expected annualized excess return on holding a ten-year nominal bond (over a one-quarter bond) from the benchmark model against the expected excess return from a regression of bond returns on the Cochrane–Piazzesi (CP) factor. Cochrane and Piazzesi (2005) argue that a tent-shaped factor in forward yields summarizes the price of risk in the term structure, so their factor can be viewed as a simple non-structural benchmark for return forecasting. The structural forecast is highly correlated (34 percent) with the ﬁtted value using the CP factor, and its standard deviation is roughly 20 percent larger. The two series rise by similar amounts in the two recessions in the sample, but the benchmark model also implies that the term premium rose in 1988 and 1999, whereas the CP factor is stable in those episodes. The bottom panel of ﬁgure 3.4 plots the term premium against the term spread. The term premium is deﬁned as the spread between the 10-year yield and the average of the expected 1-quarter yields over the life of a 10-year bond. The variance of the term premium is non-trivial in comparison to the term spread. In the two recessions in the sample, the increases in the term spread are substantially larger than the movements in the term premium, but the term premium does rise in both episodes. Interestingly, the movements

124

Figure 3.4: Expected returns and the term premium

12

Annualized expected return on 10-year nominal bond

10

Benchmark model

8

6

4

2

Cochrane–Piazessi
0 1983 -2 5 1988 1993 1998 2003

10-year/1-quarter term spread and term premium

4

Term spread

3

Term premium
2

1

0 1983 1988 1993 1998 2003

-1 Note: Top panel gives expected excess returns on a 10-year bond over the following quarter, annualized. Values for the Cochrane–Piazzesi are from a linear regression. The term premium in the bottom panel is defined as the gap between the 10year nominal yield and the mean of expected 1-quarter yields over the following 10 years.

125

in the term spread outside of the two recessions seem almost entirely driven by movements in the term premium. In particular, the rises in the term spread in 1984, 1985, 1987, 1996, and 1999 are all associated with increases in the term premium of equal magnitudes. On the other hand, the inversions of the yield curve in 1989 and 2000, both just prior to recessions, are associated with only minor declines in risk aversion, and the subsequent rises in the term spread with similarly small rises in risk aversion.

3.6.1.4

Variation in interest rates

To show how the bond yields respond to the various shocks, ﬁgure 3.5 plots responses of a level, slope, and curvature factor to the 9 fundamental shocks. Following Bekaert, Cho, and Moreno (2010), the level factor is deﬁned as the average of the 1-quarter and 5 and 10year yields; the slope factor is the 10-year/1-quarter term spread; and curvature is the sum of the 5 and 1-year yields minus twice the 3-year yield. The shocks are orthogonalized in the sense that the interactions between the inﬂation target and risk aversion and the other shocks are switched off. So ﬁgure 3.5 shows, for example, the effect of a pure increase in the level of technology, holding the inﬂation target ﬁxed. The shocks are all unit standard deviations. For the level factor, a number of shocks, the monetary policy and time preference shocks in particular, have important effects at high frequencies. The low-frequency movements, as we would expect, are mainly driven by shifts in the inﬂation target, while risk aversion also plays a role. Somewhat surprisingly, positive monetary policy shocks, which raise the short-term interest rate above its Taylor-rule value, are actually associated with declines in the level factor. The reason is that these shocks drive down expected inﬂation. So a positive monetary policy shock drives the real interest rate up, but nominal interest rates actually fall. The response to the time-preference shock is more intuitive: an increase in Bt is analogous to an increase in patience, so interest rates fall. The determinants of the slope factor are similar to those for the level factor: monetary policy and time-preference matter at high frequencies, while risk aversion determines the dynamics at lower frequencies. An increase in risk aversion increases the term spread,

126

Figure 3.5: Responses of term structure factors to orthogonalized shocks

Monetary Pol.
0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3

Neutral Tech.

Gov't Spending Investment. Tech.

Prices

Wages

Time Preference

Inflation Target

Risk aversion

0.4

0.3

0.2

0.2

0.2

0.2

0.2

0.2

0.2

0.2

0.2

AAA

0.1

0.1

0.1

0.1

0.1

0.1

0.1

0.1

0.1

0 1 11 21 31 1 11 21 31 1 11 21 31 1 11 21 31 1 11 21 31 1 11 21 31 1 11 21 31

0

0

0

0

0

0

0

0 1 11 21 31

1

11

21

31

Level Factor

-0.1

-0.1

-0.1

-0.1

-0.1

-0.1

-0.1

-0.1

-0.1

-0.2

-0.2

-0.2

-0.2

-0.2

-0.2

-0.2

-0.2

-0.2

-0.3

-0.3

-0.3

-0.3

-0.3

-0.3

-0.3

-0.3

-0.3

-0.4 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 -0.1 -0.2 -0.3 -0.4 -0.5 0.25 0.25 0.25 0.25 -0.5 -0.5 -0.5 -0.4 -0.4 -0.4 -0.4 -0.5 0.25 -0.3 -0.3 -0.3 -0.3 -0.2 -0.2 -0.2 -0.2 -0.1 -0.1 -0.1 -0.1 11 21 31 1 11 21 31 1 11 21 31 1 11 21 31 1 11 21 0 0 0 0 31 -0.1 -0.2 -0.3 -0.4 -0.5 0.25 0.1 0.1 0.1 0.1 0.2 0.2 0.2 0.2 0.3 0.3 0.3 0.3 0.3 0.2 0.1 0 1 11 21 31 0.4 0.4 0.4 0.4 0.4 0.5 0.5 0.5 0.5 0.5 0.6 0.6 0.6 0.6 0.6 0.7 0.7 0.7 0.7 0.7 0.8 0.8 0.8 0.8 0.8

-0.4

-0.4

-0.4

-0.4

-0.4

-0.4

-0.4 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 -0.1 -0.2 -0.3 -0.4 -0.5 0.25 11 21 31

-0.4 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 -0.1 -0.2 -0.3 -0.4 -0.5 0.25 11 21 31

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

Slope Factor

Curvature factor

127
0.2 0.2 0.2 0.2 0.2 0.15 0.15 0.15 0.15 0.15 0.1 0.1 0.1 0.1 0.1 0.05 0.05 0.05 0.05 0.05 0 1 -0.05 -0.05 -0.05 11 21 31 1 11 21 31 1 11 21 0 0 31 -0.05 0 1 11 21 31 -0.05 0 1 11 21 31 -0.1 -0.1 -0.1 -0.1 -0.1 -0.15 -0.15 -0.15 -0.15 -0.15 -0.2 -0.2 -0.2 -0.2 -0.2 -0.25 -0.25 -0.25 -0.25 -0.25

0

1

11

21

31

-0.1

-0.2

-0.3

-0.4

-0.5

0.25

0.2

0.2

0.2

0.2

0.15

0.15

0.15

0.15

0.1

0.1

0.1

0.1

0.05

0.05

0.05

0.05

0

0 1 -0.05 11 21 31

0 1 -0.05 11 21 31

0 1 -0.05 11 21 31

1

11

21

31

-0.05

-0.1

-0.1

-0.1

-0.1

-0.15

-0.15

-0.15

-0.15

-0.2

-0.2

-0.2

-0.2

-0.25

-0.25

-0.25

-0.25

Note: responses of each of the term structure factors to the orthogonal structural shocks. Specifically, risk aversion and the inflation target are only affected by their own shocks, not the shocks to the other exogenous processes. The level factor is the average of the 1-quarter, 5-year, and 10-year yields. The slope factor is the gap between the 10-year and 1-quarter yields. Curvature is the sum of the 5-year and 1-year yields minus twice the 3-year yield. The shocks are all unit standard deviations. All scales in each row are identical and are measured in annualized percentage points.

which ﬁts with results on bond return forecasting (Campbell and Shiller, 1988) and the fact that the term spread forecasts high equity returns (Fama and French, 1989).

3.6.2

Determinants of asset prices The variance of the SDF

3.6.2.1

An asset’s expected excess return over the real riskless interest rate is determined by its covariance with the stochastic discount factor. One of the more interesting outputs of a model as rich as this one is the variance decomposition for the SDF. Table 3.3 reports a variance decomposition for the SDF at the one-quarter horizon. The variance of the SDF is essentially entirely driven by the neutral technology and risk aversion shocks. The bar chart in the bottom panel of table 3.3 decomposes the variance of the SDF into components coming from the neutral technology shock, the risk aversion shock, and the remaining shocks combined. The lines at the top of each bar give the 2.5 and 97.5 percentiles of the posterior distribution. The 97.5 percentile for the variance share in the SDF for nontechnology and non-risk-aversion shocks is less than 2 percent. On ﬁrst glance this result might be somewhat surprising, but it is in fact a deep characteristic of models with Epstein–Zin preferences with a high EIS and high risk aversion. One way to see the source of this ﬁnding is to simply look at the household’s SDF, 1 − βBt+1 UC,t+1 1 − βBt UC,t Vt+1
ρ−αt
ρ−αt 1− α t

Mt+1 = βBt

(3.46)

Et Vt1−αt +1

For a household with a large EIS, the variance of UC,t+1 /UC,t is generally small (at least with standard preferences). In the case where the household does not have a habit (η = 0), this term is equal to (Ct+1 /Ct )−ρ . If the household has an EIS greater than 1, then ρ is less than 1 and the variance of UC,t+1 /UC,t will be less than the variance of log consumption growth. A one-percent permanent decline in consumption will raise this term by the factor 1.01ρ . The majority of the variance of the SDF is driven by the term
Vt+1 Et Vt+1
1− α t ρ−αt ρ−αt 1− α t

. Here, a one-

percent permanent decline in consumption will make this term (approximately) equal to 128

Table 3.3: One-quarter ahead variance decompositions

1

2

3

Monetary policy Neutral tech. Gov't spending Investment tech. Price markup Wage markup Time preference Inflation target Risk aversion Moments: 10 Standard deviation 11 Correl. w/ SDF 12 Expected return 0.47 1.00 N/A
Variance decompositions and 95% credible intervals

1 2 3 4 5 6 7 8 9 4.06 -0.14 0.27 4.06 -0.17 0.33 8.67 -0.09 0.36 6.54 -0.80 2.47 19.65 -0.53 4.90 1.19 0.01 N/A 1.07 0.04 N/A

SDF 0.00 0.57 0.00 0.00 0.00 0.00 0.01 0.00 0.41

Utility return Cons. return 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.94 0.94 0.05 0.05 0.00 0.01

4 Nom. lev. cons. ret. 0.00 0.01 0.00 0.00 0.00 0.00 0.91 0.07 0.01 1.91 -0.12 N/A

5 6 Real lev. cons. 10-year bond ret. return 0.00 0.34 0.15 0.14 0.00 0.00 0.00 0.00 0.00 0.06 0.01 0.03 0.36 0.17 0.02 0.03 0.46 0.23

7 Output growth 0.00 0.00 0.09 0.31 0.04 0.00 0.03 0.52 0.00

8 Consumption growth 0.00 0.03 0.07 0.01 0.02 0.01 0.04 0.83 0.00

9 Investment growth 0.00 0.03 0.00 0.92 0.04 0.01 0.00 0.00 0.00

129
SDF Utility return Cons. return Levered cons. ret. LR cons. ret.

1.00 0.90 0.80 0.70 0.60 Neutral 0.50 technology Risk aversion 0.40 0.30 0.20 All other shocks 0.10 0.00 10-year bond return Output growth Consumption growth Investment growth

Note: decompositions of one-quarter ahead forecast error. Levered returns for capital and consumption claims assume that the investor finances half the purchase price of the given claim with a 10-year nominally riskless bond. The 10-year return is the one-quarter return from holding a 10-year nominally riskless bond. The moments in rows 10–12 are annualized. The black bars in the figure give the 95 percent credible region based on random draws from the posterior density.

1.01αt −ρ . When αt

ρ, as estimated here and in most models with Epstein–Zin prefer-

ences, it is therefore the variation in Vt+1 that determines the behavior of the SDF. So what determines movements in Vt+1 ? One way to think about it is to split movements in consumption into permanent and temporary components. A purely temporary shock to consumption will have a relatively small impact on Vt+1 because the household is not very averse to shifting consumption over time. A permanent shock to consumption, on the other hand, will tend to shift Vt+1 by an equal amount. Risk aversion shocks also affect Vt . The reason is simply that increases in risk aversion directly drive Vt+1 down because households are more averse to the future uncertainty that they face. Since it is the shocks to Vt+1 that determine the movements in Mt+1 , we should thus not be surprised that it is mainly the neutral technology and risk aversion shocks that drive the variance of the SDF.

3.6.2.2

Impulse responses

Since the technology and risk aversion shocks are the key to understanding the pricing kernel, it is natural to ask how they affect the economy. Figure 3.6 plots impulse responses to an increase in labor-neutral technology and a decrease in risk aversion. Dotted lines give 95-percent credible intervals (the range between the 2.5 and 97.5 percentiles in the posterior distribution). These impulse responses are different from those in ﬁgure 3.5 because they do not turn off the interactions between the inﬂation target and risk aversion and the other shocks. The idea is that we want to see what happens on average following these two shocks, since that behavior is what is relevant for understanding the correlations with the SDF. Following the technology shock, inﬂation falls, while output and real interest rates rise: a standard positive supply shock. The declines in inﬂation and nominal interest rates are especially pronounced and persistent. This behavior is the result of the fact that the estimates imply that the inﬂation target falls following an increase in technology. However, note that inﬂation falls more than the inﬂation target does, so the effect is not entirely driven by the inﬂation target. A similar result is obtained in JPT and SW, even though they do not have time-varying

130

Figure 3.6: Responses to technology and risk-aversion shocks

Nominal risk-free rate
0.3 0.6

Real risk-free rate

0.1 0 -0.1 5

Decline in risk aversion

0.5

10

15

20

0.4

95-percent credible interval
-0.3

0.3

Increase in labor-neutral technology
0.2

-0.5 0.1 -0.7

0 0 5 10 15 20

-0.9

-0.1

Inflation
0.4 0.2 0 0 -0.2 -0.4 -0.6 -0.6 -0.8 -0.8 -1 -1.2 -1.4 -1 -1.2 5 10 15 20 0.4

Inflation target

Decline in risk aversion

0.2 0 0 -0.2 5 10 15 20

Increase in labor-neutral technology

-0.4

Output
0.3 1.2 1 0.1 0.8 0.6 0.4 0.2 1E-15 0 -0.2 -0.4 -0.6 -0.7 5 10 15 20 -0.5

Output gap

Increase in labor-neutral technology

0 -0.1

5

10

15

20

Decline in risk aversion

-0.3

Note: responses in percentage points to a unit standard deviation positive shock to labor-neutral technology and a negative shock to risk aversion. Interest rate, inflation, and inflation target are annualized. Dotted lines give 2.5 and 97.5 percentiles from the posterior distribution

131

inﬂation targets. In all of these models, because prices are sticky, when there is a positive supply shock, rather than cutting prices, ﬁrms simply produce the same quantity as previously, thereby reducing employment and hence demand. Positive technology shocks are thus associated with small increases in output and large declines in the output gap (deﬁned as the difference between output and the level it would take if prices were ﬂexible). There is also a small empirical literature that provides more reduced-form evidence on effects of this sort using direct measures of technology (e.g. Basu, Fernald, and Kimball, 2006). The slow response of output to a technology shock is particularly notable. The impulse response function in ﬁgure 3.6 suggests that output and perhaps also consumption growth should be predictable and positively autocorrelated, though other shocks could obscure those relationships. Figure 3.7 therefore plots the empirical and model-predicted autocorrelation functions for consumption and output growth. For output, the model implies that the 1-quarter autocorrelation should be 0.50, while it is only 0.33 in the data. But 0.50 is well within the 95 percent conﬁdence interval for the empirical value. In fact, nearly the entire autocorrelation function for output in the model is captured within the 95 percent conﬁdence interval in the data. The model also implies strong one-quarter autocorrelation in consumption growth, and the autocorrelation is in fact higher than the upper end of the 95 percent conﬁdence interval. However, this autocorrelation dies out very rapidly, and at lags longer than one quarter the autocorrelation of consumption growth in the model matches the behavior in the data well. The model implies little or no serial correlation in consumption and output growth at horizons longer than two quarters. Figure 3.6 also reports impulse responses for a decline in risk aversion. Inﬂation, interest rates, and the output gap all rise. This shock therefore takes the form of a classic demand shock. The effects are far smaller than those for the technology shock. Risk aversion has two main channels through which it affects the real economy. First, to the extent that physical investment is risky, a decline in risk aversion makes households more willing to purchase physical capital. Second, a decline in precautionary saving demand makes households want to consume more for any given level of interest rates. For this model, 132

Figure 3.7: Empirical and model-implied autocorrelations

0.8

Output growth
Empirical 95 % confidence band

0.6

Empirical
0.4

0.2

0 1 -0.2 6 11 16 21 26 31 36

-0.4

Model

-0.6

-0.8 0.6

Consumption growth

0.4

0.2

1E-15 1 6 11 16 21 26 31 36

-0.2

-0.4

-0.6

Notes: Empirical and model-implied autocorrelation functions. Gray regions are 95-percent confidence intervals

133

the increase in consumption demand dominates, which is why the shock is slightly expansionary.

3.6.2.3

Other asset prices

Variance decompositions After the SDF, table 3.3 reports variance decompositions for the returns on a number of assets. The bottom panel of table 3.3 reports the fraction of the variance of the one-quarter innovation to each return coming from the neutral technology shock, the risk aversion shock, and all other shocks combined. Column 2 reports the variance decomposition for the return on the utility portfolio. 87 percent of its variance comes from the time-preference shock. The reason is simply that the utility claim has a relatively long duration, like that of a consol with a coupon that grows at the average rate of the economy, so shifts in real interest rates have a large effect on its price. The time-preference shock mainly affects real interest rates, so it drives the variance of the utility claim. Row 11 shows that the correlation of the utility return with the SDF is -0.14 More interestingly, the third and fourth columns of table 3.3 report variance decompositions for a claim on aggregate proﬁts and the same claim levered two to one on short-term nominal debt.8 Once again, little of the variance of the return is driven by the technology or risk aversion shocks. While the neutral technology shock does play a role, it is relatively small; the vast majority of the variation is driven by shifts in the rate of time preference due to its effects on real interest rates. Assuming that equity is levered on nominal riskless debt does not change this result because the returns on nominal bonds are generally unaffected by the time-preference shock. However, column 5 shows that if equity claims are levered on real debt, then they because highly correlated with the pricing kernel, and the model can actually generate a large equity premium (though one that is still too small by half). The reason is that when the ﬁrm is levered on real bonds, the effects of time-preference shocks on the unlevered dividend claim and real bond prices cancel each other out.
8

The proﬁts are those earned by the intermediate-good producers, Yt − Wt Nt .

134

Zero-coupon claims Lettau and Wachter (2007, 2011) argue that the value premium can be explained if value stocks have relatively short durations and the term structure of zero-coupon consumption (or dividend) claims is downward-sloping. Figure 3.8 plots the steady-state term structures for zero-coupon nominal bonds, inﬂation indexed bonds, and consumption claims (with the yields normalized to zero at the 1-quarter horizon). For the
C zero-coupon consumption claims, ﬁgure 3.8 plots the steady-state values of − log Pj,t /Et Ct+ j /j , C where Pj,t is the price on date t of an asset that pays one unit of consumption date t + j. C − log Pj,t /Et Ct+ j /j is the average per-period discount rates applied to assets that pay

a unit of consumption at date t + j. To understand the results in ﬁgure 3.8, ﬁrst note that the real term structure is downward sloping for the usual reason: a positive technology shock (which drives the majority of the variance of the SDF) drives the marginal product of capital upward and raises real interest rates. So the prices of real bonds are low in good times, which induces a downwardsloping real term structure. This effect will also apply to the consumption claims since they are real claims and are discounted (partly) with real interest rates. However, there is a second effect – positive technology shocks raise the expected level of consumption in the future, which means that the consumption claims have a high payoff in good times. In the present setting, the latter effect is slightly stronger, inducing the small upward slope in the term structure for consumption claims.

3.6.2.4

The Hansen–Jagannathan bound

Figure 3.9 plots the ﬁtted quarterly Hansen–Jagannathan (HJ) bound against the nominal term spread. The estimated steady-state level of the HJ bound – 0.24 – is almost identical to the observed Sharpe ratio on the aggregate stock market in this sample of 0.26. This is notable given that equities were not included in the estimation and nominal bonds do not achieve the HJ bound. Perhaps more importantly, though, the price of risk in this model is highly volatile. The estimated standard deviation of the Hansen–Jagannathan bound is 95 percent of its mean. The level of variability here is somewhat higher than but still similar to that used in Dew-Becker (2011a) to match the degree of predictability observed

135

Figure 3.8: Steady-state term structures

2

1.5

1

Nominal bonds

0.5

136
Consumption claims
6 11 16 21

0 26 31 36

1

-0.5

Real bonds

-1

-1.5

Notes: term structures for nominal and real zero-coupon bonds and zero coupon consumption claims. Horizons are measured in quarters.

for aggregate stock returns in the post-war sample.

3.7 The real economy
Up to now, the analysis has focused mainly on asset pricing. But the model gives a rich description of the real side of the economy. While I leave a deeper analysis of New Keynesian models to papers focused on those models for their own sake, the interaction of the real side of the economy with asset prices is important to this paper. Figure 3.10 gives a variance decomposition for the variables used in the estimation. The ﬁgure decomposes the variance of each variable at frequencies of 6 to 32 quarters into components coming from each of the structural shocks.9 Except for investment growth, for which the investment-speciﬁc shock is completely dominant, none of the other variables examined in ﬁgure 3.10 are dominated by any particular shock. Notably, the shock to risk aversion has almost no effect on the variance of any of the real variables at business-cycle frequencies. Its largest effect is on consumption growth, for which the variance share is still only 4 percent. The far-right bar, though, shows that risk aversion has a large effect on the term spread, as we saw in ﬁgure 3.5; it explains roughly 1/3 of the variance of the term spread at business-cycle frequencies. The variance decomposition reported in the top panel of ﬁgure 3.10 is rather different from that reported by JPT. They found that the investment shock was an important determinant of not only investment, but also output and consumption growth at business-cycle frequencies. Their model differs from mine in three ways: risk aversion and the inﬂation target are constant, the preference speciﬁcation is slightly different (log utility and additive habits), and the data sample covers the entire post-war period and only includes the onequarter nominal interest rate. The bottom panel of ﬁgure 3.8 removes a number of those differences. It drops bond yields (except the short rate) from the estimation and assumes
The variance decomposition is calculated using a spectral decomposition of the state-space form of the model. Speciﬁcally, since the structural shocks are orthogonal, the spectral density of the endogenous variables is equal to the sum of the densities obtained when each shock is turned on individually. Calculating variance shares over certain frequencies then simply requires integrating the density over those frequencies. I numerically integrate by calculating the spectral density at 100 increments between wavelengths of 6 and 32 quarters.
9

137

Figure 3.9: The Hansen–Jagannathan bound and the term spread

5

0.7

Hansen–Jagannathan bound
0.6

4

Term spread

0.5

3

0.4

Term spread

1 0.2

0 1988 1993 1998 2003 0.1

1983

-1

0

Hansen–Jagannathan bound 0.3

138

2

Figure 3.10: Variance decompositions at business-cycle frequencies

Benchmark model
100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0%

Risk aversion Inflation target Time preference Wage markup Price markup Investment technology Government spending Neutral technology Monetary policy

JPT model
100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0%

Time preference Wage markup Price markup Investment technology Government spending Neutral technology Monetary policy

Note: each section of a bar represents a fraction of the variance of one of the series at frequencies between 6 and 32 quarters. The top panel gives results for the benchmark model. The bottom panel report results where the inflation target and risk aversion are fixed and bond prices are dropped from the estimation. That latter model is identical to JPT except that it uses post-1983 data and uses Epstein–Zin preferences.

139

the inﬂation target and coefﬁcient of relative risk aversion are constant. The model is then reestimated. While I do not replicate JPT’s results exactly, I do also ﬁnd that the investment technology shock is important for more variables than just investment itself. It accounts for 30 percent of the variation in output growth and 20 percent of consumption growth. Moreover, it accounts for 41 and 48 percent of the variance of the one-quarter nominal interest rate and the term spread, respectively. It is this latter result that explains the divergence between the two models. Since long-term bond prices are included in the estimation of the benchmark model, that model is forced to match the relationship between investment and the term spread. The difference in the variance decompositions suggests that JPT gets this relationship wrong. What is interesting, though, is that when the model is forced to match long-term bond yields, that also changes the decomposition for other variables. Table 3.3 gives one-quarter-ahead variance decompositions for output, consumption, and investment growth. This decomposition is useful for understanding whether any of these variables would be powerful asset pricing factors. Speciﬁcally, in a world where consumption followed a random walk and households had Epstein–Zin preferences with constant relative risk aversion, consumption growth would be perfectly correlated with the SDF, so it would price assets in the economy. What table 3.3 shows is that consumption, output, and investment growth are all only weakly correlated with the SDF at the one-quarter horizon—they all have correlations less than 16 percent. So asset pricing with only consumption growth will not work well in this economy. Many asset-pricing studies with Epstein–Zin preferences include both consumption growth and the return on the stock market as pricing factors. If we believe that the stock market is a claim on aggregate capital, then table 3.3 shows that it will do little to help with asset pricing as it is also only weakly correlated with the SDF.

3.7.1

Model comparison

The primary difference between the model studied here and JPT is the addition of time-varying risk aversion and the time-varying inﬂation target. An important question,

140

then, is the extent to which those two factors improve the ﬁt of the model to the data. Clearly, allowing a time-varying inﬂation target will help increase the volatility of longterm bond yields, and time-varying risk aversion will induce time-varying risk premia. But it is possible that the data can be well explained with a constant risk premium, or perhaps there is no need to have a time-varying inﬂation target. Table 3.4 considers two alternatives to the benchmark model: a version where the inﬂation target is constant, and a version where the coefﬁcient of relative risk aversion is constant. It lists two statistics for each model. First, it gives the standard likelihood ratio used in MLE, i.e. based on log f (y|θ, M ), the likelihood of the data conditional on the model and parameters, where y represents the data and θ the parameter vector. M denotes the model choice, i.e. the full model or one of the two restricted versions. log f (y|θ, M) is closely related to the one-step-ahead forecast error of the model. The likelihood ratio test favors the benchmark over each alternative by a wide margin. One way to see the source of this rejection is to note that in table 3.2, the measurement errors for bond yields, essentially a residual variance, are three times higher in the alternative models than the benchmark. This difference alone is more than sufﬁcient to explain the magnitude of the likelihood ratios. The second statistic is the Bayes factor, which is based on the marginal likelihood conditional only on the model, log f (y| M) (in the economics literature, see FernandezVillaverde and Rubio-Ramirez, 2004). I calculate log f (y| M ) using the monte carlo chain as in Fernandez-Villaverde and Rubio-Ramirez (2004).10 The difference in the Bayes factors listed in the bottom row of table 3.4 is similar to the values obtained under the usual likelihood ratio test. In order to accept the model with constant risk aversion or a constant inﬂation target, the ratio of the prior probability of either of those models to that of the benchmark model would have to be greater than exp (140). In a statistical sense, then, both the time-varying inﬂation target and timevarying risk aversion substantially improve the ﬁt of the model. The last exercise I perform is to compare forecasting power between the structural
10

See also Gelfand and Dey (1994) and Geweke (1999)

141

Table 3.4: Model comparison statistics

Constant RRA Constant π* Likelihood ratio 185.6 183.8 p-value 5.0E-41 1.6E-35 Bayes factor 141.50 142.50 Note: Row 1 gives the marginal likelihood of the data given the model and parameters. Row
two gives the p-value for the frequentist LR test. Row 3 is the Bayes factor, the marginal probability of the data conditional on the model (i.e. integrated over the parameter space).

142

model and a VAR. Rather than include all 13 of the endogenous variables, I use the 6 macro variables and the ﬁrst three principal components of the bond yields (i.e. I estimate a FAVAR). I estimate a VAR(2) with the restriction that the macro variables and the interest rate factors for not interact with eachother (though their innovations may be correlated). This assumption partly helps to limit the number of free parameters in the VAR, but it also means that we can compare the forecasting performance of the structural model to the benchmark models from the macro and asset-pricing literatures. Figure 3.11 plots root mean squared errors for the forecasts from the VAR(2) and the structural model. All of the variables are forecasted in levels, rather than differences. Because the strutural model is so difﬁcult to estimate, I only consider in-sample forecasting performance. For the macro variables, the structural model has somewhat weaker forecasting power than the VAR. The place where the divergence is notable is for inﬂation. At long horizons, the structural model essentially says that the level factor in interest rates should equal expected inﬂation since they are both driven by the Fed’s inﬂation target. Over this sample, however, the level factor fell slowly, while inﬂation fell rapidly in the early 1980’s and then was low and stable. Forecasting long-term inﬂation with the level factor in this sample was thus unsuccessful compared to a simple random-walk forecast for inﬂation (which is basicaly what the VAR implies). For the bond yields, the structural model forecasts well. At long horizons, the RMSE generated by the structural model is smaller than the VAR by 10–50 percent, with the largest improvements coming for long-term bonds.

3.8 Conclusion
This paper studies bond pricing in a medium-scale New-Keynesian model with a timevarying price of risk. I show that the model can generate a large and volatile term premium. The term premium is driven by the combination of two factors—a negative response of interest rates to positive technology shocks and variation in risk aversion. Removing either of these effects eliminates the model’s ability to match the magnitude of the term premium.

143

Figure 3.11: Forecast RMSE

3.5 3 2.5 2 1.5

Output

3 2.5

Consumption

16 14 12

Investment

Structural model

2 1.5

10 8 6 4

VAR(2)
1 0.5 0 1 4.5 4 5 3.5 3 2.5 3 2 1.5 1 1 0.5 0 1 2.5 5 9 13 17 0 2 4 5 9 13 17 6

1 0.5 0 1 5 9 13 17

2 0 1 3 5 7 9 11 13 15 17 19 0.6 0.5 0.4 0.3 0.2 0.1 0 1 5 9 13 17 2.5 1 5 9 13 17

Hours worked

Real wage

Inflation

1-quarter yield

2.5

3-year yield

10-year yield

2 1.5 1

2 1.5 1

2 1.5 1

0.5 0 1 5 9 13 17

0.5 0 1 5 9 13 17

0.5 0 1 5 9 13 17

Note: Root mean squared error from forecasts of the endogenous variables. All variables are forecasted in levels. The horizontal axis gives the forecast horizon in quarters.

144

While shocks to risk aversion and technology determine average asset returns, they have only weak effects on real variables at business-cycle frequencies. The covariance of asset returns with real variables over the business cycle is therefore unimportant for determining average returns. It is true that the Federal Reserve tends to cut interest rates in recessions, but the model shows that most recessions are not high-marginal-utility states of the world. So the usual intuition that the Taylor rule should lead to a downward-sloping yield curve is inaccurate since it does not take into the difference between the shocks that have high risk prices and the shocks that drive the business cycle. Furthermore, while risk aversion is estimated to be highly volatile and to be an important determinant of the dynamics of the term spread, it has almost no effects on the real economy. This model thus suggests that there is a separation between the price of risk in ﬁnancial markets and the real economy.

145

APPENDIX

146

A. APPENDIX TO CHAPTER 1

A.1 The approximation for average duration
Now using the investment function from the text, we have

Iit = exp (γ log (η1i ) + γ (µt+1 − rt+1 ) + γ log Bit + log (1 + exp (µt+2 − rt+2 ) (1 − δi ))) (A.1)

= exp (γ (µt+1 − rt+1 )) exp (γ log (η1i ) + γ log Bit + log (1 + exp (µt+2 − rt+2 ) (1 − δi )))
(A.2) ¯ Using the deﬁnition of Dt and the investment function, we have ¯ Dt =

∑∑
i

exp (γ log η1i + γ log Bit + log (1 + exp (µt+2 − rt+2 ) (1 − δi ))) D (δi ) i exp ( γ log η1i + γ log Bit + log (1 + exp ( µt+2 − rt+2 ) (1 − δi )))

(A.3)

Now we take a linear approximation to

exp (γ log η1i + γ log Bit + log (1 + exp (µt+2 − rt+2 ) (1 − δi )))

(A.4)

¯ ¯ around the point log η1 , log Bt , δ where log η1 ≡ ∑i log η1i , log Bt ≡ ∑i log Bit , and δ ≡ ∑i δi . Also deﬁne log (η1i ) ≡ log (η1i ) − log η1i log ( Bit ) ≡ log ( Bit ) − log Bt ˆ ¯ δi ≡ δi − δ

147

We have

exp (γ log η1i + γ log Bit + log (1 + exp (µt+2 − rt+2 ) (1 − δi ))) ¯ ≈ exp γlog η1i + γlog Bit + log 1 + exp (µt+2 − rt+2 ) 1 − δ    1 + γlog (η1i ) + γlog ( Bit )  ×  exp − ˆ −δi 1+exp µ(µt+2 r rt+2 )1−δ ¯) ( t +2 − t +2 ) ( Simple algebra and the observation that

(A.5)

∑
i

ˆ 1 + γlog (η1i ) + γlog ( Bit ) − δi

exp (µt+2 − rt+2 ) ¯ 1 + exp (µt+2 − rt+2 ) 1 − δ

=N

(A.6)

yields the result   (A.7)  1 + γlog (η1i ) + γlog ( Bit )  ¯ Dt ≈ N − 1 ∑   Di exp − ˆ − 1+exp µ(µt+2 r rt+2 )1−δ δi i ¯) ( t +2 − t +2 ) (

= N −1 ∑ 1 + γlog (η1i ) + γlog ( Bit ) Di −
i

exp (µt+2 − rt+2 ) −1 ˆ ¯ N ∑ δi Di 1 + exp (µt+2 − rt+2 ) 1 − δ i (A.8)

The term N −1 ∑i 1 + γlog (η1i ) Di is constant over time, and we thus have the desired result from the text.

A.2 Further robustness tests
This section describes table A.1, which has extra robustness tests for the regressions of ¯ D on the term spread and other controls. Column 1 includes up to three lags of the term spread. The second lag enters signiﬁcantly, and with a coefﬁcient slightly larger than the ﬁrst lag. This sort of lagged response to price changes is commonly found in the literature. It is generally interpreted as being due to planning and delivery lags. The second column includes every one of the other controls simultaneously, instead of individually as in tables 2 and 3. The result is that the coefﬁcients on the various other controls all become marginally signiﬁcant at best, while the term spread is still highly signiﬁcant. There is thus 148

Figure A.1: Further robustness tests

Term Spread(t-1) Term Spread(t-2) Term Spread(t-3) Unemployment(t-1) GDP(t) GDP(t-1) Investment(t) Investment(t-1) SD_returns(t+1) SD_returns(t) SD_returns(t-1) N R2
Note: See tables 2 and 3.

(1) -0.23 ** [0.10] -0.24 *** [0.08] 0.11 [0.09]

0.36 *** [0.10]

(2) -0.40 [0.06] -0.27 [0.08] 0.12 [0.07] 0.18 [0.19] -0.05 [0.12] 0.55 [0.26] -0.08 [0.17] -0.12 [0.21] -0.23 [0.09]

*** ***

(3) -0.46 *** [0.08]

(4) -0.57 *** [0.09]

*

**

0.02 [0.20] 0.02 [0.13] -0.24 [0.22] -0.24 [0.16] 0.06 [0.19] -0.19 ** [0.09]

59 0.48

47 0.74

47 0.69

-0.28 *** [0.07] 0.15 ** [0.07] -0.08 [0.05] 45 0.56

something different about the term spread from all of the other business cycle, volatility, and investment controls. Column 3 is identical to column 2 except it only uses one lag of the term spread, and the coefﬁcient is still signiﬁcant at the one percent level. Finally, column 4 includes the current and lagged values of volatility instead of just the leading value. The leading value has a negative sign, indicating that high future volatility lowers duration today (consistent with a model of irreversible investment). The current value has a positive sign. One way to reconcile this is if the volatility variable should actually enter as a ﬁrst difference: an increase in volatility lowers average duration, instead ¯ of a high value by itself. The lagged level of volatility is uncorrelated with Dt .

149

B. APPENDIX TO CHAPTER 2

B.1 The certainty equivalent
This section looks at the relationship between the certainty equivalents using G habit and G TV . I ﬁrst show that the two certainty equivalents are equal up to a second order approximation around the non-stochastic version of the model. Next, I show that in the continuous-time limit, the preferences associated with the two certainty equivalents are identical.

B.1.1

Second-order approximation

This section approximates the certainty equivalent G −1 ( Et ( G (Vt+1 ))) where Vt+1 = Vt × (1 + σε t+1 ) around the point σ = 0. We assume that Et ε t+1 = 0 and Et ε2+1 = 1. t Now consider the derivative of G −1 ( Et ( G (Vt+1 ))) with respect to σ,
d d −1 dσ Et ( G (Vt+1 )) G ( Et ( G (Vt+1 ))) = dσ G ( G −1 ( Et ( G (Vt+1 ))))

(B.1)

We have d Et ( G (Vt+1 )) = dσ

ˆ G (Vt (1 + σε t+1 )) ε t+1 Vt dF (ε t+1 )
d dσ Et

(B.2)

where F is the cdf of ε t+1 . Evaluated at σ = 0,

( G (Vt+1 )) = 0, and therefore
(B.3)

d −1 G ( Et ( G (Vt+1 ))) = 0 dσ

So all certainty equivalents taking this form are identical up to the ﬁrst order in approximations around σ.

150

Next, consider the second derivative, G G −1 ( Et ( G (Vt+1 )))
d2 E dσ2 t

( G (Vt+1 ))
2

− d 2 −1 G ( Et ( G (Vt+1 ))) = dσ2
Since
d dσ Et

d dσ Et

( G (Vt+1 ))

d dσ G

G −1 ( Et ( G (Vt+1 )))

[ G ( G −1 ( Et ( G (Vt+1 ))))]

(B.4)

( G (Vt+1 )) is equal to zero at σ = 0, we can ignore the second term in the nu-

merator. The second derivative of the expectation is d2 Et ( G (Vt+1 )) = dσ2 At σ = 0,
d2 E dσ2 t

ˆ G (Vt (1 + σε t+1 )) ε2+1 Vt2 dF (ε t+1 ) t (B.5)

( G (Vt+1 ))
dσ2 d2

σ =0

= G (Vt ) Vt2 . We also have G G −1 ( Et ( G (Vt+1 )))
σ =0

σ =0

=

G (Vt ), and hence

G −1 ( Et ( G (Vt+1 )))

=

G (Vt )Vt2 G (Vt )

. So any two choices of G, say G1 for any Vt . That relationship

and G2 are equivalent up to the second order if holds for G habit and G TV .

G1 (Vt ) G1 (Vt )

=

G2 (Vt ) G2 (Vt )

B.1.2

Continuous time

Dufﬁe and Epstein (1992) show how to extend Epstein–Zin preferences to continuous time. They derive a utility function following the process

dVt = µt + σt dBt

(B.6) (B.7)

=

1 − f (ct , Vt ) − A (Vt ) σt σt dt + σt dBt 2

for a Wiener process dBt . As in the main text, suppose the household’s certainty equivalent under discrete-time Epstein–Zin preferences is G −1 ( Et ( G (Vt+1 ))). Dufﬁe and Epstein (1992) show that the analogous choice of A, obtained as a limiting case as the length of time periods approaches zero, is A (Vt ) =
G (Vt ) . G (Vt )

In the case where G power (Vt ) = Vt1−α , we have A power (Vt ) =

−α G power (Vt ) = power (V ) G Vt t

(B.8)

151

and for G habit = (Vt − Ht )1−α Ahabit (Vt ) = For G TV = Vt1−αt , A TV (Vt ) =

−α Vt − Ht −αt Vt

(B.9)

(B.10)

V So G TV and G habit are identical if αt = α Vt −tHt , which is what is used in the text.

For all three choices of the certainty equivalent G, we can use the standard choice for f , f (ct , Vt ) =
β ct −Vt −ρ 1− ρ Vt
1− ρ 1− ρ

. ρ then determines the elasticity of intertemporal substitution,

while A determines risk aversion.

B.2 Derivation of the SDF
We can obtain the stochastic discount factor (SDF) by calculating the intertemporal marginal rate of substitution. We calculate two derivatives. First, ∂Vt −ρ ρ = Vt (1 − exp (− β)) Ct ∂Ct

(B.11)

Next, we differentiate Vt with respect to Ct+1 (w), where w denotes one state of the world, and πw is the probability of that state, ∂Vt ρ −ρ (−1) = πw Vt exp (− β) Rt Gt ∂Ct+1 (w)

× ( Et [ Gt (Vt+1 (w))]) Gt (Vt+1 (w)) Vt+1 (w) (1 − β) Ct+1 (w)
where Gt is the derivative of Gt and Gt
(−1)

ρ

−ρ

(B.12)

− the derivative of Gt 1 . Rt ≡ G −1 ( Et Gt (Vt+1 )).

The subscripts on Gt refer to the fact that Gt depends on the potentially-time-varying parameter αt . The assumption that αt is exogenous to the household is necessary for this formula for the derivative to be correct (in the same way that external habits lead to a more tractable formula for the SDF than do internal habits). The SDF can be derived from a consumer’s ﬁrst order conditions for optimization as

152

Mt + 1 ( w ) =

1 ∂Vt /∂Ct+1 (w) . πw ∂Vt /∂Ct

We then have Gt (Vt+1 (w)) Vt+1 (w) Ct+1 (w) ρ −ρ Gt ( Rt ) Rt Ct
(−1)
ρ

−ρ

Mt+1 (w) = exp (− β)

(B.13)

where the last line follows from the fact that Gt

( x ) = 1/Gt ( x ).

In the case of Gt (V ) = V 1−αt , the SDF becomes Vt+1 t (w) Ct+1 (w) Rt
ρ−αt ρ−α

−ρ

Mt+1 = exp (− β)

Ct

−ρ

(B.14)

B.2.1

Substituting in an asset return

Consider an asset that pays Ct as its dividend. We guess that its cum-dividend price is Wt = Vt
1− ρ

Ct (1 − exp (− β))−1 . This guess can be conﬁrmed by simply inserting it into
ρ

the household’s Euler equation. The return on the consumption claim is Vt+1 Wt+1 = = Wt − Ct exp (− β) Rt (Vt+1 )1−ρ
1− ρ

Rw,t+1

Ct+1 Ct

ρ

(B.15)

Which yields Vt+1 t (w) Rt
ρ−αt ρ−α

= ( Rw,t+1 exp (− β))

ρ−αt 1− ρ

Ct+1 Ct

−ρ

ρ−αt 1− ρ

(B.16)

We can then insert this into the SDF to yield
1− α t 1− ρ

Mt+1 = exp (− β)

Ct+1 Ct

−ρ

1− α t 1− ρ

1− Rw,tρ 1 +

ρ−αt

(B.17)

B.3 The log-linear model with production
B.3.1 Steady state

In the nonstochastic steady state, the interest rate earned by all assets, r, is equal to

r = β + ρµ

(B.18)

153

Standard manipulations show that the steady-state ratio of capital to technology is then exp ( β + ρµ) − 1 + δ γ
1/(γ−1)

¯ K=

(B.19)

We can obtain the steady-state consumption-output ratio by using the budget constraint, ¯ ¯ ¯ ¯ C = Y + (1 − δ) K − exp (µ) K ¯ ¯ C K 1 − ¯ = − (1 − δ − exp (µ)) ¯ Y Y B.3.2 The budget constraint

(B.20) (B.21)

The approximation I use for the budget constraint is identical to Campbell (1994). The budget constraint is Kt+1 = At Kt
1− γ

− Ct + (1 − δ) Kt . I look for a log-linear approximation

taking the form k t+1 = λ0 + λk k t + λ a at + λc ct , where the λ terms are coefﬁcients from the approximation. The budget constraint can be rewritten as log [exp (∆k t+1 ) − (1 − δ)] = yt − k t + log (1 − exp (ct − yt ))

(B.22)

Taking a log-linear approximation to the left-hand side around the point ∆k t+1 = µ,we have log [exp (∆k t+1 ) − (1 − δ)] ≈ log [exp (µ) − (1 − δ)] + exp (µ) (∆k t+1 − µ) exp (µ) − (1 − δ) (B.23)

To approximate the right-hand side of (B.22), we approximate log (1 − exp (ct − yt )) around the steady state cy,

log (1 − exp (ct − yt )) ≈ log (1 − exp (cy)) +

− exp (cy) (ct − yt − cy) 1 − exp (cy)

(B.24)

154

This implies exp (µ) (∆k t+1 − µ) ≈ log (1 − C/Y ) + yt − k t exp (µ) − (1 − δ) − exp (cy) + (ct − yt − cy) 1 − exp (cy) (B.25) Now we can ﬁnd the coefﬁcients in the linear approximation to the budget constraint. The constant term is exp (µ) − (1 − δ) exp (µ)

log [exp (µ) − (1 − δ)] +

λ0 =

log (1 − C/Y ) −

− exp (cy) 1 − exp (cy)

cy

(B.26)

The coefﬁcients on k, a, and c, are then exp (µ) − (1 − δ) − exp (cy) γ−1−γ +1 exp (µ) 1 − exp (cy) exp (µ) − (1 − δ) exp (µ) − (1 − δ) λa = (1 − γ ) − (1 − γ ) exp (µ) exp (µ) exp (µ) − (1 − δ) − exp (cy) λc = exp (µ) 1 − exp (cy) λk = Now note that λk + λ a + λc = 1, So we have k t +1 = λ 0 + λ k k t + λ a a t + (1 − λ k − λ a ) c t

(B.27)

− exp (cy) 1 − exp (cy)

(B.28) (B.29)

(B.30)

B.3.3

Capital return

To approximate the return on capital, we say

rk,t+1 = log (γ exp ((1 − γ) ( at+1 − k t+1 )) + 1 − δ)

(B.31)

¯ ¯ (γ − 1) γ exp (γ − 1) k k t+1 − at+1 − k ¯ ≈ log γ exp (γ − 1) k + 1 − δ + ¯ γ exp (γ − 1) k + 1 − δ (B.32) ˜ ¯ rk,t+1 ≈ r + rkk k t+1 − k (B.33)

155

where rkk ≡ (γ − 1) (exp (r ) − 1 + δ) / exp (r ) B.3.4 Risk aversion

I guess that the innovation to the value function van be written as κv ε t+1 , so that ¯ αt+1 = φαt + (1 − φ) α + λκv ε t+1

(B.34)

I conﬁrm this guess below.

B.3.5

Consumption dynamics

˜ ˜ Writing k t ≡ k t − at and ct ≡ ct − at , we have ˜ ˜ ˜ k t+1 = λ0 + λk k t + λc ct − σa ε a,t+1 − µ

(B.35)

˜ ˜ Now we guess that the consumption function is ct = ηc0 + ηck k t + ηcα αt (note here that I use λ terms for the budget constraint, which are terms depending only on the underlying parameters of the model; the η terms are coefﬁcients from the optimal consumption rule). Then we have ˜ ˜ ˜ k t+1 = λ0 + λk k t + λc ηc0 + ηck k t + ηcα αt − σa ε a,t+1 − µ ˜ = ηk0 + ηkk k t + ηkα αt − σa ε a,t+1 ηk0 ≡ λ0 + λc ηc0 − µ ηkk ≡ λk + λc ηck ηkα ≡ λc ηcα This equation speciﬁes the dynamics of capital conditional on the underlying parameters of the model and the two unknown coefﬁcients determining the dynamics of consumption. ˜ For consumption growth, we say ∆ct+1 = ηd0 + ηdk k t + ηda αt + κd ε t+1 , where ¯ ηd0 ≡ ηck ηk0 − ηca (φα − 1) α + µ κd ≡ σa (1 − ηck ) + ηca λκv ηdk ≡ ηck (ηkk − 1) ηda ≡ ηck ηkα + ηca (φα − 1)

(B.36) (B.37)

156

The remainder of the appendix conﬁrms that our guesses for the form of the consumption and value functions are correct.

B.3.6

Wealth return

In the presence of balanced growth, the long-run response of consumption to an innovation of σa ε t to technology must be exactly σa ε t+1 . This is equivalent to saying that ∆Et+1 ∑ ∆ct+ j+1 = σa ε t+1
j =0 ∞

(B.38)

In the case where θ approaches 1 (the steady-state dividend/price ratio approaches zero) or the consumption response only takes one period, ∆Et+1 ∑∞ 0 ∆ct+ j+1 = ∆Et+1 ∑∞ 0 θ j ∆ct+ j+1 . j= j= We therefore have the approximation, rw,t+1 = Et rw,t+1 + σa ε t+1 − ∆Et+1 ∑ θ j rw,t+ j+1
j =1 ∞

(B.39)

This extra approximation is not strictly necessary, and the model is straightforward to solve without it. However, it substantially simpliﬁes many of the formulas and makes them more transparent. The results reported below on the accuracy of the log-linear solution apply to the solution using this approximation. Now note that ∆Et+1 ∑ θ j rw,t+ j+1 = ∆Et+1 ∑ θ j αt+ j ηwa + ρEt+ j ∆ct+ j
j =1 j =1 ∞ ∞

(B.40) (B.41)

= ηwa

θ σaa + ρ (ηck σa − ηca σaa ) 1 − θφ

The second term follows from the approximation ∆Et+1 ∑ θ j ρEt+ j ∆ct+ j ≈ ∆Et+1 ∑ ρEt+ j ∆ct+ j
j =1 j =1 ∞ ∞

The right hand side of this equation is simply ρ multiplied by the total amount of consumption growth expected following period t + 1. Since we know that in the long run, the

157

consumption-technology ratio is stationary, we just need to know how much consumption declines relative to technology at period t + 1. That’s going to be exactly ηck σa ε t+1 − ηca σaa ε t+1 (since capital falls by σa and α falls by −σaa ). We then have κr = σa − ηwa The return is thus rt+1 = ηw0 + ρEt ∆ct+1 + ηwa αt + σa ε t+1 + −ηwa θ θ σaa + η σa ε t+1 (B.43) 1 − θφ 1 − θηkk dk θ σaa − ρ (ηck σa − ηca σaa ) 1 − θφ (B.42)

B.3.7

The Euler equation for wealth

The asset pricing equation gives us  1 = Et exp ( β)
1− α t 1− ρ

Ct+1 Ct

−ρ

1− α t 1− ρ

Rw,t+1 

1− α t 1− ρ

 (B.44)

The log of the term inside the expectation is  log exp ( β)
1− α t 1− ρ

Ct+1 Ct

−ρ

1− α t 1− ρ

Rw,t+1  = −

1− α t 1− ρ



1 − αt 1 − αt β−ρ ∆ct+1 1−ρ 1−ρ

+

1 − αt (ηw0 + ρEt ∆ct+1 + ηwa αt + κr ε t+1 ) (B.45) 1−ρ 1 − αt 1 − αt =− β−ρ κ ε t +1 1−ρ 1−ρ d 1 − αt + (B.46) (ηw0 + ηwa αt + κr ε t+1 ) 1−ρ
1− α t 1− ρ

Now taking the log of the expectation and dividing by

gives

0 = − β + ηw0 + ηwa αt +

1 1 − αt (−ρκd + κr )2 2 1−ρ

(B.47)

158

which implies 1 (−ρκd + κr )2 2 1−ρ 1 1 ηwa = (−ρκd + κr )2 21−ρ θ κr = σa − ηwa σaa − ρ (ηck σa − ηca σaa ) 1 − θφ B.3.8 The Euler equation for capital

ηw0 = β −

(B.48) (B.49) (B.50)

The stochastic discount factor follows 1 − αt ρ − αt κ d ε t +1 + 1−ρ 1−ρ 1 αt − 1 (−ρκd + κr )2 + κr ε t+1 2 1−ρ

mt+1 = − β − ρE∆ct+1 − ρ

(B.51)

For the capital return we have ˜ ¯ 1 = Et exp mt+1 + r + rkk ηk0 + ηkk k t + ηkα αt − σa ε a,t+1 − k

(B.52)

which implies 1 (1 − α t ) (−ρκd + κr )2 2 (1 − ρ )

0 = − β − ρE∆ct+1 +

˜ ¯ + r + rkk ηk0 + ηkk k t + ηkα αt − k

+

1 (r σa + κr )2 − (rkk σa + κr ) 2 kk

1 − αt 1−ρ

(−ρκd + κr )

(B.53)

Note that all of the nonlinearities disappear (i.e. the α2 terms), and this equation is linear t in the state variables. We can thus solve through the method of undetermined coefﬁcients as usual. For the coefﬁcients on capital,
2 0 = −ρ λk ηck + λc ηck − ηck + rkk (λk + λc ηck )

(B.54)

159

This is quadratic in ηkk , and we have ρ (1 − λk ) + rkk λc ±

ηck =

(ρ (1 − λk ) + rkk λc )2 + 4ρλc rkk λk
2ρλc

(B.55)

Now note that λc < 0, λk > 0, and rkk < 0. This implies that

(ρ (1 − λk ) + rkk λc )2 + 4ρλc rkk λk > ρ (1 − λk ) + rkk λc

(B.56)

and hence ηck has a positive and a negative root. The root where ηck < 0 violates the transversality condition (high capital implies low consumption), so we choose the root ¯ with ηck > 0. Note that the formula for ηck does not involve α or σaa , which conﬁrms remark 1. For the coefﬁcients on αt , we have 1 (−ρκd + κr )2 (−ρκd + κr ) + rkk ηka + (rkk σa + κr ) 2 1−ρ 1−ρ B.3.9 Other parameters

0 = −ρηda −

(B.57)

To solve for (−ρκd + κr ), simply combine the equations for ηwa and κr , yielding

−ρκd + κr =

−1 +

θ 1 + 2 1−θφ σaa σa 1 θ 1−ρ 1−θφ σaa

(B.58)

We choose the root for this equation that has the property that it approaches zero as σa approaches zero. That is, we know that when the shocks have zero variance, all assets have the same return, and so ηwa = 0.

160

B.3.10 Excess returns and the risk-free rate (result 2) To calculate excess returns, we can simply calculate the covariance of the wealth return with the SDF. The Sharpe ratio of the wealth portfolio is
1 2 Et rw,t+1 − r f ,t+1 + 2 κr αt − ρ = (−ρκd + κr ) + ρκd κr 1−ρ

(B.59)

= (αt − ρ)

−1 +

θ 1 + 2 1−θφ σaa σa θ 1−θφ σaa

+ ρκd

(B.60)

This also immediately gives a formula for the risk-free rate 1 2 αt − ρ Et rw,t+1 − r f ,t+1 + κr = (−ρκd + κr ) κr + ρκd κr 2 1−ρ r f ,t+1 = ηw0 + ρEt ∆ct+1 + ηwa αt 1 2 αt − ρ + κr − (−ρκd + κr ) κr − ρκd κr 2 1−ρ B.3.11 The wealth-consumption ratio (B.62)

(B.61)

The Campbell–Shiller approximation for the wealth-consumption ratio is
∞ z + Et ∑ θ j ∆ct+ j+1 − rt+ j+1 1−θ j =0

wt − ct =

(B.63)

for a constant z depending on the average consumption-wealth ratio (i.e. related to θ), z ≡ − log θ − (1 − θ ) log
1 θ ∞

− 1 . Now

¯ ¯ Et ∑ θ j ∆ct+ j+1 = −ηck k t − k − ηca (αt − α)
j =0

(B.64)

under the approximation θ = 1 from above, and Et ∑ θ j rw,t+ j+1 = Et ∑ θ j αt+ j ηwa + ρEt+ j ∆ct+ j+1
j =0 j =0 ∞ ∞

(B.65)

161

So θ z ¯ ¯ ¯ + (1 − ρ) −ηck k t − k − ηca (αt − α) − ηwa (αt − α) 1−θ 1 − θφ B.3.12 The value function and risk aversion (result 4) At any time, household value is
1− ρ ρ

wt − c t =

(B.66)

Wt = Vt vt =

Ct / (1 − exp (− β))

(B.67) (B.68)

log (1 − exp (− β)) ( wt − ct ) + ct + 1−ρ 1−ρ

The innovation to the value function, vt+1 − Et vt+1 , is equal to the sum of the innovations to
( w t +1 − c t +1 ) 1− ρ

and ct+1 , which are

∆Et+1

θ ηwa ( w t +1 − c t +1 ) + ∆Et+1 ct+1 = σa ε t+1 − σaa ε t+1 1−ρ 1 − ρ 1 − θφ κv = σa ε t+1 − θ 1 1 − θφ 2 1 1−ρ
2

(B.69)

(−ρκd + κr )2 σaa ε t+1
(B.70)

Using the formula from above that deﬁnes (−ρκd + κr ), we have κv =

(−ρκd +κr ) 1− ρ

and

−1 +
σaa = λκv = λ

θ 1 + 2 1−θφ σaa σa θ 1−θφ σaa

(B.71)

B.3.13

Afﬁne bond pricing (result 5)

mt+1 = − β − ρE∆ct+1 − ρ

1 − αt ρ − αt κ d ε t +1 + 1−ρ 1−ρ

1 αt − 1 (−ρκd + κr )2 + κr ε t+1 2 1−ρ

(B.72)

162

var (mt+1 ) =

1 − αt (−ρκd + κr ) − κr 1−ρ 1 − αt 1−ρ
2

2

σ2 1 − αt (−ρκd + κr ) κr σ2 1−ρ

(B.73) (B.74)

=

2 (−ρκd + κr )2 + κr − 2

And hence 1 r f ,t+1 = − Et mt+1 + var (mt+1 ) 2   ˜ t + ηda αt − β − ρ ηd0 + ηdk k   = −  2 1− α t 1 2 1 (1− α t ) + 2 (1−ρ) (−ρκd + κr ) + 2 κr − 1−ρ (−ρκd + κr ) κr It is straightforward to show that 1 − αt (−ρκd + κr ) − κr 1−ρ 1 2 1 − αt (−ρκd + κr ) + κr 1−ρ
2

(B.75)

(B.76)

mt+1 = −r f ,t+1 +

ε t +1 −

σ2 (B.77)

= −r f ,t+1 −

1 ( ω0 + ω1 α t ) 2 σ 2 + ( ω0 + ω1 α t ) ε t +1 2

(B.78)

So the SDF takes the essentially afﬁne form with
(−ρκd +κr ) 1− ρ −(−ρκd +κr ) 1− ρ

ω0 =

− κ r ω1 =

B.3.14

Accuracy of the approximation

Table B.1 reports simple statistics summarizing the relationship between the projection solution and the log-linear approximation to the model. The ﬁrst column lists the mean difference between the solutions, the second column the standard deviation of the gap, and the third column the standard deviation of the gap scaled by the standard deviation of the variable in the projection solution. I report deviations for log capital, log consumption growth, the coefﬁcient of relative risk aversion, and the Sharpe ratio of the wealth portfolio. For the simulations, both models start with the same initial levels of capital and risk aversion and use the same technology shocks. I then simulate the models for 20,000 periods.

163

Figure B.1: Comparison of results from simulations of projection and log-linear model solutions

Differences: Capital Cons. Growth RRA Sharpe Ratio

Mean -8.72E-04 3.16E-08 1.11E-02 -2.15E-03

Std. dev. 0.00073 0.00013 0.75344 0.00892

Scaled std. dev. 0.020 0.028 0.116 0.123

Note: Comparison of the projection and log-linear solutions. The two simulations use the same shocks but different policy functions. The first column is the mean difference between the simulations, the second column the standard deviation, and the third column the standard deviation of the difference scaled by the standard of the variable in the projection solution. RRA is relative risk aversion.

164

Table B.1 shows that for capital and consumption growth, the log-linear approximation is nearly identical to the projection solution. The mean differences are essentially zero, and the standard deviations of the errors are both less than 3 percent of the standard deviations of the variables themselves. For risk aversion and the Sharpe ratio, the log-linear approximation is essentially identical to the projection solution on average, but the standard deviation of the differences is now roughly 12 percent of the standard deviation of the variables themselves. An alternative method of checking the accuracy of the approximation is to look at Euler equation errors. Figure A.1 plots histograms of the log10 Euler equation errors,
α −1 log10 Et Mt+1 αKt+1 + 1 − δ

− 1 under the projection solution and the log-linear so-

lution at each date in the simulation.

B.4 Details of return forecasting
B.4.1 The method from Lettau and Ludvigson (2001) If consumption and wealth are cointegrated, then we have the relationship

ct = ζwt + ξ t

(B.79)

where ζ is a parameter, and ξ t is a mean-zero, stationary, and not necessarily i.i.d. error term. If we observed wealth, ζ and ξ t could be directly estimated. We do not observe wealth, though, especially the human component. Lettau and Ludvigson (2001) therefore use the approximation wt = ωat + (1 − ω ) hut (B.80)

where at is asset wealth and hut human wealth. This equation simply says that log aggregate wealth is equal to the sum of log asset and human wealth. Since the level of aggregate wealth is equal to the sum of the levels of asset and human wealth, the approximation is valid as long as the shares of asset and human wealth in aggregate wealth are stationary not not too variable. The fact that labor’s share of income has been stationary in the post-war US data makes this assumption reasonable. 165

Figure B.2: Log10 Euler equation error densities

4

Projection
3

Log-linearization

2

166
1 0 -9 -7 -5 -3 -1

-11

Note: Densities of Euler equation errors under the two solution methods. The log errors are defined as log10(|E[Mt+1Rk,t+1]-1|). Densities are estimated using a kernel smoother on simulated data. In both cases, the model used is the benchmark single-shock model with EZ-habit preferences and constant labor supply.

Finally, we assume that labor income, yt , can be viewed as the dividend from human wealth and that the dividend/price ratio for human wealth is stationary. That is,

yt = g + hut − µt

(B.81)

where g is a parameter and µt is a mean-zero stationary bz,1 term. This implies that wt = ωat + (1 − ω ) yt + (1 − ω ) g + µt ct = ζωat + ζ (1 − ω ) yt + ζ (1 − ω ) g + ζµt + ξ t

(B.82) (B.83)

since ξ t + ζµt is mean-zero and stationary, regardless of any correlation between ξ t and µt , the variables ct , at , and yt are jointly cointegrated. The parameters ζ, ω, and g can be estimated through standard methods for cointegrated models. As Lettau and Ludvigson point out, the estimation is of these parameters is superconsistent, converging linearly with sample size, so these parameters can be taken as known with certainty in any subsequent analyses (in particular, stock return forecasts). I follow Lettau and Ludvigson in referring to the cointegrating residual, ζµt + ξ t = ct − ζωat − ζ (1 − ω ) yt − ζ (1 − ω ) g as cayt , and I refer to ωat + (1 − ω ) × yt as ayt . ayt is an estimate of total wealth derived from data on consumption, asset wealth, and labor income, taking advantage of an assumed cointegrating relationship between the three variables. I estimate the parameters using standard maximum likelihood methods.

B.4.2 Sensitivity analysis for return forecasting The results in section 2.4.2 depend on choices for two parameters – the EIS and the persistence of risk aversion. Tables B.2 and B.3 report the ratio of the R2 for excess value to cay for 1, 5, 10, and 20-quarter returns across a variety of choices for the EIS and the persistence of risk aversion. Table B.2 varies the EIS between 0.75 and 10. The numbers in bold represent points ˆ where cay outperforms α. When the EIS is greater than 1, cay only ever outperforms at the 1-quarter horizon, and then only if the EIS is set to 10. With an EIS less than 1, though,

167

ˆ ˆ cay always has an R2 substantially larger that of α. Moreover, the sign on α in the return ˆ regressions ﬂips. Intuitively, this is because in the construction of v, when the EIS is less than 1, the weight on aggregate wealth is negative. The theory would predict that high risk ˆ aversion is associated with low returns, but with the EIS less than 1, α and future returns are actually positively correlated. Table B.3 presents R2 ratios for the same set of regressions, but now varying the persisˆ tence of risk aversion. Across a fairly wide range of autocorrelations, α outperforms cay at most horizons. The best performance is found with an annual autocorrelation of 0.9, which corresponds to φ = 0.974. Even with an autocorrelation as low as 0.65 (φ = 0.9), ˆ though, α performs nearly as well as cay. As with the EIS, the place where cay is most ˆ likely to outperform is with 1-quarter returns. Table B4 lists R2 s for cay, PE, and α for pre and post-1980 samples.

B.4.3 Out-of-sample forecasting regressions An alternative to the in-sample regressions studied in the main text is out-of-sample tests of forecasting power. I consider the mean squared forecast bz,1 (MSFE) based tests from analyzed in Clark and McCracken (2001, 2005) and Clark and West (2007). Suppose we want to test whether a single variable, xt , forecasts stock returns, rt , against the null that rt is i.i.d. (the methods used here apply to any null model that is nested; i.e. they are appropriate for asking whether xt has marginal forecasting power when added to some other model). The forecast horizon can be any length. Therefore, denote rt,t+ j ≡ ∑τ =t rτ . ˆ We compare the residuals from the null model, e1t ≡ rt,t+ j − β 0,t (for an estimated ˆ constant mean β 0,t using data prior to date t) to the residuals from the alternative model, ˆ ˆ ˆ e2t,t+ j ≡ rt,t+ j − β 0,t − β 1,t xt+ j−1 (where β 1t is a constant regression coefﬁcient estimates on the data from τ = 0 to τ = t − 1). The samples for the regressions are begun after the ﬁrst 20 percent of the sample. The measure of the difference in MSFE is
2 2 f t,t+ j ≡ e1t,t+ j − e2t,t+ j + e1t,t+ j − e2t,t+ j 2 t+ j

(B.84)

168

Figure B.3: Relative R2 s for varying EIS

Span 1 quarter 5 quarters 10 quarters 20 quarters

EIS=0.1 0.39 0.44 0.61 1.28

0.25 0.26 0.28 0.42 1.02

0.75 0.29 0.29 0.31 0.21

1.25 1.07 1.16 1.41 1.92

1.5 1.10 1.21 1.49 2.08

2 1.08 1.21 1.49 2.15

10 0.96 1.09 1.37 2.09

Note: This table lists the ratio of the R2 for a univariate regression of long-horizon returns on estimated risk aversion to the R2 for cay . Values less than 1 are in bold. The span in quartes is listed in the left hand column. The top row gives the EIS. The EIS is used to calculate household value and risk aversion.

169

Figure B.4: Relative R2 s for varying persistence of risk aversion

Span 1 quarter 5 quarters 10 quarters 20 quarters

Autocorr.=0.95 0.86 1.03 1.17 1.24

0.9 1.27 1.44 1.73 2.24

0.85 1.10 1.21 1.49 2.08

0.8 0.90 0.97 1.20 1.76

0.75 0.75 0.78 0.99 1.48

0.7 0.64 0.64 0.82 1.27

Note: This table lists the ratio of the R2 for a univariate regression of long-horizon returns on estimated risk aversion to the R2 for cay . Values less than 1 are in bold. The span in quarters is listed in the left hand column. The top row gives the annual autocorrelation of risk aversion.

170

Figure B.5: R2 sfrom pre and post-1980 univariate return forecasting regressions

pre-1980 1q 5q 10q 20q

Estim. RRA 0.10 0.28 0.27 0.38

cay 0.10 0.22 0.16 0.13

P/D 0.03 0.25 0.27 0.39

post-1980 1q 5q 10q 20q

Estim. RRA 0.03 0.18 0.48 0.56

cay 0.03 0.15 0.36 0.33

P/D 0.03 0.08 0.19 0.29

Note: R2s from univariate regressions of long-horizon stock returns on estimated risk aversion, cay , and the price/dividend ratio. The highest value for each horizon and sample is listed in bold.

171

Under the null, the MSFE for the e1 model tends to be smaller than the MSFE for the e2 model because the e2 model has added noise due to the extraneous predictor. Intuitively, model e1 correctly imposes the constraint that β 1 = 0 under the null. The term e1t,t+ j − e2t,t+ j
2

is essentially a correction for this effect.

When the forecast horizon is more than a single observation, f t,t+ j is serially correlated. To correct for this, we divide by a consistent estimate of its long-run variance (spectral density at frequency zero). Following Clark and West (2007), I use the Newey–West measure with a lag window of 1.5× j. Denote this measure of the long-run variance as S f t,t+ j . The long-run variance corrects for the fact that the forecast bz,1 from overlapping samples will be serially correlated. Clark and McCracken tabulate the critical values of the statistic

( T − j) ∑t=1 f t,t+ j /S f t,t+ j .
ˆ In the main text, α is calculated using full-sample information. In particular, we need to calculate the cointegrating relationship between consumption, labor income, and ﬁnancial wealth. We also need to know the average growth rate of value. For the out-of-sample forecasts, all of those parameters are estimated using only backward-looking information. The only possible source of look-ahead bias here would be data revisions. ˆ The top panel of ﬁgure B.2 plots the values of the statistics using αt as the predictor against a null of a constant expected equity for horizons from 1 to 20 quarters. We can easily reject the null at the 5 percent level at all horizons and at the 1% level for 2–13 quarter horizons.

T−j

B.4.3.1

Bootstrapping

A major concern with predictive regressions is that asymptotic distribution theory is often a poor guide to small-sample behavior. A simple way to deal with that concern is to use a bootstrap to construct conﬁdence intervals for the test statistics. I construct bootstrap samples in the following way. I select bootstrap samples of stock returns and growth rates of consumption, asset wealth, and labor income. I then construct level series for consumpˆ tion, wealth, and income, and calculate α using purely backward-looking information as above. Finally, I construct the test statistic from above for each bootstrapped sample at

172

Figure B.6: Out-of-sample test statistics

3

Bootstrapped 1% critical value Asy. 1% critical value Test statistic

2.5

2 Test statistic

Bootstrapped 5% critical value
1.5

Asy. 5% critical value
1

0.5

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

4

3

Estim. risk aversion above cay Asy. 1% critical value

2 Test statistic

1

Asy. 5% critical value

0 1 -1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

cay above Estim. risk aversion

-2

-3

Forecast horizon (quarters)

Note: Out-of-sample test statistics from Clark and McCracken (2001, 2005) based on the reduction in out-of-sample RMSE. Estimated risk aversion depends on the cointegrating model used to estimate cay . The top panel tests whether estimated risk aversion has marginal forecasting power against a null of a constant-mean model for returns. The cointegrating vector is reestimated in each period using only backward-looking information. The bottom panel tests adding estimated risk aversion to a null model including a constant and cay and vice versa.

173

each horizon from 1 to 20 quarters. I bootstrap 10,000 samples of data. The top panel of ﬁgure B.2 plots the 95th and 99th percentiles of the bootstrapped test statistics, and the out-of-sample forecasting power is still signiﬁcant at the 5 percent level.

B.4.3.2

ˆ α versus cay

We can also use the out-of-sample test to ask whether estimated risk aversion forecasts stock returns better than cay. The null model is now one where stock returns depend on a constant and the lagged value of cay, and the encompassing alternative adds the lagged ˆ value of α. The bottom panel of ﬁgure A2 plots the test statistics. At every horizon, we can ˆ reject the null that α does not improve the forecast using cay at the 5 percent level, and we can reject the null at the 1 percent level at every horizon longer than 1 quarter. Figure B.2 also plots the statistic for a test of whether cay has any marginal predictive ˆ power above that of α. At horizons shorter than 8 quarters, we cannot reject the null that it does not. At longer horizons, though, there is evidence that both variables contain important information for forecasting stock returns.

174

C. APPENDIX TO CHAPTER 3

C.1 Results from the text
C.1.1 Price of a utility claim and the SDF

−1 The utility claim pays Ut UC,t as its dividend. We conﬁrm that its cum-dividend price

is WU,t = Vt

1− ρ −1 −1 Bt UC,t / (1 −

β) by simply inserting this guess into the Euler equation,

−1 WU,t = Ut UC,t + Et [WU,t+1 Mt+1 ] 
1− ρ − −1 Vt Bt 1 UC,t / (1 −

(C.1)  Vt+1
ρ−αt
ρ−αt 1− α t

β) =

−1 Ut UC,t

−1 −1  Vt+1 Bt+1 UC,t+1 Bt+1 UC,t+1  β + Et  Bt UC,t (1 − β )

1− ρ

Et Vt1−αt +1

   (C.2)

 Vt
1− ρ

 Vt+1
ρ−αt
ρ−αt 1− α t

 1− ρ = (1 − β) Bt Ut + βEt Vt+1 

Et Vt1−αt +1
1− ρ 1− α t

  

(C.3)

1− ρ Vt

= (1 − β) Bt Ut +

βEt Vt1−αt +1

(C.4)

The last line shows that the guess for the price of the utility claim was in fact correct, since the Euler equation is guaranteed to hold.

175

Next, we consider the return on the utility claim. We have
−1 −1 Vt+1 Bt+1 UC,t+1 / (1 − β)
1− ρ −1 −1 −1 Bt UC,t / (1 − β) − Ut UC,t 1− ρ Vt−1 Bt+1 UC,t+1 1− ρ Bt UC,t Vt − (1 − β) Bt Ut 1− ρ Vt+1 Bt+1 UC,t+1 −1 1− ρ

RU,t+1 =

Vt

(C.5)
−1

= =
 ρ1−αρt − Vt+1
1− ρ
1− ρ 1− α t

(C.6) (C.7)

β     Et Vt1−αt +1

Et Vt1−αt +1
ρ−αt 1− ρ ρ−αt 1− ρ

1− ρ 1− α

Bt UC,t

  

= RU,t+1 β

UC,t+1 UC,t

ρ−αt 1− ρ

Bt+1 Bt

ρ−αt 1− ρ

(C.8)

Now substitute the return into the SDF:
ρ−αt ρ−αt B U 1− = β t+1 C,t+1 RU,tρ+1 β 1−ρ Bt UC,t 1− α t 1− ρ

Mt + 1

UC,t+1 UC,t
1− RU,tρ 1 + ρ−αt

ρ−αt 1− ρ

Bt+1 Bt

ρ−αt 1− ρ

(C.9) (C.10)

=β

UC,t+1 Bt+1 UC,t Bt

1− α t 1− ρ

This is the formula from the text.

C.1.2 First-order condition for wage setting

0=

∗ Et

∗ = Et

k =0 ∞

∑

∞

k ξw

1 + θw ψt+k θw

Wt+k ( j) Wt+k

θ − 1+ww θ

Nt+k − λt+k Nt+k ( j) Wt+ j ( j)

(C.11)

k =0

k ∑ ξw

dVt dVt Wt+k ( j) + dNt+k ( j) dCt+k Pt

(1 + θ w ) − θ w

dVt Wt+k ( j) Nt+k ( j) dCt+k Pt (C.12)

∗ = Et ∗ = Et

k =0 ∞ k =0

k ∑ ξw k ∑ ξw

∞

dVt dVt Wt+k ( j) Nt+k ( j) (1 + θ w ) + dNt+k ( j) dCt+k Pt dVt /dCt+k dVt /dCt dVt /dNt+k ( j) W ( j) Nt+k ( j) (1 + θ w ) + t + k dVt /dCt+k Pt

(C.13) (C.14)

176

C.2 Approximation method
This section proceeds as follows. First, I ﬁx notation for a general set of equilibrium conditions. Next, I describe the speciﬁcs of how to solve for the model’s equilibrium dynamics. Third, I show that the essentially afﬁne approximation method delivers bondpricing formulas in the essentially afﬁne class of Duffee (2002).

C.2.1

Equilibrium conditions

Denote the vector of the variables in the model (including the exogenous processes) as Xt . In addition to the various variables described above that track the state of the economy and the shocks, Xt will include the price/dividend ratio of the utility portfolio, the return on the utility portfolio, and the marginal utility of consumption. The vector of fundamental shocks in the model is denoted ε t ≡ ε mp,t , ε z,t , ε b,t , ε µ,t , ε g,t , ε p,t , ε w,t , ε α,t , ε π ∗,t . The equations determining the equilibrium of the model take the form

0 = G ( Xt , Xt+1 , σε t+1 )

(C.15)

where the expectation operator may appear in the function G.There is one equation for each variable in the model. The new parameter σ is the usual parameter used in perturbation approximations that controls the variance of the shocks. In the true model, σ = 1, but we will consider an approximation around the point σ = 0. The equations G can be divided into two types: those that do not involve taking expectations over the SDF and those that do.   G ( X t , X t +1 , ε t +1 ) =  D ( Xt , Xt+1 , σε t+1 ) Et [ M ( Xt , Xt+1 , σε t+1 ) F ( Xt , Xt+1 , σε t+1 )]    (C.16)

where D and F are vector-valued functions and M is the (scalar-valued) stochastic discount factor. Note that this formulation does not actually restrict F. Speciﬁcally, suppose there were a set of equilibrium conditions 1 = Et h ( xt , xt+1 , σε t+1 ), i.e. that do not involve the SDF. We could simply say that F ( xt , xt+1 , σε t+1 ) ≡ h ( xt , xt+1 , σε t+1 ) /M ( xt , xt+1 , σε t+1 ).

177

Note, though, that in this model, all expectational equations involve the SDF. For the equations that do not involve the SDF, I use standard perturbation methods and simply take a log-linear approximation. We approximate D as

0 = log ( D (exp ( xt ) , exp ( xt+1 ) , σε t+1 ) + 1) ˆ ˆ 0 ≈ d0 + d x xt + d x xt+1 + dε σε t+1

(C.17) (C.18)

where the terms d0 , d x , d x , and dε are coefﬁcients from a Taylor-series approximation and xt ≡ log Xt ¯ ˆ xt ≡ log Xt − log X

C.2.2

Linearizing the Euler equations

I now show that if we log-linearize the function F, we can transform (3.40) into a linear condition that can be solved alongside the remaining equations. Mt+1 will not even be log-linear in the state variables, but we will be able to state the equilibrium conditions as a set of linear expectational difference equations. First, guess that the approximated equilibrium dynamics of the model take the form ˆ ˆ xt+1 = C + Φ xt + Ψσε t+1

(C.19)

¯ ˆ where xt ≡ log ( Xt ) − log ( X ). We conﬁrm in the end that the solution to the approximated equilibrium conditions actually takes this form. Next, deﬁne the matrices ΓUC , Γb , and Γr as matrices that select individual elements of xt such that ˆ ˆ ˆ ˆ ˆ ˆ uC,t = ΓUC xt , bt = Γb xt , rU,t = Γr xt

(C.20)

where lower-case letters with circumﬂexes denote log deviations from non-stochastic steady¯ ˆ state values. That is, uC,t ≡ log UC,t − log UC , etc.For the sake of arithmetical convenience, also deﬁne an auxiliary variable ζ t ≡
1− α t 1− ρ .

178

C.2.2.1

The essentially afﬁne SDF

U=

η ¯ 1− η Ct Ct−1

1− ρ

1−ρ

+ Zt

1− ρ

ϕ1

α (1− ρ ) ¯ ( H − Nt ) H 1−ρ

UC = ηct

η (1−ρ)−1 (1−η )(1−ρ) η (1−ρ)−1 (1−η )(1−ρ) Zt Zt−1 c t −1

= ηct
The SDF is

η (1−ρ)−1 (1−η )(1−ρ) −(1−η )(1−ρ) (1−η )(1−ρ) −ρ Zt Zt Zt−1 c t −1

Mt + 1 = β

1− α t 1− ρ

UC,t+1 1 − βBt+1 Bt UC,t 1 − βBt

1− α t 1− ρ

1− RU,tρ 1 +

ρ−αt

(C.21)

Mt+1 is completely log-linear in the endogenous variables UC,t and RU,t+1 , but it is not log-linear in Bt and Bt+1 . If not for these terms, we could use the exact formula for Mt+1 in what follows. Since those terms are non-linear, I use the approximations, β 1 − β exp (bt+1 ) 1 bt − bt + 1 ≈ exp (−bt ) − β 1−β 1−β ¯ ˆ mt+1 ≡ m + ζ t ∆uC,t+1 +
(1) (1)

log

(C.22) ˆ + (ζ t − 1) rU,t+1 (C.23)

1 β bt − bt + 1 1−β 1−β

mt+1 is ﬁrst-order accurate for mt+1 , hence the superscript. In the continuous time limit, this formula becomes exact. The Euler equation for the return on the utility portfolio is ˆ 1 = Et exp ζ t ∆uC,t+1 + 1 β bt − bt + 1 1−β 1−β

ˆ + ζ t rU,t+1

(C.24)

Taking logs of both sides and taking advantage of log-normality gives 1 β ˆ bt − Et bt+1 + Et rU,t+1 1−β 1−β ΓUC − β Γ + Γr 1−β b (C.25)

ˆ ˆ 0 = Et uC,t+1 − uC,t +

1 β + ζ t σ2 ΓUC − Γ + Γr ΨΨ 2 1−β b

179

Moreover, this implies that the approximated SDF can be written as mt+1 = ζ t ΓUC −
(1)

β Γ + Γr 1−β b

¯ Ψσε t+1 − Γr (Φxt + Ψσε t+1 ) − r ΓUC − β Γ + Γr 1−β b (C.26)

β 1 2 − ζ t σ2 ΓUC − Γ + Γr ΨΨ 2 1−β b

which is the essentially afﬁne form from the text. The reason that this form is useful is that any time the SDF is essentially afﬁne, we can obtain an exact expression for Et exp (mt+1 + f 0 + f x xt + f x xt+1 ). It also means that we can price any asset whose payoffs are linear in the endogenous variables, including real and nominal bonds.

C.2.2.2

Approximation to F

Next, we take a ﬁrst-order Taylor approximation to log F such that log F ( xt , xt+1 , ε t+1 ) ≈ f 0 + f x xt + f x xt+1 , giving
(1)

ˆ 1 = Et exp mt+1 + f x xt + f x xt+1 Taking logs and evaluating the expectation yields 1 ˆ 0 = − Et rU,t+1 + f x xt + f x Et xt+1 + σ2 (−Γr + f x ) ΨΨ (−Γr + f x ) 2 β 2 + ζ t σ ΓUC − Γ + Γr ΨΨ (−Γr + f x ) 1−β b

(C.27)

(C.28)

(C.28) is the equation that we ultimately place into the system to be solved. It is completely linear in both xt and ζ t (equivalently, αt ). C.2.2.3 Solution

Since every equation in the system is now linear in the variables of the model, we can solve the system for the parameters Φ and Ψ from (C.19). Speciﬁcally, we solve the

180

following system, 1 ˆ 0 = − Et rU,t+1 + f x xt + f x Et xt+1 + σ2 (−Γr + f x ) ΨΨ (−Γr + f x ) 2 β + ζ t σ2 ΓUC − Γ + Γr ΨΨ (−Γr + f x ) 1−β b 0 = d0 + d x xt + d x xt+1 + dε σε t+1

(C.29) (C.30)

where σ = 1 in the stochastic equilibrium that we approximate. This system can be solved using, for example, Sims’ (2001) Gensys algorithm. The last wrinkle here is that we cannot simply insert (C.28) into the set of equations to be solved since it involves the matrix Ψ, which is one of the unknown structures we are solving for. I deal with this with a simple ﬁxed-point iteration: I begin with the equations that we obtain from perturbation, 0 = −Γr Et xt+1 + f x xt + f x Et xt+1 ˆ ˆ 0 = d0 + d x xt + d x xt+1 + dε σε t+1 which will deliver an initial value of Ψ, denoted Ψ(1) . We then use Ψ(1) to change the equilibrium condition to take the form 1 ¯ 0 = −Γr Φxt − r + f x xt + f x Φxt + σ2 (−Γr + f x ) Ψ(1) Ψ(1) (−Γr + f x ) 2 β + ζ t σ2 ΓUC − Γ + Γr Ψ(1) Ψ(1) (−Γr + f x ) 1−β b ˆ ˆ 0 = d0 + d x xt + d x xt+1 + dε σε t+1

(C.31) (C.32)

which delivers a value Ψ(2) . Then simply iterate to convergence. I treat parameter sets for which the iteration diverges as inadmissible, setting the marginal likelihood to zero.

181

C.2.3

Bond pricing

To solve for bond prices, we guess that bond prices are log-linear in the vector of state variables, so that pn,t = An + Bn xt (C.33)

where pn,t is the price of a zero-coupon bond that matures on date t + n and pays 1 unit of consumption. We can also write the price of a nominal bond as p$ . Using the formula for n,t the SDF from above, we have ¯ + Γr Ψε t+1 − Γr (Φxt + Ψε t+1 ) − r  ζ t ΓUC −  = Et exp  + An−1 + ( Bn−1 − Γπ ) (Φxt + Ψε t+1 )   β 1 2 − 2 ζ t σ2 ΓUC − 1− β Γb + Γr ΨΨ (ΓUC + Γb + Γr )
β 1− β Γ b



      (C.34)

exp

p$ n,t

¯ = −r − Γr Φxt + An−1 + ( Bn−1 − Γπ ) Φxt 1 + σ2 ( Bn−1 − Γπ − Γr ) ΨΨ ( Bn−1 − Γπ − Γr ) 2 β Γ + Γr ΨΨ ( Bn−1 − Γπ − Γr ) Γζ xt + σ2 ΓUC − 1−β b Matching coefﬁcients gives 1 ¯ An = −r + An−1 + σ2 ( Bn−1 − Γπ − Γr ) ΨΨ ( Bn−1 − Γπ − Γr ) (C.37) 2 β Γ + Γr ΨΨ ( Bn−1 − Γπ − Γr ) Γζ (C.38) Bn = −Γr Φ + ( Bn−1 − Γπ ) Φ + ΓUC − 1−β b (C.35) (C.36)

C.3 Estimation
Much of the analysis discusses the behavior of the model around the posterior mode (i.e. the peak of the posterior distribution; it is also a maximum-likelihood estimate penalized by the prior distribution). I also start the Metropolis–Hastings chain from that point. To search for the posterior mode, I begin by running a genetic algorithm on a population of 60 points drawn from the prior distribution. The genetic algorithm searches the parameter space by mixing parameter sets in the population and also allowing random mutations. After 30 iterations of the genetic algorithm, I take the point in the population with the

182

highest posterior density and use it as the starting point for Chris Sims’ CSMINWEL algorithm, which is a derivative-based hill-climbing algorithm that is designed for DSGE models. When CSMINWEL gets stuck, I also try the standard simplex algorithm. I ran this combined search 2500 times (each search takes roughly an hour, so access to a large computing cluster was essential). The point that I am calling the posterior mode was found to be the peak in fewer than 100 of the searches. In other words, it is extremely difﬁcult to ﬁnd the peak of the posterior likelihood. In general, I found that it was far easier to ﬁnd the posterior mode when risk aversion or the inﬂation target was held ﬁxed, and easier still when bond prices were also dropped from the estimation. Even though the priors help to add curvature to the posterior likelihood surface, I still ﬁnd many local maxima, a problem that plagues the bond-pricing literature. Furthermore, since the model is so highly constrained there is no straightforward way to use the more recent estimation algorithms for afﬁne term structure models proposed by, for example, Hamilton and Wu (2012) and Joslin, Singleton, and Zhu (2010). I simulate the posterior distribution using the adaptive random-walk Metropolis– Hastings algorithm of Haario, Saksman, and Tamminen (2001). I initialize the chain at the posterior mode. For the proposal distribution, I begin with a normal distribution whose variance matrix is equal to that of the prior, multiplied by (2.382 )/d, where d is the dimension of the parameter vector (49), which is the optimal scaling factor of Gelman, Roberts, and Gilks (1996). After 10,000 iterations of the algorithm, I update the variance matrix of the proposal distribution to be equal to the observed variance matrix for the ﬁrst 10,000 iterations of the chain. Subsequently, the variance is updated each on each iteration using the sample variance of the chain up to the current iteration.1 I achieve relatively rapid mixing this way. The full chain has 1,000,000 draws, but it mixes well even by 100,000 draws.

A more common method in the DSGE literature is to use the hessian of the posterior around the posterior mode to determine the variance of the proposal distribution. I have difﬁculty calculating the hessian due to numerical instability.

1

183

BIBLIOGRAPHY

A BEL , A. B. (1990): “Asset Prices under Habit Formation and Catching up with the Joneses,” The American Economic Review, Papers and Proceedings, 80(2), 38–42. (1999): “Risk Premia and Term Premia in General Equilibrium,” Journal of Monetary Economics, 43(1), 3–33. A BEL , A. B.,
AND

O. J. B LANCHARD (1986): “The Present Value of Proﬁts and Cyclical

Movements in Investment,” Econometrica, 54(2), 249–273. A BEL , A. B., A. K. D IXIT, J. C. E BERLY, AND R. S. P INDYCK (1996): “Options, the Value of Capital, and Investment,” Quarterly Journal of Economics, 111(3), 753–777. A LVAREZ , F.,
AND

U. J. J ERMANN (2005): “Using Asset Prices to Measure the Persistence

of the Marginal Utility of Wealth,” Econometrica, 73(6), 1977–2016. A NDREWS , D. W. (1993): “Tests for Parameter Instability and Structural Change with Unknown Change Point,” Econometrica, 61(4), 821–856. A NG , A.,
AND

M. P IAZZESI (2003): “No-Arbitrage Vector Autoregression of Term Struc-

ture Dynamics with Macroeconomic and Latent Variables,” Journal of Monetary Economics, 50, 745–787. ATKESON , A.,
AND

L. E. O HANIAN (2001): “Are Phillips Curves Useful for Forecasting

Inﬂation?,” Federal Reserve Bank of Minneapolis Quarterly Review, 25(1), 2–11. ATTANASIO , O. P., P. K. G OLDBERG ,
AND

E. K YRIAZIDOU (2008): “Credit Constraints in

the Market for Consumer Durables: Evidence from Micro Data on Car Loans,” International Economic Review, 49(2), 401–436.

184

B AKER , M., R. G REENWOOD ,

AND

J. W URGLER (2003): “The Maturity of Debt Issues and

Predictable Variation in Bond Returns,” Journal of Financial Economics, 70(2), 261–291. B ANSAL , R.,
AND

A. YARON (2004): “Risks for the Long Run: A Potential Resolution of

Asset Pricing Puzzles,” Journal of Finance, 59(4), 1481–1509. B ARBERIS , N., M. H UANG ,
AND

T. S ANTOS (2001): “Prospect Theory and Asset Prices,”

Quarterly Journal of Economics, 116(1), 1–53. B ARCLAY, M. J., AND J. C LIFFORD W. S MITH (1995): “The Maturity Structure of Corporate Debt,” Journal of Finance, 50(2), 609–631. B ASU , S., J. G. F ERNALD ,
AND

M. S. K IMBALL (2006): “Are Technology Improvements

Contractionary?,” American Economic Review, 96(5), 1418–1448. B AXTER , M.,
AND

M. J. C RUCINI (1993): “Explaining Saving–Investment Correlations,”

American Economic Review, 83(3), 416–436. B EELER , J., AND J. Y. C AMPBELL (2011): “The Long-Run Risks Model and Aggregate Asset Prices: An Empirical Assessment,” Working paper. B EKAERT, G., S. C HO , AND A. M ORENO (2010): “new Keynesian Macroeconomics and the Term Structure,” Journal of Money, Credit ans Banking, 42(1), 33–62. B ERK , J. B., R. C. G REEN ,
AND

V. N AIK (1999): “Optimal Investment, Growth Options,

and Security Returns,” Journal of Finance, 54(5), 1553–1607. B ERNANKE , B. S.,
AND

M. G ERTLER (1995): “Inside the Black Box: The Credit Channel of

Monetary Policy Transmission,” Journal of Economic Perspectives, 9(4), 27–48. B EVERIDGE , S.,
AND

C. R. N ELSON (1981): “A New Approach to Decomposition of Eco-

nomic Time Series into Permanent and Transitory Components with Particular Attention to Measurement of the "Business Cycle",” Journal of Monetary Economics, 7(2), 151–174. B LOOM , N. (2009): “The Impact of Uncertainty Shocks,” Econometrica, 77(3), 623–685. B OLDRIN , M., L. J. C HRISTIANO ,
AND

J. D. M. F ISHER (2001): “Habit Persistence, Asset

Returns, and the Business Cycle,” American Economic Review, 91(1), 149–166. 185

B RUNNERMEIER , M.,

AND

S. N AGEL (2008): “Do Wealth Fluctuations Generate Time-

Varying Risk Aversion? Micro-Evidence on Individuals’ Asset Allocation,” The American Economic Review, 98(3), 713–736. C ABALLERO , R. J. (1994): “Small Sample Bias and Adjustment Costs,” Review of Economics and Statistics, 76(1), 52–58. C ALDARA , D., J. F ERNANDEZ -V ILLAVERDE , J. F. R UBIO -R AMIREZ ,
AND

Y. W EN (2009):

“Computing DSGE Models with Recursive Preferences,” Working paper. C ALVET, L. E., J. Y. C AMPBELL ,
AND

P. S ODINI (2009): “Fight Or Flight? Portfolio Rebal-

ancing by Individual Investors,” The Quarterly Journal of Economics, 124(1), 301–348. C ALVET, L. E.,
AND

P. S ODINI (2010): “Twin Picks: Disentangling the Determinants of

Risk-Taking in Household Portfolios,” Working Paper. C AMPANALE , C., R. C ASTRO , AND G. L. C LEMENTI (2010): “Asset Pricing in a Production Economy with Chew-Dekel Preferences,” Review of Economic Dynamics, 13(2), 379–402. C AMPBELL , J. Y. (1987): “Stock Returns and the Term Structure,” Journal of Financial Economics, 18(2), 373–399. (1994): “Inspecting the Mechanism: An Analytical Approach to the Stochastic Growth Model,” Journal of Monetary Economics, 33(3), 463–506. (2003): “Consumption-Based Asset Pricing,” in Handbook of the Economics of Finance, vol. 1, pp. 803–887. Elsevier Science. C AMPBELL , J. Y., AND J. H. C OCHRANE (1999): “By Force of Habit: A Consumption-Based Explanation of Aggregate Stock Market Behavior,” Journal of Political Economy, 107(2), 205–251. C AMPBELL , J. Y.,
AND

A. D EATON (1989): “Why is Consumption so Smooth?,” Review of

Economic Studies, 56(3), 357–373.

186

C AMPBELL , J. Y., M. L ETTAU , B. G. M ALKIEL , AND Y. X U (2001): “Have Individual Stocks Become More Volatile? an Empirical Exploration of Idiosyncratic Risk,” Journal of Finance, 56(1), 1–43. C AMPBELL , J. Y.,
AND

N. G. M ANKIW (1989): “Consumption, Income and Interest Rates:

Reinterpreting the Time Series Evidence,” in NBER Macroeconomics Annual, ed. by O. J. Blanchard, and S. Fischer. C AMPBELL , J. Y.,
AND

R. J. S HILLER (1988): “The Dividend-Price Ratio and Expectations

of Future Dividends and Discount Factors,” Review of Financial Studies, 1(3)(3), 195–228. (1991): “Yield Spreads and Interest Rate Movements: A Bird’s Eye View,” Review of Economic Studies, 58(3), 495–514. C ARROLL , C. D. (2002): “Portfolios of the Rich,” in Household Portfolios, ed. by L. Guiso, M. Haliassos, and T. Japelli. MIT Press. C HIRINKO , R. S. (1993): “Business Fixed Investment Spending: Modeling Strategies, Empirical Results, and Policy Implications,” Journal of Economic Literature, 31(4), 1875–1911. C HRISTIANO , L. J., M. E ICHENBAUM ,
AND

R. V IGFUSBSON (2004): “The Response of

Hours to a Technology Shock: Evidence Based on Direct Measures of Technology,” Journal of the European Economic Association, 2(2–3), 381–395. C HRISTIANO , L. J., M. T RABANDT,
AND

K. WALENTIN (2011): “DSGE Models for Mone-

tary Policy Analysi,” in Handbook of Monetary Economics. North-Holland. C LARK , T. E.,
AND

M. W. M C C RACKEN (2001): “"Tests of Equal Forecast Accuracy and

Encompassing for Nested Models",” Journal of Econometrics, 105(1), 85–110. (2005): “Evaluating Direct Multistep Forecasts,” Econometric Reviews, 24(4), 369– 404. C LARK , T. E., AND K. D. W EST (2007): “Approximately Normal Tests for Equal Predictive Accuracy in Nested Models,” Journal of Econometrics, 138(1), 291–311.

187

C OCHRANE , J. H. (1991): “Production-Based Asset Pricing and the Link Between Stock Returns and Economic Fluctuations,” Journal of Finance, 46(1), 209–237. (1996): “A Cross-Sectional Test of an Investment-Based Asset Pricing Model,” Journal of Political Economy, 104(3), 572–621. (2005): “Financial Markets and the Real Economy,” NBER Working paper. (2008): “The Dog That Did Not Bark: A Defense of Return Predictability,” Review of Financial Studies, 21(4), 1533–1575. (2011): “Discount Rates,” Journal of Finance, 66(4), 1047–1108, AFA Presidential Address. C OCHRANE , J. H.,
AND

M. P IAZZESI (2005): “Bond Risk Premia,” American Economic Re-

view, 95(1), 138–160. C ONSTANTINIDES , G. M. (1990): “Habit Formation: A Resolution of the Equity Premium Puzzle,” The Journal of Political Economy, 98(3), 519–543. D ANTHINE , J.-P., J. B. D ONALDSON ,
AND

R. M EHRA (1992): “The Equity Premium and

the Allocation of Income Risk,” Journal of Economic Dynamics and Control, 16(3–4), 509– 532. D EW-B ECKER , I. (2011a): “Bond Pricing With a Time-varying Price Of Risk in an Estimated Medium-Scale Bayesian DSGE Model,” Working paper. (2011b): “A Model of Time-Varying Risk Premia with Habits and Production,” Working paper. (2012): “Essentially afﬁne approximations for economic models,” Working paper. D OH , T. (2011): “Yield Curve in an Estimated Nonlinear Macro Model,” Journal of Economic Dynamics & Control. D UFFEE , G. R. (2002): “Term Premia and Interest Rate Forecasts in Afﬁne Models,” Journal of Finance, 57(1), 405–443. 188

D YNAN , K. E. (1993): “How Prudent are Consumers?,” Journal of Political Economy, 101(6), 1104–1113. (2000): “Habit Formation in Consumer Preferences: Evidence from Panel Data,” American Economic Review, 90(3), 391–406. E BERLY, J., S. R EBELO ,
AND

N. V INCENT (2009): “Investment and Value: A Neoclassical

Benchmark,” NBER Working Paper. E DGE , R. M. (2000): “The Effect of Monetary Policy on Residential and Structures Investment Under Differential Project Planning and Completion Times,” Federal Reserve Board of Governors International Finance Discussion Papers No. 671. E PSTEIN , L. G.,
AND

S. E. Z IN (1989): “Substitution, Risk Aversion, and the Temporal

Behavior of Consumption and Asset Returns: A Theoretical Framework,” Econometrica, 57(4), 937–969. (1991): “Substitution, Risk Aversion, and the Temporal Behavior of Consumption and Asset Returns: An Empirical Analysis,” The Journal of Political Economy, 99(2), 263– 286. FAMA , E. F.,
AND

K. R. F RENCH (1989): “Business Conditions and Expected Returns on

Stocks and Bonds,” Journal of Financial Economics, 25(1), 23–49. FAMA , E. F.,
AND

G. W. S CHWERT (1977): “Human Capital and Capital Market Equilib-

rium,” Journal of Financial Economics, 4(1), 95–125. FAZZARI , S. M., R. G. H UBBARD , AND B. C. P ETERSEN (1988): “Financing Constraints and Corporate Investment,” Brookings Papers on Economic Activity, 1988(1), 141–206. F ERNANDEZ -V ILLAVERDE , J., P. G UERRON , J. F. R UBIO -R AMIREZ ,
AND

M. U RIBE (2011):

“Risk Matters: The Real Effects of Volatility Shocks,” American Economic Review. F ERNANDEZ -V ILLAVERDE , J., AND J. R UBIO -R AMIREZ (2004): “Comparing dynamic equilibrium models to data: a Bayesian approach,” Journal of Econometrics, 123(1), 153–187.

189

F RANCIS , N.,

AND

V. A. R AMEY (2005): “Is the Technology-Driven Real Business Cycle

Hypothesis Dead? Shocks and Aggregate Fluctuations Revisited,” Journal of Monetary Economics, 52(8), 1379–1399. (2009): “Measures of Per Capita Hours and Their Implications for the TechnologyHours Debate,” Journal of Money, Credit, and Banking, 41(6), 1071–1097. F RAUMENI , B. (1997): “The Measurement of Depreciation in the U.S. National Income and Product Accounts,” Survey of Current Business, pp. 7–23. G ALI , J. (1999): “Technology, Employment, and the Business Cycle: Do Technology Shocks Explain Aggregate Fluctuations?,” American Economic Review, 89(1), 249–271. G ELFAND , A. E., AND D. K. D EY (1994): “Bayesian Model Choice: Asymptotics and Exact Calculations,” Journal of the Royal Statistical Society. Series B, 56(3), 501–514. G ELMAN , A., G. R OBERTS ,
AND

W. G ILKS (1996): “Efﬁcient Metropolis Jumping Rules,”

in Bayesian Statistics, ed. by J. Bernardo, J. Berger, A. Dawid, and A. Smith, vol. 5. Oxford University Press. G ERTNER , R. (1993): “Game Shows and Economic Behavior: Risk-Taking on "Card Sharks",” Quarterly Journal of Economics, 108(2), 507–521. G EWEKE , J. (1999): “Using simulation methods for bayesian econometric models: inference, development,and communication,” Econometric Reviews, 18(1), 1–73. G ILCHRIST, S.,
AND

E. Z AKRAJSEK (2007): “Investment and the Cost of Capital: New

Evidence from the Corporate Bond Market,” Working Paper. G OMES , J. F., L. K OGAN , AND M. Y OGO (2009): “Durability of Output and Expected Stock Returns,” Journal of Political Economy, 117(5), 941–986. G OURIO , F. (2010): “Disaster Risk and Business Cycles,” Working Paper. G RAEVE , F. D., M. D OSSCHE , M. E MIRIS , H. S NEESSENS ,
AND

R. W OUTERS (2010): “Risk

premiums and macroeconomic dynamics in a heterogeneous agent model,” Journal of Economic Dynamics and Control, 34(9), 1680–1699. 190

G RAHAM , J. R.,

AND

C. R. H ARVEY (2002): “How do CFOs Make Capital Budgeting and

Capital Structure Decisions?,” Journal of Applied Corporate Finance, 15(1), 8–23. G REENWOOD , R., S. H ANSON ,
AND

J. S TEIN (2009): “A Gap-Filling Theory of Corporate

Debt Maturity Choice,” Working Paper. G ROSSMAN , S. J., AND R. J. S HILLER (1981): “The Determinants of the Variability of Stock Market Prices,” American Economic Review, Papers and Proceedings, 71(2), 222–227. G RUBER , J. (2006): “A Tax-Based Estimate of the Elasticity of Intertemporal Substitution,” NBER Working Paper 11945. G UEDES , J.,
AND

T. O PLER (1996): “The Determinants of the Maturity of Corporate Debt

Issues,” The Journal of Finance, 51(5), 1809–1833. G UISO , L., A. K. K ASHYAP, F. PANETTA ,
AND

D. T ERLIZZESE (2002): “How Interest Sen-

sitive is Investment? Very (when the data are well measured),” Working Paper. G UISO , L., P. S APIENZA , AND L. Z INGALES (2011): “Time-Varying Risk Aversion,” . G URKAYNAK , R. S., B. S ACK , AND E. S WANSON (2005): “The Sensitivity of Long-Term Interest Rates to Economic News: Evidence and Implications for Macroeconomic Models,” American Economic Review, 95(1), 425–436. G UVENEN , F. (2009): “A Parsimonious Macroeconomic Model for Asset Pricing,” Econometrica, 77(6), 1711–1750. H AARIO , H., E. S AKSMAN ,
AND

J. TAMMINEN (2001): “An Adaptive Metropolis Algo-

rithm,” Bernoulli, 7(2), 223–242. H ALL , R. E. (1988): “Intertemporal Substitution in Consumption,” Journal of Political Economy, 96(2), 339–357. H AMILTON , J. D.,
AND

C. W U (2012): “Identiﬁcation and Estimation of Gaussian Afﬁne-

Term-Structure Models,” Journal of Econometrics, Working Paper. H ANSEN , L. P., J. C. H EATON , AND N. L I (2008): “Consumption Strikes Back? Measuring Long-Run Risk,” Journal of Political Economy, 116(2), 260–302. 191

H ANSEN , L. P.,

AND

R. J AGANNATHAN (1991): “Implications of Security Market Data for

Models of Dynamic Economies,” Journal of Political Economy, 99(2), 225–262. H ASSETT, K. A., AND R. G. H UBBARD (2002): Handbook of Public Economicschap. Tax Policy and Business Investment, pp. 1293–1343. Elsevier Science B.V. H OUSE , C. L., AND M. D. S HAPIRO (2008): “Temporary Investment Tax Incentives: Theory with Evidence from Bonus Depreciation,” American Economic Review, 98(3), 737–768. H ULTEN , C. R., AND F. C. W YKOFF (1981): “The Measurement of Economic Depreciation,” in Depreciation, Inﬂation, and the Taxation of Income from Capital,, ed. by C. R. Hulten, pp. 81–125. The Urban Institute Press, Washington, D.C. J ACCARD , I. (2007): “Asset Returns and Labor Supply in a Production Economy With Habit Memory,” ECB Working Paper. J ERMANN , U. J. (1998): “Asset Pricing in Production Economies,” Journal of Monetary Economics, 41(2), 257–275. J OSLIN , S., K. J. S INGLETON ,
AND

H. Z HU (2011): “A New Perspective on Gaussian Dy-

namic Term Structure Models,” Review of Financial Studies, 24(3), 926–970. J UDD , K. L. (1999): Numerical Methods for Economists. MIT Press, Cambridge, MA. J USTINIANO , A., G. P RIMICERI ,
AND

A. TAMBALOTTI (2010): “Investment Shocks and

Business Cycles,” Journal of Monetary Economics, 57(2), 132–145. K ALTENBRUNNER , G.,
AND

L. A. L OCHSTOER (Forthcoming): “Long-Run Risk Through

Consumption Smoothing,” The Review of Financial Studies. K ASHYAP, A. K.,
AND

J. C. S TEIN (2000): “What Do a Million Observations on Banks Say

About the Transmission of Monetary Policy?,” American Economic Review, 90(3), 407–428. K IEFER , N. M.,
AND

T. J. V OGELSANG (2005):

“A New Asymptotic Theory for

Heteroskedasticity-Autocorrelation Robust Tests,” Econometric Theory, 21(6), 1130–1164. K IEFER , N. M., T. J. V OGELSANG ,
AND

H. B UNZEL (2000): “Simple Robust Testing of

Regression Hypotheses,” Econometrica, 68(3), 695–714. 192

K ING , R. G., C. I. P LOSSER ,

AND

S. T. R EBELO (1988): “Production, Growth and Business

Cycles : I. The Basic Neoclassical Model,” Journal of Monetary Economics, 21(2–3), 195– 232. K REPS , D. M.,
AND

E. L. P ORTEUS (1978): “Temporal Resolution of Uncertainty and Dy-

namic Choice Theory,” Econometrica, 46(1), 185–200. L E R OY, S. F.,
AND

R. D. P ORTER (1981): “The Present-Value Relation: Tests Based on

Implied Variance Bounds,” Econometrica, 49(3), 555–574. L ETTAU , M. (2003): “Inspecting The Mechanism: Closed-Form Solutions for Asset Prices in Real Business Cycle Model,” The Economic Journal, 113(489), 550–575. L ETTAU , M., AND S. L UDVIGSON (2001): “Consumption, Aggregate Wealth, and Expected Stock Returns,” Journal of Finance, 56(3), 815–849. L ETTAU , M.,
AND

H. U HLIG (2000): “Can Habit Formation be Reconciled with Business

Cycle Facts?,” Review of Economic Dynamics, 3(1), 79–99. L ETTAU , M.,
AND

J. A. WACHTER (2007): “Why Is Long-Horizon Equity Less Risky? A

Duration-Based Explanation of the Value Premium,” Journal of Finance, 62(1), 55–92. L ONGSTAFF , F. A.,
AND

E. S. S CHWARTZ (1992): “Interest Rate Volatility and the Term

Structure: A Two-Factor General Equilbrium Model,” Journal of Finance, 47(4), 1259– 1282. L OWN , C. S., AND D. P. M ORGAN (2006): “The Credit Cycle and the Business Cycle: New Findings Using the Loan Ofﬁcer Opinion Survey,” Journal of Money, Credit, and Banking, 38(6), 1575–1597. M ELINO , A.,
AND

A. X. YANG (2003): “State-Dependent Preferences can Explain the Eq-

uity Premium Puzzle,” Review of Economic Dynamics, 6(4), 806–830. M IAO , J., AND P. WANG (2010): “Credit Risk and Business Cycles,” Working Paper.

193

O LINER , S. D., G. D. R UDEBUSCH , AND D. S ICHEL (1995): “New and Old Models of Business Investment: A Comparison of Forecasting Performance,” Journal of Money, Credit, and Banking, 27(3), 806–826. (1996): “The Lucas Critique Revisited: Assessing the Stability of Empirical Euler Equations for Investment,” Journal of Econometrics, 70(1), 291–316. O PLER , T., L. P INKOWITZ , R. S TULZ ,
AND

R. W ILLIAMSON (1999): “The Determinants

and Implications of Corporate Cash Holding,” Journal of Financial Economics, 52(1), 3–46. P IAZZESI , M. (2010): “Afﬁne Term Structure Models,” in Handbook of Financial Econometrics. Elsevier. P OST, T., M. J. VAN DEN A SSEM , G. B ALTUSSEN ,
AND

R. H. T HALER (2008): “Deal or No

Deal? Decision Making under Risk in a Large-Payoff Game Show,” American Economic Review, 98(1), 38–71. R AVINA , E. (2007): “Habit Formation and Keeping Up with the Joneses: Evidence from Micro Data,” Mimeo. R OUWENHORST, K. G. (1995): “Asset Pricing Implications of Equilibrium Business Cycle Models,” in Frontiers of Business Cycle Research, ed. by T. F. Cooley, chap. 10, pp. 294–330. Princeton University Press. R UDEBUSCH , G. D.,
AND

E. T. S WANSON (2008): “Examining the Bond Premium Puzzle

with a DSGE Model,” Journal of Monetary Economics, 55(Supplement 1), S111–S126. (Forthcoming): “The Bond Premium in a DSGE Model with Long-Run Real and Nominal Risks,” American Economic Journal: Macroeconomics, Federal Reserve Bank of San Francisco Working Paper 2008-31. S CHALLER , H. (2006): “Estimating the Long-Run User Cost Elasticity,” Journal of Monetary Economics, 53(4), 725–736. S HILLER , R. J. (1981): “Do Stock Prices Move Too Much to be Justiﬁed by Subsequent Changes in Dividends?,” American Economic Review, 71(3), 421–436. 194

S OLOW, R. M. (1957): “Technical Change and the Aggregate Production Function,” Review of Economics and Statistics, 39(3), 312–320. S TOHS , M. H.,
AND

D. C. M AUER (1996): “The Determinants of Corporate Debt Maturity

Structure,” The Journal of Business, 69(3), 279–312. S UMMERS , L. H. (1981): “Taxation and Corporate Investment: A q-Theory Approach,” Brookings Papers on Economic Activity, 1981(1), 67–140. S WANSON , E. T. (2011): “Risk Aversion and the Labor Margin in Dynamic Equilibrium Models,” American Economic Review. TALLARINI , T. D. (2000): “Risk-Sensitive Real Business Cycles,” Journal of Monetary Economics, 45(3), 507–532. TANAKA , T., C. F. C AMERER ,
AND

Q. N GUYEN (2010): “Risk and Time Preferences: Link-

ing Experimental and Household Survey Data from Vietnam,” American Economic Review, 100(1), 557–571. T EVLIN , S.,
AND

K. W HELAN (2003): “Explaining the Investment Boom of the 1990s,”

Journal of Money, Credit, and Banking, 35(1), 1–22. T ITMAN , S.,
AND

R. W ESSELS (1988): “The Determinants of Capital Structure Choice,”

Journal of Finance, 43(1), 1–19.
VAN

A RK , B.,

AND

R. I NKLAAR (2006): “Catching up or getting stuck? Europe’s troubles

to exploit ICT’s productivity potential,” GGDC Research Memorandum GD-79.
VAN

B INSBERGEN , J. H., J. F ERNANDEZ -V ILLAVERDE , R. S. K OIJEN ,

AND

J. F. R UBIO -

R AMIREZ (2011): “The Term Structure of Interest Rates in a DSGE Model with Recursive Preferences,” Working paper. V ISSING -J ORGENSEN , A.,
AND

O. P. ATTANASIO (2003): “Stock-Market Participation, In-

tertemporal Substitution, and Risk-Aversion,” The American Economic Review, 93(2), 383– 391.

195

V ISSING -J ORGENSON , A. (2002): “Limited Asset Market Participation and the Elasticity of Intertemporal Substitution,” Journal of Political Economy, 110(4), 825–853. WACHTER , J. A. (2010): “Can Time-Varying Risk of Rare Disasters Explain Aggregate Stock Market Volatility?,” Working Paper. W EIL , P. (1989): “The Equity Premium Puzzle and the Risk-Free Rate Puzzle,” Journal of Monetary Economics, 24(3), 401–421. W OODFORD , M. (2003): Interest and Prices. Princeton University Press, Princeton, NJ. YANG , W. (2008): “Intertemporal Substitution and Equity Premium: A Perspective With Habit in Epstein-Zin Preferences,” Working Paper.

196