Carbon Taxes, Path Dependency, and Directed Technical Change: Evidence from the Auto Industry

Can directed technical change be used to combat climate change? We construct new firm-level panel data on auto industry innovation distinguishing between “dirty” (internal combustion engine) and “clean” (e.g., electric, hybrid, and hydrogen) patents across 80 countries over several decades. We show that firms tend to innovate more in clean (and less in dirty) technologies when they face higher tax-inclusive fuel prices. Furthermore, there is path dependence in the type of innovation (clean/dirty) both from aggregate spillovers and from the firm’s own innovation history. We simulate the increases in carbon taxes needed to allow clean technologies to overtake dirty technologies.

several decades. We show that firms tend to innovate more in clean ðand less in dirtyÞ technologies when they face higher tax-inclusive fuel prices. Furthermore, there is path dependence in the type of innovation ðclean/dirtyÞ both from aggregate spillovers and from the firm's own innovation history. We simulate the increases in carbon taxes needed to allow clean technologies to overtake dirty technologies.

I. Introduction
There is a wide scientific consensus that greenhouse gas emissions from human activities, in particular carbon dioxide ðCO 2 Þ, are responsible for the current observed warming of the planet. Automobiles are major contributors to these emissions: according to the International Energy Agency, in 2009 road transport accounted for 4.88 gigatons of CO 2 , which represented 16.5 percent of global CO 2 emissions ðtransport as a whole was responsible for 22.1 percentÞ. In this paper we look at technological innovations in the auto industry and examine whether government intervention can affect the direction of this innovation. More specifically, we construct a new panel data set on auto innovations to examine whether firms redirect technical change away from dirty ðpollutingÞ technologies and toward cleaner technologies in response to increases in fuel prices ðour proxy for a carbon taxÞ in the context of path-dependent innovation. We associate "dirty" innovation with internal combustion engine patents and "clean" innovation with electric, hybrid, and hydrogen vehicle patents, but we discuss carefully issues around this definition and consider various alternatives. 1 Our main data are drawn from the European Patent Office's ðEPOÞ World Patent Statistical database ðPATSTATÞ. These data cover close to the population of all worldwide patents since the mid-1960s. Our outcome measure focuses on high-value "triadic" patents, which are those that have been taken out in all three of the world's major patents offices: the EPO, the Japan Patent Office ð JPOÞ, and the US Patents and Trade-mark Office ðUSPTOÞ. Our database also reports the name of patent applicants, which in turn allows us to match clean and dirty patents with distinct patent holders each of whom has her own history of clean versus dirty patenting. Finally, we know the geographical location of the inventors listed on the patent so we can examine location-based knowledge spillovers.
We report three important empirical findings. First, higher fuel prices induce firms to redirect technical change away from dirty innovation and toward clean innovation. Second, a firm's propensity to innovate in clean technologies appears to be stimulated by its own past history of clean innovations ðand vice versa for dirty technologiesÞ. In other words, there is path dependence in the direction of technical change: firms that have innovated a lot in dirty technologies in the past will find it more profitable to innovate in dirty technologies in the future. 2 Our third finding is that a firm's direction of innovation is affected by local knowledge spillovers. We measure this using the geographical location of its inventors. More specifically, a firm is more likely to innovate in clean technologies if its inventors are located in countries where other firms have been undertaking more clean innovations ðand vice versa for dirty technologiesÞ. This provides an additional channel that reinforces path dependency.
Our paper relates to several strands in the literature. First, our work is linked to the literature on climate change, initiated by Nordhaus ð1994Þ. 3 We contribute to this literature by focusing on the role of innovation in mitigating global warming and by looking at how various policies can induce more clean innovation in the auto industry.
We also connect with work on directed technical change, in particular, Acemoglu ð1998, 2002, which itself was inspired by early contributions by Hicks ð1932Þ and Habakkuk ð1962Þ. 4 We contribute to this liter- 4 The theoretical literature on directed technical change is well developed. For applications to climate change, see, e.g., Messner ð1997Þ, Grübler and Messner ð1998Þ, Goulder and Schneider ð1999Þ, Nordhaus ð2002Þ, van der Zwaan et al. ð2002Þ, Buonanno, Carraro, and Galeotti ð2003Þ, Smulders and de Nooij ð2003Þ, Sue Wing ð2003Þ, Manne and Richels 2 As shown in Acemoglu et al. ð2012Þ, this path dependency feature when combined with the environmental externality ðwhereby firms do not factor in the loss in aggregate productivity or consumer utility induced by environmental degradationÞ will induce a laissez-faire economy to produce and innovate too much in dirty technologies compared to the social optimum. This in turn calls for government intervention to "redirect" technical change. 3 Nordhaus ð1994Þ developed a dynamic Ramsey-based model of climate change ðthe dynamic integrated climate-economy [DICE] modelÞ, which added equations linking production to emissions. Subsequent contributions have notably examined the implications of risk and discounting for the optimal design of environmental policy. In particular, see Stern ð2006Þ, Nordhaus ð2007Þ, Weitzman ð2007, 2009Þ, Dasgupta ð2008Þ, Mendelsohn et al. ð2008Þ, von Below and Persson ð2008Þ, and Yohe, Tol, and Anthoff ð2009Þ. Recently, Golosov et al. ð2014Þ have extended this literature by solving for the optimal policy in a full dynamic stochastic general equilibrium framework. ature by providing empirical evidence on the role of carbon prices in directing technical change. Earlier work by Popp ð2002Þ is closely related to our paper. This paper uses aggregate US patent data from 1970-94 to study the effect of energy prices on energy-efficient innovations. Popp finds a significant impact from both energy prices and past knowledge stocks on the direction of innovation. However, since he uses aggregate data, a concern is that his regressions also capture macroeconomic shocks correlated with both innovation and the energy price. 5 The novelty of our approach is that we use international firm-level panel data and exploit differences in a firm's exposure to different markets to build firm-specific fuel prices, which allows us to provide microeconomic evidence of directed technical change. Acemoglu et al. ð2016, in this issueÞ calibrate a microeconomic model of directed technical change to derive quantitative estimates of the optimal climate change policy. The focus of our work is more empirical, but we use our results to perform a related exercise: we simulate the aggregate evolution of future clean and dirty knowledge stocks and analyze how this evolution would be affected by changes in carbon taxes.
Finally, we draw on the extensive literature in industrial organization that estimates the demand for vehicles ðenergy-efficient and otherwiseÞ as a function of fuel prices and other factors. 6 We go beyond this work by looking at the rate and direction of innovation.
The paper is organized as follows. Section II develops a simple model to guide our empirical analysis and Section III presents the econometric methodology. The data are presented in Section IV with some descrip-ð2004Þ, Gerlagh ð2008Þ, Gerlagh, Kverndokk, and Rosendahl ð2009Þ, and Gans ð2012Þ. In contrast, empirical work on directed technical is scarcer; but see Acemoglu and Linn ð2004Þ for evidence in the pharmaceutical industry, Acemoglu and Finkelstein ð2008Þ in the health care industry, or, more recently, Hanlon ð2015Þ for historical evidence in the textile industry. 5 Further evidence of directed technical change in the context of energy saving can be found in the study by Newell, Jaffe, and Stavins ð1999Þ, who focus on the air conditioning industry, or by Crabb and Johnson ð2010Þ, who also look at energy-efficient automotive technology. Haščič et al. ð2009Þ investigate the role of regulations and fuel price on automotive emission control technologies. Hassler, Krussell, and Olovsson ð2012Þ find evidence for a trend increase in energy-saving technologies following oil price shocks. They measure the energy-saving bias of technology as a residual, which is attractive as it sidesteps the need to classify patents into distinct classes. On the other hand, our technology variables are more directly related to the innovation we want to measure. 6 For example, using around 86 million transactions, Alcott and Wozny ð2014Þ find that fuel prices reduce the demand for autos, but by less than an equivalent increase in the vehicle price. They argue that this is a behavioral bias causing consumers to undervalue fuel price changes. Readers are referred to this paper for an extensive review of the literature on fuel prices and the demand for autos. Busse, Knittel, and Zettelmeyer ð2013Þ use similar data in a more reduced-form approach but, by contrast, find a much larger impact of fuel price on auto demand. Although the magnitude of the fuel price effect on demand differs between studies, it is generally accepted that there is an important effect of fuel prices on vehicle demand. tive statistics. Section V reports the results and discusses their robustness and some extensions. We perform the simulation exercise in Section VI. Section VII presents conclusions. Appendix A provides details on the theoretical model, appendix B on the econometrics model, and appendix C on the data. All appendices are available online.

II. Theoretical Predictions
In this section we develop theoretical predictions that will guide our empirical analysis. Full details are in appendix A. We consider a one-period model of an economy in which consumers derive utility from an outside good and from motor vehicle services. To abstract from income effects, utility is quasi-linear with respect to the outside good C 0 ðchosen as the numeraireÞ.
To consume motor vehicle services, consumers need to buy cars and fuel ðcall this a "dirty car bundle"Þ or cars and electricity ðcall this a "clean car bundle"Þ. Utility is then given by where the consumption of variety i of clean cars together with the corresponding clean energy ðelectricityÞ is Y ci 5 minðy ci ; y ci e ci Þ; and the consumption of variety i of dirty cars together with the corresponding dirty energy ðfuelÞ is The term e zi is the amount of energy consumed for variety i of a type z car, where z 5 c, d, that is, z 5 Clean, Dirty; ε is the elasticity of substitution between the clean and dirty cars; j is the elasticity of substitution among varieties within each type of car; and b is the elasticity of consumption of motor vehicle services with respect to its index price ðthis parameter measures the degree of substitutability between motor vehicle services and the outside goodÞ. Finally, y ci ðrespectively, y di Þ is the energy efficiency of clean ðrespectively, dirtyÞ cars. We impose the following parameters restrictions: 1 < ε ≤ j, so that clean cars are more substitutable with each other than with dirty cars; and ε > b, that is, the elasticity of substitution between clean and dirty cars is larger than the price elas-carbon taxes, path dependency ticity for motor vehicle services ðwhich implies that the elasticity of substitution between clean and dirty cars is larger than that between motor vehicle services and the outside goodÞ. Varieties of cars are produced by monopolists. Each monopolist owns a given number of varieties in clean or dirty cars of mass zero. The monopoly producer of variety i of a type z car produces A zi cars using one unit of outside good as an input, and the energy requirement for that variety is y zi . Therefore, y zi captures energy-augmenting technologies, while A zi captures technologies that augment the other inputs ðlabor, for instanceÞ for a car of type z. Prior to production, monopolists can spend R&D resources to increase the level of their technologies ðwe assume that the cost function is quadratic in the amount of technological improvementÞ. We refer to increases in A di as "dirty" innovations: such an innovation reduces the price of dirty cars and increases the demand for fossil fuel, generating more emissions. Increases in y di are "grey" innovations; they reduce the amount of emissions per unit of "dirty car bundles" but they also increase the demand for dirty cars ðthrough a "rebound" effectÞ, so that the impact on emissions is ambiguous. Increases in y ci or A ci are clean innovations; they lead to a substitution from dirty cars consumption to clean cars consumption, leading to a decrease in emissions. 7 The model is solved in appendix A. We show that for typical parameter values we can derive some key predictions.
Prediction 1. An increase in the price of the fossil fuel increases innovation in clean technologies, decreases innovation in dirty technologies, and has an ambiguous impact on innovation in grey technologies.
Prediction 2. Firms with an initially higher level of clean technologies will tend to innovate more in clean technologies. Similarly, those with higher initial levels of dirty technologies will tend to innovate more in dirty technologies.
Here, we provide only the intuition for these results. First, on the impact of an increase in fuel price on clean innovations ðprediction 1Þ, a higher fuel price makes the dirty bundle more expensive; and since clean and dirty cars are substitutes, this encourages the consumption of clean cars. Since the market share of clean cars is now larger, the return to innovation in clean cars is also larger. For dirty cars, a higher fuel price reduces the market share and therefore profits, discouraging both dirty and grey innovation. However, it also increases the returns from grey innovation as saving on fuel reduces the price of a car bundle more when fuel prices are large. The total impact on grey innovations is there-fore ambiguous ðit is more likely to be negative when the price elasticity of cars is larger and when clean and dirty cars are closer substitutesÞ. 8 Second, on path dependence within firms ðprediction 2Þ, a higher level of dirty technologies implies a larger market share increasing the incentives to innovate in dirty technologies. Against this, however, more dirty technologies imply that there are lower marginal benefits to making investments that increase productivity and reduce the prices of a dirty car bundle further. The net effect is positive when the elasticity of substitution is sufficiently large ðso that the market size effect is largeÞ. The same applies to grey and clean technologies.
These predictions are also generated by other models in the literature. Acemoglu et al. ð2012Þ and Gans ð2012Þ study models in which innovation can augment a clean or a dirty energy technology and show that a carbon tax ðequivalent here to a higher fuel priceÞ increases innovation in clean energy-augmenting technologies ðto the detriment of dirty energyaugmenting technologiesÞ provided that the two inputs are substitutes. This is similar to the trade-off between clean and dirty innovations in our model. Smulders andde Nooij ð2003Þ andHassler et al. ð2012Þ consider models in which innovation can augment either ðfossil fuelÞ energy or other inputs that are complementary to it. An increase in the price of energy redirects innovation toward energy-augmenting technology, but since the total amount of innovation may decrease, the net impact on energy-augmenting innovation is ambiguous ðthis is similar to what happens to grey innovations here in our modelÞ.
Our model departs from these models, however, in three main respects. First, we simultaneously consider clean, dirty, and grey technologies when looking at path dependence. Second, we allow for firm heterogeneity. Both aspects are directly relevant to our empirical analysis since it is based on firm-level data, and we identify the role of path dependence from the difference in innovation efforts by firms with differing technology levels. Third, we allow for an externality whereby local aggregate knowledge in a given technology exogenously contributes to a firm's own knowledge stock. This directly delivers the third prediction, which we take to the data.
Prediction 3. Firms innovate more in clean technologies when the aggregate level of clean technologies is higher in neighboring varieties ðand similarly for dirty technologiesÞ. 8 In app. A, we further show that the impact of an increase in fuel price on innovation is not the same for all varieties if their productivity levels differ. Indeed, the fuel price increase affects relatively less the varieties that have a high level of grey over dirty technologies; therefore, these varieties can increase their market share at the expense of other dirty cars. This has the effect of increasing both dirty and grey innovations. By contrast, both types of innovations are further reduced for varieties with a low grey over dirty technology levels ratio.

General Approach
Consider the following Poisson specification for the determination of firm innovation in clean technologies: 9 where PAT C,it is the number of patents applied for in clean technologies by firm i in year t; A C,it is the firm's knowledge stock relevant for clean innovation, which depends on both its own stocks of past clean and dirty innovation and the aggregate spillovers from other firms ðdiscussed be-lowÞ; u C,it is an error term; expðÁÞ is the exponential operator; and FP it is fuel price. We lag prices and knowledge stocks to reflect delayed response and to mitigate contemporaneous feedback effects. 10 In the robustness section we show that this functional form is reasonable comparing it to alternative dynamic representations using other lag structures and the Popp ð2002Þ approach. The fuel price has independent variation across time and countries primarily because of country-specific taxes, and we show the robustness of our results to using just fuel taxes instead of ðtax-inclusiveÞ fuel prices. The profile of car sales across countries differs between auto firms. For example, General Motors has some "home bias" toward the US market, whereas Toyota has a home bias toward the Japanese market ði.e., they sell more in these countries than one would expect from country and firm observables aloneÞ. Thus, different firms are likely to be differently exposed to tax changes in different countries, and the fuel price has a firm-specific component. This firm-specific difference in market shares across countries could arise because of product differentiation and heterogeneous tastes or perhaps because of government policies to promote domestic firms. To take this heterogeneity into account, we use the firm's history of patent filing to assess the relative importance of the various markets the firm is operating in and construct firm-specific weights on fuel prices for the corresponding market. Simply put, an unexpected increase in US fuel taxes will have a more salient impact on car makers with a bigger market share in the United States than those with a smaller market share. We discuss this in more detail in Section IV. journal of political economy We parameterize the firm's total knowledge stock as The firm's knowledge will likely depend on its own history of innovation, and we denote this as K C,it ðfirm's own stock of clean innovationÞ and K D,it ðfirm's own stock of dirty innovationÞ. 11 In addition to building on their own past innovations, firms will also "stand on the shoulders of giants," so we allow their knowledge stock to depend on spillovers from other firms in both clean ðSPILL C,it Þ and dirty technologies ðSPILL D,it Þ. We use stocks of economywide patents to construct these country-specific spillover measures. Drawing on the evidence that knowledge has a geographically local component ðe.g., Jaffe, Trajtenberg, and Henderson 1993Þ, we use the firm's distribution of inventors across countries to weight the country spillover stocks. In other words, if the firm has many inventors in the United States regardless of whether the headquarters of the firm is in Tokyo or Detroit, then the knowledge stock in the United States is given a higher weight ðsee Sec. IVÞ. There are of course other factors that may influence innovation in addition to fuel prices and the past history of innovation. These include government R&D subsidies for clean innovation, regulations over emissions, and the size and income level of the countries a firm is expecting to sell to ðproxied by GDP and GDP per capitaÞ. We denote these potentially observable variables by the vector w C,it . We also allow for unobservable factors by introducing a firm fixed effect ðh C,i Þ, a full set of time dummies ðT C,t Þ, and an error term ðu C,it , assumed to be uncorrelated with the right-hand-side variablesÞ. Adding these extra terms and substituting equation ð2Þ into ð1Þ gives us our main empirical equation for clean innovation: Symmetrically, we can derive an equation for dirty innovation: 11 We construct stocks using the perpetual inventory method but show robustness to using patent flows and to considering alternative assumptions over knowledge depreciation rates. Some firms have zero lagged knowledge stock in some periods, so we also add in three dummy indicator variables for when lagged clean stock is zero, lagged dirty stock is zero, or both are zero. Section II yielded predictions on the signs of the coefficients in these two equations. If higher fuel prices induce more clean than dirty innovation, then the marginal effect of the fuel price must be larger on clean innovation than on dirty innovation: b C,P > b D,P , and we would further expect that b C,P > 0 and b D,P < 0. 12 Next, for there to be path dependence in the direction of innovation, it should be the case that ðceteris paribusÞ firms that are exposed to more dirty spillovers become more prone to conduct dirty innovation in the future: that is, b D,2 > 0 and b D,2 > b C,2 . In the clean innovation equation we have b C,1 > 0 and b C,1 > b D,1 . Furthermore, path dependence should involve similar effects working through a firm's own accumulated knowledge: b D,4 > 0 and b D,4 > b C,4 ðb C,3 > 0 and b C,3 > b D,3 Þ. Also, we would expect that the positive effect of dirty spillovers and dirty knowledge stocks on dirty innovation would be larger than the effects of clean spillovers and clean knowledge stocks: b D,2 > b D,1 and b D,4 > b D,3 . The reverse predictions should all apply for the clean equation: b C,2 < b C,1 and b C,4 < b C,3 .

Dynamic Count Data Models with Fixed Effects
To estimate equations ð3Þ and ð4Þ we use where z ∈ fC, Dg and x it is the vector of regressors. We compare a number of econometric techniques to account for firm-level fixed effects h zi in these Poisson models. Our baseline is an econometric model we label CFX, the control function fixed-effect estimator. It builds on the presample mean scaling estimator proposed in Blundell, Griffith, and Van Reenen ð1999Þ ðsee also Blundell, Griffith, and Van Reenen 1995;Blundell, Griffith, and Windmeijer 2002Þ. Blundell et al. suggest conditioning on the presample average of the dependent variable to proxy out the fixed effect. Like the Blundell et al. ðBGVRÞ estimator, the CFX uses a control function approach to deal with the fixed effect; but rather than using information from the presample period in the control function, we simultaneously estimate the main regression equation and a second equation allowing us to identify the control function from future data ðsimilar to the idea of taking orthogonal deviations in the linear panel data literature; see Arellano 2003Þ. The full details on this are provided in appendix B, but in a nutshell, we use CFX to deal with a potential concern with the BGVR approach, namely, that it requires a long presample history of realizations of the dependent variable. However, in our data-particularly for cleanpatenting is concentrated toward the end of our sample period. Below, we provide results using both the CFX and BGVR method as well as two other common approaches. First, we use the Hausman, Hall, and Griliches ð1984Þ method ðthe count data equivalent to the withingroups estimatorÞ even though this requires strict exogeneity, which is inconsistent with models including functions of the lagged dependent variable as we have in equations ð3Þ and ð4Þ. Second, we implement some simple linear within-groups models adding an arbitrary constant to the dependent variable before taking logarithms. We show that all these approaches deliver similar qualitative results, although the CFX provides the best overall fit to the data.

Main Data Set
In this section, we briefly present our data ðadditional details can be found in app. CÞ. Our main data are drawn from the World Patent Statistical Database ðPATSTATÞ maintained by the EPO. 13 Patent documents are categorized using the International Patent Classification ðIPCÞ and national classification systems. We extract all the patents relating to clean and dirty technologies in the automotive industry. Clean is identified by patents whose technology class is specifically related to electric, hybrid, and hydrogen vehicles. Our selection of relevant IPC codes for clean technologies relies heavily on previous work by the OECD ðsee http:// www.oecd.org/environment/innovation; Haščič et al. 2009;Vollebergh 2010Þ. Clearly, there is a debate as to how clean both electric cars and hydrogen cars really are ðGraff Zivin, Kotchen, and Mansur 2014Þ. This will depend, by and large, on how electricity and hydrogen are being generated. However, we note that in most plausible long-run scenarios, electricity will be generated by renewable sources and hydrogen will be generated using electrolysis. Consequently, electric and hydrogen cars would be clean. Assessing the speed of such a transition for a full optimal environmental policy is beyond the scope of this paper but is an important topic for future research.
The precise description of the IPC codes used to identify relevant clean patents can be found in panel A of table 1. Some typical IPC classification codes included in the clean category are B60L11 ðelectric propulsion with power supplied within the vehicleÞ and B60K6 ðarrange- ment or mounting of hybrid propulsion systems comprising electric motors and internal combustion enginesÞ. US patent 6456041 is an example of a clean patent from our data set: 14 it describes a power supply system for an electric vehicle. It was first filed by Yamaha Motor in Japan in 1998 and was then filed at the EPO and at the USPTO in 1999. The front page and technical diagram of the patent are shown in appendix figure A1. Dirty includes patents with an IPC code that is related to the internal combustion engine. These can be found in various subcategories of the F02 group, for example, F02B ðcombustion engines in generalÞ, F02F ðcylinders, pistons, or casings for combustion enginesÞ, or F02N ðstarting of combustion enginesÞ. The full list of IPC codes used to identify dirty patents is in panel B of table 1. Each of these groups includes several dozen subclasses, and an example of the full list of subclasses for the F02F group is shown in appendix figure A2. The dirty category typically includes patents covering the various parts that make up an internal combustion engine. For example, EPO patent 0967381 protects a cylinder head of an internal combustion engine and USPTO patent 5844336 protects a starter for an internal combustion engine.
An important feature of the dirty category is that some patents included in this group aim at improving the fuel efficiency of internal combustion engines, making the dirty technology less dirty. We refer to these fuel efficiency patents as "grey" patents. In our baseline results, grey patents are included in the dirty category, but we also disaggregate the category to estimate models separately for grey and "pure dirty" innovations separately ðas well as splitting up the knowledge stocks along these lines on the right-hand side of the regressionsÞ. To select IPC codes for grey technologies, we use recent work at the EPO related to the new climate change mitigation patent classification ðsee Veefkind et al. 2012Þ. We complement this with information from interviews with engineers working in the automobile industry. 15 The list of these IPC codes is shown in panel C of table 1. An example of a grey patent is EPO patent 0979940, which protects a method and device for controlling fuel injection into an internal combustion engine. Electronic fuel injection technologies constantly monitor and control the amount of fuel burnt in the engine, with a view to reducing the amount of fuel unnecessarily burnt, thus optimizing fuel consumption. Appendix figure A3 has the front page and technical diagram of this patent. Alongside the grey fuel efficiency innovations, there are many purely dirty patents, such as EPO patent 0402091, which covers a four-cycle 12cylinder engine ðsee app. fig. A4Þ. Fuel consumption is proportional to the number and the volume of cylinders: the average car sold in Europe has four cylinders, whereas it has six in the United States. A 12-cylinder engine is much more powerful than a four-or six-cylinder engine, but this comes at the cost of increased fuel consumption. Twelve-cylinder engines are used by many carmakers for their top-range models, including Aston Martin, Audi, BMW, and Rolls Royce. These cars typically run about 15 miles per gallon, while the average new car sold in the United States in 2011 obtains 33.8 miles per gallon. 16 To measure innovation, we use a count of patents by application/filing date. The advantages and limitations of patenting as a measure of innovation have been extensively discussed. 17 For our purposes, there are three advantages of using patents. First, they are available at a highly technologically disaggregated level. We can distinguish innovations in the auto industry according to specific technologies, whereas R&D investment cannot be easily disaggregated. Second, R&D is not reported for small and medium-sized firms in Europe nor for privately listed firms in the United States ðthey are exempt from the accounting requirement to report R&DÞ. Third, the auto sector is an innovation-intensive sector, where patents are perceived as an effective means of protection against imitation, something that is not true in all sectors ðCohen, Nelson, and Walsh 2000Þ. 18 In our view, these considerations make patents a reasonably good indicator of innovative activity in the auto sector.
Patents do suffer from a number of limitations. They are not the only way to protect innovations, although a large fraction of the most economically significant innovations appear to have been patented ðDernis, Guellec, and van Pottelsberghe 2001Þ. Another problem is that patent values are highly heterogeneous, with most patents having a very low valuation. Finally, the number of patents that are granted for a given innovation varies significantly across patent offices with concerns over increasing laxity in recent years particularly in the USPTO ðe.g., Jaffe and Lerner 2004Þ. To mitigate these problems, we focus on "triadic" patents as our main outcome measure, 19 which are those patents that have been taken out in all three of the world's major patent offices in the United States, Europe, and Japan ðUSPTO, EPO, and JPOÞ. 20 Focusing on triadic patents has a number of advantages. First, triadic patents provide us with a common measure of innovation worldwide, which is robust to administrative idiosyncrasies of the various patent offices. For example, if the same invention is covered by one patent in the United States and by two patents in Japan, all of which are part of the same triadic patent family, we will count it as one single invention. Second, triadic patents cover only the most valuable inventions, which explains why they have been used so extensively to capture high-quality patents. 21 Third, triadic patents typically protect inventions that have a potential worldwide application; thus these patents are relatively independent of the countries in which they are filed.
Our data set includes 6,419 clean and 18,652 dirty triadic patents. 22 Since the EPO was created in 1978, our triadic patent data start only in that year. The last year of fully comprehensive triadic data is 2005, so this is our end year. 23 Our basic data set consists of all those applicants ðboth firms and individualsÞ that applied for at least one of these clean or dirty auto patents. We identify 3,423 distinct patent holders, which breaks 19 To identify triadic patents we use the INPADOC data set in PATSTAT. For details on the construction of patent families, see Martinez ð2010Þ.
20 Following standard practice we use all patents filed at the EPO, JPO, and USPTO. The USPTO published ungranted patent applications only after 2001 ðwhen it changed policy in line with the other major patent officesÞ. For consistency we thus consider only triadic patents granted by the USPTO both before and after 2001. For the official definition of triadic patents and how triadic patent families are constructed, see Kahn ð2004Þ andMartinez ð2010Þ. 21 It has been empirically demonstrated that the number of countries in which a patent is filed is correlated with other indicators of patent value. See, e.g., Grupp ð1996, 1998Þ, Lanjouw, Pakes, and Putnam ð1998Þ, Dernis et al. ð2001Þ, Harhoff, Scherer, and Vopel ð2003Þ, Dernis and Khan ð2004Þ, and Guellec and van Pottelsberghe ð2004Þ. 22 In total, the PATSTAT data set includes 213,668 clean and 762,708 dirty patent applications across all 80 patent offices. Thus by using triadic patents we focus on the high end of the distribution of patent quality. 23 The number of triadic patents in all technologies ði.e., including patents that are neither clean nor dirtyÞ starts falling in 2006 because of time lags between application and grant date at the USPTO. down into 2,427 companies and 996 individuals. For every patent holder we subsequently identify all the patents they filed. We also extract other pieces of information based on this sample, which we use to construct weights for prices and spillovers. For example, we identify all the other patents filed by holders of at least one clean or dirty triadic patent, which represents a total of 1,505,719 patent applications.

Tax-Inclusive Fuel Prices
To estimate the impact of a carbon tax on innovation in clean and dirty technologies, we use information on country-level fuel prices ðFP ct Þ and fuel taxes. Data on tax-inclusive fuel prices are available from the International Energy Agency ðIEAÞ for 25 major countries from 1978 onward. 24 We construct a time-varying country-level fuel price defined as the average of diesel and gasoline prices. 25 The average fuel price across countries for our regression sample period 1986-2005 is shown in panel a of figure 1. Although this source of variation will be absorbed by the time dummies in our econometric specifications, it gives a sense of the overall evolution of prices. Fuel prices fell from the mid to late 1980s and then rose, peaking just before Fuel prices are available only at the country-year level, whereas our dependent variable has firm-level variation that we would like to exploit. A related issue is that the auto market is global, and government policies abroad might be at least as important for a firm's innovation decisions 24 The IEA reports some incomplete data for an additional 13 countries. We explore the robustness of our main results to the precise range of countries considered. We find that our results emerge even if we restrict ourselves to only the 10 largest economies. 25 Diesel and gasoline are differentially taxed in many countries, which could provide an interesting additional source of variation. However, this would also require distinguishing innovations between these categories. This is not easily possible as internal combustion engine patent classes do not explicitly separate between diesel and other types of engines. Our interviews with engineers working in the automobile industry revealed that patent class F02B1 ðengines characterized by fuel-air mixture compressionÞ corresponds in practice mostly to gasoline engines, while patent class F02B3 ðengines characterized by air compression and subsequent fuel additionÞ mostly corresponds to diesel engines. However, these are only two subclasses out of over 200 used in the paper to classify dirty patents. Consequently, we would not be able to classify the majority of patents into diesel or gasoline engines, in particular because many engine parts, such as pistons and cylinders ðsee, e.g., F02B55, internal combustion aspects of rotary pistonsÞ, are used indifferently in both types of engines. as policies in the country where the company's headquarters are located. We allow fuel prices to have a different effect across firms by noting that some geographical markets matter more than others for reasons that are idiosyncratic to an auto firm. First, auto manufacturers have different styles of vehicles reflecting their heterogeneous capabilities and branding that are differentially popular depending on local tastes ðe.g., Berry, Levinsohn, and Pakes 1995;Goldberg 1995;Verboven 1999Þ. Second, there is typically some home bias toward "national champion" auto manufacturers in government policies and national tastes. For example, the 2008 auto bailouts in Detroit where paid for by US taxpayers, whereas the bailout of Peugeot has been shouldered by the French. The upshot of this is that auto firms display heterogeneous current and expected market shares across nations, and their R&D decisions will be more influenced by prices and policies in some countries than in others.
To operationalize this idea, we construct a fuel price variable for each firm as a weighted average of fuel prices across countries based on a proxy of where the firm expects its future market to be. Our price index for firm i at time t is defined as where FP ct is the tax-inclusive fuel price in country c discussed above and w FP ic0 is a firm-specific weight ðthis is time invariant and uses information only prior to the regression sample periodÞ. The weight is determined by the importance of county c as a market outlet for firm i, so we define w FP ic0 as the fraction of firm i's patents taken out in country c. The rationale for doing this is that a firm will seek intellectual property protection in jurisdictions where it believes it will need to sell in the future ðeven if it licenses the technology, the value of a license will depend on whether it has obtained intellectual property protection in relevant marketsÞ. For every patent applied for, we know that the patenting firm has paid the cost of legal protection in a discrete number of countries. For example, a firm may choose to enforce its rights in all EU countries or in only a subset of EU countries, say Germany and the United Kingdom. Similarly, the firm may decide to apply for patent protection in the United States but not in smaller markets. Assuming that the country distribution of a firm's patent portfolio is a good indicator of the firm's expectation of where its markets will be in the future, we can use this distribution to construct a firm-specific fuel price, FP it , whose value is computed as the weighted mean of the lnðfuel pricesÞ in the relevant markets, with weights w FP ic0 equal to the shares of the corresponding countries in the firm's patent portfolio. For example, if a firm had filed 30 patents, 20 in the United States and 10 in Germany, the price changes in the United States would get a weight of two-thirds and the German price changes a weight of onethird. In addition, to account for the greater importance of larger countries, we further weight by each country's average GDP.
We calculate the weights using the patent portfolio of each company averaged over the 1965 -85 "presample" period, whereas we run regressions over the period 1986 -2005. This is to make sure that the weights are weakly exogenous as patent location could be influenced by shocks to innovation. We choose 1985 as the cutoff to ensure that there is enough presample time to construct the weights. We perform robustness tests using different presample periods to check that nothing is driven by the precise year of cutoff ðe.g., use 1965-90 as the presample period and estimate the regressions from 1991 onwardÞ.
Why do we not use an alternative weighting scheme that simply reflects where firms currently sell their products ðe.g., as in Bloom, Schankerman, and Van Reenen 2013Þ? First, we believe that the information on where firms choose to take patent protection is a potentially better measure because it reflects their expectations of where their future markets will be. Second, there is a data constraint: although sales distributions by geographic area are available for larger firms, they are not available for smaller firms-and there are many patents from these smaller firms. We show our weights compared to sales weights for some of the largest car firms in appendix table A1: Toyota, Volkswagen, Ford, Honda, and Peugeot. The correlation is generally high, suggesting that the weights we choose do a reasonable job at reflecting market shares. 26 The Firm's Own Lagged Patent Stocks and Spillovers Firm patent stocks are calculated in a straightforward manner using the patent flows ðPAT z,it Þ described above. Following Cockburn and Griliches ð1988Þ and Peri ð2005Þ, the patent stock is calculated using the perpetual inventory method: where z ∈ fC, Dg. We take d, the depreciation of R&D capital, to be 20 percent, as is often assumed in the literature, but we check the robustness of our results to other plausible values.
To construct aggregate spillovers for a firm, we use information on the geographical location of the various inventors in that firm. Patent statistics allow us to locate an inventor geographically regardless of nationality of the firm's headquarters or the location of the office where the patent was filed ðe.g., the patents of Toyota's scientists working in US research labs are part of this US spillover poolÞ. Implicit in our approach is the view that the geographical location of an inventor is likely to be a key determinant of knowledge spillovers rather than the jurisdiction over which the patent is taken out ðwhich matters more as a signal of where the market for sales is likely to beÞ. Many papers have documented the importance of the geographical component of knowledge spillovers in patents and other indicators ðe.g., Trajtenberg 1993, 2005;Griffith, Lee, and Van Reenen 2011Þ. To construct a firm-specific spillover pool, we use an empirical strategy analogous to that for the fuel price. The spillover weight w S ic0 is the share of all firm i's inventors ði.e., where the inventor workedÞ in country c between 1965 and 1985. This weight is distinct from w P ic0 in equation ð6Þ as it is based on the location of inventors who are more likely to benefit from research conducted locally. Importantly, the distribution of the patent portfolio across countries and the distribution of inventors vary considerably across firms. This is illustrated for the United States in appendix figure A5.
The spillover for firm i is where SPILL z,ct is the spillover pool in country c at time t. This is defined as The spillover pool of a country is the sum of all other firms' patent stocks with a weight that depends on how many inventors the other firm has in that country. 27 As noted above, a common problem with patent data is that the value of patents is highly heterogeneous. We mitigate this problem by conditioning on triadic patents, which screen out the very low-value patents. But we also perform two other checks. First, we weight patents by the number of future citations. Second, we use "biadic" patents filed at the EPO and at the USPTO, following Cockburn and Henderson ð1994Þ, who argued that patents were important if they had been applied in at least two of the three major economic regions. Our results are robust to these two variants. Figure 4 shows that aggregate triadic clean and dirty patents have been rising over time. Dirty patents increased steadily between 1978 and 1988, fell temporarily, and then rose again between 1992 and 2000, but they have been decreasing during the last 5 years of our data set. The number of clean patents was low for a decade until 1992, then began rising particularly after 1995 ðat an average annual growth rate of 23 percentÞ, peaking at 724 in 2002 alone, before falling back slightly. Consequently, while the number of clean patents represented only 10 percent of the number of dirty patents filed annually during the 1980s, they represented 60 percent by 2005. Descriptive statistics for our data set used in the regressions are shown in table 2. In any given year, the average number of dirty patents per firm is 0.23 and the average number of clean patents is 0.08. Appendix C discusses more descriptive statistics showing more of the cross-country distribution of patent filing and citation patterns that are consistent with spillovers being much stronger within the two categories ðclean or dirtyÞ than between them. 27 An alternative approach would be to define the country-level spillover as SPILL z;ct 5 o j K z;jct ;

Descriptive Statistics
where K z;jct 5 PAT z;jct 1 ð1 2 dÞK z;jct21 and PATz,jct is the number of patents filed by inventors of company j located in country c at year t. Empirically, these two methods give very similar results.
We look at the top 10 patentors in clean technologies ðtable A4Þ and dirty technologies ðtable A5Þ between 1978 and 2005. Japanese and German companies predominate, although most top companies' portfolios include both clean and dirty ðthe only exception is Samsung SDI, a battery specialistÞ. Recall that this is based on triadic patents, and US com-  panies tend to file disproportionately more patents in the United States than in Europe and Japan. Tables A6-A9 report top clean and dirty patentors at the EPO and at the USPTO separately. General Motors is the third-largest patentor of clean technologies at the USPTO, whereas it is not even in the top 10 at the EPO. 28

Main Results
Our main results are shown in table 3. Columns 1-3 use the number of clean patents ða flowÞ in a firm as the dependent variable and columns 4- Note.-Standard errors are clustered at the firm level. Estimation is by the CFX method. All regressions include controls for GDP per capita, year dummies, fixed effects, and three dummies for no clean knowledge, no dirty knowledge, and no dirty or clean knowledge ðin the previous yearÞ. Fuel price is the tax-fuel price faced. R&D subsidies are public R&D expenditures in energy-efficient transportation. Emissions regulations are maximum levels of tailpipe emissions for pollutants from new automobiles. * Significant at 10 percent. ** Significant at 5 percent. *** Significant at 1 percent. 28 While it is clear that there are a number of big companies active in both clean and dirty automotive patenting, the Herfindahl index for patenting over 1978-2005 for clean innovations is 0.023, and it is 0.038 for dirty innovations, implying low concentration. The top 10 patent holders in clean account for 35.6 percent of patents over 1978-2005, whereas the corresponding figure is 46.6 percent for dirty. journal of political economy 6 use the flow of dirty patents. All estimates include firm fixed effects using the CFX approach ðdescribed in Sec. III and in more detail in app. BÞ, year dummies, and GDP per capita. Column 1 shows that the coefficient on the ðtax-inclusiveÞ fuel price is positive and significant. The elasticity of 0.97 implies that a 10 percent higher fuel price is associated with about 10 percent more clean patents. The coefficients on spillovers and lagged patent stocks take signs consistent with the path dependency hypothesis. Firms that are more exposed to larger stocks of clean innovation by other firms ðclean spillovers, SPILL C,it21 Þ are significantly more likely to produce clean patents, whereas those benefiting more from dirty spillovers ðSPILL D,it21 Þ are significantly less likely to innovate in clean technologies. An increase in the lagged clean spillover stock by 10 percent is associated with an increase in a firm's clean innovation by 2.7 percent. By contrast, an increase in the exposure to dirty spillovers by 10 percent reduces clean innovation by 1.7 percent.
In addition to path dependency at the economy level through spillovers, there is also path dependency at the firm level. Column 1 of table 3 suggests that firms that have innovated in clean technologies in the past ðK C,it21 Þ are much more likely to continue to innovate in clean technologies in the future, with a significant elasticity of 0.306. Interestingly, a firm's own history of dirty innovation ðK D,it21 Þ is also associated with more clean innovation with an elasticity of 0.139. This coefficient is, however, much smaller than the corresponding coefficient on past dirty innovation stocks in the dirty innovation equation ðcol. 4Þ, which is four times as large ð0.557Þ. In other words, firms with a history of dirty innovation are more likely to innovate in the future in either clean or dirty ðcompared to those with little innovationÞ, but this effect is much stronger for dirty innovations than for clean innovations, leading to path dependence. Moreover, note that in column 1 the coefficient on a firm's past dirty innovation stock on future clean innovation ð0.139Þ is much smaller than the effect of past clean innovations on future clean innovation ð0.306Þ. 29 Columns 2 and 3 of table 3 include a measure of R&D subsidies for clean technologies and a control for emission regulations. R&D subsidies are from the IEA's Energy Technology Research Database, and the emissions regulations index is from Dechezleprêtre, Perkins, and Neumayer ð2012Þ with details in appendix C. In contrast to the proxy for carbon taxes ðfuel pricesÞ, neither of these additional policy variables is statistically significant, and the coefficients on the other variables do not change much. The absence of an R&D subsidy effect is surprising, and we explain below why when discussing table 4.
Columns 4-6 of table 3 repeat the specification in the first three columns but use dirty patents as the dependent variable instead of clean patents. The coefficient on fuel prices is negative and significant in all columns. In column 4 a 10 percent increase in fuel prices is associated with about a 6 percent decrease in dirty innovation. The estimates on spillovers and knowledge stocks are symmetric to those in the clean equation. Exposure to dirty spillovers fosters future dirty innovation, whereas clean spillovers reduce dirty patenting. The coefficients suggest that a firm's own history of dirty patenting has a positive association with future dirty patenting but that there is no association between the firm's past clean patenting and its future dirty patents.
In summary, table 3 offers considerable support for our model. First, higher fuel prices significantly encourage clean innovation and significantly discourage dirty innovation. Second, there is path dependency in the direction of technical change: countries and firms that have a history of relatively more clean ðrespectively, dirtyÞ innovation are more likely to innovate in clean ðrespectively, dirtyÞ technologies in the future.

Grey Innovations
As noted above, our dirty category includes innovations relating to improvements in the energy efficiency of internal combustion engines. We labeled these "grey" innovations and consider disaggregating the dirty category into these grey and purely dirty innovations. As noted in Section II, the effect of fuel prices is more ambiguous in this middle grey category. On the one hand, there are incentives to substitute research away from purely dirty into grey innovation when the fuel price rises. On the other hand, there is also an incentive to switch away from the internal combustion engines completely ðincluding greyÞ toward alternative clean vehicles. Table 4 presents the results and shows that, as expected, the coefficient on the fuel price for grey innovation in column 2 ð0.282Þ lies between the coefficients on clean ðpositive at 0.848 in col. 1Þ and purely dirty ðvery negative at 20.832 in col. 3Þ. This is consistent with fuel prices having a positive effect on energy-efficient innovation, although smaller and insignificant when compared to the effect of fuel prices on purely clean innovations. Another interesting feature of the results is that the coefficient on R&D subsidies is positive and significant in the grey innovation equation whereas it continues to be insignificant in the clean and purely dirty equations. This is consistent with the fact that the majority of these government subsidies are for energy efficiency ðsee app. CÞ rather than for more radical clean technologies.
Since we have also disaggregated the spillover stocks and the firm's own past innovation stocks into the three categories, now we have six var-journal of political economy iables reflecting path dependency on the right-hand side of the regression. The coefficients on these variables take a broadly sensible pattern, but precision has fallen as there is likely to be some collinearity issues with a large number of highly correlated variables.
Given how demanding this specification is, we find the overall results from table 4 encouraging and consistent with the theory.

Magnitude of the Fuel Price Effect on Innovation
In quantitative terms, how do our estimates compare to others in the literature? Popp ð2002Þ reports short-run energy price elasticities for the impact of prices on the aggregate number of clean patents as a share Note.-Standard errors are clustered at the firm level. Estimation is by the CFX method. This table disaggregates the dirty patents into those that are "grey" ðrelated to fuel efficiencyÞ and those that are not ð"purely dirty"Þ. We construct all spillovers and own past stocks on the basis of this disaggregation and include on the right-hand side ðhence two extra terms compared to table 3Þ. We estimate two dirty equations: one in which grey innovations are the dependent variable ðin col. 2Þ and one for the purely dirty ðin col. 3Þ. All regressions include controls for GDP per capita, year dummies, fixed effects, and four dummies for no own innovations in ðiÞ clean, ðiiÞ grey, ðiiiÞ dirty, and ðivÞ no clean, grey, or purely dirty in the previous year. Fuel price is the tax-inclusive fuel price faced. R&D subsidies are public R&D expenditures in energy-efficient transportation. * Significant at 10 percent. ** Significant at 5 percent. *** Significant at 1 percent. of all patents ðwe look at long-run price effects in Sec. VI belowÞ. We can compute this elasticity ðE C,P Þ from our regression model as 30 where S C and S D are the share of clean and dirty patenting in economywide patents ði.e., clean, dirty, and all othersÞ and b C,P and b D,P are coefficients on lnðpriceÞ from our clean and dirty innovation equations, respectively. Compared to all patents in the economy, innovation in the car industry is rather small. In our sample period, only 0.9 percent of all patents are clean auto patents and 2.5 percent are dirty auto patents. Hence, since S C ≈ 0 and S D ≈ 0, b C,P provides a good approximation of the elasticity. For example, using the estimates in table 3, column 1, the elasticity would be 0.970 under our approximation and 0.981 using the exact formula above.
Popp ð2002Þ looks at clean innovation in power generation technologies, whereas we are focused on innovation in the auto sector. Crabb and Johnson ð2010Þ implement the same specification as Popp, but on the US auto sector, finding an elasticity of around 0.4 ðcompared to Popp's 0.06 for all power generation technologies clean innovationsÞ. Both Popp and Crabb and Johnson include what we have dubbed grey innovation in their definition of clean. Thus to derive a comparable elasticity, we report a weighted average of the price coefficient for clean ðb C,P Þ and the price coefficient for grey ðb G,P Þ derived from our estimates reported in table 4, where we split the dirty category into "grey" and "purely nongrey dirty." The elasticity becomes ðagain abstracting away from the small effect on aggregate innovationÞ where PAT C and PAT G are the aggregate number of clean ðour defini-tionÞ and grey innovations at a particular point in time. As can be seen 30 The elasticity E C,P 5 ∂ ln S C /∂ ln FP, where S C 5 PAT C =ðPAT C 1 PAT D 1 PAT O Þ and total patentsPAT Z 5 o i expðx it b Z Þh Zi for Z ∈ fC, D, Og and where O represents "other," i.e., nonclean or dirty patents. Consequently, journal of political economy from figure 5, this elasticity ranges from 0.4 to 0.6 and so is similar in magnitude to Crabb and Johnson's estimates. The increase over time occurs because the share of purely clean innovation relative to grey innovations has been increasing over time. Table 5 considers the alternative econometric approaches for dynamic count data models with firm fixed effects discussed in Section III. First, we follow Hausman et al. ð1984Þ in column 1 for clean patents and column 3 for dirty patents. The signs of coefficients are generally the same as in our baseline model of table 3, but the marginal effect of fuel price is much greater in absolute magnitude for dirty innovation and smaller ðand insignificantÞ for clean. Indeed, the magnitude of the estimated elasticity for dirty patents seems unreasonably large ð22.496Þ. We suspect that the assumption of strict exogeneity underlying the Hausman et al. ðHHGÞ method is problematic in our context, as we have a highly dynamic specification. Columns 2 and 4 implement the Blundell et al. ð1995, 1999Þ estimator. The pattern of the spillover effects and dynamics remains similar to those of the baseline regression, and we still obtain a positive and FIG. 5.-Aggregate price elasticities ðclean plus grey shareÞ with respect to lnðfuel priceÞ over time implied by our firm-level estimates. The detailed methodology is explained in the text. significant effect of fuel prices on clean innovation and a negative and significant effect on dirty innovation. The fuel price coefficients are comparable to those in the baseline case. 31 The final two columns of table 5 uses relative patenting lnð1 1 PAT C,it Þ 2 lnð1 1 PAT D,it Þ as the dependent variable in an ordinary least squares regression with firm dummies ði.e., the linear within-groups es-timatorÞ. Column 5 shows that there is a significant and positive effect of fuel prices on relative innovation. Column 6 shows that this result is Note.-Standard errors are clustered at the firm level. Regressions are the same specifications as in table 3, i.e., col. 3 for clean and col. 6 for dirty. Fuel price is the tax-inclusive fuel price faced by the firm. The dependent variable is the flow of clean patents in cols. 1 and 2, the flow of dirty patents in cols. 3 and 4, and the log-ratio of clean to dirty patents in cols. 5 and 6. Different columns control for fixed effects in different ways: HHG is the Hausman et al. ð1984Þ method, BGVR is the Blundell et al. ð1999Þ method, and last two columns are simply within groups ði.e., adding a dummy variable for every firmÞ. * Significant at 10 percent. ** Significant at 5 percent. *** Significant at 1 percent. 31 However, notice that we find larger values for the effects of clean knowledge stocks on clean patenting and dirty knowledge stocks on dirty patenting than in both the baseline CFX and the HHG specifications. This could mean that the BGVR approach is not fully controlling for all the fixed effects by relying on presample patenting only. robust to including a full set of country by year fixed effects to absorb any potential country-specific time-varying policy variables. 32 Could the results somehow be driven by firms that were not patenting prior to 1986? Table 6 repeats the baseline regressions for our three count data models ðBGVR, HHG, and CFXÞ restricting the sample to firms with at least one patent before 1986. This leads to only small changes in the coefficients and no change in the overall qualitative patterns.

Electricity Prices
Most clean car technologies depend on electricity. 33 We can therefore hypothesize that electricity prices have the opposite effect from fossil fuel Note.-Standard errors are clustered at the firm level. This is a subsample of the data in table 3 in which we condition on firms having at least one patent in the presample period. All regressions include controls for GDP per capita, fixed effects, year dummies, and three dummies for no clean knowledge, no dirty knowledge, and no dirty or clean knowledge ðin the previous yearÞ. Fuel price is the tax-inclusive fuel price faced by the firm. HHG is the Hausman et al. ð1984Þ method, BGVR is the Blundell et al. ð1999Þ method, and CFX is the control function fixed-effect method. * Significant at 10 percent. ** Significant at 5 percent. *** Significant at 1 percent. 32 The country here is based on the headquarters whereas the previous country variables such as fuel price were based on weighted averages using patent weights. It was computationally infeasible to include the full set of country by time dummies in the nonlinear count data models. 33 Hydrogen for hydrogen cars can be produced via electrolysis of water. It can also be derived from natural gas in a process called steam reforming. However, steam reforming still leads to CO 2 emissions. Consequently, many experts suggest that in the long run, most hydrogen would be derived from electrolysis using electricity from renewable sources. prices on the direction of technical change. In table 7 we find that, as expected, electricity prices have a negative effect on clean innovation ðcol. 1Þ and a positive effect on dirty innovation ðcol. 4Þ, although the coefficients are less precisely determined than those on the fuel price. Looking simultaneously at fuel and electricity prices can also be seen as a further robustness check for our main results. One concern might be that our results on fossil fuels are driven by unobserved factors such as a general concern for climate change or other climate-related regulation that we do not control for. However, for most such unobserved factors we would expect that they have a similar effect on both fossil fuel and electricity prices, whereas the coefficients take opposite signs in the regressions. Columns 2 and 4 use the relative fuel to electricity price as the coefficients in columns 1 and 3 are opposite and have a similar magnitude. The coefficients on the relative price look very similar to our baseline estimates.

Other Extensions and Robustness Tests
Oil prices are broadly global, so most of the country-specific variation over time in fuel prices comes from differential taxation. Consequently, table 8 Note.-Standard errors are clustered at the firm level. Estimation is by the CFX ðcontrol function fixed-effectÞ method described in Sec. III. All regressions include controls for GDP per capita, year dummies, and three dummies for no clean knowledge, no dirty knowledge, and no dirty or clean knowledge in the previous year. * Significant at 10 percent. ** Significant at 5 percent. *** Significant at 1 percent.

journal of political economy
substitutes fuel taxes for fuel prices showing again a similar pattern of results. One difference is that the point estimates of the fuel tax response are smaller in absolute terms for both types of innovation. This is to be expected as demand is driven by the final price the consumer pays rather than the fuel tax itself. Choosing 1986 as the first year for the regression sample is somewhat arbitrary, so we experimented with changing the cutoff year to check robustness. For example, we used 1990 instead and ran the regressions for 1991-2005 using data from 1965-90 to construct the weights. The results in table 9 are quite comparable to our baseline, although standard errors are a little larger as we would expect from using a smaller sample for the regressions. Table 10 reports alternative dynamic specifications for fuel prices. Columns 1-5 are for clean innovation and use fuel prices dated in the current year in column 1, lagged 1 year in our baseline of column 2, lagged 2 years in column 3, and lagged 3 years in column 4. In column 5 we construct a geometrically weighted average of past fuel price levels as proposed by Popp ð2002Þ. 34 We repeat these specifications in columns 6- 10 but use dirty patents instead. With all these approaches we find price coefficients that are very similar to our earlier estimates, with a positive elasticity of clean patents with respect to fuel price of around unity and a negative elasticity of dirty patents of around 20.6. 35 We conducted many other robustness tests. First, our outcome variable is triadic patents, those filed at all three main patent offices in the world ðUSPTO, EPO, and JPOÞ. A concern is that this screens out too many of the lower-value patents. To address this, we ran our regressions using biadic rather than triadic patents; that is, we included all patents in the construction of the innovation and knowledge stock variables that are filed at the EPO and the USPTO but not necessarily the JPO. Table A10 shows that the results are robust to this experiment. Second, we Note.-Standard errors are clustered at the firm level. All regressions include controls for GDP per capita, year dummies, and three dummies for no clean knowledge, no dirty knowledge, and no dirty or clean knowledge in the previous year. Fuel price is the taxinclusive fuel price faced by the firm ðusing presample patent portfolios as weightsÞ. HHG is the Hausman et al. ð1984Þ method, BGVR is the Blundell et al. ð1999Þ method, and CFX is the control function fixed-effect method. * Significant at 10 percent. ** Significant at 5 percent. *** Significant at 1 percent.
captures the speed at which agents adjust their expectations on the basis of the gap between the predicted and the realized values. For comparison purposes we use the same adjustment factor of l 5 0.83 as in Popp's paper. 35 We tried to pin down more precisely the dynamic response structure by including multiple lags of price simultaneously, but autocorrelation in prices made it difficult as all coefficients tended to be zero, as in Popp ð2002Þ. constructed the patent stock variables-including the spillover variablesusing citation-weighted counts from all worldwide patents ðtable A11Þ. This led to qualitatively similar results; for example, the fuel price response is larger for clean patents than for dirty patents. 36 Third, we experimented with a wide range of other country-specific variables and report that the results are robust to these additional covariates. For example, in table A12 we included total GDP in addition to GDP per capita. The coefficient on GDP is insignificant, and the basic pattern of our results is robust to this extra control. Fourth, we were concerned that the results could be driven by high price volatility in the smaller countries in our data, so we reconstructed the weights for the fuel price on the basis of subsamples of the largest countries in GDP terms. Table A13 shows that the results are robust when just using the larger countries in our sample. Fifth, as discussed in Section IV.B, it may be that it is not correct to classify hybrid cars as clean innovation, so we experimented with dropping them from our definition of clean technologies. The results are robust to this change ðtable A14Þ. 37 Finally, we wanted to make sure that our results were not driven by firms that rarely patent, so we dropped the least innovative firms, which collectively accounted for only 5 percent of aggregate patents. The results were robust to this test.

VI. Simulation Results
To obtain a better sense of the aggregate magnitude of the results, we report a number of counterfactual experiments. We explore the implications of our econometric models for the evolution of future clean and dirty knowledge stocks and how this is affected by an increase in the fuel price ðgenerated, e.g., by an international carbon taxÞ. We recursively compute values of expected patenting under different policy scenarios, use those to update the knowledge stock variables ðincluding the spillover variablesÞ, and feed these into the next iteration. Hence, if we split the right-hand-side variables x it into variables that are functions of the lagged knowledge stock ðk it Þ and other variables such as the fuel price ðp it Þ, we can write x it 5 [k it , p it ] and a particular iteration T periods after year t as defined by 36 If anything, the results are generally stronger with elasticities that are larger in magnitude. 37 We also reran table 4 reclassifying all hybrids as grey innovations. The resulting point estimate on price in the clean equation is somewhat lower ð0.565 instead of 0.848Þ and loses significance. However, the coefficient on price in the grey equation drops even more, so that the gap in elasticities between clean and grey gaps becomes slightly larger. We attribute these changes to the somewhat reduced power of this specification and conclude that hybrid technologies are not the main drivers of the clean advantage in our main specifications. journal of political economy where d PAT C;it1T and d PAT D;it1T are vectors of predicted patent flows for firms in the sample and p CF it1T are potentially counterfactual values of the policy and other control variables. Our empirical results have implied that there is path dependence in the type of innovation pursued, through both internal firm-level knowledge stock effects and external countrywide spillovers. In this section we explore how important this path dependence is in quantitative terms by studying the evolution of both clean and dirty knowledge stock implied by our fitted models into the future. We do this for every firm in the data set and then aggregate across the world economy in each period.
More specifically, we are looking for conditions under which the clean knowledge stock for the aggregate economy exceeds the dirty knowledge stock. In line with Acemoglu et al. ð2012Þ, this would be a requirement for clean technologies to be able to compete with dirty ones, even without policy intervention. Our projections should be considered as a rough exploration into the importance of carbon taxes and path dependency rather than precise forecasts of future innovation. 38 We focus on the period up to 2030 with 2020 as a focal point. This is somewhat arbitrary but is in line with scenarios of the IEA, 39 suggesting that globally fossil fuel use must peak by 2020 to avoid highly risky climate change. It is also consistent with the European Commission's 2020 targets. 40 We first check the within-sample performance of the model by implementing simulation runs providing recursively generated knowledge stocks over the regression sample period ð1986-2005Þ in appendix figure A6. 41 Clean and dirty patent stocks are reported on the y-axis. Comparing predicted aggregate patents to the actual values suggests that our preferred CFX model does a reasonably good job at tracking the aggregate changes in clean and dirty patenting ðpanel aÞ. The alternative BGVR and HHG estimates are not too bad but do much less well in later years ðpanels b and cÞ. 38 Technically, the tipping point at which the market starts innovating more in clean technologies than in dirty technologies without policy intervention occurs when the clean technology is more productive than the dirty technology. Our stock of knowledge variables, respectively, on clean and dirty innovation are natural proxies for measuring the relative productivity of clean vs. dirty technologies. 39 See http://blogs.ft.com/energy-source/2009/11/10/fossil-fuel-use-must-peak-by -2020-warns-iea/#axzz1tQmZyLoy. 40 See http://ec.europa.eu/news/economy/100303_en.htm. 41 For the simulations we restrict the sample to the firms for which we have presample information. In this way we do not have to make further assumptions as to how changes in the spillover and policy variables would affect firms for which these variables are essentially missing. Figure 6 reports simulations based on the regressions from table 6, columns 1 and 4, for years through to 2030. In panel a we report the baseline case keeping fuel prices ðand time dummiesÞ at their 2005 values. 42 The regressions imply a strong enough path dependency for the gap between dirty and clean knowledge stocks to remain far apart for a considerable period of time. Clean innovation catches up with dirty innovation only well after 2030. This catch-up occurs because of delayed reaction to fuel price hikes leading up to 2005 and GDP per capita growth, which tends to relatively favor clean innovation.
To what extent can carbon taxes speed up this convergence process? We examine the effects of a permanent worldwide increase in fuel prices in 2006 ðand fixed at this level thereafterÞ of 10 percent, 20 percent, 30 percent, 40 percent, and 50 percent in panels b-f, respectively. In panel b we see that the gap between clean and dirty becomes smaller with a fuel price increase of 10 percent both because there is more clean innovation and because there is less dirty innovation. However, parity is achieved between clean and dirty only after 2030. It would take an increase of 40 percent in fuel prices in order to achieve parity in 2020 according to our model ðpanel eÞ. This is a pretty large increase, comparable with the increase that took place in the 1990s in figure 1.
One criticism of the simulation is that we would expect such a large increase in the fuel prices to have a negative effect on GDP per capita because of deadweight costs of taxation, adjustment costs, and so on. This, in turn, could slow down the growth of clean innovation ðe.g., Gans 2012Þ. To obtain some insight into the magnitude of these effects, figure 7 considers the 40 percent fuel tax hike scenario coupled with a negative effect on GDP per capita growth. Panel a reproduces the baseline case in which there is no effect on GDP ðas in fig. 6, panel eÞ. Panel b considers a fall in the growth rate by 0.25 percentage point ðe.g., from 1.5 percent to 1.25 percent per yearÞ. This postpones the crossover year because income growth has a stronger positive effect on clean innovation than dirty innovation in our estimates. But the effect is rather small, moving the crossover year from 2020 to 2022, only 2 years. Larger tax-driven falls in GDP per capita growth postpone things further, but it would take a full 1 percentage point a year fall in the growth rate to postpone the crossover year beyond 2030. We view it as very unlikely that fuel taxes would knock a percentage point off annual growth for 15 years or more, and this also ignores the damaging effects of global warming itself on economic growth over the medium run. We therefore take some comfort from figure 7 that incorporating output effects would not dramatically change the conclusions from figure 6. -Simulations over time of the effect of a 40 percent increase in fuel prices allowing for a negative effect of the carbon tax on GDP per capita growth. a, Baseline case, no effect of carbon tax on GDP per capita growth. b, Tax reduces GDP per capita growth by 0.25 percentage points. c, Tax reduces GDP per capita growth by 0.50 percentage points. d, Tax reduces GDP per capita growth by 0.75 percentage points. e, Tax reduces GDP per capita growth by 1.0 percentage points. These graphs show the simulated evolution of the aggregate clean and dirty knowledge stocks between 2005 and 2030 after a fuel price increase of 40 percent using the model in table 6, columns 1 and 4. We consider a negative effect on per capita GDP growth of the carbon tax of between zero as in the baseline case ðpanel a replicates panel e of fig. 5Þ and 1 percentage point ðin panel eÞ.
In figure 8 we explore the importance of path dependence for the simulations. First, we repeat the baseline specifications allowing for all dynamic adjustments in the cases of no fuel price change ðpanel aÞ and of a 40 percent increase ðpanel bÞ. In panels c and d we repeat this exercise while fixing all innovation stock variables-that is, both spillovers and own knowledge stocks-at their 2005 levels. As a consequence, both clean and dirty innovation and thus the growth rate of knowledge stocks reduce markedly as firms no longer benefit from standing on the shoulders of either their own or others' past innovation success. Also note that in panel c, where we keep prices fixed, the gap between clean and dirty is now narrower than in the equivalent panel a. Despite this, the 40 percent increase in fuel prices in panel d is much less effective than in panel b, where the dynamic effects from knowledge stocks are switched on. This illustrates that path dependency is a double-edged sword as pointed out by Acemoglu et al. ð2012Þ. In the absence of effective policies, it creates a kind of lock-in for dirty innovation. But if effective policies are introduced such as a carbon tax or R&D subsidy, path dependency can help reinforce the growth of clean innovation as the economy accumulates clean knowledge more rapidly. Hence, if we switch off the two path dependency channels, innovation trends become less responsive to tax policy. and b, knowledge stocks and spillover stocks are recursively updated using the estimates from table 6, columns 1 and 4. In panels c and d, we switch off the effects of past innovation stocks by the firm itself and of spillovers. In all figures we assume a 1.5 percent growth rate of per capita GDP.

VII. Conclusion
In this paper we have combined several patent data sets to analyze directed technical change in the auto sector, which is a key industry of concern for climate change. We use patenting data from 3,412 firms and individuals between 1965 and 2005 across 80 patent offices. We exploit the fact that tax-inclusive fuel prices ðour proxy for a carbon taxÞ evolve differentially over time across countries in our data set and that firms are differentially exposed to these price changes because of their heterogeneous market positions in different geographic markets. Consistent with our theoretical predictions, we find that clean innovation is stimulated by increases in the fuel prices whereas dirty innovation is depressed. Our second key result is that there is strong evidence for "path dependency" in the sense that firms more exposed to clean innovation from other firms are more likely to direct their research energies to clean innovation in the future ða directed knowledge spillover effectÞ. Similarly, firms with a history of dirty innovation in the past are more likely to focus on dirty innovation in the future. The fact that such path dependency holds for clean ðas well as dirtyÞ innovation highlights the desirability of acting sooner to shift incentives for climate change innovation. Since the stock of dirty innovation is greater than that of clean, the path dependency effect will tend to lock economies into high carbon emissions, even after the introduction of a mild carbon tax or R&D subsidies for clean technologies. So this may make the case for stronger action now, which could be relaxed in the future as the economy's stock of knowledge shifts in more of a clean direction. Increases to carbon prices can bring about a change in direction. For example, our baseline results suggest that an increase of 40 percent of fuel prices with respect to the 2005 fuel price will allow clean innovation stocks to overtake dirty stocks after 15 years.
Our analysis could be extended in several directions. First, we could analyze output effects beyond the macro adjustments in the simulations of table 6 to examine the firm-level effects. This would require a large extension in terms of using data on sales, however. Second, we could use our framework to simulate other policies, such as country-specific changes in carbon taxes ðor R&D subsidiesÞ, to see how this would affect the innovation profiles in specific countries rather than just globally. Third, the same basic approach could be taken to look at sectors other than automobiles such as the energy sector as in Acemoglu et al. ð2016Þ. Finally, we could use microdata to estimate the relative efficiency of R&D investments in clean versus dirty innovation and also the elasticity of substitution between the two types of production technologies. As argued in Acemoglu et al. ð2012Þ, these parameters play as important a role as the discount rate in characterizing the optimal environmental policy. We acknowledge that a limitation of our analysis is that we assume that non-carbon taxes, path dependency combustion engine cars are needed for radically reducing carbon emissions in transport. It may be that innovation in grey technologies will be sufficient, although we view this as unlikely. To close the model, one would further need to measure the emissions impact of each type of innovations ðclean, grey, or purely dirtyÞ and include a simultaneous analysis of emissions in electricity production. All these and other extensions of our analysis in this paper are left for future research.