What to Do about Missing Values in Time-Series Cross-Section Data

Honaker, James; King, Gary

dc.contributor.author	Honaker, James
dc.contributor.author	King, Gary
dc.date.accessioned	2010-05-17T20:33:23Z
dc.date.issued	2010
dc.identifier.citation	Honaker, James and Gary King. 2010. What to do about missing values in time-series cross-section data. American Journal of Political Science 54(2): 561-581.	en_US
dc.identifier.issn	0092-5853	en_US
dc.identifier.uri	http://nrs.harvard.edu/urn-3:HUL.InstRepos:4100248
dc.description.abstract	Applications of modern methods for analyzing data with missing values, based primarily on multiple imputation, have in the last half-decade become common in American politics and political behavior. Scholars in this subset of political science have thus increasingly avoided the biases and inefficiencies caused by ad hoc methods like listwise deletion and best guess imputation. However, researchers in much of comparative politics and international relations, and others with similar data, have been unable to do the same because the best available imputation methods work poorly with the time-series cross-section data structures common in these fields. We attempt to rectify this situation with three related developments. First, we build a multiple imputation model that allows smooth time trends, shifts across cross-sectional units, and correlations over time and space, resulting in far more accurate imputations. Second, we enable analysts to incorporate knowledge from area studies experts via priors on individual missing cell values, rather than on difficult-to-interpret model parameters. Third, because these tasks could not be accomplished within existing imputation algorithms, in that they cannot handle as many variables as needed even in the simpler cross-sectional data for which they were designed, we also develop a new algorithm that substantially expands the range of computationally feasible data types and sizes for which multiple imputation can be used. These developments also make it possible to implement the methods introduced here in freely available open source software that is considerably more reliable than existing algorithms.	en_US
dc.description.sponsorship	Government	en_US
dc.language.iso	en_US	en_US
dc.publisher	Wiley-Blackwell	en_US
dc.relation.isversionof	http://dx.doi.org/10.1111/j.1540-5907.2010.00447.x	en_US
dc.relation.hasversion	http://gking.harvard.edu/files/pr.pdf	en_US
dash.license	LAA
dc.title	What to Do about Missing Values in Time-Series Cross-Section Data	en_US
dc.type	Journal Article	en_US
dc.description.version	Version of Record	en_US
dc.relation.journal	American Journal of Political Science	en_US
dash.depositing.author	King, Gary
dc.date.available	2010-05-17T20:33:23Z
dc.data.uri	http://hdl.handle.net/1902.1/14316	en_US
dc.identifier.doi	10.1111/j.1540-5907.2010.00447.x	*
dash.identifier.orcid	0000-0002-5327-7631	*
dash.contributor.affiliated	King, Gary

Files in this item

Name:: Honaker_MissingValues.pdf
Size:: 924.8Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

FAS Scholarly Articles [18292]

Show simple item record