Show simple item record

dc.contributor.authorHopkins, Daniel J.
dc.contributor.authorKing, Gary
dc.date.accessioned2011-09-06T19:01:32Z
dc.date.issued2010
dc.identifier.citationHopkins, Daniel J. and Gary King. 2010. A method of automated nonparametric content analysis for social science. American Journal of Political Science 54(1): 229-247.en_US
dc.identifier.issn0092-5853en_US
dc.identifier.urihttp://nrs.harvard.edu/urn-3:HUL.InstRepos:5125261
dc.description.abstractThe increasing availability of digitized text presents enormous opportunities for social scientists. Yet hand coding many blogs, speeches, government records, newspapers, or other sources of unstructured text is infeasible. Although computer scientists have methods for automated content analysis, most are optimized to classify individual documents, whereas social scientists instead want generalizations about the population of documents, such as the proportion in a given category. Unfortunately, even a method with a high percent of individual documents correctly classified can be hugely biased when estimating category proportions. By directly optimizing for this social science goal, we develop a method that gives approximately unbiased estimates of category proportions even when the optimal classifier performs poorly. We illustrate with diverse data sets, including the daily expressed opinions of thousands of people about the U.S. presidency. We also make available software that implements our methods and large corpora of text for further analysis.en_US
dc.description.sponsorshipGovernmenten_US
dc.language.isoen_USen_US
dc.publisherWiley-Blackwellen_US
dc.relation.isversionofdoi:10.1111/j.1540-5907.2009.00428.xen_US
dc.relation.hasversionhttp://j.mp/1M2zFGNen_US
dash.licenseLAA
dc.titleA Method of Automated Nonparametric Content Analysis for Social Scienceen_US
dc.typeJournal Articleen_US
dc.description.versionVersion of Recorden_US
dc.relation.journalAmerican Journal of Political Scienceen_US
dash.depositing.authorKing, Gary
dc.date.available2011-09-06T19:01:32Z
dc.data.urihttp://hdl.handle.net/1902.1/12898
dc.identifier.doi10.1111/j.1540-5907.2009.00428.x*
dash.identifier.orcid0000-0002-5327-7631*
dash.contributor.affiliatedKing, Gary


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record