Show simple item record

dc.contributor.authorNelken, Rani
dc.contributor.authorShieber, Stuart M.
dc.date.accessioned2012-07-16T14:11:16Z
dc.date.issued2007
dc.identifier.citationNelken, Rani and Stuart M. Shieber. Lexical chaining and word-sense-disambiguation. Technical Report TR-06-07, School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, 2007.en_US
dc.identifier.urihttp://nrs.harvard.edu/urn-3:HUL.InstRepos:9136730
dc.description.abstractLexical chains algorithms attempt to find sequences of words in a document that are closely related semantically. Such chains have been argued to provide a good indication of the topics covered by the document without requiring a deeper analysis of the text, and have been proposed for many NLP tasks. Different underlying lexical semantic relations based on WordNet have been used for this task. Since links in WordNet connect synsets rather than words, open word-sense disambiguation becomes a necessary part of any chaining algorithm, even if the intended application is not disambiguation. Previous chaining algorithms have combined the tasks of disambiguation and chaining by choosing those word senses that maximize chain connectivity, a strategy which yields poor disambiguation accuracy in practice. <p>We present a novel probabilistic algorithm for finding lexical chains. Our algorithm explicitly balances the requirements of maximizing chain connectivity with the choice of probable word-senses. The algorithm achieves better disambiguation results than all previous ones, but under its optimal settings shifts this balance totally in favor of probable senses, essentially ignoring the chains. This model points to an inherent conflict between chaining and word-sensedisambiguation. By establishing an upper bound on the disambiguation potential of lexical chains, we show that chaining is theoretically highly unlikely to achieve accurate disambiguation. <p>Moreover, by defining a novel intrinsic evaluation criterion for lexical chains, we show that poor disambiguation accuracy also implies poor chain accuracy. Our results have crucial implications for chaining algorithms. At the very least, they show that disentangling disambiguation from chaining significantly improves chaining accuracy. The hardness of all-words disambiguation, however, implies that finding accurate lexical chains is harder than suggested by the literature.en_US
dc.description.sponsorshipEngineering and Applied Sciencesen_US
dc.language.isoen_USen_US
dash.licenseLAA
dc.titleLexical Chaining and Word-Sense-Disambiguationen_US
dc.typeResearch Paper or Reporten_US
dc.description.versionVersion of Recorden_US
dash.depositing.authorShieber, Stuart M.
dc.date.available2012-07-16T14:11:16Z
dash.identifier.orcid0000-0002-7733-8195*
dash.contributor.affiliatedShieber, Stuart


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record