Lexical Chaining and Word-Sense-Disambiguation

Nelken, Rani; Shieber, Stuart M.

dc.contributor.author	Nelken, Rani
dc.contributor.author	Shieber, Stuart M.
dc.date.accessioned	2012-07-16T14:11:16Z
dc.date.issued	2007
dc.identifier.citation	Nelken, Rani and Stuart M. Shieber. Lexical chaining and word-sense-disambiguation. Technical Report TR-06-07, School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, 2007.	en_US
dc.identifier.uri	http://nrs.harvard.edu/urn-3:HUL.InstRepos:9136730
dc.description.abstract	Lexical chains algorithms attempt to find sequences of words in a document that are closely related semantically. Such chains have been argued to provide a good indication of the topics covered by the document without requiring a deeper analysis of the text, and have been proposed for many NLP tasks. Different underlying lexical semantic relations based on WordNet have been used for this task. Since links in WordNet connect synsets rather than words, open word-sense disambiguation becomes a necessary part of any chaining algorithm, even if the intended application is not disambiguation. Previous chaining algorithms have combined the tasks of disambiguation and chaining by choosing those word senses that maximize chain connectivity, a strategy which yields poor disambiguation accuracy in practice. <p>We present a novel probabilistic algorithm for finding lexical chains. Our algorithm explicitly balances the requirements of maximizing chain connectivity with the choice of probable word-senses. The algorithm achieves better disambiguation results than all previous ones, but under its optimal settings shifts this balance totally in favor of probable senses, essentially ignoring the chains. This model points to an inherent conflict between chaining and word-sensedisambiguation. By establishing an upper bound on the disambiguation potential of lexical chains, we show that chaining is theoretically highly unlikely to achieve accurate disambiguation. <p>Moreover, by defining a novel intrinsic evaluation criterion for lexical chains, we show that poor disambiguation accuracy also implies poor chain accuracy. Our results have crucial implications for chaining algorithms. At the very least, they show that disentangling disambiguation from chaining significantly improves chaining accuracy. The hardness of all-words disambiguation, however, implies that finding accurate lexical chains is harder than suggested by the literature.	en_US
dc.description.sponsorship	Engineering and Applied Sciences	en_US
dc.language.iso	en_US	en_US
dash.license	LAA
dc.title	Lexical Chaining and Word-Sense-Disambiguation	en_US
dc.type	Research Paper or Report	en_US
dc.description.version	Version of Record	en_US
dash.depositing.author	Shieber, Stuart M.
dc.date.available	2012-07-16T14:11:16Z
dash.identifier.orcid	0000-0002-7733-8195	*
dash.contributor.affiliated	Shieber, Stuart

Files in this item

Name:: tr-06-07.pdf
Size:: 170.1Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

FAS Scholarly Articles [18292]

Show simple item record