Publication:
Reference Specification in Multilingual Document Production

Thumbnail Image

Date

2005

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Nickerson, Jill Suzanne. 2005. Reference Specification in Multilingual Document Production. Harvard Computer Science Group Technical Report TR-21-05.

Research Data

Abstract

To produce documents in multiple languages automatically requires a language-independent representation of the documents' meaning. For a person to build this language-independent representation by communicating in natural language with a computer system, the problem of reference must be addressed. This problem, inherent in natural language, presents itself not only in the specification of the language-independent representation, but also in the generation of documents with the meaning contained in this representation. This thesis presents methods to make both the specification of entities in the user interface and the generation of expressions to refer to these entities in documents more natural and provides empirical evidence demonstrating the efficacy of these methods. More specifically, this thesis describes the development of three types of reference mechanisms: a statistical model that uses domain and lexical knowledge to organize new options in the interface; techniques for controlling coreference specification that take advantage of discourse structure and genre features; and automatically learned models for generating expressions to refer to new and already mentioned entities in a particular domain. The evaluation of these reference mechanisms establishes that specifying new entities using an interface formed by computational linguistic processing reduces the amount of time required to refer to entities in the interface; exploiting discourse structure and genre features is more helpful than traditional knowledge-editing interfaces for referring to entities in the interface that are already contained in the knowledge representation; and using learned linguistic information to generate referring expressions in documents leads to expressions that more closely match the decisions of people.

Description

Other Available Sources

Keywords

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories