Publication: Complexity, Parsing, and Factorization of Tree-Local Multi-Component Tree-Adjoining Grammar
Open/View Files
Date
2010
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Massachusetts Institute of Technology Press (MIT Press)
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Nesson, Rebecca, Stuart M. Shieber, and Giorgio Satta. 2010. Complexity, parsing, and factorization of tree-local multi-component tree-adjoining grammar. Computational Linguistics 36(3): 443-480.
Research Data
Abstract
Tree-Local Multi-Component Tree-Adjoining Grammar (TL-MCTAG) is an appealing formalism for natural language representation because it arguably allows the encapsulation of the appropriate domain of locality within its elementary structures. Its multicomponent structure allows modeling of lexical items that may ultimately have elements far apart in a sentence, such as quantifiers and Wh-words. When used as the base formalism for a synchronous grammar, its flexibility allows it to express both the close relationships and the divergent structure necessary to capture the links between the syntax and semantics of a single language or the syntax of two different languages. Its limited expressivity provides constraints on movement and, we posit, may have generated additional popularity based on a misconception about its parsing complexity. Although TL-MCTAG was shown to be equivalent in expressivity to TAG when it was first introduced (Weir 1988), the complexity of TL-MCTAG is still not well-understood. This paper offers a thorough examination of the problem of TL-MCTAG recognition, showing that even highly restricted forms of TL-MCTAG are NP-complete to recognize. However, in spite of the provable difficulty of the recognition problem, we offer several algorithms that can substantially improve processing efficiency. First, we present a parsing algorithm that improves on the baseline parsing method and runs in polynomial time when both the fan-out and rank of the input grammar are bounded. Second, we offer an optimal, efficient algorithm for factorizing a grammar to produce a strongly-equivalent TL-MCTAG grammar with the rank of the grammar minimized.
Description
Other Available Sources
Keywords
Terms of Use
This article is made available under the terms and conditions applicable to Open Access Policy Articles (OAP), as set forth at Terms of Service