Publication:

Structured Neural Models for Coreference and Generation

Loading...
Thumbnail Image

Date

2018-05-13

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Abstract

Natural Language Processing (NLP) has recently entered a period marked by impressive empirical performance on a wide variety of natural language tasks, with much of this empirical success due to the use of deep learning techniques. Deep learning has, in particular, offered a simple approach for learning expressive, global models of linguistic data. While incredibly powerful, this style of modeling poses a problem for NLP tasks that require structured prediction, the prediction of outputs with combinatorial structure, such as sequences, trees, and graphs. Indeed, whereas the standard approach to tackling structured prediction problems in NLP involves predicting the structure with the highest score under the learned model, it is often intractable to find the highest scoring structure under the sort of global model common in deep learning approaches to NLP. In this thesis we argue that search-based structured prediction, where a model is trained to search incrementally for a structure, is a particularly natural choice for doing structured prediction with deep models. Specifically, we argue that recurrent neural networks make it simple and convenient to compactly represent the history of incremental predictions made during search, which allows for the learning of powerful search-based structured predictors. We first investigate this approach to deep structured prediction in the context of the NLP task of coreference resolution. In particular we first discuss a baseline, neural coreference resolver, which was sufficient for state-of-the-art performance on its introduction, and we then show that a search-based, structured approach improves even over this. We then discuss an approach to training the celebrated sequence-to-sequence model as a search-based structured predictor, and we show that this leads to improvements on word ordering, dependency parsing, and machine translation tasks. In the final chapter of this thesis we discuss the structured prediction problem of long-form text generation, and database-to-text generation in particular, which is not well handled by the techniques introduced in the preceding chapters. We introduce a new dataset for studying the challenges posed by this structured prediction problem, suggest new automatic approaches to evaluating performance on this problem, and use these automatic approaches to analyze the performance of various state-of-the-art generation models.

Description

Other Available Sources

Research Data

Keywords

Computer Science

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories