Publication:
Comma restoration using constituency information

Thumbnail Image

Date

2003

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

Association for Computational Linguistics
The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Stuart M. Shieber and Xiaopeng Tao. Comma restoration using constituency information. In Proceedings of the 2003 Human Language Technology Conference and Conference of the North American Chapter of the Association for Computational Linguistics, pages 221-227, Edmonton, AB, Canada, 2003.

Research Data

Abstract

Automatic restoration of punctuation from unpunctuated text has application in improving the fluency and applicability of speech recognition systems. We explore the possibility that syntactic information can be used to improve the performance of an HMM-based system for restoring punctuation (specifically, commas) in text. Our best methods reduce sentence error rate substantially - by some 20%, with an additional 8% reduction possible given improvements in extraction of the requisite syntactic information.

Description

Other Available Sources

Keywords

Terms of Use

This article is made available under the terms and conditions applicable to Open Access Policy Articles (OAP), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories