Arabic diacritization using weighted finite-state transducers

DSpace/Manakin Repository

Arabic diacritization using weighted finite-state transducers

Citable link to this page

 

 
Title: Arabic diacritization using weighted finite-state transducers
Author: Shieber, Stuart ORCID  0000-0002-7733-8195 ; Nelken, Rani

Note: Order does not necessarily reflect citation order of authors.

Citation: Rani Nelken and Stuart M. Shieber. Arabic diacritization using weighted finite-state transducers. In Proceedings of the 2005 ACL Workshop on Computational Approaches to Semitic Languages, pages 79-86, Ann Arbor, Michigan, June 2005.
Full Text & Related Files:
Abstract: Arabic is usually written without short vowels and additional diacritics, which are nevertheless important for several applications. We present a novel algorithm for restoring these symbols, using a cascade of probabilistic finite- state transducers trained on the Arabic treebank, integrating a word-based language model, a letter-based language model, and an extremely simple morphological model. This combination of probabilistic methods and simple linguistic information yields high levels of accuracy.
Published Version: http://www.aclweb.org/anthology-new/W/W05/W05-0711.pdf
Terms of Use: This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-use#LAA
Citable link to this page: http://nrs.harvard.edu/urn-3:HUL.InstRepos:2252610
Downloads of this work:

Show full Dublin Core record

This item appears in the following Collection(s)

 
 

Search DASH


Advanced Search
 
 

Submitters