Publication:

An Iterative Dual Pathway Structure for Speech-to-Text Transcription

Loading...
Thumbnail Image

Date

2011

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

Association for the Advancement of Artificial Intelligence
The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Liem, Beatrice, Haoqi Zhang, and Yiling Chen. Forthcoming. An iterative dual pathway structure for speech-to-text transcription. In Human Computation: Papers from the AAAI Workshop (WS-11-11). San Francisco, CA, August 2011, ed. Luis von Ahn and Panagiotis Ipeirotis. Association for the Advancement of Artificial Intelligence.

Abstract

In this paper, we develop a new human computation algorithm for speech-to-text transcription that can potentially achieve the high accuracy of professional transcription using only microtasks deployed via an online task market or a game. The algorithm partitions audio clips into short 10-second segments for independent processing and joins adjacent outputs to produce the full transcription. Each segment is sent through an iterative dual pathway structure that allows participants in either path to iteratively refine the transcriptions of others in their path while being rewarded based on transcriptions in the other path, eliminating the need to check transcripts in a separate process. Initial experiments with local subjects show that produced transcripts are on average 96.6% accurate.

Description

Research Data

Keywords

Terms of Use

This article is made available under the terms and conditions applicable to Open Access Policy Articles (OAP), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories