An iterative dual pathway structure for speech-to-text transcription

Beatrice Liem*, Haoqi Zhang, Yiling Chen

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Scopus citations

Abstract

In this paper, we develop a new human computation algorithm for speech-to-text transcription that can potentially achieve the high accuracy of professional transcription using only microtasks deployed via an online task market or a game. The algorithm partitions audio clips into short 10-second segments for independent processing and joins adjacent outputs to produce the full transcription. Each segment is sent through an iterative dual pathway structure that allows participants in either path to iteratively refine the transcriptions of others in their path while being rewarded based on transcriptions in the other path, eliminating the need to check transcripts in a separate process. Initial experiments with local subjects show that produced transcripts are on average 96.6% accurate.

Original languageEnglish (US)
Title of host publicationHuman Computation - Papers from the 2011 AAAI Workshop, Technical Report
Pages37-42
Number of pages6
StatePublished - Nov 2 2011
Event2011 AAAI Workshop - San Francisco, CA, United States
Duration: Aug 8 2011Aug 8 2011

Publication series

NameAAAI Workshop - Technical Report
VolumeWS-11-11

Conference

Conference2011 AAAI Workshop
CountryUnited States
CitySan Francisco, CA
Period8/8/118/8/11

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint Dive into the research topics of 'An iterative dual pathway structure for speech-to-text transcription'. Together they form a unique fingerprint.

  • Cite this

    Liem, B., Zhang, H., & Chen, Y. (2011). An iterative dual pathway structure for speech-to-text transcription. In Human Computation - Papers from the 2011 AAAI Workshop, Technical Report (pp. 37-42). (AAAI Workshop - Technical Report; Vol. WS-11-11).