Description
This is a modified version of a subset of the Device and Produced Speech (DAPS) dataset. The original dataset can be found here. This dataset contains text-aligned audio of the first script of the "clean" partition of the DAPS dataset for all 20 speakers. Phoneme and word alignments are provided as JSON files. We segment the audio and alignments into single sentences. For each sentence, we additionally provide the raw text in a txt file. Audio is provided as 44.1 kHz WAV files. If you use this work as part of an academic publication, please cite the paper corresponding to the original dataset: Gautham J. Mysore, “Can We Automatically Transform Speech Recorded on Common Consumer Devices in Real-World Environments into Professional Production Quality Speech? - A Dataset, Insights, and Challenges”, in the IEEE Signal Processing Letters, Vol. 22, No. 8, August 2015
Date made available | May 24 2021 |
---|---|
Publisher | ZENODO |