Skip to content

SPECOM 2018 (Oral): Dataset for "Towards Improving The Intelligibility of Black Box Speech Synthesizers In Noise"

Notifications You must be signed in to change notification settings

TManzini/ImprovingBlackBoxSpeechSynthesizers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Towards Improving The Intelligibility of Black Box Speech Synthesizers In Noise

This repository contains the dataset that was collected in conjunction with the publication "Towards Improving The Intelligibility of Black Box Speech Synthesizers In Noise". The dataset itself contains 1440 transcriptions of syntheic speech synthesizers under synthetic noise conditions to measure the errors that listeners make.

The dataset was collected with s from mechanical turk and the Carnegie Mellon University TestVox test kit.

Dataset Shape

The dataset is contained in the file "IBBSS_dataset.csv" here in the repository.

Each transcription contains the following information.

  • A unique ID that represents the listener that transcribed this example.
  • A number which indicates which sound file this transcription this was for, for a given listener. (0 is the first sound file a listener heard, 1 is the second, etc.)
  • The transcription that the listener entered. (An empty string if the listener indicates the listener did not understand any content from the synthesizer)
  • The ground truth text which was provided to the synethizer.
  • The standard deviation of the random noise that was added to the synthesized speech.
  • The frequency that was used as a threshold for a low pass filter that was applied to the synthesized speech.
  • The frequency that was used as a threshold for a high pass filter that was applied to the synthesized speech.
  • The name of the synthesizer that was used (Flite, E-Speak, Google).
  • The file type that was used for the experiment.

Citation

If you use this dataset in your research please use the following citation.

@inproceedings{manzini2018towards,
  title={Towards Improving Intelligibility of Black-Box Speech Synthesizers in Noise},
  author={Manzini, Thomas and Black, Alan},
  booktitle={International Conference on Speech and Computer},
  pages={367--376},
  year={2018},
  organization={Springer}
}

About

SPECOM 2018 (Oral): Dataset for "Towards Improving The Intelligibility of Black Box Speech Synthesizers In Noise"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published