Add Support for laughter annotation in Fine-Tuning with a special token [Feature request] #3760
Labels
feature request
feature requests for making TTS better.
wontfix
This will not be worked on but feel free to help.
Hello Coqui-AI team,
I would like to request a new feature that would greatly enhance the training process for capturing non-verbal sounds, such as laughter, in transcribed conversations. Specifically, my suggestion is to implement a special token or keyword, such as [laugh], that can be used during fine-tuning to denote instances of laughter in the audio data.
For instance, if a person laughs in an audio file, the transcription could include the special token [laugh] at the appropriate point. This way, when the model is fine-tuned, it learns to recognize and reproduce laughter in the synthesized speech.
Thank you for considering this request. You are doing a fantastic job.
The text was updated successfully, but these errors were encountered: