Add Support for laughter annotation in Fine-Tuning with a special token [Feature request] #3760

JaviCru · 2024-05-27T10:30:08Z

Hello Coqui-AI team,

I would like to request a new feature that would greatly enhance the training process for capturing non-verbal sounds, such as laughter, in transcribed conversations. Specifically, my suggestion is to implement a special token or keyword, such as [laugh], that can be used during fine-tuning to denote instances of laughter in the audio data.

For instance, if a person laughs in an audio file, the transcription could include the special token [laugh] at the appropriate point. This way, when the model is fine-tuned, it learns to recognize and reproduce laughter in the synthesized speech.

Thank you for considering this request. You are doing a fantastic job.

stale · 2024-06-26T16:48:35Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

JaviCru added the feature request feature requests for making TTS better. label May 27, 2024

stale bot added the wontfix This will not be worked on but feel free to help. label Jun 26, 2024

stale bot closed this as completed Jul 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Support for laughter annotation in Fine-Tuning with a special token [Feature request] #3760

Add Support for laughter annotation in Fine-Tuning with a special token [Feature request] #3760

JaviCru commented May 27, 2024

stale bot commented Jun 26, 2024

Add Support for laughter annotation in Fine-Tuning with a special token [Feature request] #3760

Add Support for laughter annotation in Fine-Tuning with a special token [Feature request] #3760

Comments

JaviCru commented May 27, 2024

stale bot commented Jun 26, 2024