Polish model #6

abb128 · 2023-03-27T19:42:01Z

An initial Polish model has been trained on the 160 hours of Mozilla Common Voice using 80/10/10 train/test/dev speaker split, with 4.51% WER on unseen speakers (some of which may actually potentially not be unseen because Common Voice allows anonymous submissions and doesn't link them? not certain)

It is available here: https://april.sapples.net/april-polish-dev-2_pl.april

phodina · 2023-05-20T05:16:00Z

Could you link an article or lists of steps on how to train the model for different languages? I assumed you got the dataset from here.

What would be the next step if I'd like to train it for czech/slovak language?

abb128 · 2023-05-22T14:17:13Z

@phodina From my testing, training on common voice data actually didn't work that well for Internet content with talking, because the speech is too high-quality and clear. The dataset is composed entirely of reading-speech, as the dataset collection is done by having people read written sentences out loud, and this produces a somewhat different kind of speech compared to natural talking or conversation.

I may write an article with some findings and instructions later, but for now I trained the model using this recipe with some modifications to use common voice instead of LibriSpeech, and I used this to finally export the checkpoint to a .april file.

phodina · 2023-06-04T08:27:08Z

Hi @abb128 , thanks for the explanation. I'll look at the recipe you suggested!

dreamcat4 · 2023-06-21T11:30:46Z

Hello,
I could not follow what was being explained here. But you like to know if there is a good pathway to convert from Mozilla Common Voice --> LTSM --> .april model.

The desired language (for myself) is greek. However for any other languages, as a general workflow. It would be very helpful.

[EDIT]

But also: how to be informed / notified once new .april models gets added? To know to come back / check again.

Doomsdayrs · 2024-02-29T05:52:08Z

I may write an article with some findings and instructions later, but for now I trained the model using this recipe with some modifications to use common voice instead of LibriSpeech

What are the modifications you performed? Can you provide the patch file?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Polish model #6

Polish model #6

abb128 commented Mar 27, 2023

phodina commented May 20, 2023

abb128 commented May 22, 2023

phodina commented Jun 4, 2023

dreamcat4 commented Jun 21, 2023 •

edited

Loading

Doomsdayrs commented Feb 29, 2024

Polish model #6

Polish model #6

Comments

abb128 commented Mar 27, 2023

phodina commented May 20, 2023

abb128 commented May 22, 2023

phodina commented Jun 4, 2023

dreamcat4 commented Jun 21, 2023 • edited Loading

Doomsdayrs commented Feb 29, 2024

dreamcat4 commented Jun 21, 2023 •

edited

Loading