count_phonemized in words_mismatch should use given separator, not spaces to split words #169

iamanigeeit · 2024-06-26T05:35:55Z

It's quite common to use spaces to separate the phonemes for speech synthesis.

But this leads to word mismatch problems because count_phonemized splits on whitespace.

>>> from phonemizer.backend import BACKENDS
>>> from phonemizer.separator import Separator
>>> G2P = BACKENDS['espeak'](language='en-us', words_mismatch='warn')
>>> SEP = Separator(word='|', phone=' ')
>>> G2P.phonemize(['try'], separator=SEP)[0]
WARNING:phonemizer:words count mismatch on line 1 (expected 1 words but get 4)
WARNING:phonemizer:words count mismatch on 100.0% of the lines (1/1)
't ɹ aɪ |'

It seems to be a common issue, e.g. #154 and lifeiteng/vall-e#50

I have fixed this (per below) but let me know if you need a PR for it.

Fix in words_mismatch.py

    @classmethod
    def _count_words(cls, text, wordsep=None):
        """Return the number of words contained in each line of `text`"""
        return [
            len([w for w in line.strip().split(wordsep) if w])
            for line in text]

    def count_phonemized(self, text, wordsep=None):
        """Stores the number of words in each output line"""
        self._count_phn = self._count_words(text, wordsep)

Fix in espeak.py:

    def _phonemize_postprocess(self, phonemized, punctuation_marks, separator):
        text = phonemized[0]
        switches = phonemized[1]

        self._words_mismatch.count_phonemized(text, separator.word)
        self._lang_switch.warning(switches)

        phonemized = super()._phonemize_postprocess(text, punctuation_marks, separator)
        return self._words_mismatch.process(phonemized)

Fix in base.py

    def phonemize(self, text, separator=default_separator,
                  strip=False, njobs=1):
        ...
        return self._phonemize_postprocess(phonemized, punctuation_marks, separator)

    def _phonemize_postprocess(self, phonemized, punctuation_marks, separator):
        ...

Note: this still raises warnings when unexpected line splits occur, such as caps in the middle GameStop or nonword chars before punctuation he said--, no. But it should suffice for most cases and the input text should be normalized properly.

The text was updated successfully, but these errors were encountered:

mmmaat · 2024-06-28T17:22:59Z

Thank's for pointing that bug! Does the fix in the issue_169 branch solve your problem?

iamanigeeit · 2024-07-01T21:43:07Z

Thanks for adding test cases! I haven't tested as i simply changed my own code.

mmmaat pushed a commit that referenced this issue Jun 28, 2024

fixing issue #169

bae418c

mmmaat added a commit that referenced this issue Jul 2, 2024

fixed issue #169

b52ccc3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

count_phonemized in words_mismatch should use given separator, not spaces to split words #169

count_phonemized in words_mismatch should use given separator, not spaces to split words #169

iamanigeeit commented Jun 26, 2024 •

edited

Loading

mmmaat commented Jun 28, 2024

iamanigeeit commented Jul 1, 2024

count_phonemized in words_mismatch should use given separator, not spaces to split words #169

count_phonemized in words_mismatch should use given separator, not spaces to split words #169

Comments

iamanigeeit commented Jun 26, 2024 • edited Loading

mmmaat commented Jun 28, 2024

iamanigeeit commented Jul 1, 2024

iamanigeeit commented Jun 26, 2024 •

edited

Loading