Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] on large text with device=cpu, tts_to_file will throw RuntimeError bad allocation #3800

Open
ArmoredExplorer opened this issue Jun 27, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@ArmoredExplorer
Copy link

ArmoredExplorer commented Jun 27, 2024

Describe the bug

When running text-to-speech on an english model, when tts tries to write the .wav file, it runs out of memory. I'm running on cpu only. My machine has ~14GB available RAM

I ran the code on around 20 pages of text, everything worked before tts.tts_to_file, but then it threw runtimeError bad allocation. During inference the model was successfully swapping chunks in and out of memory but when trying to write the file, it looks like it ran out of memory.

It works fine on a few paragraphs.

To Reproduce

from TTS.api import TTS

# set device
device = "cpu"

txt_20_pages = "copyrighted text, substitute with 500*20 words"

# Init TTS with the target model name
tts = TTS(model_name="tts_models/de/thorsten/tacotron2-DDC", progress_bar=False).to(device)
# Run TTS
tts.tts_to_file(text=txt_20_pages, file_path="long_voice.wav")

Expected behavior

Writing the .wav file successfully

Logs

Traceback (most recent call last):
  File "C:\Users\Zapi\Documents\spe2.py", line 284, in <module>
    tts.tts_to_file(text=txt2, file_path="gard_book_ich1.wav")
  File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\TTS\api.py", line 334, in tts_to_file
    wav = self.tts(
  File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\TTS\api.py", line 276, in tts
    wav = self.synthesizer.tts(
  File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\TTS\utils\synthesizer.py", line 398, in tts
    outputs = synthesis(
  File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\TTS\tts\utils\synthesis.py", line 221, in synthesis
    outputs = run_model_torch(
  File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\TTS\tts\utils\synthesis.py", line 53, in run_model_torch
    outputs = _func(
  File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\TTS\tts\models\vits.py", line 1161, in inference
    o = self.waveform_decoder((z * y_mask)[:, :, : self.max_inference_len], g=g)
  File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\TTS\vocoder\models\hifigan_generator.py", line 254, in forward
    o = self.ups[i](o)
  File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\Zapi\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\conv.py", line 797, in forward
    return F.conv_transpose1d(
RuntimeError: bad allocation

Environment

{
    "CUDA": {
        "GPU": [],
        "available": false,
        "version": null
    },
    "Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "2.3.1+cpu",
        "TTS": "0.22.0",
        "numpy": "1.22.0"
    },
    "System": {
        "OS": "Windows",
        "architecture": [
            "64bit",
            "WindowsPE"
        ],
        "processor": "AMD64 Family 25 Model 1 Stepping 1, AuthenticAMD",
        "python": "3.9.7",
        "version": "10.0.20348"
    }
}

Additional context

This is happening on 16GB of RAM so if you have more ram when testing it might not happen. Limit in a VM should be able to do it.

@ArmoredExplorer ArmoredExplorer added the bug Something isn't working label Jun 27, 2024
Copy link

stale bot commented Aug 2, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

@stale stale bot added the wontfix This will not be worked on but feel free to help. label Aug 2, 2024
@ArmoredExplorer
Copy link
Author

Wondering what is wrong with it. It seems to happen in the torch package.

@stale stale bot removed the wontfix This will not be worked on but feel free to help. label Aug 2, 2024
@ArmDaniel
Copy link

Hey @ArmoredExplorer ! I ran into exactly the same issue as you. Also on a machine with 16GB RAM. The solution I found was to split the inital text into super small text files ( max 100 characters...) and then feed them sequentially to the model. It is super slow, but it is the best I could do. You could multiprocesses them in batches too, but you need extra care

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants