Add multi gpus capacity for transcription #114

chainyo · 2023-06-22T18:33:49Z

This PR:

Remove bad async design and blocking operations.
Allow multi-GPUs for transcription

WARNING: NEEDS TO BE DONE

We should handle multi-GPU cases when we have to load a specific whisper model for extra languages. I suggest introducing a sort of index for that will load the model and unload the model only for 1 GPU, and not all the GPUs...
Nemo is the bottleneck atm and it should also use an index to load multiple diarization models

chainyo · 2023-06-26T15:47:45Z

I implemented a GPU index for each service based on the GPUService class that handles the acquisition and the release of the GPU (to avoid 2 jobs trying to use the same GPU at the exact same moment).

I'm actually facing a weird issue with ctranslate2 inference: RuntimeError: CUDA: invalid arguments

…he" for this param

…d-error-reporting Updated error payload for svix in cortex endpoint

update the download_audio function to avoid extension problems

Add `audio_duration` key in API response

* add a catch for empty audio file * rename utterances -> response for coherence * fix quality

* add vocab feature * fix youtube endpoint * update the prompt sentence * add vocab feature * fix youtube endpoint * update the prompt sentence

…u20.04

Upgraded docker image to nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu20.04

add multi gpus handling for transcription

9766cd5

chainyo added api Everything related to the API implementation transcription Everything related to the transcription part labels Jun 22, 2023

chainyo requested a review from aleksandr-smechov June 22, 2023 18:33

chainyo self-assigned this Jun 22, 2023

chainyo marked this pull request as draft June 22, 2023 18:33

chainyo added 4 commits June 23, 2023 10:34

add model index for transcription and diarization

97aaadf

add gpu_index for alignment models

f157d74

fix diarization gpu indexing

1747561

multi gpu setup, with transcription errors

b03d824

Aleks and others added 19 commits June 29, 2023 01:58

Updated error payload for svix in cortex endpoint

6c3b6b7

Extra languages performed poorly, commenting out tests that require "…

8952960

…he" for this param

Merge pull request #119 from Wordcab/118-better-support-for-svix-base…

6cc874b

…d-error-reporting Updated error payload for svix in cortex endpoint

update the download_audio function to avoid extension

025f809

Merge pull request #122 from Wordcab/121-truncate-file-name-too-long

a690bac

update the download_audio function to avoid extension problems

add audio_duration key in repsonse + fix dual_channel bug

f9244c5

fix tests and endpoint returns

71a62a5

Merge pull request #127 from Wordcab/125-add-audio_duration-param

0778a8f

Add `audio_duration` key in API response

Add a catch for empty audio (#128)

e7df355

* add a catch for empty audio file * rename utterances -> response for coherence * fix quality

Add vocab feature (#124)

7fe0a7b

* add vocab feature * fix youtube endpoint * update the prompt sentence * add vocab feature * fix youtube endpoint * update the prompt sentence

Upgraded base docker image to nvidia/cuda:11.7.1-cudnn8-runtime-ubunt…

55582f2

…u20.04

Merge pull request #133 from Wordcab/132-upgrade-deprecated-nvidia-image

b691afe

Upgraded docker image to nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu20.04

add multi gpus handling for transcription

c92b44f

add model index for transcription and diarization

f72d0ff

add gpu_index for alignment models

81b2ded

fix diarization gpu indexing

6b44dd0

multi gpu setup, with transcription errors

2a61021

Merge branch '4gpus' of github.com:Wordcab/wordcab-transcribe into 4gpus

c42cc27

lower batch_size

f92d5f7

chainyo added 2 commits July 3, 2023 15:45

fix alignment device index

92c9234

revert transcribe service to no mapping

4feba13

aleksandr-smechov approved these changes Jul 3, 2023

View reviewed changes

chainyo added 2 commits July 5, 2023 12:59

update gpu service queue manager

1652c78

fix Exception returns for endpoints

58abd9d

aleksandr-smechov approved these changes Jul 5, 2023

View reviewed changes

fixed dual_channel

e3af87f

chainyo marked this pull request as ready for review July 5, 2023 14:23

chainyo added 2 commits July 5, 2023 14:35

fix flake and darglint

959eb9c

run black linter

fc7fbe6

aleksandr-smechov approved these changes Jul 5, 2023

View reviewed changes

fix nemo config tests

fb6f7a8

aleksandr-smechov approved these changes Jul 5, 2023

View reviewed changes

fix typo

12442d1

chainyo merged commit 2a17376 into main Jul 5, 2023
1 check failed

chainyo deleted the 4gpus branch July 5, 2023 15:30

chainyo mentioned this pull request Jul 6, 2023

Error in diarization: CUDA out of memory #129

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add multi gpus capacity for transcription #114

Add multi gpus capacity for transcription #114

chainyo commented Jun 22, 2023

chainyo commented Jun 26, 2023

Add multi gpus capacity for transcription #114

Add multi gpus capacity for transcription #114

Conversation

chainyo commented Jun 22, 2023

chainyo commented Jun 26, 2023