Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upcoming in next release! (this week) #46

Closed
3 tasks
Vaibhavs10 opened this issue Nov 14, 2023 · 17 comments
Closed
3 tasks

Upcoming in next release! (this week) #46

Vaibhavs10 opened this issue Nov 14, 2023 · 17 comments

Comments

@Vaibhavs10
Copy link
Owner

  • Speaker Diarization with Pyannote 🤯
  • Fast CPU support 💻
  • Streaming ⚡
@thomasmol
Copy link

Awesome! I have a pyannote implementation here https://github.com/thomasmol/replicate-whisper-diarization if you want to take a look.
Also really curious how streaming would work!

@omarsiddiqi224
Copy link

There is this resource as well to help with diarization: https://github.com/MahmoudAshraf97/whisper-diarization

@zeke
Copy link

zeke commented Nov 17, 2023

Looking forward to diarization 🙏🏼

@JuergenFleiss
Copy link

Hi, is testing CPU possible already? Would be greatly interested to compare it to faster-whisper. Also, what would the limitations be? Batching should work? Flash attention? Your speeds sound really promising.

@souvikqb
Copy link

souvikqb commented Nov 22, 2023

  • Speaker Diarization with Pyannote 🤯
  • Fast CPU support 💻
  • Streaming ⚡

Hey 👋 any update on these releases @Vaibhavs10 ? They would open up a lot of possibilities

@acul3
Copy link

acul3 commented Nov 27, 2023

Looking forward to this

Btw @Vaibhavs10 have you consider adding vad(voice activity detection) to the pipeline

By my testing..VAD reduce hallucination espscially with audio lot of silence and noise

Thanks

@BBC-Esq
Copy link

BBC-Esq commented Nov 29, 2023

Might I suggest using nemo toolkit instead? It seems to avoid pyannote's requirement of using a huggingface key or what not to access their model. omarsiddiqi224 is the one who posted a link to a repository that relies on it instead of pyannote.

@bluusun
Copy link

bluusun commented Nov 29, 2023

How can the speaker diarization be used? Where does it show? Thanks for adding this!

@TomExMachina
Copy link

Does anyone have a streaming script or snippet they can share ahead of the release? If you do I will help iterate on it.

@souvikqb
Copy link

Does anyone have a streaming script or snippet they can share ahead of the release? If you do I will help iterate on it.

I had originally asked this question on Distill Whisper, here's a potential script - huggingface/distil-whisper#4 (comment)

Link to my issue - huggingface/distil-whisper#41 (comment)

@Vaibhavs10
Copy link
Owner Author

Heu @souvikqb @TomExMachina - Re: Streaming: A community member made this: https://gist.github.com/Oceanswave/32da596e8bb10c928f6c69c889c3c130 (It works quite well)

@Vaibhavs10
Copy link
Owner Author

Hey @bluusun - Currently, the API is a bit spaghetti, however, if you pass a parameter --hf_token <HF token> it should automatically diarise.

@kadirnar recently made a PR to make this more clear #83 (we'll make a release tomorrow or on saturday along with some more goodies 🤞 )

@Vaibhavs10
Copy link
Owner Author

@BBC-Esq - I'm opening a new issue to discuss this #85, I think adding support for Nvidia NeMo might make sense and give people the option to choose different backends too.

@Vaibhavs10
Copy link
Owner Author

(Closing this issue since the release already happened; we need another patch (to fix the current API) before considering the next steps.)

@Tortoise17
Copy link

@Vaibhavs10 CPU usage is now possible with the new release?

@Tortoise17
Copy link

@Vaibhavs10 Fast CPU support was also mentioned to make available in this release in addition to diarization and streaming

@Vaibhavs10
Copy link
Owner Author

Let's discuss that in a seperate issue. (I'll open one)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests