Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add vocab feature #124

Merged
merged 12 commits into from
Jun 30, 2023
Merged

Add vocab feature #124

merged 12 commits into from
Jun 30, 2023

Conversation

chainyo
Copy link
Contributor

@chainyo chainyo commented Jun 29, 2023

Add a simple way for the user to add some extra vocab in the payload. These words will be concatenated into a single string and provided to the model during inference.

@chainyo chainyo added api Everything related to the API implementation transcription Everything related to the transcription part labels Jun 29, 2023
@chainyo chainyo linked an issue Jun 29, 2023 that may be closed by this pull request
Copy link
Contributor

@aleksandr-smechov aleksandr-smechov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test this with a few keywords? How were the results?

@chainyo
Copy link
Contributor Author

chainyo commented Jun 29, 2023

Did you test this with a few keywords? How were the results?

For example, on this YouTube video: https://youtu.be/v2X51AVgl3o

I added some vocab: ["GitHub", "Python", "open-source"," README.md", "Visual Studio Code"]

Here are the results with the vocab on (+) and off (-)

+ Right now, open-source contributions are being used as the new resume.
- Right now, open source contributions are being used as the new resume.

+ In this video, we will be discussing what is open-source contributions and how do you actually do that.
- In this video, we will be discussing what is open source contributions and how do you actually do that.

+ The next place where you can find these projects is GitHub.
- The next place where you can find these projects is GitHub.

+ For example, if you're really good at Python programming language and want to contribute.
- For example, if you are really good at Python programming language and want to contribute.

+ Now open this folder in your visual studio code and open the readme.md file.
- Now open this folder in your Visual Studio code and open the readme.md file.

It's not perfect, for example README.md was not correctly handled nor VSCode.

We could add an extra post-processing step.

@aleksandr-smechov
Copy link
Contributor

aleksandr-smechov commented Jun 29, 2023

And do fuzzy match? That could go wrong in unforseen ways. What about if you try prepending the terms with "Make sure these words are spelled correctly: "

@aleksandr-smechov aleksandr-smechov added the enhancement New feature or request label Jun 29, 2023
@chainyo
Copy link
Contributor Author

chainyo commented Jun 30, 2023

And do fuzzy match? That could go wrong in unforseen ways. What about if you try prepending the terms with "Make sure these words are spelled correctly: "

It strictly doesn't change anything on the sample test I use.

@aleksandr-smechov
Copy link
Contributor

Ok, let's stick to the initial method of splitting by comma, since the prompt is limited to a certain number of tokens I believe.

@aleksandr-smechov
Copy link
Contributor

There could be a simpler way than fuzzy match to post-process maybe - just look for the exact words after lowercasing (and then replacing symbols with spaces) in the custom vocab. For example:

VS Code in the custom vocab dictionary becomes vs code. Open-Source becomes open-source (you would replace with the original vocab item). That way you can lowercase the output and find potential matches without altering the number of characters for the original. Then you can replace symbols with spaces, so in the custom vocab Open-Source would become open source, which you can search for in a lowercased output without altering overall character length.

wdyt?

@chainyo chainyo merged commit 7fe0a7b into main Jun 30, 2023
@chainyo chainyo deleted the 123-use-prompting-for-custom-vocabulary branch June 30, 2023 15:55
chainyo pushed a commit that referenced this pull request Jul 5, 2023
* add multi gpus handling for transcription

* add model index for transcription and diarization

* add gpu_index for alignment models

* fix diarization gpu indexing

* multi gpu setup, with transcription errors

* Updated error payload for svix in cortex endpoint

* Extra languages performed poorly, commenting out tests that require "he" for this param

* update the download_audio function to avoid extension

* add audio_duration key in repsonse + fix dual_channel bug

* fix tests and endpoint returns

* Add a catch for empty audio (#128)

* add a catch for empty audio file

* rename utterances -> response for coherence

* fix quality

* Add vocab feature (#124)

* add vocab feature

* fix youtube endpoint

* update the prompt sentence

* add vocab feature

* fix youtube endpoint

* update the prompt sentence

* Upgraded base docker image to nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu20.04

* add multi gpus handling for transcription

* add model index for transcription and diarization

* add gpu_index for alignment models

* fix diarization gpu indexing

* multi gpu setup, with transcription errors

* lower batch_size

* fix alignment device index

* revert transcribe service to no mapping

* update gpu service queue manager

* fix Exception returns for endpoints

* fixed dual_channel

* fix flake and darglint

* run black linter

* fix nemo config tests

* fix typo

---------

Co-authored-by: Aleks <aleks@wordcab.com>
Co-authored-by: Aleksandr Smechov <35517862+aleksandr-smechov@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Everything related to the API implementation enhancement New feature or request transcription Everything related to the transcription part
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use prompting for custom vocabulary
2 participants