Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tokenization endpoint #1649

Open
Tracked by #1126
benniekiss opened this issue Jan 26, 2024 · 1 comment
Open
Tracked by #1126

Tokenization endpoint #1649

benniekiss opened this issue Jan 26, 2024 · 1 comment
Labels
area/api area/backends enhancement New feature or request roadmap up for grabs Tickets that no-one is currently working on

Comments

@benniekiss
Copy link

Is your feature request related to a problem? Please describe.

For generative models, many are limited by a maximum number of tokens. in some workflows, the prompts are generated dynamically to use as much context as possible by tokenizing the responses first to ensure that they will fit in the context.

Currently, this requires a local tokenization scheme which limits a complete API workflow.

Describe the solution you'd like

backends like transformers and llama.cpp both offer tokenization methods that just tokenize text without generating
a response. Attaching these methods to a tokenization api endpoint would be helpful in removing local processing requirements.

Describe alternatives you've considered

Additional context

@benniekiss benniekiss added the enhancement New feature or request label Jan 26, 2024
@mudler mudler added area/api area/backends roadmap up for grabs Tickets that no-one is currently working on labels Jan 26, 2024
@mudler
Copy link
Owner

mudler commented Jan 26, 2024

good point, it should be relatively easy indeed to expose

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api area/backends enhancement New feature or request roadmap up for grabs Tickets that no-one is currently working on
Projects
None yet
Development

No branches or pull requests

2 participants