Rate limiting options #2

asg017 · 2024-06-04T20:28:42Z

Different providers have different limits, so might be hard to coordinate. Will also need settings to disable/alter limits for self-hosted options.

simonw · 2024-07-25T20:50:04Z

Some providers even return HTTP headers that inform the client of the current rate limit remaining and when it would be reset, so it would be possible to automatically throttle requests to those providers (sleep automatically until the next "reset" point).

A call to the OpenAI embeddings API for example returns this:

x-ratelimit-limit-requests: 5000
x-ratelimit-limit-tokens: 5000000
x-ratelimit-remaining-requests: 4999
x-ratelimit-remaining-tokens: 4999990
x-ratelimit-reset-requests: 12ms
x-ratelimit-reset-tokens: 0s

asg017 · 2024-07-25T20:53:45Z

Oooh interesting, that would be way better than trying to wrap some rust rate limiter around every client. Will score through all these clients and see which ones support that besides openai

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rate limiting options #2

Rate limiting options #2

asg017 commented Jun 4, 2024

simonw commented Jul 25, 2024

asg017 commented Jul 25, 2024

Rate limiting options #2

Rate limiting options #2

Comments

asg017 commented Jun 4, 2024

simonw commented Jul 25, 2024

asg017 commented Jul 25, 2024