Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug with distributed rate limit syncing #335

Closed
GUI opened this issue Apr 28, 2016 · 1 comment
Closed

Bug with distributed rate limit syncing #335

GUI opened this issue Apr 28, 2016 · 1 comment

Comments

@GUI
Copy link
Member

GUI commented Apr 28, 2016

We have a process where rate limit information gets synced between our multiple servers. This is done so that we can store the rate limit information locally in memory on each server (for performance reasons), but then still ensure the rate limit information is correct across a cluster of separate machines (since a single user's traffic might be distributed across the individual servers).

I recently noticed quite a few errors like this being thrown from this sync process:

2016-04-28T16:11:15.46838 2016/04/28 16:11:15 [error] 3550#0: [lua] interval_lock.lua:41: timeout_exec(): timeout exec pcall failed: ...pi-umbrella/proxy/jobs/distributed_rate_limit_puller.lua:52: bad "exptime" argument, context: ngx.timer

After digging around, this case can crop up for longer duration rate limits (for example, on APIs that have per day limits). The culprit was that our distributed information didn't have the correct TTL settings, which caused a negative calculation in the TTL when it came time to populate the local memory version of rate limit information.

This wasn't a fatal error, but it could have led to some odd rate limit counts for these longer duration rate limits depending on which server you hit.

@GUI
Copy link
Member Author

GUI commented Apr 28, 2016

This is fixed by NREL/api-umbrella@7ff8625

We now properly set the TTL when pushing data into the distributed rate limit database which should fix this. We also now do a better job of catching and logging this error if it does ever happen again (but hopefully not).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant