-
-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rqscheduler can constantly attempt to register itself. #62
rqscheduler can constantly attempt to register itself. #62
Conversation
Allow rqscheduler to keep attempting to register itself periodically.
Would introducing a Similar to RQ worker, running This means you can schedule N hosts to run |
@selwin What would the behavior be with FWIW resque-scheduler (the analog of rq-scheduler in the ruby world) allows you to run multiple schedulers and it handles failover automatically: https://github.com/resque/resque-scheduler#redundancy-and-fail-over
I think this is a good approach, and is very similar to @SirScott's patch (all processes continually try to acquire a 'master' lock until one succeeds, and TTLs allow failover to happen when a scheduler process dies unexpectedly). |
Sorry, I forgot to reply to this issue. To be honest, I don't think Yes, two schedulers running at the same time would still create an error. But what I like about the However, I've also been thinking about the approach @SirScott suggested and am not opposed to it. Can we have a more descriptive name than |
Could it be that you just enqueue jobs that do the schedule poll? That way the loop is shared across the rqworker cluster? |
@SirScott thanks for writing this PR, please see my comment here: #70 (comment) |
Allow rqscheduler to keep attempting to register itself periodically.
My use case is that I have N identical hosts all running rqworker and rqscheduler processes under the watchful eye of supervisor. As supevisord doesn't support an 'unlimited' value for the 'startretries' program configuration option, eventually the rqscheduler process will be moved into supervisor's failed state. Thus, if the host that was successfully running rqscheduler goes down, none of the other existing hosts can automatically take its place.
Similarly, when performing rolling updates, the new hosts come online and try to launch rqscheduler. This fails because an old host is already executing it. If the rollout is slow, it's possible that the new hosts will again have put rqscheduler into a failed state before the old host is rolled out. This will result in a new deployment where rqscheduler is not running.
This patch will allow rqscheduler itself to keep retrying its registration process. Backwards compatability is preserved by aborting by default.