rqscheduler can constantly attempt to register itself. #62

ScottSturdivant · 2014-11-03T18:20:45Z

Allow rqscheduler to keep attempting to register itself periodically.

My use case is that I have N identical hosts all running rqworker and rqscheduler processes under the watchful eye of supervisor. As supevisord doesn't support an 'unlimited' value for the 'startretries' program configuration option, eventually the rqscheduler process will be moved into supervisor's failed state. Thus, if the host that was successfully running rqscheduler goes down, none of the other existing hosts can automatically take its place.

Similarly, when performing rolling updates, the new hosts come online and try to launch rqscheduler. This fails because an old host is already executing it. If the rollout is slow, it's possible that the new hosts will again have put rqscheduler into a failed state before the old host is rolled out. This will result in a new deployment where rqscheduler is not running.

This patch will allow rqscheduler itself to keep retrying its registration process. Backwards compatability is preserved by aborting by default.

Allow rqscheduler to keep attempting to register itself periodically.

selwin · 2014-11-23T05:48:20Z

Would introducing a --burst option be better for this use case?

Similar to RQ worker, running rqscheduler --burst would scheduled all jobs that need to be scheduled and quit on completion.

This means you can schedule N hosts to run rqscheduler --burst every minute via cron on multiple hosts and will retry infinitely.

lost-theory · 2014-11-25T19:07:44Z

@selwin What would the behavior be with --burst when two schedulers run at the same time? Seems like you'd have the same problem, one process would throw an error. And it would introduce a dependency on cron (one of the reasons people use a system like rq & rq-scheduler is to get away from cron 😄).

FWIW resque-scheduler (the analog of rq-scheduler in the ruby world) allows you to run multiple schedulers and it handles failover automatically:

https://github.com/resque/resque-scheduler#redundancy-and-fail-over

You may want to have resque-scheduler running on multiple machines for redudancy. Electing a master and failover is built in and default. Simply run resque-scheduler on as many machine as you want pointing to the same redis instance and schedule. The scheduler processes will use redis to elect a master process and detect failover when the master dies. Precautions are taken to prevent jobs from potentially being queued twice during failover even when the clocks of the scheduler machines are slightly out of sync (or load affects scheduled job firing time). If you want the gory details, look at Resque::Scheduler::Locking.

I think this is a good approach, and is very similar to @SirScott's patch (all processes continually try to acquire a 'master' lock until one succeeds, and TTLs allow failover to happen when a scheduler process dies unexpectedly).

selwin · 2015-02-18T01:54:46Z

Sorry, I forgot to reply to this issue.

To be honest, I don't think rq-scheduler as a direct replacement for cron as I think it's one of the most battle tested utility out there. rq-scheduler is meant to be something that lets you schedule jobs programatically.

Yes, two schedulers running at the same time would still create an error. But what I like about the --burst approach is that you won't get two active scheduler processes at the same time (the one that errors out would just die).

However, I've also been thinking about the approach @SirScott suggested and am not opposed to it. Can we have a more descriptive name than --retry though?

jmmills · 2015-02-20T01:53:44Z

Could it be that you just enqueue jobs that do the schedule poll? That way the loop is shared across the rqworker cluster?

selwin · 2015-08-19T23:46:24Z

@SirScott thanks for writing this PR, please see my comment here: #70 (comment)

rqscheduler can constantly attempt to register itself.

f18a144

Allow rqscheduler to keep attempting to register itself periodically.

selwin mentioned this pull request Nov 24, 2014

Need better documentation how to start Scheduler #51

Open

lost-theory mentioned this pull request Feb 17, 2015

Architecture question: Multiple rqscheduler processes? #70

Open

jmmills mentioned this pull request Aug 18, 2015

Heroku / dokku and "ValueError: There's already an active RQ scheduler" #93

Closed

selwin closed this Aug 19, 2015

sandlerben added a commit to hack4impact-upenn/idle-free-philly that referenced this pull request Jan 13, 2016

Added workaround for bug (rq/rq-scheduler#62)

ce9c7fc

sandlerben added a commit to hack4impact-upenn/idle-free-philly that referenced this pull request Jan 13, 2016

Added workaround for bug (rq/rq-scheduler#62)

7ce0bb0

sandlerben added a commit to hack4impact-upenn/idle-free-philly that referenced this pull request Jan 21, 2016

Added workaround for bug (rq/rq-scheduler#62)

592cd32

sandlerben added a commit to hack4impact/flask-base that referenced this pull request Jan 29, 2016

Added workaround for bug (rq/rq-scheduler#62)

3b5ef0f

sandlerben added a commit to hack4impact/flask-base that referenced this pull request Jan 29, 2016

Added workaround for bug (rq/rq-scheduler#62)

97e47a4

sandlerben added a commit to hack4impact-upenn/maps4all that referenced this pull request Nov 14, 2016

Added workaround for bug (rq/rq-scheduler#62)

89364f7

rrelaxx pushed a commit to rrelaxx/xrm-ui that referenced this pull request Aug 4, 2022

Added workaround for bug (rq/rq-scheduler#62)

e1ced7f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rqscheduler can constantly attempt to register itself. #62

rqscheduler can constantly attempt to register itself. #62

ScottSturdivant commented Nov 3, 2014

selwin commented Nov 23, 2014

lost-theory commented Nov 25, 2014

selwin commented Feb 18, 2015

jmmills commented Feb 20, 2015

selwin commented Aug 19, 2015

rqscheduler can constantly attempt to register itself. #62

rqscheduler can constantly attempt to register itself. #62

Conversation

ScottSturdivant commented Nov 3, 2014

selwin commented Nov 23, 2014

lost-theory commented Nov 25, 2014

selwin commented Feb 18, 2015

jmmills commented Feb 20, 2015

selwin commented Aug 19, 2015