Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Architecture question: Multiple rqscheduler processes? #70

Open
jmmills opened this issue Feb 17, 2015 · 6 comments
Open

Architecture question: Multiple rqscheduler processes? #70

jmmills opened this issue Feb 17, 2015 · 6 comments

Comments

@jmmills
Copy link

jmmills commented Feb 17, 2015

I was wondering if this product has some built in ability to deal with multiple scheduler processes?

In the case of RQ workers distributed across multiple hosts with clustered Redis, one would have a pretty fault tolerant system, for example if a whole machine goes down - jobs still get processed.

If however, some of these jobs are scheduled jobs and the particular host goes down that was running the scheduler... scheduled jobs would then not get run until another schedule was started (jobs are recovered at least).

But what about running multiple schedulers? Thus a highly available system.

@lost-theory
Copy link
Contributor

These is a similar bug open here: #62

The way I solved this was by making my own rqscheduler script (I already need to subclass rq_scheduler.Scheduler for a few things) that tries to register itself in a loop, like this:

def main():
    sched = FooScheduler(connection=get_redis_client(), interval=SCHEDULER_INTERVAL_SECONDS) #this is my subclassed rq_scheduler.Scheduler object
    while True:
        try:
            sched.run()
            break
        except ValueError, exc:
            if exc.message == "There's already an active RQ scheduler":
                sched.log.debug(
                    "An RQ scheduler instance is already running. Retrying in %d seconds.",
                    SCHEDULER_INTERVAL_SECONDS,
                )
                time.sleep(SCHEDULER_INTERVAL_SECONDS)
            else:
                raise

if __name__ == "__main__":
    main()

Then I run my rqscheduler script instead of the builtin one. This way I can do rolling restarts of my rq-scheduler processes, the new one will automatically wait until the old one dies.

It would be nice to see this behavior builtin to rq-scheduler, as I explained in this comment: #62 (comment)

@selwin
Copy link
Collaborator

selwin commented Feb 20, 2015

Yes, let's build this into rq-scheduler. See my comment here: #62 (comment)

@selwin
Copy link
Collaborator

selwin commented Aug 19, 2015

@darkpixel has an interesting suggestion. Allow multiple schedulers to run, but each scheduler has to acquire a lock when scheduling jobs. I think this is a good solution to people who want to run multiple scheduler processes for reliability purposes.

If someone can make a pull request for this, I would be happy to accept this :)

@jmmills
Copy link
Author

jmmills commented Aug 20, 2015

If a redis lock is used, how to ensure that a crashed scheduler doesn't cause a stale lock? Maybe a simple keep alive via redis pub/sub?

On Aug 19, 2015, at 4:43 PM, Selwin Ong notifications@github.com wrote:

@darkpixel has an interesting suggestion. Allow multiple schedulers to run, but each scheduler has to acquire a lock when scheduling jobs. I think this is a good solution to people who want to run multiple scheduler processes for reliability purposes.

If someone can make a pull request for this, I would be happy to accept this :)


Reply to this email directly or view it on GitHub.

@selwin
Copy link
Collaborator

selwin commented Aug 20, 2015

We can use "redis.expire(30)" so that if scheduler crashes, the lock will still be expired by Redis :)

Sent from my phone

On Aug 20, 2015, at 11:26 AM, Jason Mills notifications@github.com wrote:

If a redis lock is used, how to ensure that a crashed scheduler doesn't cause a stale lock? Maybe a simple keep alive via redis pub/sub?

On Aug 19, 2015, at 4:43 PM, Selwin Ong notifications@github.com wrote:

@darkpixel has an interesting suggestion. Allow multiple schedulers to run, but each scheduler has to acquire a lock when scheduling jobs. I think this is a good solution to people who want to run multiple scheduler processes for reliability purposes.

If someone can make a pull request for this, I would be happy to accept this :)


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub.

@jmmills
Copy link
Author

jmmills commented Aug 20, 2015

Ah, that works. A deadman switch.

On Aug 19, 2015, at 9:32 PM, Selwin Ong notifications@github.com wrote:

We can use "redis.expire(30)" so that if scheduler crashes, the lock will still be expired by Redis :)

Sent from my phone

On Aug 20, 2015, at 11:26 AM, Jason Mills notifications@github.com wrote:

If a redis lock is used, how to ensure that a crashed scheduler doesn't cause a stale lock? Maybe a simple keep alive via redis pub/sub?

On Aug 19, 2015, at 4:43 PM, Selwin Ong notifications@github.com wrote:

@darkpixel has an interesting suggestion. Allow multiple schedulers to run, but each scheduler has to acquire a lock when scheduling jobs. I think this is a good solution to people who want to run multiple scheduler processes for reliability purposes.

If someone can make a pull request for this, I would be happy to accept this :)


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub.

cheungpat added a commit to cheungpat/rq-scheduler that referenced this issue Dec 12, 2015
Schedulers have to acquire a lock before one can schedule jobs. The
lock automatically expires in case a scheduler is terminated unexpectedly.
Schedulers that cannot acquire a lock will sleep for the duration of
polling interval before retrying.

refs rq#70
aquamatt added a commit to UKTV/django_rq that referenced this issue Mar 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants