HA - Automatic Failover #67

eveiga · 2013-02-13T17:37:05Z

Hi! First of all, thanks for the proxy, it has been really helpful :)

I'm in need of a decent solution for automatic failover and already stated that twemproxy doesn't support it. Any thoughts or ideas on it?

I was thinking on a external process that would leverage the use o redis-sentinel and on a master-switch event updates the IP address on nutcracker.conf and restarts the service.

manjuraj · 2013-02-13T17:56:56Z

Glad you liked it @eveiga

I believe using the external process the way you described makes sense. In fact you can have two twemproxy processes running - one routing the traffic to all the masters and the other to all the slaves. On a failover event, you switch from one twemproxy to the other

bmatheny · 2013-02-13T21:49:02Z

@manjuraj @eveiga that's what we do for memcache when there are events like a total failure (external process). Works quite well.

eveiga · 2013-02-13T21:57:43Z

@manjuraj I've already thought on that solution. Can I use the slaves cluster to perform read operations? Or the hashes wont pair with the ones for the master cluster?

@bmatheny are you using my sugestion or manjuraj one?

bmatheny · 2013-02-13T22:51:59Z

@eveiga the one you recommended. When the topology needs to change the config is updated by an external process and twem gets restarted.

eveiga · 2013-02-13T23:17:11Z

@bmatheny sorry for the boring questions :) dont you experience a window of downtime during that restart? If yes, How do you cope with that?

BTW, are you using any pool of twemproxy just with slaves for reading?

eveiga · 2013-02-13T23:20:36Z

Humm, I forgot you are using it with memcache, dont know if the last question fits your use case!

bmatheny · 2013-02-13T23:30:30Z

We do see a short burst of errors. The error type is detected by the app and retried, so we generally don't 'lose' writes, and reads will fall back to the DB.

eveiga · 2013-02-14T10:42:24Z

@bmatheny Thanks for the tips, I'll go on with that solution!

matschaffer · 2013-02-26T13:34:42Z

@eveiga thanks for the redis-sentinel reminder. So far it looks like this will work well.

Has anyone built the bits to update twemproxy when redis-sentinel finishes a failover?

@manjuraj would you recommend anything more graceful than simply rewriting the twemproxy config and restarting it?

eveiga · 2013-02-26T13:59:46Z

@matschaffer Yes, I've developed a simple service that attaches a handler to the "master-switch" event emitted by redis-sentinel, updates twemproxy.conf with the new info and restartes the service. So far so good with the tests, I'll put it in production in a short time.

matschaffer · 2013-02-26T14:40:49Z

@eveiga any chance of sharing what you've come up with?

eveiga · 2013-02-26T14:46:51Z

No problem. It's on node.js and a bit tight with our structure, still want it?

matschaffer · 2013-02-26T14:49:46Z

Sure! Even just a gist is great. Always nicer to have some collaboration. :)

On Feb 26, 2013, at 9:47, eveiga notifications@github.com wrote:

No problem. It's on node.js and a bit tight with our structure but, still
want it?

—
Reply to this email directly or view it on
GitHubhttps://github.com//issues/67#issuecomment-14118050
.

eveiga · 2013-02-26T15:01:26Z

https://gist.github.com/eveiga/5039007

As I said, it's pretty tight with our structure (init scripts path, mails, etc) and could be a lot configurable, but it can give you a starting point.

Sugestions are welcome!

manjuraj · 2013-02-26T16:05:23Z

it you guys can make this generic enough, we can check this into the scripts/ folder of twemproxy

matschaffer · 2013-02-28T19:02:30Z

@eveiga how's yours panning out? Over here it seems to work if I'm careful about the startup order. But if the agent comes up before the sentinel the agent seems to deadlock after a certain number of retries. Have you run into that or are you controlling start order more carefully.

matschaffer · 2013-02-28T19:03:59Z

@eveiga btw, I have this up at https://github.com/matschaffer/redis_twemproxy_agent as something I can pack with npm and get some rough testing around. I took out the email notifier though since we'll probably want to notify via other means.

eveiga · 2013-03-01T17:59:36Z

Hey @matschaffer, I've assumed that the sentinel was already running, but indeed we should have some kind of reaction on a failed startup. Thanks for packing this in a new repo, I'll take a look at it during the weekend and try to do some contribution!

matschaffer · 2013-03-01T19:05:53Z

No problem! After further testing I'm not sure that's the case (with the startup order issue). Not sure what caused the lack of reconfiguration on my first test but I haven't been able to replicate it. My latest commit logs a lot to stdout in hopes that I can tell what's up if it happens again.

matschaffer · 2013-03-14T21:28:45Z

@eveiga how's this working for you? For me it was working great until I added a second sentinel. Seems like a single sentinel may or may not broadcast the failover messages. Still investigating though.

matschaffer · 2013-03-15T14:42:21Z

After some investigation it looks like it's not just the multiple sentinels but rather multiple masters failing at the same time. The agent doesn't seem to reliably get all the switch-master messages :(

matschaffer · 2013-03-15T15:22:30Z

Swapping for node-sentinel for direct use of node-redis seems to help. Gonna do another test now.

eveiga · 2013-03-19T18:49:11Z

Hey @matschaffer! Sorry for the absence, I'm back on this! Thanks for the bumps on it, I'll take a look and update the production code.

eveiga · 2013-03-19T18:53:53Z

BTW: I never had more than one sentinel so I've never crashed into your problem.

idning · 2014-03-21T11:01:14Z

hi, all, try https://github.com/idning/redis-mgr please

nidhhoggr · 2017-01-07T17:30:57Z

If anyone is interested I started a C implementation of https://github.com/matschaffer/redis_twemproxy_agent at https://github.com/nidhhoggr/twemproxy_sentinel

virendarkmr · 2018-05-17T10:05:16Z

Hi, I am stuck with same issue. I have 2 different redis cluster with master slave slave and sentinel is handling failover. I redis twemproxy agent is working fine with when I give single sentinl ip in cli.js
How can I handle failover for two cluster?

douglaslps · 2018-10-08T16:33:30Z

hi, all, try https://github.com/idning/redis-mgr please

What happened with that? I'm getting page not found.

matschaffer mentioned this issue Mar 15, 2013

Not catching all switch-master messages JvrBaena/node-sentinel#1

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HA - Automatic Failover #67

HA - Automatic Failover #67

eveiga commented Feb 13, 2013

manjuraj commented Feb 13, 2013

bmatheny commented Feb 13, 2013

eveiga commented Feb 13, 2013

bmatheny commented Feb 13, 2013

eveiga commented Feb 13, 2013

eveiga commented Feb 13, 2013

bmatheny commented Feb 13, 2013

eveiga commented Feb 14, 2013

matschaffer commented Feb 26, 2013

eveiga commented Feb 26, 2013

matschaffer commented Feb 26, 2013

eveiga commented Feb 26, 2013

matschaffer commented Feb 26, 2013

eveiga commented Feb 26, 2013

manjuraj commented Feb 26, 2013

matschaffer commented Feb 28, 2013

matschaffer commented Feb 28, 2013

eveiga commented Mar 1, 2013

matschaffer commented Mar 1, 2013

matschaffer commented Mar 14, 2013

matschaffer commented Mar 15, 2013

matschaffer commented Mar 15, 2013

eveiga commented Mar 19, 2013

eveiga commented Mar 19, 2013

idning commented Mar 21, 2014

nidhhoggr commented Jan 7, 2017

virendarkmr commented May 17, 2018

douglaslps commented Oct 8, 2018

HA - Automatic Failover #67

HA - Automatic Failover #67

Comments

eveiga commented Feb 13, 2013

manjuraj commented Feb 13, 2013

bmatheny commented Feb 13, 2013

eveiga commented Feb 13, 2013

bmatheny commented Feb 13, 2013

eveiga commented Feb 13, 2013

eveiga commented Feb 13, 2013

bmatheny commented Feb 13, 2013

eveiga commented Feb 14, 2013

matschaffer commented Feb 26, 2013

eveiga commented Feb 26, 2013

matschaffer commented Feb 26, 2013

eveiga commented Feb 26, 2013

matschaffer commented Feb 26, 2013

eveiga commented Feb 26, 2013

manjuraj commented Feb 26, 2013

matschaffer commented Feb 28, 2013

matschaffer commented Feb 28, 2013

eveiga commented Mar 1, 2013

matschaffer commented Mar 1, 2013

matschaffer commented Mar 14, 2013

matschaffer commented Mar 15, 2013

matschaffer commented Mar 15, 2013

eveiga commented Mar 19, 2013

eveiga commented Mar 19, 2013

idning commented Mar 21, 2014

nidhhoggr commented Jan 7, 2017

virendarkmr commented May 17, 2018

douglaslps commented Oct 8, 2018