-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connecting to Azure Redis Cluster #1074
Comments
The above way of connecting Redis cluster looks like legitimate. I will try to enhance the error messages. It might be caused by ACL. |
I'm trying to connect to a Redis6 cluster, but it doesn't support ACL. Is there another way around this? |
Thanks for the extra logging! Very helpful. The actual error I'm getting is I'm not actually passing a username, only a |
It might be a bug in the client of cluster mode. I'll look into the differences of implementation and behavior between stand-alone mode and cluster mode. Lines 460 to 477 in 13c7a8e
redis-rb/lib/redis/cluster/option.rb Lines 63 to 69 in 13c7a8e
redis-rb/test/cluster_client_options_test.rb Lines 9 to 51 in 13c7a8e
|
I was looking at the cluster_client_options_test.rb file on my own.
These are the two types that make sense. The former (without ':') returns `Redis client could not fetch cluster information: NOAUTH Authentication required. (Redis::Cluster::InitialSetupError)' The latter returns I'm not aware of a username I'd even attempt to pair with the password. Someone having a similar issue with the JS client suggested that |
I haven't been able to get the gem to build locally so I can play around with some of the tests myself, but I'll keep trying that |
Trying to pass in a hash instead of a format string just results in a Connection timed out. (Trying to pass the same hash as a single instance does connect fine)
|
I reproduced it in my local machine. It seems that special characters are doubly escaped. It is a client bug for cluster mode. I'll fix it later. success in stand-alone mode with a url option:
success in cluster mode with a password option as plain text:
failure in cluster mode with a URI string:
server configuration: $ diff -u makefile /tmp/redis-rb-makefile
--- makefile 2022-02-20 19:02:23.660431802 +0900
+++ /tmp/redis-rb-makefile 2022-02-20 18:16:35.986257109 +0900
@@ -18,6 +18,7 @@
CLUSTER_PID_PATHS := $(addprefix ${TMP}/redis,$(addsuffix .pid,${CLUSTER_PORTS}))
CLUSTER_CONF_PATHS := $(addprefix ${TMP}/nodes,$(addsuffix .conf,${CLUSTER_PORTS}))
CLUSTER_ADDRS := $(addprefix 127.0.0.1:,${CLUSTER_PORTS})
+PASSWORD := !&<123-abc>
define kill-redis
(ls $1 > /dev/null 2>&1 && kill $$(cat $1) && rm -f $1) || true
@@ -43,21 +44,24 @@
start: ${BINARY}
@${BINARY}\
- --daemonize yes\
- --pidfile ${PID_PATH}\
- --port ${PORT}\
- --unixsocket ${SOCKET_PATH}
+ --daemonize yes\
+ --pidfile ${PID_PATH}\
+ --port ${PORT}\
+ --unixsocket ${SOCKET_PATH}\
+ --requirepass '${PASSWORD}'
stop_slave:
@$(call kill-redis,${SLAVE_PID_PATH})
start_slave: ${BINARY}
@${BINARY}\
- --daemonize yes\
- --pidfile ${SLAVE_PID_PATH}\
- --port ${SLAVE_PORT}\
- --unixsocket ${SLAVE_SOCKET_PATH}\
- --slaveof 127.0.0.1 ${PORT}
+ --daemonize yes\
+ --pidfile ${SLAVE_PID_PATH}\
+ --port ${SLAVE_PORT}\
+ --unixsocket ${SLAVE_SOCKET_PATH}\
+ --slaveof 127.0.0.1 ${PORT}\
+ --requirepass '${PASSWORD}'\
+ --masterauth '${PASSWORD}'
stop_sentinel:
@$(call kill-redis,${SENTINEL_PID_PATHS})
@@ -72,6 +76,7 @@
echo 'sentinel down-after-milliseconds ${HA_GROUP_NAME} 5000' >> $$conf;\
echo 'sentinel failover-timeout ${HA_GROUP_NAME} 30000' >> $$conf;\
echo 'sentinel parallel-syncs ${HA_GROUP_NAME} 1' >> $$conf;\
+ echo 'sentinel auth-pass ${HA_GROUP_NAME} ${PASSWORD}' >> $$conf;\
${BINARY} $$conf\
--daemonize yes\
--pidfile ${TMP}/redis$$port.pid\
@@ -105,7 +110,9 @@
--cluster-node-timeout 5000\
--pidfile ${TMP}/redis$$port.pid\
--port $$port\
- --unixsocket ${TMP}/redis$$port.sock;\
+ --unixsocket ${TMP}/redis$$port.sock\
+ --requirepass '${PASSWORD}'\
+ --masterauth '${PASSWORD}';\
done
create_cluster: $ diff -u test/support/cluster/orchestrator.rb /tmp/redis-rb-cluster-helper.rb
--- test/support/cluster/orchestrator.rb 2022-02-20 19:02:23.660431802 +0900
+++ /tmp/redis-rb-cluster-helper.rb 2022-02-20 18:16:55.590696731 +0900
@@ -11,6 +11,7 @@
@clients = node_addrs.map do |addr|
Redis.new(url: addr,
timeout: timeout,
+ password: '!&<123-abc>',
reconnect_attempts: 10,
reconnect_delay: 1.5,
reconnect_delay_max: 10.0) |
Thanks for the quick fix. After merging lastest master I no longer get the WRONGPASS, but I'm getting a new error now. If I try with the format string:
And if I try with
Is it possible there's a different issue underneath? I can connect with the same credentials in non-clustered mode |
I'll look into the issue later. There might be a another bug. |
I've tried to reproduce SSL/TLS connection error with cluster in local machine using mutual self signed certs. However, it couldn't. The former error is still under investigation.
documents: cert files: server configuration: $ diff -u /tmp/redis-rb-makefile.bk makefile
--- /tmp/redis-rb-makefile.bk 2022-02-23 16:24:08.536745615 +0900
+++ makefile 2022-02-23 17:23:22.028594220 +0900
@@ -104,7 +104,13 @@
--cluster-config-file ${TMP}/nodes$$port.conf\
--cluster-node-timeout 5000\
--pidfile ${TMP}/redis$$port.pid\
- --port $$port\
+ --port 0\
+ --tls-port $$port\
+ --tls-cert-file $(CURDIR)/test/support/ssl/trusted-cert.crt\
+ --tls-key-file $(CURDIR)/test/support/ssl/trusted-cert.key\
+ --tls-ca-cert-file $(CURDIR)/test/support/ssl/trusted-ca.crt\
+ --tls-cluster yes\
+ --logfile /tmp/redis.log\
--unixsocket ${TMP}/redis$$port.sock;\
done $ diff -u /tmp/redis-rb-cluster-helper.rb.bk test/support/cluster/orchestrator.rb
--- /tmp/redis-rb-cluster-helper.rb.bk 2022-02-23 18:03:21.288383625 +0900
+++ test/support/cluster/orchestrator.rb 2022-02-23 18:03:23.656437261 +0900
@@ -1,6 +1,7 @@
# frozen_string_literal: true
require 'redis'
+require 'openssl'
class ClusterOrchestrator
SLOT_SIZE = 16_384
@@ -11,6 +12,12 @@
@clients = node_addrs.map do |addr|
Redis.new(url: addr,
timeout: timeout,
+ ssl: true,
+ ssl_params: {
+ ca_file: File.join(__dir__, '..', 'ssl', 'trusted-ca.crt'),
+ cert: OpenSSL::X509::Certificate.new(File.read(File.join(__dir__, '..', 'ssl', 'trusted-cert.crt'))),
+ key: OpenSSL::PKey::RSA.new(File.read(File.join(__dir__, '..', 'ssl', 'trusted-cert.key'))),
+ },
reconnect_attempts: 10,
reconnect_delay: 1.5,
reconnect_delay_max: 10.0) |
Ah, forget about my comment for the latter. We can specify the option like the follows instead. Redis.new(cluster: [{ host: '127.0.0.1', port: 7000 }], password: 'mysecret', ssl: true) |
Perhaps, Redis.new(cluster: [{ host: 'my-redis.example.com', port: 6379 }], password: 'mysecret', ssl: true, ssl_params: { verify_hostname: true }) redis-rb/lib/redis/connection/ruby.rb Lines 254 to 255 in 399ebde
redis-rb/lib/redis/connection/ruby.rb Lines 279 to 284 in 9446688
|
Tried a few things. Results: Connection timed out (Redis::TimeoutError)
certificate verify failed (unspecified certificate verification error) (OpenSSL::SSL::SSLError)
Note: for this I used 6380, since 6379 is blocked on my azure instance (expects non ssl connections). If I try with 6379 I get a (Redis::TimeoutError) (Redis::Cluster::InitialSetupError) I have the minimum TLS version set to 1.0 in Azure in case the client wasn't using 1.1/1.2 yet. |
I've tried to check with AWS ElastiCache. It works.
Since you said stand-alone mode is success, would you inform us of the following command's response with some masked sensitive texts? Redis.new(url: 'rediss://:yoursecret@yourhost:yourport').cluster(:nodes).split("\n") |
So this is interesting.
However, trying Since the first case worked, is it possible to build a cluster client manually with the list of nodes? |
I'd say that it is a hard way for building cluster client manually. I think the following directives in server configuration may be related. They were added by AWS folks and available since Redis At CLUSTER NODES command, which do Azure Redis servers reply, IP addresses or host names? AWS ElastiCache returns host names. Maybe we should use CLUSTER SLOTS instead of CLUSTER NODES. |
I got IP addresses instead of host names. |
We've got a root cause but the matter looks like Azure Redis service side. I'll catch up some documents and look for solutions later. |
I found a document but it was for redis-cli. Does redis-cli work fine for Azure Redis cluster with SSL/TLS? Please check some redirection behavior by GET command.
sequenceDiagram
participant Client
participant Server Shard 1
participant Server Shard 2
participant Server Shard 3
Client->>+Server Shard 1: CLUSTER SLOTS
Server Shard 1-->>-Client: nodes and slots data
Note over Client,Server Shard 1: host names needed if using SSL/TLS
Client->>+Server Shard 1: GET key1
Server Shard 1-->>-Client: value1
Client->>+Server Shard 2: GET key2
Server Shard 2-->>-Client: value2
Client->>+Server Shard 3: GET key3
Server Shard 3-->>-Client: value3
Client->>+Server Shard 3: GET key1
Server Shard 3-->>-Client: MOVED Server Shard 1
Note over Client,Server Shard 3: Client needs to redirect to correct node
Client->>+Server Shard 2: MGET key2 key3
Server Shard 2-->>-Client: CROSSSLOTS
Note over Client,Server Shard 2: Cannot command across shards
|
I'm sorry for my misunderstanding. It seems that the SSL certificate is for a IP address, not a common name. As expected, there might be some bugs in cluster client of redis-rb. I will continue to try to find out that cause. https://datatracker.ietf.org/doc/html/rfc5280#section-4.2.1.6 |
Some documents say: https://docs.microsoft.com/en-us/azure/azure-cache-for-redis/cache-how-to-premium-clustering
But:
It seems that it's not like the endpoint proxies requests, clients need to support cluster protocols. Behaving of the above redis-cli is a evidence to back up it. |
Please try to connect to Azure Redis cluster with SSL/TLS by using https://github.com/redis/redis-rb#hiredis
|
I found that, unfortunately, SSL/TLS support with hiredis is not enough in redis-rb currently. Forget about that. I'm sorry. |
Could you try to connect to Azure Redis cluster with SSL/TLS by patching code like the follows? $ diff -u /tmp/redis-rb-conn-rb.rb.bk lib/redis/connection/ruby.rb
--- /tmp/redis-rb-conn-rb.rb.bk 2022-02-26 14:00:25.862435640 +0900
+++ lib/redis/connection/ruby.rb 2022-02-26 14:00:52.355047769 +0900
@@ -252,7 +252,7 @@
ctx.set_params(ssl_params || {})
ssl_sock = new(tcp_sock, ctx)
- ssl_sock.hostname = host
+ #ssl_sock.hostname = host
begin
# Initiate the socket connection in the background. If it doesn't fail The above patch is comment out the following line. redis-rb/lib/redis/connection/ruby.rb Line 255 in 610c783
I assume SNI might be fail if using certificate for a IP address. |
It seems that there is a same issue in other libraries. We are not able to know how Azure Redis cluster works internally. In a general way, client of Redis Cluster use node addresses fetched from server but the way tends to fail with SSL/TLS at Azure Redis. case PHP:There is a workaround using the host name of endpoint constantly. It may works indeed but it might be a bit ad-hoc and minor use case. It seems that Azure Redis has single IP address and multiple ports. Internally, there might be proxy servers such that stunnel or something like that. The proxy server doesn't support redirection. It does only SSL/TLS termination. graph TB
client(Cluster Client)
subgraph Azure Redis Cache
subgraph Endpoint
endpoint(Active)
endpoint_sb(Standby)
end
subgraph Cluster
node0(Node0)
node1(Node1)
node2(Node2)
end
end
endpoint-.-endpoint_sb
node0-.-node1-.-node2-.-node0
client--rediss://vip:15000-->endpoint--redis://real:6379-->node0
On the other hand, in my opinion, I'd say that AWS ElastiCache behaves friendly to typical clients. graph TB
client(Cluster Client)
subgraph AWS ElastiCache
node0(Node0)
node1(Node1)
node2(Node2)
end
node0-.-node1-.-node2-.-node0
client--rediss://node0:6379-->node0
case Java:
https://github.com/lettuce-io/lettuce-core/releases/tag/4.2.0.Final
I think this is a last resort but it may also work. Redis.new(cluster: [{ host: 'foo.example.com', port: 6379 }], password: 'bar', ssl: true, ssl_params: { verify_mode: OpenSSL::SSL::VERIFY_NONE }) Since your endpoint looks like public, I've checked behavior with SSL/TLS options in my local machine. It seems that our client works fine if we disable to verify hostname.
The certificate may not be set IP addresses to SAN so we probably cannot verify it with IP addresses correctly.
Since you received a timeout error, Azure Redis cluster nodes might be returning private or plain text port numbers by CLUSTER command. Are your cluster nodes in Azure Redis returning |
YES! Are there any adverse security implications for this? Doesn't setting VERIFY_NONE open us up to MITM attacks? Should I open a ticket with the Azure Redis Cluster support team to try and address this? The cluster nodes/slots commands are returning ports in the 1500n range. |
In a use case of cluster mode with SSL/TLS, Azure Redis architecture expects to client to be able to access with FQDN of single endpoint but our redis-rb expects to servers to be able to reply FQDNs by CLUSTER commands or to be able to verify certificate with IP addresses. I'm not familiar with security problems, but there might be only a way to disable verification of certificates currently. It might be a good idea to inquire about the issue to the support team of Azure Cache for Redis. |
I opened a ticket with the Azure Redis team and this was their response:
Since we have the hostname from when we initially configure the client, would that work? |
Yes, it does. However, Azure Redis servers currently return IP addresses in reply to CLUSTER NODES. The ability to reply with FQDNs is only possessed by AWS ElastiCache and Redis redis-rb/lib/redis/connection/ruby.rb Line 255 in 610c783
|
Since Azure uses the same FQDN for all the nodes (just different ports), can we bring in the config? Would that require a huge refactor? |
I’m not sure about the scale of refactoring, but I think an additional option may be needed such that |
Yeah that would make sense |
Could you try to test the following version of the client? @ftlc gem 'redis', git: 'https://github.com/supercaracal/redis-rb.git', branch: 'support-azure-cache-for-redis-with-cluster-mode-and-ssl-tls' Redis.new(cluster: %w[rediss://foo-endpoint.example.com:6379], fixed_hostname: 'foo-endpoint.example.com') |
Tried with this client. Following config:
Got a timeout like before, so it looks like peer verification may still have failed.
|
OH! But using a format string did work!
|
Thank you for your testing. The timeout error in the former is weird. |
Thanks for pushing updates so fast! |
Ah, the former may works if we specify options like this: connection_url = { host: host, password: key, port: port }
client = Redis.new(cluster: [connection_url], fixed_hostname: host, ssl: true) The |
Oh yup, moving SSL out of the cluster did it. It works now! Thanks so much! Once everything is merged I'll work on deploying it and make sure there are no issues with high traffic, but it looks perfect now. |
@supercaracal Hey, do you know when this might be merged? I'm trying to decide if I should plan integrating this work in the upcoming sprint |
@byroot I understand you are occupied, but I would appreciate it if you could review the following pull request. |
@supercaracal apologies, I didn't see it was ready for review. Never hesitate pinging me for these things. I'll have a look right now. |
Thanks for merging it in! When is 4.7 scheduled? For the bigger services we'll probably wait for the official release instead of pointing our gemfile directly at master |
Hey folks,
We have a rails app we're trying to connect to azure redis. I have a test cluster provisioned, and can connect as a single instance, but trying to connect in clustered mode gives an error:
Redis Client could not connect to any cluster nodes
Here's a minimal config:
The documentation stated that I can pass in one instance like this and it'll discover the rest of the nodes through the CLUSTER NODES command.
If I switch from
cluster
tourl
I can connect to a redis node, but of course half the writes fail with a MOVED message.I was able to find one SO post of somebody having a similar issue in 2019 that was never resolved, but otherwise no documentation that's redis-rb and Azure Redis specific
The text was updated successfully, but these errors were encountered: