Skip to content

Commit

Permalink
memcached cache: switch to AWS elasticache-java-cluster-client and ad…
Browse files Browse the repository at this point in the history
…d TLS support (#14827)

This PR updates the library used for Memcached client to AWS Elasticache Client : https://github.com/awslabs/aws-elasticache-cluster-client-memcached-for-java

This enables us to use the option of encrypting data in transit:
Amazon ElastiCache for Memcached now supports encryption of data in transit

For clusters running the Memcached engine, ElastiCache supports Auto Discovery—the ability for client programs to automatically identify all of the nodes in a cache cluster, and to initiate and maintain connections to all of these nodes.
Benefits of Auto Discovery - Amazon ElastiCache

AWS has forked spymemcached 2.12.1, and has since added all the patches included in 2.12.2 and 2.12.3 as part of the 1.2.0 release. So, this can now be considered as an equivalent drop-in replacement.

GitHub - awslabs/aws-elasticache-cluster-client-memcached-for-java: Amazon ElastiCache Cluster Client for Java - enhanced library to connect to ElastiCache clusters.
https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/elasticache/AmazonElastiCacheClient.html#AmazonElastiCacheClient--

How to enable TLS with Elasticache

On server side:
https://docs.aws.amazon.com/AmazonElastiCache/latest/mem-ug/in-transit-encryption-mc.html#in-transit-encryption-enable-existing-mc

On client side:
GitHub - awslabs/aws-elasticache-cluster-client-memcached-for-java: Amazon ElastiCache Cluster Client for Java - enhanced library to connect to ElastiCache clusters.
  • Loading branch information
pagrawal10 authored Oct 2, 2023
1 parent 2785e06 commit d038237
Show file tree
Hide file tree
Showing 8 changed files with 209 additions and 39 deletions.
23 changes: 13 additions & 10 deletions docs/configuration/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2112,16 +2112,19 @@ In addition to the normal cache metrics, the caffeine cache implementation also

Uses memcached as cache backend. This allows all processes to share the same cache.

|Property|Description|Default|
|--------|-----------|-------|
|`druid.cache.expiration`|Memcached [expiration time](https://code.google.com/p/memcached/wiki/NewCommands#Standard_Protocol).|2592000 (30 days)|
|`druid.cache.timeout`|Maximum time in milliseconds to wait for a response from Memcached.|500|
|`druid.cache.hosts`|Comma separated list of Memcached hosts `<host:port>`.|none|
|`druid.cache.maxObjectSize`|Maximum object size in bytes for a Memcached object.|52428800 (50 MiB)|
|`druid.cache.memcachedPrefix`|Key prefix for all keys in Memcached.|druid|
|`druid.cache.numConnections`|Number of memcached connections to use.|1|
|`druid.cache.protocol`|Memcached communication protocol. Can be binary or text.|binary|
|`druid.cache.locator`|Memcached locator. Can be consistent or array_mod.|consistent|
| Property | Description | Default |
|-------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------|
| `druid.cache.expiration` | Memcached [expiration time](https://code.google.com/p/memcached/wiki/NewCommands#Standard_Protocol). | 2592000 (30 days) |
| `druid.cache.timeout` | Maximum time in milliseconds to wait for a response from Memcached. | 500 |
| `druid.cache.hosts` | Comma separated list of Memcached hosts `<host:port>`. Need to specify all nodes when `druid.cache.clientMode` is set to static. Dynamic mode [automatically identifies nodes in your cluster](https://docs.aws.amazon.com/AmazonElastiCache/latest/mem-ug/AutoDiscovery.html) so just specifying the configuration endpoint and port is fine. | none |
| `druid.cache.maxObjectSize` | Maximum object size in bytes for a Memcached object. | 52428800 (50 MiB) |
| `druid.cache.memcachedPrefix` | Key prefix for all keys in Memcached. | druid |
| `druid.cache.numConnections` | Number of memcached connections to use. | 1 |
| `druid.cache.protocol` | Memcached communication protocol. Can be binary or text. | binary |
| `druid.cache.locator` | Memcached locator. Can be consistent or array_mod. | consistent |
| `druid.cache.enableTls` | Enable TLS based connection for Memcached client. Boolean | false |
| `druid.cache.clientMode` | Client Mode. Static mode requires the user to specify individual cluster nodes. Dynamic mode uses [AutoDiscovery](https://docs.aws.amazon.com/AmazonElastiCache/latest/mem-ug/AutoDiscovery.HowAutoDiscoveryWorks.html) feature of AWS Memcached. String. ["static"](https://docs.aws.amazon.com/AmazonElastiCache/latest/mem-ug/AutoDiscovery.Manual.html) or ["dynamic"](https://docs.aws.amazon.com/AmazonElastiCache/latest/mem-ug/AutoDiscovery.Using.ModifyApp.Java.html) | static |
| `druid.cache.skipTlsHostnameVerification` | Skip TLS Hostname Verification. Boolean. | true |

#### Hybrid

Expand Down
6 changes: 3 additions & 3 deletions licenses.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1658,13 +1658,13 @@ libraries:

---

name: Spymemcached
name: aws-elasticache-cluster-client-memcached-for-java
license_category: binary
module: java-core
license_name: Apache License version 2.0
version: 2.12.3
version: 1.2.0
libraries:
- net.spy: spymemcached
- com.amazonaws: elasticache-java-cluster-client

---

Expand Down
6 changes: 3 additions & 3 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -773,9 +773,9 @@
<version>3.3.6</version>
</dependency>
<dependency>
<groupId>net.spy</groupId>
<artifactId>spymemcached</artifactId>
<version>2.12.3</version>
<groupId>com.amazonaws</groupId>
<artifactId>elasticache-java-cluster-client</artifactId>
<version>1.2.0</version>
</dependency>
<dependency>
<groupId>org.antlr</groupId>
Expand Down
4 changes: 2 additions & 2 deletions server/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -134,8 +134,8 @@
<artifactId>tesla-aether</artifactId>
</dependency>
<dependency>
<groupId>net.spy</groupId>
<artifactId>spymemcached</artifactId>
<groupId>com.amazonaws</groupId>
<artifactId>elasticache-java-cluster-client</artifactId>
</dependency>
<dependency>
<groupId>org.lz4</groupId>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
import com.google.common.hash.HashFunction;
import com.google.common.hash.Hashing;
import net.spy.memcached.AddrUtil;
import net.spy.memcached.ClientMode;
import net.spy.memcached.ConnectionFactory;
import net.spy.memcached.ConnectionFactoryBuilder;
import net.spy.memcached.FailureMode;
Expand All @@ -52,10 +53,16 @@
import org.apache.druid.java.util.metrics.AbstractMonitor;

import javax.annotation.Nullable;
import javax.net.ssl.SSLContext;
import javax.net.ssl.TrustManagerFactory;
import java.io.IOException;
import java.net.InetSocketAddress;
import java.nio.ByteBuffer;
import java.nio.charset.StandardCharsets;
import java.security.KeyManagementException;
import java.security.KeyStore;
import java.security.KeyStoreException;
import java.security.NoSuchAlgorithmException;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
Expand Down Expand Up @@ -339,25 +346,8 @@ public void updateHistogram(String name, int amount)
}
};

final ConnectionFactory connectionFactory = new MemcachedCustomConnectionFactoryBuilder()
// 1000 repetitions gives us good distribution with murmur3_128
// (approx < 5% difference in counts across nodes, with 5 cache nodes)
.setKetamaNodeRepetitions(1000)
.setHashAlg(MURMUR3_128)
.setProtocol(ConnectionFactoryBuilder.Protocol.valueOf(StringUtils.toUpperCase(config.getProtocol())))
.setLocatorType(ConnectionFactoryBuilder.Locator.valueOf(StringUtils.toUpperCase(config.getLocator())))
.setDaemon(true)
.setFailureMode(FailureMode.Cancel)
.setTranscoder(transcoder)
.setShouldOptimize(true)
.setOpQueueMaxBlockTime(config.getTimeout())
.setOpTimeout(config.getTimeout())
.setReadBufferSize(config.getReadBufferSize())
.setOpQueueFactory(opQueueFactory)
.setMetricCollector(metricCollector)
.setEnableMetrics(MetricType.DEBUG) // Not as scary as it sounds
.build();

final ConnectionFactory connectionFactory = createConnectionFactory(config, transcoder,
opQueueFactory, metricCollector);
final List<InetSocketAddress> hosts = AddrUtil.getAddresses(config.getHosts());


Expand Down Expand Up @@ -389,11 +379,57 @@ public MemcachedClientIF get()

return new MemcachedCache(clientSupplier, config, monitor);
}
catch (IOException e) {
catch (IOException | NoSuchAlgorithmException e) {
throw new RuntimeException(e);
}
catch (KeyStoreException e) {
throw new RuntimeException(e);
}
catch (KeyManagementException e) {
throw new RuntimeException(e);
}
}

public static ConnectionFactory createConnectionFactory(final MemcachedCacheConfig config, final LZ4Transcoder transcoder, final OperationQueueFactory opQueueFactory, final MetricCollector metricCollector) throws KeyManagementException, KeyStoreException, NoSuchAlgorithmException
{
MemcachedCustomConnectionFactoryBuilder connectionFactoryBuilder = (MemcachedCustomConnectionFactoryBuilder) new MemcachedCustomConnectionFactoryBuilder()
// 1000 repetitions gives us good distribution with murmur3_128
// (approx < 5% difference in counts across nodes, with 5 cache nodes)
.setKetamaNodeRepetitions(1000)
.setHashAlg(MURMUR3_128)
.setProtocol(ConnectionFactoryBuilder.Protocol.valueOf(StringUtils.toUpperCase(config.getProtocol())))
.setLocatorType(ConnectionFactoryBuilder.Locator.valueOf(StringUtils.toUpperCase(config.getLocator())))
.setDaemon(true)
.setFailureMode(FailureMode.Cancel)
.setTranscoder(transcoder)
.setShouldOptimize(true)
.setOpQueueMaxBlockTime(config.getTimeout())
.setOpTimeout(config.getTimeout())
.setReadBufferSize(config.getReadBufferSize())
.setOpQueueFactory(opQueueFactory)
.setMetricCollector(metricCollector)
.setEnableMetrics(MetricType.DEBUG); // Not as scary as it sounds
if (config.enableTls()) {
// Build SSLContext
TrustManagerFactory tmf = TrustManagerFactory.getInstance(TrustManagerFactory.getDefaultAlgorithm());
tmf.init((KeyStore) null);
SSLContext sslContext = SSLContext.getInstance("TLS");
sslContext.init(null, tmf.getTrustManagers(), null);
// Create the client in TLS mode
connectionFactoryBuilder.setSSLContext(sslContext);
}
if ("dynamic".equals(config.getClientMode())) {
connectionFactoryBuilder.setClientMode(ClientMode.Dynamic);
connectionFactoryBuilder.setHostnameForTlsVerification(config.getHosts().split(",")[0]);
} else if ("static".equals(config.getClientMode())) {
connectionFactoryBuilder.setClientMode(ClientMode.Static);
} else {
throw new RuntimeException("Invalid value provided for `druid.cache.clientMode`. Value must be 'static' or 'dynamic'.");
}
connectionFactoryBuilder.setSkipTlsHostnameVerification(config.skipTlsHostnameVerification());
return connectionFactoryBuilder.build();
}

private final int timeout;
private final int expiration;
private final String memcachedPrefix;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,15 @@ public class MemcachedCacheConfig
@JsonProperty
private String locator = "consistent";

@JsonProperty
private boolean enableTls = false;

@JsonProperty
private String clientMode = "static";

@JsonProperty
private boolean skipTlsHostnameVerification = true;

public int getExpiration()
{
return expiration;
Expand Down Expand Up @@ -112,4 +121,19 @@ public String getLocator()
{
return locator;
}

public boolean enableTls()
{
return enableTls;
}

public String getClientMode()
{
return clientMode;
}

public boolean skipTlsHostnameVerification()
{
return skipTlsHostnameVerification;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
package org.apache.druid.client.cache;

import net.spy.memcached.ArrayModNodeLocator;
import net.spy.memcached.ClientMode;
import net.spy.memcached.ConnectionFactory;
import net.spy.memcached.ConnectionFactoryBuilder;
import net.spy.memcached.ConnectionObserver;
Expand All @@ -37,6 +38,7 @@
import net.spy.memcached.transcoders.Transcoder;
import net.spy.memcached.util.DefaultKetamaNodeLocatorConfiguration;

import javax.net.ssl.SSLContext;
import java.util.Collection;
import java.util.List;
import java.util.concurrent.BlockingQueue;
Expand All @@ -56,7 +58,7 @@ public MemcachedCustomConnectionFactoryBuilder setKetamaNodeRepetitions(int repe
@Override
public ConnectionFactory build()
{
return new DefaultConnectionFactory()
return new DefaultConnectionFactory(clientMode)
{
@Override
public NodeLocator createLocator(List<MemcachedNode> nodes)
Expand Down Expand Up @@ -213,6 +215,45 @@ public long getAuthWaitTime()
{
return authWaitTime;
}

@Override
public SSLContext getSSLContext()
{
return sslContext == null ? super.getSSLContext() : sslContext;
}

@Override
public String getHostnameForTlsVerification()
{
return hostnameForTlsVerification == null ? super.getHostnameForTlsVerification() : hostnameForTlsVerification;
}
@Override
public ClientMode getClientMode()
{
return clientMode == null ? super.getClientMode() : clientMode;
}

@Override
public boolean skipTlsHostnameVerification()
{
return skipTlsHostnameVerification;
}

@Override
public String toString()
{
// MURMUR_128 cannot be cast to DefaultHashAlgorithm
return "Failure Mode: " + getFailureMode().name() + ", Hash Algorithm: "
+ getHashAlg() + " Max Reconnect Delay: "
+ getMaxReconnectDelay() + ", Max Op Timeout: " + getOperationTimeout()
+ ", Op Queue Length: " + getOpQueueLen() + ", Op Max Queue Block Time"
+ getOpQueueMaxBlockTime() + ", Max Timeout Exception Threshold: "
+ getTimeoutExceptionThreshold() + ", Read Buffer Size: "
+ getReadBufSize() + ", Transcoder: " + getDefaultTranscoder()
+ ", Operation Factory: " + getOperationFactory() + " isDaemon: "
+ isDaemon() + ", Optimized: " + shouldOptimize() + ", Using Nagle: "
+ useNagleAlgorithm() + ", KeepAlive: " + getKeepAlive() + ", SSLContext: " + getSSLContext().getProtocol() + ", ConnectionFactory: " + getName();
}
};
}
}
Loading

0 comments on commit d038237

Please sign in to comment.