Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cross cluster search connection fails when the remote cluster has APM enabled and API key based connection is used #112552

Closed
rhr323 opened this issue Sep 5, 2024 · 1 comment · Fixed by #112649
Assignees
Labels
>bug :Security/Security Security issues without another label Team:Security Meta label for security team

Comments

@rhr323
Copy link

rhr323 commented Sep 5, 2024

Elasticsearch Version

8.14.2

Installed Plugins

No response

Java Version

22.0.1

OS Version

Linux ip-172-22-221-60 5.15.0-1035-aws #39-Ubuntu SMP Wed Apr 19 13:51:21 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux

Problem Description

When attempting to use Cross-Cluster Search (CCS) with API keys between two Elasticsearch clusters, the connection fails. The following exception is logged on the local cluster (the querying cluster):

Failure from _resolve/cluster lookup against cluster qa-overview: org.elasticsearch.transport.RemoteTransportException: [instance-0000000071][172.17.0.6:23217][cluster:internal/remote_cluster/handshake] 
Caused by: java.lang.IllegalArgumentException: Transport request header [tracestate] is not allowed for cross-cluster requests through the dedicated remote cluster server port

Connecting to another similar cluster from the same querying cluster works without issues. Based on observations, the main difference seems to be that the problematic remote cluster has APM enabled (e.g. telemetry.api_key entry in its keystore), which might be influencing the connection failure.

Steps to Reproduce

  1. Set up two Elasticsearch clusters:
    • A query cluster that will initiate the connection for cross-cluster search.
    • A remote cluster to which the query cluster will connect.
  2. Enable APM on the remote cluster
  3. Configure the query cluster to use Cross-Cluster Search (CCS) with API keys to connect to the remote cluster.
  4. Observe that the connection is not established, and the connection error is logged at the local cluster.

Logs (if relevant)

[instance-0000000001] Failure from _resolve/cluster lookup against cluster qa-overview: org.elasticsearch.transport.RemoteTransportException: [instance-0000000071][172.17.0.6:23217][cluster:internal/remote_cluster/handshake] Caused by: java.lang.IllegalArgumentException: Transport request header [tracestate] is not allowed for cross cluster requests through the dedicated remote cluster server port at org.elasticsearch.xpack.security.transport.CrossClusterAccessServerTransportFilter.validateHeaders(CrossClusterAccessServerTransportFilter.java:102) ~[?:?] at org.elasticsearch.xpack.security.transport.CrossClusterAccessServerTransportFilter.authenticate(CrossClusterAccessServerTransportFilter.java:87) ~[?:?] at org.elasticsearch.xpack.security.transport.ServerTransportFilter.inbound(ServerTransportFilter.java:105) ~[?:?] at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:642) ~[?:?] at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:75) ~[elasticsearch-8.15.0.jar:?] at org.elasticsearch.transport.InboundHandler.doHandleRequest(InboundHandler.java:288) ~[elasticsearch-8.15.0.jar:?] at org.elasticsearch.transport.InboundHandler.handleRequest(InboundHandler.java:273) ~[elasticsearch-8.15.0.jar:?] at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:115) ~[elasticsearch-8.15.0.jar:?] at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:96) ~[elasticsearch-8.15.0.jar:?] at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:821) ~[elasticsearch-8.15.0.jar:?] at org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:124) ~[elasticsearch-8.15.0.jar:?] at org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:96) ~[elasticsearch-8.15.0.jar:?] at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:61) ~[elasticsearch-8.15.0.jar:?] at org.elasticsearch.transport.netty4.Netty4MessageInboundHandler.channelRead(Netty4MessageInboundHandler.java:48) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[?:?] at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[?:?] at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1475) ~[?:?] at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1338) ~[?:?] at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1387) ~[?:?] at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529) ~[?:?] at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468) ~[?:?] at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[?:?] at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) ~[?:?] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[?:?] at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[?:?] at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) ~[?:?] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788) ~[?:?] at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689) ~[?:?] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652) ~[?:?] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) ~[?:?] at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[?:?] at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?] at java.lang.Thread.run(Thread.java:1570) ~[?:?]

@rhr323 rhr323 added >bug needs:triage Requires assignment of a team area label labels Sep 5, 2024
@jakelandis jakelandis added :Security/Security Security issues without another label and removed needs:triage Requires assignment of a team area label labels Sep 5, 2024
@elasticsearchmachine elasticsearchmachine added the Team:Security Meta label for security team label Sep 5, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-security (Team:Security)

@n1v0lg n1v0lg self-assigned this Sep 5, 2024
n1v0lg added a commit to n1v0lg/elasticsearch that referenced this issue Sep 10, 2024
The [`tracestate`
header](https://www.elastic.co/guide/en/apm/agent/rum-js/current/distributed-tracing-guide.html#enable-tracestate)
is an HTTP header used for distributed tracing; it's a valid header to
persist in cross cluster requests and should therefore be allowlisted in
the remote server port header check.

Note: due to implementation details, `tracestate` today may be set on
the fulfilling cluster (instead of arriving across the wire) _before_
the header check. Not allowing the header therefore can lead to failures
to connect clusters
(elastic#112552).

This PR allowlists the header to allow tracing with RCS 2.0.

As a separate follow up, we may furthermore change behavior around
sending the header from the query cluster to the fulfilling cluster
(which we don't today). This is pending further discussion.  

Closes: elastic#112552
elasticsearchmachine pushed a commit that referenced this issue Sep 11, 2024
The [`tracestate`
header](https://www.elastic.co/guide/en/apm/agent/rum-js/current/distributed-tracing-guide.html#enable-tracestate)
is an HTTP header used for distributed tracing; it's a valid header to
persist in cross cluster requests and should therefore be allowlisted in
the remote server port header check.

Note: due to implementation details, `tracestate` today may be set on
the fulfilling cluster (instead of arriving across the wire) _before_
the header check. Not allowing the header therefore can lead to failures
to connect clusters
(#112552).

This PR allowlists the header to allow tracing with RCS 2.0.

As a separate follow up, we may furthermore change behavior around
sending the header from the query cluster to the fulfilling cluster
(which we don't today). This is pending further discussion.  

Closes: #112552

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Security/Security Security issues without another label Team:Security Meta label for security team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants