Cross Cluster Search: make remote clusters optional #27182

javanna · 2017-10-30T22:26:08Z

Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yields partial results, and the _shards section in the response will indicate that.

This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.

Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.

The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:

"_clusters" : {
    "total" : 3,
    "successful" : 2,
    "skipped" : 1
}

Such section won't be part of the response if no clusters have been skipped.

The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.

~~This PR is marked "work in progress", here is what's left to do:~~

there are halfway scenarios that need to be considered when it comes to reconnecting to remote clusters that were down: the cluster could be responding but it's recovering the index, in which case it may return IndexNotFoundException (depending on the indices options provided with the original CCS) which may be surprising. The current approach ignores any failures from the remote cluster that have skip_unavailable marked to true. Maybe the setting should rather be called ignore_failures though, as it is not just about not failing when a remote cluster is disconnected but rather ignoring any failure coming from the remote cluster.
~~yaml tests should probably be converted to java tests, the complication is to properly simulate disconnected nodes (though we have unit tests for all that)~~
~~is it enough to return the number of skipped clusters, or should we go the extra mile and return which clusters were skipped?~~
~~settings validation: you can submit skip_unavailable for a remote cluster that is not registered (with no seeds) Allow affix settings to specify dependencies #27161 should help with this~~
~~would be nice to clean up the skip_unavailable setting once its corresponding remote is removed (as its seeds are set to null), otherwise it stays part of settings although it has no effect.~~

Closes #26118

s1monw · 2017-11-01T11:00:54Z

is it enough to return the number of skipped clusters, or should we go the extra mile and return which clusters were skipped?

I am not sure since we also don't do this if we skip shards? I wonder what would be the usecase. I think we can go and return warning headers for disconnected clusters that are skipped?

settings validation: you can submit skip_if_disconnected for a remote cluster that is not registered (with no seeds) #27161 should help with this

this should be taken care of soon.

would be nice to clean up the skip_if_disconnected setting once its corresponding remote is removed (as its seeds are set to null), otherwise it stays part of settings although it has no effect.

the change in #27161 should make sure it's never left behind.

there are halfway scenarios that need to be considered when it comes to reconnecting to remote clusters that were down: the cluster could be responding but it's recovering the index, in which case it may return IndexNotFoundException (depending on the indices options provided with the original CCS) which may be surprising. The current approach ignores any failures from the remote cluster that have skip_if_disconnected marked to true. Maybe the setting should rather be called ignore_failures though, as it is not just about not failing when a remote cluster is disconnected but rather ignoring any failure coming from the remote cluster.

First I think the reason for disconnection isn't important in this scenario. We want to read what is possible to read. Down the road we might replace the extra roundtrip with a partial reduction on the remote cluster so we really only care if we can get the info or not. Yet, I am ok with stuff like skip_not_available or so.

s1monw

Looks good IMO. I will do another round once you remove the WIP

s1monw · 2017-11-01T10:45:13Z

core/src/main/java/org/elasticsearch/action/search/SearchResponse.java

+                            } else if (Clusters.SKIPPED_FIELD.match(currentFieldName)) {
+                                skipped = parser.intValue();
+                            } else {
+                                parser.skipChildren();


should we be more strict and just fail if this happens?

this parsing method is used in the high-level REST client, we are lenient there to guarantee forward compatibility, meaning that if one day we add a new field under _clusters, we don't break while parsing that but we rather ignore it and everything is fine. This is tested in SearchResponseTests#testFromXContentWithRandomFields

s1monw · 2017-11-01T10:45:20Z

core/src/main/java/org/elasticsearch/action/search/SearchResponse.java

+                                parser.skipChildren();
+                            }
+                        } else {
+                            parser.skipChildren();


s1monw · 2017-11-01T10:46:08Z

core/src/main/java/org/elasticsearch/action/search/SearchResponse.java

@@ -322,6 +360,9 @@ public void readFrom(StreamInput in) throws IOException {
                shardFailures[i] = readShardSearchFailure(in);
            }
        }
+        if (!in.getVersion().before(Version.V_6_1_0)) {


please onOrAfter instead and I think it should be 7.0.0 for now?

s1monw · 2017-11-01T10:46:16Z

core/src/main/java/org/elasticsearch/action/search/SearchResponse.java

@@ -340,7 +381,9 @@ public void writeTo(StreamOutput out) throws IOException {
        for (ShardSearchFailure shardSearchFailure : shardFailures) {
            shardSearchFailure.writeTo(out);
        }
-
+        if (!out.getVersion().before(Version.V_6_1_0)) {


same comment as above

s1monw · 2017-11-01T10:46:51Z

core/src/main/java/org/elasticsearch/action/search/SearchResponse.java

+        private final int skipped;
+
+        Clusters(int total, int successful, int skipped) {
+            this.total = total;


can you also check that they are never negative

s1monw · 2017-11-01T10:49:51Z

core/src/main/java/org/elasticsearch/transport/RemoteClusterConnection.java

+    /**
+     * Updates the skipIfDisconnected flag that can be dynamically set for each remote cluster
+     */
+    synchronized void updateSkipIfDisconnected(boolean skipIfDisconnected) {


it's volatile no need for synchronized then

s1monw · 2017-11-01T10:53:13Z

core/src/main/java/org/elasticsearch/transport/RemoteConnectionInfo.java

@@ -62,6 +64,11 @@ public RemoteConnectionInfo(StreamInput input) throws IOException {
        initialConnectionTimeout = new TimeValue(input);
        numNodesConnected = input.readVInt();
        clusterAlias = input.readString();
+        if (input.getVersion().before(Version.V_6_1_0)) {


please use onOrAfter it has better semantics IMO

Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run, otherwise the whole search request fails despite some of the data is available. This happens when performing the `_search/shards` calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the `_shards` section in the response will indicate that. This commit introduces a boolean setting per cluster called `search.remote.$cluster_alias.skip_if_disconnected`, set to `false` by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory. Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (`_search/scroll` endpoint) will fail if some of the remote clusters went down meanwhile. The search API response contains now a new `_clusters` section, similar to the `_shards` section, that gets returned whenever one or more clusters were disconnected and got skipped: ``` "_clusters" : { "total" : 3, "successful" : 2, "skipped" : 1 } ``` Such section won't be part of the response if no clusters have been skipped. The per cluster `skip_if_disconnected` setting value has also been added to the output of the `remote/info` API. Furthermore, this commit makes sure that we try and reconnect to the remote clusters although they are skipped. Closes elastic#26118

…ardless of its content Rathre than outputting the _clusters section only when something is abnormal, we now print it out every time a Cross Cluster Search is executed. This is more consistent as CCS will always return such section, but ordinary searches will never have that same section. This also improves the high-level REST client output, as the Clusters object can be null, and it will be only in the cases when the _clusters section wasn't returned, which reflects the response better.

…rService

javanna · 2017-11-02T16:31:21Z

qa/ccs-unavailable-clusters/build.gradle

+apply plugin: 'elasticsearch.rest-test'
+apply plugin: 'elasticsearch.test-with-dependencies'
+
+dependencies {


I wanted to added this new test to the existing multi-cluster-search qa module, but that one has two clusters and runs yaml tests against both of them. Whenever you add a new IT test to that module it gets run multiple times (one against the remote cluster, one against the ccs cluster). Also this new module uses the rest high level client too. Maybe we should look into merging the two though as a follow-up, I don't like having too many qa modules and they slow down the build.

do we have to have a java test for this or can we just use multi-cluster-search for it. I mean we can add a 3rd stage where we shutdown one cluster and then run our tests? To me it seems like we can pull it all in yaml land? I can help if you want with that. I am also ok to make it a followup.

I am not sure about this. I personally prefer this java part that shuts down a transport as part of the test over gradle magic that shuts clusters down. I spent some time trying to adapt multi-cluster-search but it was complicated in different ways. And this test also has the advantage of using the high-level client when CCS is used, which we don't test otherwise. Can we eventually address this later?

javanna · 2017-11-02T16:31:51Z

@s1monw I removed the WIP label, this is ready for another round of review

s1monw

LGTM I left some suggestions

s1monw · 2017-11-13T10:03:32Z

core/src/main/java/org/elasticsearch/action/search/SearchResponse.java

@@ -71,15 +74,18 @@

    private ShardSearchFailure[] shardFailures;

+    private Clusters clusters;


can be final?

I'm afraid not, SearchResponse implements Streamable and not Writeable as it is an ActionResponse.

fair enough

s1monw · 2017-11-13T10:05:21Z

core/src/main/java/org/elasticsearch/action/search/SearchResponse.java

    private long tookInMillis;

    public SearchResponse() {
    }

    public SearchResponse(SearchResponseSections internalResponse, String scrollId, int totalShards, int successfulShards,
-                          int skippedShards, long tookInMillis, ShardSearchFailure[] shardFailures) {
+                          int skippedShards, long tookInMillis, ShardSearchFailure[] shardFailures, @Nullable Clusters clusters) {


instead of allowing null should we make it required and just don't render it if total == successful ? I don't like null invariants

that's what I had before, but I think in this specific case null makes sense. Rendering only when total != successful becomes confusing as the same response is also used for scroll responses where rendering this section has no value. Also, you don't see the section unless something went wrong. I think it would be better to print the section out for all CCS requests, and never when going only local. null allows to distinguish between responses that involved CCS and responses that come from local nodes only. Would you prefer to have a placeholder instead of null, called let's say EMPTY, which indicates that the section has no meaningful value and should not be printed out?

placeholder would be fine IMO

s1monw · 2017-11-13T10:06:35Z

core/src/main/java/org/elasticsearch/action/search/TransportSearchAction.java

@@ -215,6 +217,20 @@ protected void doExecute(Task task, SearchRequest searchRequest, ActionListener<
        }
    }

+    static SearchResponse.Clusters buildClusters(OriginalIndices localIndices, Map<String, OriginalIndices> remoteIndices,
+                                                 Map<String, ClusterSearchShardsResponse> searchShardsResponses) {
+        int localClusters = localIndices.indices().length == 0 ? 0 : 1;


Math.min(localIndices.indices().length, 1); maybe?

s1monw · 2017-11-13T10:09:36Z

docs/reference/modules/cross-cluster-search.asciidoc

+PUT _cluster/settings
+{
+  "persistent": {
+    "search": {


would it be easier to say search.remote.cluster_one.skip_unavailable: true?

s1monw · 2017-11-13T10:22:14Z

qa/ccs-unavailable-clusters/build.gradle

+apply plugin: 'elasticsearch.rest-test'
+apply plugin: 'elasticsearch.test-with-dependencies'
+
+dependencies {


do we have to have a java test for this or can we just use multi-cluster-search for it. I mean we can add a 3rd stage where we shutdown one cluster and then run our tests? To me it seems like we can pull it all in yaml land? I can help if you want with that. I am also ok to make it a followup.

s1monw · 2017-11-17T08:16:03Z

@javanna what's the status on this?

javanna · 2017-11-17T08:54:43Z

@s1monw I replied to your comments. Also making the docs snippets work, I had some issues there.

s1monw · 2017-11-20T21:26:15Z

@javanna wanna push this?

Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that. This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory. Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile. The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped: "_clusters" : { "total" : 3, "successful" : 2, "skipped" : 1 } Such section won't be part of the response if no clusters have been skipped. The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.

also fixed the remote.info yaml test to clean up the registered remote cluster once the test is completed. Relates to #27182

* master: (41 commits) [Test] Fix AggregationsTests#testFromXContentWithRandomFields [DOC] Fix mathematical representation on interval (range) (elastic#27450) Update version check for CCS optional remote clusters Bump BWC version to 6.1.0 for elastic#27469 Adapt rest test BWC version after backport Fix dynamic mapping update generation. (elastic#27467) Use the primary_term field to identify parent documents (elastic#27469) Move composite aggregation to core (elastic#27474) Fix test BWC version after backport Protect shard splitting from illegal target shards (elastic#27468) Cross Cluster Search: make remote clusters optional (elastic#27182) [Docs] Fix broken bulleted lists (elastic#27470) Move resync request serialization assertion Fix resync request serialization Fix issue where pages aren't released (elastic#27459) Add YAML REST tests for filters bucket agg (elastic#27128) Remove tcp profile from low level nio channel (elastic#27441) [TEST] Fix `GeoShapeQueryTests#testPointsOnly` failure Transition transport apis to use void listeners (elastic#27440) AwaitsFix GeoShapeQueryTests#testPointsOnly elastic#27454 ...

luxiaoxun · 2018-10-09T08:15:58Z

@javanna
How you guys decide one remote cluster is unavailable?
Do we have some settings like "connection timeout" or "heartbeat detecting" to decide one remote cluster is unavailable ?
Why I am asking this because when I enable the firewall between 2 clusters, the search response is 'java.io.IOException: listener timeout after waiting for [30000] ms' then after almost 15 minutes, one cluster can find out the other one can not be connected. Then the search will skip the "unavailable one".

javanna · 2018-10-10T08:13:57Z

hi @luxiaoxun we currently catch any exception coming back from the remote cluster (could be any type of timeout, or just an error returned), which we recently realized is too broad. We are working on a change to reflect that "unavailable" means we can't connect to a remote cluster, so that skip_unavailable will not make us ignore any error returned from the remote clusters.

I am curious though about the timeout exception you are getting, would you be able to share a stacktrace and expand a bit on the behaviour you are seeing so we can evaluate improvements? It would be better to post it to our discuss forum, and ping me there. Thanks!

javanna added :Search/Search Search-related issues that do not fall into other categories >enhancement v6.1.0 v7.0.0 WIP labels Oct 30, 2017

javanna requested a review from s1monw October 30, 2017 22:26

javanna mentioned this pull request Oct 31, 2017

Allow affix settings to specify dependencies #27161

Merged

javanna force-pushed the enhancement/ccs_skip_disconnected branch from 077ce68 to 878161a Compare October 31, 2017 10:45

s1monw reviewed Nov 1, 2017

View reviewed changes

javanna added 2 commits November 1, 2017 12:58

address review comments

2a4bac1

javanna force-pushed the enhancement/ccs_skip_disconnected branch from 878161a to 2a4bac1 Compare November 1, 2017 13:07

javanna added 4 commits November 2, 2017 14:53

[TEST] move yaml tests for skip_unavailable to java tests

0c53821

move skip_unavailable setting from RemoteClusterAware to RemoteCluste…

1d79d41

…rService

udpate version

38e9281

javanna commented Nov 2, 2017

View reviewed changes

javanna removed the WIP label Nov 2, 2017

restore indentation and add comment

9c97344

s1monw approved these changes Nov 13, 2017

View reviewed changes

javanna added 4 commits November 14, 2017 10:45

Merge branch 'master' into enhancement/ccs_skip_disconnected

1db4a24

adapt to upstream changes

64ab400

add test for dependent settings

abff3e1

[DOCS] fix docs snippets

82b94e2

fix docs and add placeholder Clusters.EMPTY

1d67334

Merge branch 'master' into enhancement/ccs_skip_disconnected

304f2d0

javanna force-pushed the enhancement/ccs_skip_disconnected branch from 6060c23 to 304f2d0 Compare November 21, 2017 00:57

javanna added 2 commits November 21, 2017 01:58

update SearchIT

19c3c84

fix docs tests

68f056a

javanna merged commit 29450de into elastic:master Nov 21, 2017

javanna added the backport pending label Nov 21, 2017

javanna removed the backport pending label Nov 21, 2017

javanna added a commit that referenced this pull request Nov 21, 2017

Update version check for CCS optional remote clusters

3eeccb7

also fixed the remote.info yaml test to clean up the registered remote cluster once the test is completed. Relates to #27182

elasticmachine mentioned this pull request Apr 25, 2018

[Monitoring] Deprecate xpack.monitoring.elasticsearch.* Settings elastic/kibana#18324

Closed

1 task

jasontedor mentioned this pull request Jul 27, 2018

cross error with settings "search.remote.cluster_two.skip_unavailable": true #32420

Closed

javanna mentioned this pull request Sep 21, 2018

CCS skip_unavailable drops remote results on 404 #33915

Closed

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cross Cluster Search: make remote clusters optional #27182

Cross Cluster Search: make remote clusters optional #27182

javanna commented Oct 30, 2017 •

edited

Loading

s1monw commented Nov 1, 2017

s1monw left a comment

s1monw Nov 1, 2017

javanna Nov 1, 2017

s1monw Nov 2, 2017

s1monw Nov 1, 2017

s1monw Nov 1, 2017

s1monw Nov 1, 2017

s1monw Nov 1, 2017

s1monw Nov 1, 2017

s1monw Nov 1, 2017

javanna Nov 2, 2017

s1monw Nov 13, 2017

javanna Nov 14, 2017

javanna commented Nov 2, 2017

s1monw left a comment

s1monw Nov 13, 2017

javanna Nov 14, 2017

s1monw Nov 17, 2017

s1monw Nov 13, 2017

javanna Nov 14, 2017

s1monw Nov 17, 2017

s1monw Nov 13, 2017

javanna Nov 14, 2017

s1monw Nov 13, 2017

javanna Nov 14, 2017

s1monw Nov 13, 2017

s1monw commented Nov 17, 2017

javanna commented Nov 17, 2017

s1monw commented Nov 20, 2017

luxiaoxun commented Oct 9, 2018

javanna commented Oct 10, 2018

		@@ -71,15 +74,18 @@

		private ShardSearchFailure[] shardFailures;

		private Clusters clusters;

Cross Cluster Search: make remote clusters optional #27182

Cross Cluster Search: make remote clusters optional #27182

Conversation

javanna commented Oct 30, 2017 • edited Loading

s1monw commented Nov 1, 2017

s1monw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

javanna commented Nov 2, 2017

s1monw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

s1monw commented Nov 17, 2017

javanna commented Nov 17, 2017

s1monw commented Nov 20, 2017

luxiaoxun commented Oct 9, 2018

javanna commented Oct 10, 2018

javanna commented Oct 30, 2017 •

edited

Loading