From 0ec053f2d6f833d533090c5910688a2ccd2bce83 Mon Sep 17 00:00:00 2001 From: Clinton Gormley Date: Thu, 23 Nov 2017 17:07:50 +0100 Subject: [PATCH] Improve docs for split API in 6.1/6.x (#27504) --- docs/reference/indices/split-index.asciidoc | 38 +++++++++++---------- 1 file changed, 20 insertions(+), 18 deletions(-) diff --git a/docs/reference/indices/split-index.asciidoc b/docs/reference/indices/split-index.asciidoc index 467c09baa2432..f7dd7404b04e6 100644 --- a/docs/reference/indices/split-index.asciidoc +++ b/docs/reference/indices/split-index.asciidoc @@ -1,23 +1,25 @@ [[indices-split-index]] == Split Index -number_of_routing_shards - -The split index API allows you to split an existing index into a new index -with multiple of it's primary shards. Similarly to the <> -where the number of primary shards in the shrunk index must be a factor of the source index. -The `_split` API requires the source index to be created with a specific number of routing shards -in order to be split in the future. (Note: this requirement might be remove in future releases) -The number of routing shards specify the hashing space that is used internally to distribute documents -across shards, in oder to have a consistent hashing that is compatible with the method elasticsearch -uses today. -For example an index with `8` primary shards and a `index.number_of_routing_shards` of `32` -can be split into `16` and `32` primary shards. An index with `1` primary shard -and `index.number_of_routing_shards` of `64` can be split into `2`, `4`, `8`, `16`, `32` or `64`. -The same works for non power of two routing shards ie. an index with `1` primary shard and -`index.number_of_routing_shards` set to `15` can be split into `3` and `15` or alternatively`5` and `15`. -The number of shards in the split index must always be a factor of `index.number_of_routing_shards` -in the source index. Before splitting, a (primary) copy of every shard in the index must be active in the cluster. +The split index API allows you to split an existing index into a new index, +where each original primary shard is split into two or more primary shards in +the new index. + +IMPORTANT: The `_split` API requires the source index to be created with a +specific `number_of_routing_shards` in order to be split in the future. This +requirement has been removed in Elasticsearch 7.0. + +The number of times the index can be split (and the number of shards that each +original shard can be split into) is determined by the +`index.number_of_routing_shards` setting. The number of routing shards +specifies the hashing space that is used internally to distribute documents +across shards with consistent hashing. For instance, a 5 shard index with +`number_of_routing_shards` set to `30` (`5 x 2 x 3`) could be split by a +factor of `2` or `3`. In other words, it could be split as follows: + +* `5` -> `10` -> `30` (split by 2, then by 3) +* `5` -> `15` -> `30` (split by 3, then by 2) +* `5` -> `30` (split by 6) Splitting works as follows: @@ -29,7 +31,7 @@ Splitting works as follows: into the new index, which is a much more time consuming process.) * Once the low level files are created all documents will be `hashed` again to delete - documents that belong in a different shard. + documents that belong to a different shard. * Finally, it recovers the target index as though it were a closed index which had just been re-opened.