-
Notifications
You must be signed in to change notification settings - Fork 24.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed index replica allocation #41784
Closed index replica allocation #41784
Conversation
When an index is closed, we expect primary and replicas to be identical. This commit improves the gateway replica shard allocator to consider shards with identical sequence numbers sync'ed for closed indices. This ensures that we will pick a fast recovery regardless of whether synced flush was performed prior to closing an index. Relates elastic#41400 and elastic#33888
Pinging @elastic/es-distributed |
Added integration test validating that fast recovery is made for closed indices when multiple shard copies can be chosen from. Fixed InternalTestCluster to allow doing operations inside onStopped() when using restartXXXNode(). Relates elastic#41400 and elastic#33888
to assume closed indices are synced.
ci/1 failed with unrelated failure, reported here: #41794 |
…x_replica_allocation
@henningandersen I have merged #41400. |
…x_replica_allocation
GatewayIndexIT relies on getInstance returning closed node inside onStopped.
@elasticmachine run elasticsearch-ci/1 |
It looks like I’m mostly wondering if we should generalize the logic a bit more, and not rely on the max seq no / local checkpoint of the last commit, but explicitly enrich the
If we then have the condition that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^^
This is a first step away from sync-ids. We now check if replica and primary are identical using sequence numbers when determining where to allocate a replica shard. If an index is no longer indexed into, issuing a regular flush will now be enough to ensure a no-op recovery is done. This has the nice side-effect of ensuring that closed indices and frozen indices choose existing shard copies with identical data over file-overlap comparison, increasing the chance that we end up doing a no-op recovery (only no-op and file-based recovery is supported by closed indices). Relates elastic#41400 and elastic#33888 Supersedes elastic#41784
@henningandersen Should we close this PR? |
Thanks @dnhatn , yes this can be closed now. |
When an index is closed, we expect primary and replicas to be identical.
This commit improves the gateway replica shard allocator to consider
shards with identical sequence numbers sync'ed for closed indices. This
ensures that we will pick a fast recovery regardless of whether synced
flush was performed prior to closing an index.
Fixed
InternalTestCluster
to allow doing operations insideonStopped()
when using
restartXXXNode()
.Relates #41400 and #33888
Please notice the todo on the explain API.