Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reindex resolve indices early #49850

Conversation

henningandersen
Copy link
Contributor

Resolve indices before starting to reindex. This ensures that the list
of indices does not change when failing over (TBD). The one exception to
this is aliases, which we still need to access through the alias.

In addition, resolved index patterns are sorted by create-date and
otherwise the listed order is preserved. This ensures that once we
reindex one index at a time, we will get reasonable time locality for
time based indices.

The resolved list of indices will also by used to do searching one
index (or index group) at a time, improving search performance (since we
use sort) and allowing us to do more fine-grained checkpoint and track
progress (TBD).

Relates #42612

Resolve indices before starting to reindex. This ensures that the list
of indices does not change when failing over (TBD). The one exception to
this is aliases, which we still need to access through the alias.

In addition, resolved index patterns are sorted by create-date and
otherwise the listed order is preserved. This ensures that once we
reindex one index at a time, we will get reasonable time locality for
time based indices.

The resolved list of indices will also by used to do searching one
index (or index group) at a time, improving search performance (since we
use sort) and allowing us to do more fine-grained checkpoint and track
progress (TBD).

Relates elastic#42612
@henningandersen henningandersen added >non-issue :Distributed/Reindex Issues relating to reindex that are not caused by issues further down labels Dec 5, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Reindex)

@henningandersen
Copy link
Contributor Author

ci/1 test failure fixed here: #49855

Copy link
Contributor

@Tim-Brooks Tim-Brooks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question mostly

@@ -87,7 +103,7 @@ protected void doExecute(Task task, StartReindexTaskAction.Request request, Acti

// In the current implementation, we only need to store task results if we do not wait for completion
boolean storeTaskResult = request.getWaitForCompletion() == false;
ReindexTaskParams job = new ReindexTaskParams(storeTaskResult, included);
ReindexTaskParams job = new ReindexTaskParams(storeTaskResult, resolveIndexPatterns(request.getReindexRequest()), included);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to be putting this in the cluster state? I don't have a sense for how large this gets, but I assume it could go in the index?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is clearly wrong, fixed in 33d8fd4.

@henningandersen
Copy link
Contributor Author

Marking this WIP and intend to close later based on our discussions the other day. The conclusion there was that we should try to improve this inside search rather than try to improve performance in reindex by doing one index at a time.

@rjernst rjernst added the Team:Distributed Meta label for distributed team label May 4, 2020
@colings86 colings86 closed this May 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Reindex Issues relating to reindex that are not caused by issues further down >non-issue Team:Distributed Meta label for distributed team WIP
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants