Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Index Deletion during Snapshot Finalization #50202

Merged
merged 6 commits into from
Dec 16, 2019

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear commented Dec 14, 2019

With #45689 making it so that index metadata is written
after all shards have been snapshotted we can't delete indices
that are part of the upcoming snapshot finalization any longer
and it is not sufficient to check if all shards of an index have been
snapshotted before deciding that it is safe to delete it.
This change forbids deleting any index that is in the process of being
snapshot to avoid issues during snapshot finalization.

Relates #50200 (doesn't fully fix yet because we're not fixing the partial=true snapshot case here

With elastic#45689 making it so that index metadata is written
after all shards have been snapshotted we can't delete indices
that are part of the upcoming snapshot finalization any longer
and it is not sufficient to check if all shards of an index have been
snapshotted before deciding that it is safe to delete it.
This change forbids deleting any index that is in the process of being
snapshot to avoid issues during snapshot finalization.

Closes elastic#50200
@original-brownbear original-brownbear added >bug :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.6.0 v7.5.2 labels Dec 14, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

indices.add(index);
}
}
for (IndexId index : entry.indices()) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admittedly this makes it a little "harder" to delete an index, but I don't see it as much of an issue relative to the complication it saves. If we don't do it this way, we'd have to add another step to write index metadata per index (once all the shards for that index have sucessfull been snapshotted) to the state machine which doesn't seem worth it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I don't think this weakens anything meaningful, since the order in which indices are snapshotted isn't specified. The existing behaviour seems overly heroic.

DaveCTurner
DaveCTurner previously approved these changes Dec 16, 2019
Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

indices.add(index);
}
}
for (IndexId index : entry.indices()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I don't think this weakens anything meaningful, since the order in which indices are snapshotted isn't specified. The existing behaviour seems overly heroic.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not do this. The reason this functionality is in place (allowing deletes during partial snapshots) is that it allows background snapshots while not interfering with user-level actions such as deletes. Note that we discussed this at the time here: #16321

@original-brownbear
Copy link
Member Author

@ywelsch makes sense. Are you ok with this behavior for non-partial snapshots though?
If so, I would just fix the finalization of partial snapshots in such a way that missing metadata in the end is just handled as shard failures for the deleted indices. WDYT?

@ywelsch
Copy link
Contributor

ywelsch commented Dec 16, 2019

Are you ok with this behavior for non-partial snapshots though?

yes, I think that's ok, at least to get a quick fix out.

@DaveCTurner DaveCTurner dismissed their stale review December 16, 2019 08:33

Ugh I missed the vital check for partiality.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll address partial snapshots in a follow-up. LGTM

@original-brownbear
Copy link
Member Author

original-brownbear commented Dec 16, 2019

@ywelsch Thanks! 7.5.1 here now right?

@ywelsch
Copy link
Contributor

ywelsch commented Dec 16, 2019

Let's backport to 7.5 branch for now, and have a separate discussion whether it makes it to 7.5.1

@original-brownbear original-brownbear merged commit aecbb2f into elastic:master Dec 16, 2019
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Dec 16, 2019
With elastic#45689 making it so that index metadata is written
after all shards have been snapshotted we can't delete indices
that are part of the upcoming snapshot finalization any longer
and it is not sufficient to check if all shards of an index have been
snapshotted before deciding that it is safe to delete it.
This change forbids deleting any index that is in the process of being
snapshot to avoid issues during snapshot finalization.

Relates elastic#50200 (doesn't fully fix yet because we're not fixing the `partial=true`
snapshot case here
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Dec 16, 2019
With elastic#45689 making it so that index metadata is written
after all shards have been snapshotted we can't delete indices
that are part of the upcoming snapshot finalization any longer
and it is not sufficient to check if all shards of an index have been
snapshotted before deciding that it is safe to delete it.
This change forbids deleting any index that is in the process of being
snapshot to avoid issues during snapshot finalization.

Relates elastic#50200 (doesn't fully fix yet because we're not fixing the `partial=true`
snapshot case here
original-brownbear added a commit that referenced this pull request Dec 16, 2019
With #45689 making it so that index metadata is written
after all shards have been snapshotted we can't delete indices
that are part of the upcoming snapshot finalization any longer
and it is not sufficient to check if all shards of an index have been
snapshotted before deciding that it is safe to delete it.
This change forbids deleting any index that is in the process of being
snapshot to avoid issues during snapshot finalization.

Relates #50200 (doesn't fully fix yet because we're not fixing the `partial=true`
snapshot case here
original-brownbear added a commit that referenced this pull request Dec 16, 2019
* Fix Index Deletion during Snapshot Finalization (#50202)

With #45689 making it so that index metadata is written
after all shards have been snapshotted we can't delete indices
that are part of the upcoming snapshot finalization any longer
and it is not sufficient to check if all shards of an index have been
snapshotted before deciding that it is safe to delete it.
This change forbids deleting any index that is in the process of being
snapshot to avoid issues during snapshot finalization.

Relates #50200 (doesn't fully fix yet because we're not fixing the `partial=true`
snapshot case here
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Dec 16, 2019
We can simply filter out shard generation updates for indices
that were removed from the cluster state concurrently to fix
index deletes during partial snapshots as that completely removes
any reference to those shards from the snapshot.

Follow up to elastic#50202
Closes elastic#50200
@jasontedor jasontedor added v7.5.1 and removed v7.5.2 labels Dec 16, 2019
original-brownbear added a commit that referenced this pull request Dec 17, 2019
* Fix Index Deletion During Partial Snapshot Create

We can simply filter out shard generation updates for indices
that were removed from the cluster state concurrently to fix
index deletes during partial snapshots as that completely removes
any reference to those shards from the snapshot.

Follow up to #50202
Closes #50200
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Dec 17, 2019
* Fix Index Deletion During Partial Snapshot Create

We can simply filter out shard generation updates for indices
that were removed from the cluster state concurrently to fix
index deletes during partial snapshots as that completely removes
any reference to those shards from the snapshot.

Follow up to elastic#50202
Closes elastic#50200
original-brownbear added a commit that referenced this pull request Dec 17, 2019
We can simply filter out shard generation updates for indices
that were removed from the cluster state concurrently to fix
index deletes during partial snapshots as that completely removes
any reference to those shards from the snapshot.

Follow up to #50202
Closes #50200
SivagurunathanV pushed a commit to SivagurunathanV/elasticsearch that referenced this pull request Jan 23, 2020
With elastic#45689 making it so that index metadata is written
after all shards have been snapshotted we can't delete indices
that are part of the upcoming snapshot finalization any longer
and it is not sufficient to check if all shards of an index have been
snapshotted before deciding that it is safe to delete it.
This change forbids deleting any index that is in the process of being
snapshot to avoid issues during snapshot finalization.

Relates elastic#50200 (doesn't fully fix yet because we're not fixing the `partial=true`
snapshot case here
SivagurunathanV pushed a commit to SivagurunathanV/elasticsearch that referenced this pull request Jan 23, 2020
* Fix Index Deletion During Partial Snapshot Create

We can simply filter out shard generation updates for indices
that were removed from the cluster state concurrently to fix
index deletes during partial snapshots as that completely removes
any reference to those shards from the snapshot.

Follow up to elastic#50202
Closes elastic#50200
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v7.5.1 v7.6.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants