Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend full upgrade to any version upgrade of non-HA ES #5408

Merged
merged 5 commits into from
Feb 24, 2022

Conversation

pebrc
Copy link
Collaborator

@pebrc pebrc commented Feb 23, 2022

Elasticsearch documentation implies that two node clusters cannot be upgraded in a rolling fashion because we would be removing more than 50% of the nodes during the upgrade. A stricter validation of this invariant introduced in 7.14 was causing the test failures we have seen in our e2e test pipelines once we had fixed the faulty non-HA upgrade test This affects any 7.x version but as of 7.14 there is an error and the nodes will not boot up.

In the light of this information this PR extends the full restart upgrade mode to any version upgrade. It also adjusts the setting of initial master nodes during the 6.x to 7.x transition to include two node clusters. This gives us a uniform way of handling those upgrades.

There is still a case where an upgrade can get stuck that this PR does not cover: if a user combines a version upgrade with an upscale/downscale operation e.g. by bumping the Elasticsearch version and renaming the nodeSet at the same time. This will lead to something that looks like a rolling upgrade as we are removing the old nodes and adding new nodes one by one. There is still a chance that the upgrade gets stuck in that scenario and I don't see any easy way to fix that (other than through documentation)

One potential way of how the upgrade might get stuck is as follows:

  • we create new node running in the new version but the image needs to be pulled or the startup is otherwise delayed
  • we delete one of the existing nodes (in accordance with the change budget)
  • the old node terminates before the new node joins the cluster and if the old node happened to be the master at the time the cluster now breaks down

I think this scenario can be avoided by setting a change budget with maxUnavailable: 0 or by avoiding the combination of nodeSet addition/removal with a version upgrade.

@pebrc pebrc added the >enhancement Enhancement of existing functionality label Feb 23, 2022
@barkbay
Copy link
Contributor

barkbay commented Feb 24, 2022

the old node terminates before the new node joins the cluster and if the old node happened to be the master at the time the cluster now breaks down

I think I'm missing something, assuming the upscaled+renamed nodeSet is configured with 3 replicas, we would then still have 2 nodes running, which I think would be enough to trigger a new election ?
If the upscaled+renamed nodeSet is configured with 2 replicas it would be then restarted because isNonHACluster would return true Both situation seems to be covered here ?

@pebrc
Copy link
Collaborator Author

pebrc commented Feb 24, 2022

If the upscaled+renamed nodeSet is configured with 2 replicas it would be then restarted because isNonHACluster would return true Both situation seems to be covered here ?

I think the problem is that once you rename a nodeSet we are no longer executing the upgrade path of the reconciliation loop but we have instead two independent operations that are only coordinated through the maxUnavailable change budget: a downscale of the old nodeSet and an upscale of the new nodeSet. Both downscale and upscale logic do not benefit from the changes in this PR and thus the we are again in a situation where things can go wrong.

Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from a code perspective.
I'm running TestVersionUpgrade* e2e tests locally, will ✅ once it's done.

@@ -75,7 +76,7 @@ func shouldSetInitialMasterNodes(es esv1.Elasticsearch, k8sClient k8s.Client, no
return true, nil
}
// - we're upgrading (effectively restarting) a single zen1 master to zen2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think this comment should be updated

@naemono naemono added the v2.1.0 label Feb 24, 2022
@barkbay
Copy link
Contributor

barkbay commented Feb 24, 2022

TestVersionUpgradeTwoNodes68xTo7x and TestVersionUpgradeTwoNodesToLatest7x failed:

=== RUN   TestVersionUpgradeTwoNodes68xTo7x/Stopping_to_pod_count_for_change_budget:_expect_to_stay_within_the_change_budget
    checks_budget.go:105: 
        	Error Trace:	checks_budget.go:105
        	            				watcher.go:83
        	Error:      	change budget violated
        	Test:       	TestVersionUpgradeTwoNodes68xTo7x/Stopping_to_pod_count_for_change_budget:_expect_to_stay_within_the_change_budget
        	Messages:   	ready pod count 0 when allowed min was 1
=== RUN   TestVersionUpgradeTwoNodesToLatest7x/Stopping_to_pod_count_for_change_budget:_expect_to_stay_within_the_change_budget
    checks_budget.go:105: 
        	Error Trace:	checks_budget.go:105
        	            				watcher.go:83
        	Error:      	change budget violated
        	Test:       	TestVersionUpgradeTwoNodesToLatest7x/Stopping_to_pod_count_for_change_budget:_expect_to_stay_within_the_change_budget
        	Messages:   	ready pod count 0 when allowed min was 1

I think we have to use the WithChangeBudget(2, 2) "trick":

diff --git a/test/e2e/es/version_upgrade_test.go b/test/e2e/es/version_upgrade_test.go
index 113fac858..63b9e8705 100644
--- a/test/e2e/es/version_upgrade_test.go
+++ b/test/e2e/es/version_upgrade_test.go
@@ -55,6 +55,10 @@ func TestVersionUpgradeTwoNodes68xTo7x(t *testing.T) {
        // due to minimum_master_nodes=2, the cluster is unavailable while the first master is upgraded
        initial := elasticsearch.NewBuilder("test-version-up-2-68x-to-7x").
                WithVersion(srcVersion).
+               // 7x non-HA upgrades cannot honour a change budget with maxUnavailable 1 because we are rolling both nodes at once
+               // setting the change budget accordingly allows this test to pass as otherwise the changeBudgetWatcher step would
+               // fail as it would detect a change budget violation.
+               WithChangeBudget(2, 2).
                WithESMasterDataNodes(2, elasticsearch.DefaultResources)
 
        mutated := initial.WithNoESTopology().
@@ -173,6 +177,10 @@ func TestVersionUpgradeTwoNodesToLatest7x(t *testing.T) {
 
        initial := elasticsearch.NewBuilder("test-version-up-2-to-7x").
                WithVersion(srcVersion).
+               // 7x non-HA upgrades cannot honour a change budget with maxUnavailable 1 because we are rolling both nodes at once
+               // setting the change budget accordingly allows this test to pass as otherwise the changeBudgetWatcher step would
+               // fail as it would detect a change budget violation.
+               WithChangeBudget(2, 2).
                WithESMasterDataNodes(2, elasticsearch.DefaultResources)
 
        mutated := initial.WithNoESTopology().

@pebrc
Copy link
Collaborator Author

pebrc commented Feb 24, 2022

I think we have to use the WithChangeBudget(2, 2) "trick"

Good catch. I thought I tested that but apparently not 😞

Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !

@pebrc pebrc merged commit bc4e28a into elastic:main Feb 24, 2022
fantapsody pushed a commit to fantapsody/cloud-on-k8s that referenced this pull request Feb 7, 2023
Elasticsearch documentation implies that two node clusters cannot be upgraded in a rolling fashion because we would be removing more than 50% of the nodes during the upgrade. A stricter validation of this invariant introduced in 7.14 was causing the test failures we have seen in our e2e test pipelines once we had fixed the faulty non-HA upgrade test This affects any 7.x version but as of 7.14 there is an error and the nodes will not boot up.

In the light of this information this PR extends the full rolling upgrade mode to any version upgrade. It also adjusts the setting of initial master nodes during the 6.x to 7.x transition to include two node clusters. This gives us a uniform way of handling those upgrades.

There is still a case where an upgrade can get stuck that this change does not cover: if a user combines a version upgrade with an upscale/downscale operation e.g. by bumping the Elasticsearch version and renaming the nodeSet at the same time. This will lead to something that looks like a rolling upgrade as we are removing the old nodes and adding new nodes one by one. There is still a chance that the upgrade gets stuck in that scenario and I don't see any easy way to fix that (other than through documentation). The problem is that once you rename a nodeSet we are no longer executing the upgrade path of the reconciliation loop but we have instead two independent operations that are only coordinated through the maxUnavailable change budget: a downscale of the old nodeSet and an upscale of the new nodeSet. Both downscale and upscale logic do not benefit from the changes in this commit and thus the we are again in a situation where things can go wrong.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement Enhancement of existing functionality v2.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants