Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mix-version testing for sending snaspshot cases #14807

Closed
ahrtr opened this issue Nov 18, 2022 · 6 comments
Closed

Add mix-version testing for sending snaspshot cases #14807

ahrtr opened this issue Nov 18, 2022 · 6 comments

Comments

@ahrtr
Copy link
Member

ahrtr commented Nov 18, 2022

What would you like to be added?

We already have mix-version (e.g, A cluster has both 3.5 node and 3.6 node) test, but do not cover the cases of sending snapshot. Specifically we need to cover the following two cases,

  1. 3.5 node sends a snapshot to 3.6 node;
  2. 3.6 node sends a snapshot to 3.5 node;

Note that etcd doesn't support adding a new member of old version into a cluster with higher version. For example, etcd cluster version is 3.6.x, then a new member of 3.5.x can't join the cluster. Please refer to cluster_util.go#L222-L230. So we can't implement case 2 by dynamically adding a node with older version, please refer to #14707. Instead, we can use network partition scenario to trigger snapshots.

Why is this needed?

We need to make sure there is no any issue when upgrading multi-node etcd cluster. Also refer to #14592

@ahrtr
Copy link
Member Author

ahrtr commented Nov 18, 2022

Reference: cluster.go#L1343

@halegreen
Copy link
Contributor

@ahrtr Hi , I want to take this. While I've not seen the injectPartition method for e2e test cases., only in integration test case. So I plan to modify iptables, dorp peer port transporting , in order to mock network partition for e2e cluster , should it be OK ?

@ahrtr
Copy link
Member Author

ahrtr commented Dec 4, 2022

Thanks @halegreen for working on this. Programmatically modifying iptables seems not a good solution, reasons:

  1. It might have impact on other test case;
  2. It isn't good to update any system-wide setting even you programmatically rollback the change afterwards. Note we may manually run any e2e test case locally, so it may affect developers' local environment.

A simpler & safer solution is to stop a member, play traffic and start it again. @ptabor @serathius any thoughts?

@halegreen
Copy link
Contributor

halegreen commented Dec 17, 2022

hi, @ahrtr "stop a member, play traffic and start it again", I was wonder will this action always leads to a network partition ?

@ahrtr
Copy link
Member Author

ahrtr commented Dec 17, 2022

hi, @ahrtr "stop a member, play traffic and start it again", I was wonder will this action always leads to a network partition ?

It's accepted as long as a snapshot can be sent from the leader to the follower when the follower gets started again.

@stale
Copy link

stale bot commented Mar 18, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Mar 18, 2023
@stale stale bot closed this as completed May 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants