Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update #1

Merged
merged 10 commits into from
Sep 12, 2023
Merged

update #1

merged 10 commits into from
Sep 12, 2023

Conversation

ts170710
Copy link
Owner

No description provided.

hzhou8 and others added 10 commits September 7, 2023 08:53
ct_lb is only for backward compatibility, so update the test to align to
the latest behavior with ct_lb_mark.

Signed-off-by: Han Zhou <hzhou@ovn.org>
Acked-by: Ales Musil <amusil@redhat.com>
Add option ignore_chassis_features for northd to bypass the support
status of features on each chassis and to directly implement the latest
features.

This is particularly useful for users who follow the suggested upgrade
order that upgrades ovn-controllers before ovn-northd, and want to
safeguard the operation of northd from being adversely affected by a
mismatched configuration of a chassis. This not only avoids feature
degradation but can also avoid risks of any broken feature in backward
compatible mode. An example of such impact is discussed at [0].

[0] https://mail.openvswitch.org/pipermail/ovs-dev/2023-September/407756.html

Signed-off-by: Han Zhou <hzhou@ovn.org>
Acked-by: Ales Musil <amusil@redhat.com>
skip_snat and force_snat were added in ct_lb/ct_lb_mark action by commit
c1d6b8a to set skip_snat and force_snat bits in ct-lable/mark. It is
not related to the ct_lb_related feature. This patch removes the check.

Without this fix, the skip_snat/force_snat features are broken when
northd is running in "backward compatible" mode, when there is any
ovn-controller running an older version that doesn't support
"ct_lb_related". northd would not add the skip_snat/force_snat in the
ct_lb action in such case, and the relevant bits won't be set in CT
(even on nodes running ovn-controller that supports the
skip_snat/force_snat in ct_lb action), but those CT bits are required to
be matched in another logical flows so that the relevant register flags
can be set (the behavior introduced by the commit ce46a1b):

match=(ct.est && !ct.rel && !ct.new && ct_mark.natted && ct_mark.force_snat == 1), action=(flags.force_snat_for_lb = 1; next;)
match=(ct.est && !ct.rel && !ct.new && ct_mark.natted && ct_mark.skip_snat == 1), action=(flags.skip_snat_for_lb = 1; next;)

Because of this, the skip_snat/force_snat doesn't work in "backward
compatible" mode. With this fix, the feature continues to work between
"backward compatible" mode northd and all nodes that has the ct_lb
skip_snat/force_snat support, while nodes running older version of
ovn-controller would report syntax error on such logical flows. While
this may not sound perfect, but if users follow the suggested upgrade
order so that ovn-controllers are upgraded before ovn-northd, there is
no problem. The biggest benefit of this fix is that when there is a bad
node that fails upgrading ovn-controller, the skip_snat/force_snat
features are not broken.

Alternatively, we could fix the problem by reverting commit ce46a1b.
However, there were already several fixes and refactors for the related
code on top of that, it is not straightforward. The code would become
more complex and the value of the backward compatibility for a
northd-first upgrade order is not obvious for this feature which was
introduced 2 release ago. In addition, there are other places when the
ct_lb skip_snat/force_snat are used without checking any chassis feature
support, such as in build_lb_affinity_lr_flows(). So, based on all the
above considerations, simply removing the feature compatiblity check
seems to be a more reasonable fix.

Fixes: c1d6b8a ("northd: Store skip_snat and force_snat in ct_label/mark")
Fixes: ce46a1b ("northd: Use generic ct.est flows for LR LBs")
Signed-off-by: Han Zhou <hzhou@ovn.org>
Acked-by: Ales Musil <amusil@redhat.com>
Fix the missing test case update of a previous commit.

Fixes: 97c1b17 ("northd: Don't check ct_lb_related feature for skip_snat/force_snat.")
Signed-off-by: Han Zhou <hzhou@ovn.org>
Similar to the commit [1], a new sub-engine node "sync_to_sb_lb"
is added with-in the "sync_to_sb" to sync the SB load balancers.
Its main input nodes are "northd" (to access the "lbs" hmap built
by this node) and "sb_load_balancer" to access the SB load balancer.
"sync_to_sb_lb" doesn't add any handlers and falls back to full
recompute all the time.

[1] - ccafcc2("northd I-P: Add a new engine node 'en_sync_to_sb' to sync SB tables.")

Acked-by: Han Zhou <hzhou@ovn.org>
Reviewed-by: Ales Musil <amusil@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Numan Siddique <numans@ovn.org>
This patch separates out the 'lbs' and 'lb_groups' from the 'northd' engine
node data into a new engine node 'lb_data'.  This new node becomes
an input to the 'northd' node.

This makes handling the NB load balancer and load balancer group changes
easier.

Prior to this patch, 'struct ovn_northd_lb' and 'struct ovn_lb_group'
were maintaing the logical switch and logical router association
with a load balancer.  This data is now maintained by 'northd' engine
node.  New structs 'struct ovn_lb_datapaths' and 'struct
ovn_lb_group_datapaths' is added for this purpose.

Acked-by: Han Zhou <hzhou@ovn.org>
Reviewed-by: Ales Musil <amusil@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Numan Siddique <numans@ovn.org>
Any changes to load balancers and load balancer groups
are handled incrementally in the newly added 'lb_data'
engine node.  'lb_data' is input to 'northd' node
and the handler - northd_lb_data_handler in 'northd'
node handles the changes.

If a load balancer or load balancer group is associated to
a logical switch or router then 'northd' node falls back
to full recompute.  Upcoming patch will handle this scenario.

Acked-by: Han Zhou <hzhou@ovn.org>
Reviewed-by: Ales Musil <amusil@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Numan Siddique <numans@ovn.org>
…changes.

This will help in handling other column changes of a logical switch.

Acked-by: Han Zhou <hzhou@ovn.org>
Reviewed-by: Ales Musil <amusil@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Numan Siddique <numans@ovn.org>
Single test needs to run 4 times to check all
the permutations, this is taking a lot of CI time
and job space.

Remove the parallel permutation and leave parallelization
enabled for all tests, as this use case is more complex.
Only exception are 10 tests in ovn-northd.at that still run
parallelization permutations. This should be enough to cover
the code with parallelization disabled.

This allows us to greatly reduce the number of test cases
(almost by half) and we can also remove 6 jobs from the CI
pipeline. The time reduction is very noticeable going down
from ~30 min to ~20 min.

Signed-off-by: Ales Musil <amusil@redhat.com>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
…tion.

Commit 506f7d4 ("northd: rely on new actions for ecmp-symmetric
routing") relied on commit_ecmp_nh() to learn the mapping between a
traffic session's 5-tuple and the MAC of the next-hop to be used for
that session.  This logical action translates to OVS learn() action.

While in theory this is the most correct way of tracking the mapping
between ECMP symmetric reply sessions and their next hop's MAC address,
it also creates additional load in ovs-vswitchd (to learn the new flows)
and introduces latency and affects traffic throughput.

An alternative is to instead re-commit the 5-tuple (along with the
next-hop MAC and port information) to conntrack every time traffic
received from the next-hop side is forwarded by the OVN router.

Testing shows that in a scenario with 4 next-hops and ECMP symmetric
replies enabled with traffic running for 600 seconds latency, throughput
and ovs-vswitchd CPU usage are significantly better with this change:

- Before:
  - ovs-vswitchd ~1200% CPU (distributed across 17 revalidator threads)
  - Sent: 638.72MiB, 1.06MiB/s
  - Recv: 7.17GiB, 12.24MiB/s

- After:
  - ovs-vswitchd ~7% CPU (distributed across 17 revalidator threads)
  - Sent: 892.69MiB, 1.49MiB/s
  - Recv: 8.63GiB, 14.72MiB/s

The only downside of not using learn() flows is that OVN cannot
determine on its own when a next-hop goes away (without using BFD).
This scenario however can probably be handled by the CMS which has more
knowledge about the rest of the network (outside OVN), e.g.,
ovn-kubernetes currently flushes conntrack entries created for ECMP
symmetric reply sessions when it detects that the next-hop went away.

If a next-hops changes MAC address that's handled by OVN gracefully and
the conntrack entry corresponding to that session gets updated
accordingly.

NOTE: we don't remove the logical actions' implementation as
ovn-controller needs to be able to translate logical flows generated by
older versions of ovn-northd.

Fixes: 506f7d4 ("northd: rely on new actions for ecmp-symmetric routing")
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Reviewed-by: Ales Musil <amusil@redhat.com>
@ts170710 ts170710 merged commit 433502d into ts170710:main Sep 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants