Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge #3

Merged
merged 133 commits into from
Dec 27, 2023
Merged

merge #3

merged 133 commits into from
Dec 27, 2023

Conversation

ts170710
Copy link
Owner

No description provided.

numansiddique and others added 30 commits September 13, 2023 21:59
'lb_data' engine node now also handles logical switch changes.
Its data maintains ls to lb related information. i.e if a
logical switch sw0 has lb1, lb2 and lb3 associated then
it stores this info in its data.  And when a new load balancer
lb4 is associated to it, it stores this information in its
tracked data so that 'northd' engine node can handle it
accordingly.  Tracked data will have information like:
  changed ls -> {sw0 : {associated_lbs: [lb4]}

The first handler 'northd_lb_data_handler_pre_od' is called before the
'northd_nb_logical_switch_handler' handler and it just creates or
deletes the lb_datapaths hmap for the tracked lbs.

The northd handler 'northd_lb_data_handler' updates the
ovn_lb_datapaths's 'nb_ls_map' bitmap accordingly.

Eg.  If the lb_data has the below tracked data:

tracked_data = {'crupdated_lbs': [lb1, lb2],
                'deleted_lbs': [lb3],
                'crupdated_lb_groups': [lbg1, lbg2],
                'crupdated_ls_lbs': [{ls: sw0, assoc_lbs: [lb1],
                                     {ls: sw1, assoc_lbs: [lb1, lb2]}

The handler northd_lb_data_handler(), creates the
ovn_lb_datapaths object for lb1 and lb2 and deletes lb3 from
the ovn_lb_datapaths hmap.  It does the same for the created or updated lb
groups lbg1 and lbg2 in the ovn_lbgrp_datapaths map.  It also updates the
nb_ls_bitmap of lb1 for sw0 and sw1 and nb_ls_bitmap of lb2 for sw1.

Reviewed-by: Ales Musil <amusil@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Numan Siddique <numans@ovn.org>
For a given load balancer group 'A', northd engine data maintains
a bitmap of datapaths associated to this lb group.  So when lb group 'A'
gets associated to a logical switch 's1', the bitmap index of 's1' is set
in its bitmap.

In order to handle the load balancer group changes incrementally for a
logical switch, we need to set and clear the bitmap bits accordingly.
And this patch does it.

Reviewed-by: Ales Musil <amusil@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Numan Siddique <numans@ovn.org>
A new engine node 'sync_to_sb_pb' is added within 'sync_to_sb'
node to sync NAT column of Port bindings table.  This separation
is required in order to add load balancer group I-P handling
in 'northd' engine node (which is handled in the next commit).

'sync_to_sb_pb' engine node can be later expanded to sync other
Port binding columns if required.

Reviewed-by: Ales Musil <amusil@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Numan Siddique <numans@ovn.org>
When a logical router gets updated due to load balancer or load balancer
groups changes, it is now incrementally handled first in 'lb_data'
engine node similar to how logical switch changes are handled.  The
tracking data of 'lb_data' is updated similarly so that northd engine
handler - northd_handle_lb_data_changes() handles it.

A new handler northd_handle_lr_changes() is added in the 'northd' engine
node for logical router changes.  This handler returns true if only
load balancer or load balancer group columns are changed.  It returns
false for any other changes.

northd_handle_lb_data_changes() also sets the logical router
od's lb_ips accordingly.

Below are the scale testing results done with these patches applied
using ovn-heater.  The test ran the scenario  -
ocp-500-density-heavy.yml [1].

With these patches applied (with load balancer I-P handling in northd
engine node) the resuts are:

-------------------------------------------------------------------------------------------------------------------------------------------------------
			Min (s)		Median (s)	90%ile (s)	99%ile (s)	Max (s)		Mean (s)	Total (s)	Count	Failed
-------------------------------------------------------------------------------------------------------------------------------------------------------
Iteration Total		0.138730	2.168997	3.224783	3.320061	3.326713	1.616405	202.050672	125	0
Namespace.add_ports	0.005276	0.005608	0.006604	0.009053	0.018615	0.005901	0.737612	125	0
WorkerNode.bind_port	0.034812	0.045776	0.053103	0.057902	0.060541	0.045659	11.414781	250	0
WorkerNode.ping_port	0.005281	0.006927	2.071924	3.186326	3.197238	0.743860	185.964955	250	0
-------------------------------------------------------------------------------------------------------------------------------------------------------

The results with the present main are:

-------------------------------------------------------------------------------------------------------------------------------------------------------
                        Min (s)	        Median (s)	90%ile (s)	99%ile (s)	Max (s)	        Mean (s)	Total (s)	Count	Failed
-------------------------------------------------------------------------------------------------------------------------------------------------------
Iteration Total	        3.233795	4.364926	5.400982	6.412803	7.409757	4.792270	599.033790	125	0
Namespace.add_ports	0.005230	0.006564	0.007379	0.019060	0.037490	0.007223	0.902930	125	0
WorkerNode.bind_port	0.033864	0.044052	0.049608	0.054849	0.056196	0.044005	11.001231	250	0
WorkerNode.ping_port	0.005334	2.060477	5.222422	6.267332	7.284001	2.323020	580.754964	250	0
-------------------------------------------------------------------------------------------------------------------------------------------------------

Few observations:

 - The total time taken has come down significantly from 599 seconds to 202.
 - 99%ile with these patches is 3.32 seconds compared to 6.4 seconds for the
   main.
 - 99%ile with these patches is 3.2 seconds compared to 5.4 seconds for the
   main.
 - CPU utilization of northd during the test with these patches
   is between 100% to 300% which is almost the same as main.
   Main difference being that, with these patches the test duration is
   less and hence overall less CPU utilization.

[1] - https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml

Reviewed-by: Ales Musil <amusil@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Acked-by: Han Zhou <hzhou@ovn.org>
Signed-off-by: Numan Siddique <numans@ovn.org>
The release message indicates that the address is
no longer in use. Simply reply with status code success
without any special handling as we do not store the
information about address being in use.

Reported-at: https://bugzilla.redhat.com/2237855
Signed-off-by: Ales Musil <amusil@redhat.com>
Signed-off-by: Numan Siddique <numans@ovn.org>
If the flow size is bigger than UINT16_MAX it doesn't
fit into openflow message. Programming of such flow will
fail which results in disconnect of ofctrl. After reconnect
we program everything from scratch, in case the long flow still
remains the cycle continues. This causes the node to be almost
unusable as there will be massive traffic disruptions.

To prevent that check if the flow is within the allowed size.
If not log the flow that would trigger this problem and do not program
it. This isn't a self-healing process, but the chance of this happening
are very slim. Also, any flow that is bigger than allowed size is OVN
bug, and it should be fixed.

Reported-at: https://bugzilla.redhat.com/1955167
Signed-off-by: Ales Musil <amusil@redhat.com>
Signed-off-by: Mark Michelson <mmichels@redhat.com>
During ofctrl_add_or_append_flow we are able to combine
two flows with same match but different conjunctions.
However, the function didn't check if the conjunctions already
exist in the installed flow, which could result in conjunction
duplication and the flow would grow infinitely e.g.
actions=conjunction(1,1/2), conjunction(1,1/2)

Make sure that we add only conjunctions that are not present
in the already existing flow.

Reported-at: https://bugzilla.redhat.com/2175928
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ales Musil <amusil@redhat.com>
Signed-off-by: Mark Michelson <mmichels@redhat.com>
The flows to allow DHCP response from ovn-controller were missing
if a logical VIF port had dhcp v4/v6 options set and were handled
incrementally.

Fixes: 8bbd678 ("northd: Incremental processing of VIF additions in 'lflow' node.")
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Numan Siddique <numans@ovn.org>
Check if parent_name is properly set in build_gateway_get_l2_hdr_size
routine if tag_request is set 0, since parent_name is mandatory for
dynamically allocated VLANID.

Reported-at: https://issues.redhat.com/browse/FDP-38
Fixes: b68753a ("northd: dynamically compute l2 hdr len for check_pkt_larger action")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Numan Siddique <numans@ovn.org>
_DUMP_DB_TABLES macro should instead dump "port_group" table.

Fixes: 5b6a7ad ("northd: Add incremental processing for NB port groups.")
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Numan Siddique <numans@ovn.org>
previously if multiple routers in the same az are connected to the same
transit switch then ovn-ic would only propagate the routes of one of
these routers to the ic-sb.
This commit fixes this behaviour and allows multiple routers in a single
az to use route advertisements.

Co-authored-by: Maxim Korezkij <maxim.korezkij@mail.schwarz>
Signed-off-by: Maxim Korezkij <maxim.korezkij@mail.schwarz>
Signed-off-by: Felix Huettner <felix.huettner@mail.schwarz>
Acked-by: Ales Musil <amusil@redhat.com>
Signed-off-by: Mark Michelson <mmichels@redhat.com>
when connecting multiple logical routers to a transit switch per az then
previously the routers in the same az would not learn each others
routes while the routers in the others az would learn all of them.

As this is confusing and would require each user to have additional
logic that configures static routing within each az.

Acked-by: Mark Michelson <mmichels@redhat.com>
Acked-by: Ales Musil <amusil@redhat.com>
Co-Authored-By: Maxim Korezkij <maxim.korezkij@mail.schwarz>
Signed-off-by: Maxim Korezkij <maxim.korezkij@mail.schwarz>
Signed-off-by: Felix Huettner <felix.huettner@mail.schwarz>
Signed-off-by: Mark Michelson <mmichels@redhat.com>
QoS was not configured in OVS db when db was read only: the configuration
was just ignored and not done later when OVS db became writable.
It was sometimes set later, if/when a recompute happened.
This is now fixed: when OVS db is read only, the ports on which qos
must be applied are stored and qos will be applied when OVS db becomes writable.
To avoid race conditions between delayed qos and new qos changes (e.g. a qos
configuration delayed in one loop as ovs is ro, followed in next loop, when ovs
becomes rw, by another qos on the same port), all qos changes are done at the
same time.

This issue was identified by some random failures in system test
"egress qos".

Reported-at: https://bugzilla.redhat.com/2234349
Signed-off-by: Xavier Simonart <xsimonar@redhat.com>
Acked-by: Ales Musil <amusil@redhat.com>
Signed-off-by: Numan Siddique <numans@ovn.org>
Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Mark Michelson <mmichels@redhat.com>
To avoid any warning spam in the northd.log remove
the "hosting-chassis" status only if it was previously
specified.

Fixes: 19164b0 ("Expose distributed gateway port information in NB DB")
Reported-at: https://issues.redhat.com/browse/FDP-54
Signed-off-by: Ales Musil <amusil@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
The daemon life cycle spans over the whole test case life time, which
significantly speeds up test cases that rely on fmt_pkt for packet byte
representations.

The speed-up comes from the fact that we no longer start a python
environment with all scapy modules imported on any fmt_pkt invocation;
but only on the first call to fmt_pkt.

The daemon is not started for test cases that don't trigger fmt_pkt.
(The server is started lazily as part of fmt_pkt call.)

For example, without the daemon, all tests that use fmt_pkt took the
following on my vagrant box:

real    17m23.092s
user    26m27.935s
sys     5m25.486s

With the daemon, the same set of tests run in:

real    2m16.741s
user    2m40.155s
sys     0m47.514s

We may want to make the daemon global, so that it's invoked once per
test suite run. But I haven't figured out, yet, how to run a trap to
clean up the deamon and its socket and pid files on suite exit (and not
on test case exit.)

Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>

unixctl impl
Acked-by: Ales Musil <amusil@redhat.com>

Signed-off-by: Mark Michelson <mmichels@redhat.com>
ovn-org/ovn-kubernetes#3830 introduced a new
binary that is required to run a kind cluster.

Eventually it would be good to modify the ovn-kubernetes Dockerfile in
OVN to rely on the one from ovn-kubernetes repo.

Submitted-at: #219
Signed-off-by: Patryk Diak <pdiak@redhat.com>
Acked-by: Ales Musil <amusil@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Minor (non-functional) fixes to commit 4023d6a.
1. Fix typo of function name: collect_lb_groups_for_ha_chassis_groups
2. Update the comments of the collect_lb_groups_for_ha_chassis_groups
   function to avoid confusion.
3. Rename tmp_ha_chassis to tmp_ha_ref_chassis, because HA chassis is
   more like the chassis belonging to a HA chassis group.

Fixes: 4023d6a ("northd: Fix recompute of referenced chassis in HA chassis groups.")
Signed-off-by: Han Zhou <hzhou@ovn.org>
Acked-by: Mark Michelson <mmichels@redhat.com>
Every LR datapath references to some LR group, but not every LR group
has HA chassis groups. This patch avoids collecting LR groups and
reference chassis for the LR groups without HA chassis groups.

In addition, this patch also refactors the function
build_ha_chassis_group_ref_chassis to avoid the unnecessary SB
ha_chassis_group lookup by name, because the only field used is the
name.

Signed-off-by: Han Zhou <hzhou@ovn.org>
Acked-by: Mark Michelson <mmichels@redhat.com>
This fixes make clean removing the daemon file because of the mechanics
of pycov make target.

Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Remove some debugging code in unit-tests

Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Mark Michelson <mmichels@redhat.com>
Use conditional monitor for the FDB for local datapaths.

Fixes: 6ec3b12 ("MAC learning: Add a new FDB table in southbound db.")
Signed-off-by: Han Zhou <hzhou@ovn.org>
Acked-by: Dumitru Ceara <dceara@redhat.com>
When client sends RS we reply with RA to the source address,
however the client is allowed to send RS with unspecified
source address ("::"). In that case we would send the RA to
that empty address which is not correct.

According to RFC 2461 [0] the server is allowed to reply to the
source address directly only if the address isn't unspecified:

   A router MAY choose to unicast the
   response directly to the soliciting host's address (if the
   solicitation's source address is not the unspecified address)

Make sure we change the source for all noes address when it
is unspecified.

[0] https://www.ietf.org/rfc/rfc2461.txt
Reported-at: https://issues.redhat.com/browse/FDP-43
Signed-off-by: Ales Musil <amusil@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Mark Michelson <mmichels@redhat.com>
Multiple ovn-ic tests change routes in az1 and check whether the routes are properly
updated in az2. Some of those tests add some ovn-nbctl --wait=sb sync in az2.
However, if ovn-ic processes are slow for whatever reasons, such tests might
have failed.

Signed-off-by: Xavier Simonart <xsimonar@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
The test was sometimes failing if an arp packet was received
after installing priority=0 flow, and before installing
the flow matching arp.

Signed-off-by: Xavier Simonart <xsimonar@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
The test was checking if a lport was properly released, but in
some race condition the logging message is slighly different.

Signed-off-by: Xavier Simonart <xsimonar@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
- dhcpv4 : 1 HV, 2 LS
- Remote port binding
- MLD snoop/querier/relay
- basic connectivity with multiple requested-chassis

Signed-off-by: Xavier Simonart <xsimonar@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
The test was setting ACL rules, then sending packets, then changing ACLs rules, then sending packets.
Then it checked whether those packets were properly received/dropped at the end.
It should check whether those packets are properly recived/dropped before updating ACLs rules
for the second test phase, as otherwise there is no guarentee that packet are fully handled when
we update the ACL rules.

Signed-off-by: Xavier Simonart <xsimonar@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Xavier Simonart <xsimonar@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
trim_zeros was redefined multiple times in ovn.at, and was
undefined for one test.

Signed-off-by: Xavier Simonart <xsimonar@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
putnopvut and others added 29 commits November 29, 2023 12:54
I have no idea why, but this is the only patch in this series where the
execution time is *improved* by switching to fmt_pkt.

Execution time: 8.321s
Execution time on "main" branch: 8.576s

Signed-off-by: Mark Michelson <mmichels@redhat.com>
This test is slower than it used to be since some of the test_ip calls
that were backgrounded (with '&') cannot do this anymore. This is
because there is a race condition with starting the scapy server when
the calls are backgrounded.

Execution time: 36.471s
Execution time on "main" branch: 13.913s

Signed-off-by: Mark Michelson <mmichels@redhat.com>
This test is much slower with fmt_pkt compared to main. It's likely due
to the high number of fmt_pkt calls made during the test.

Execution time: 17.484s
Execution time on "main" branch: 4.975s

Signed-off-by: Mark Michelson <mmichels@redhat.com>
Execution time: 3.413s
Execution time on "main" branch: 2.675s

Signed-off-by: Mark Michelson <mmichels@redhat.com>
Execution time: 3.329s
Execution time on "main" branch: 2.794s

Signed-off-by: Mark Michelson <mmichels@redhat.com>
... 2 peer LRs, static routes.

Execution time: 4.114s
Execution time on "main" branch: 3.428s

Signed-off-by: Mark Michelson <mmichels@redhat.com>
Execution time: 4.308s
Execution time on "main" branch: 3.170s

Signed-off-by: Mark Michelson <mmichels@redhat.com>
Execution time: 4.100s
Execution time on "main" branch: 3.083s

Signed-off-by: Mark Michelson <mmichels@redhat.com>
Execution time: 3.791s
Execution time on "main" branch: 2.897s

Signed-off-by: Mark Michelson <mmichels@redhat.com>
The group_table and meter_table are initialized in ovn-controller, with n_ids = 0.
Then they are re-initialized in ofctrl, with proper number of n_ids, in state S_CLEAR_FLOWS.
However, nothing prevented to start adding flows (by adding logical ports) using groups
before ofctrl reaches this state. This was causing some wrong flows (missing group in the flow).

With this patch, as soon as the feature rconn is available, i.e. before adding flows, those table
are properly re-initialized.

This issue is usually not visible in main and recent branches ci, since
"Wait for new ovn-controllers to connect to Southbound." as this was slowing down the moment when
the test started to add ports.
This was causing the following test to fail:
"ECMP static routes -- ovn-northd -- parallelization=yes -- ovn_monitor_all=yes".

Fixes: 1d6d953 ("controller: Don't artificially limit group and meter IDs to 16bit.")
Signed-off-by: Xavier Simonart <xsimonar@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Mark Michelson <mmichels@redhat.com>
While either check works, it's better to use a more explicit check.

Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Mark Michelson <mmichels@redhat.com>
The daemon will now log to scapy.log file.

Log messages include stats on request processing time as well as any
errors that may happen during processing.

If you'd like to see even more logs (e.g. for debugging purposes), just
pass --verbose to scapy-server.

Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Mark Michelson <mmichels@redhat.com>
When running fmt_pkt in parallel, before the patch, there could be
several attempts to start scapy-server. Using flock on a test case
specific file should guarantee that only one of the subshells will
actually be able to start a daemon, which is designed to be a singleton.

Now that start_scapy_server doesn't necessarily start a server (only the
first lucky subshell will do), wait for server socket file before
proceeding with ovs-appctl call.

Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Mark Michelson <mmichels@redhat.com>
Introduce specific flows for E/W ICMPv{4,6} need frag packets
generated by the tunnel module to be delivered back to the
source port, if packets to be tunnelled do not fit path MTU.
This patch enables PMTUD for East/West Geneve traffic.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2241711
Tested-by: Jaime Ruiz <jcaamano@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Numan Siddique <numans@ovn.org>
We're sending TCP RST not RST-ACK as the comment suggests.

Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2023-November/409288.html
Reported-by: Mark Michelson <mmichels@redhat.com>
Fixes: a35725a ("pinctrl: send RST instead of RST_ACK bit for lb hc")
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Ales Musil <amusil@redhat.com>
Add missing wait condition and at the same time
move the wait to the last ovn-nbctl command
of the chain, as there is no need to wait
if there are more command afterward.

Signed-off-by: Ales Musil <amusil@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
When multiple OVSDB updates have been received since the last northd run
it's possible that the IDL tracks changes for database entities that
were added _and also_ removed from the last time the northd processing
engine run.  In some cases, those may appear as being simultaneously
"new" and "deleted".

Currently, the only tables for which this can be a problem are
NB.Load_Balancer and NB.Load_Balancer_Group.

Skip these "transient records" to avoid adding soon to be deleted rows
to the tracked_lb_data->crupdated_lbs records.  These are used to build
'northd' I-P engine state in northd_handle_lb_data_changes().

We also avoid crashing if "unexpected" deletes are reported by the IDL.
This is likely due to a bug in the IDL [0] but it's easy to avoid on the
northd side.

This commit also adds a test case which _might_ detect the issue when
run under valgrind.  The test case can't always detect the problem
because a prerequisite for a Load Balancer to be "transient" is that the
IDL processes the update that removes it from the NB Load_Balancer table
and from the Load_Balancer_Group row that was referring to it in the
following order: Load_Balancer_Group table update first and then
Load_Balancer deletion.  The order is controlled by the way
'struct shash' hashes records (table names in this case) and that's
arch and/or compiler dependent.

[0] https://issues.redhat.com/browse/FDP-193

Fixes: a24eed5 ("northd: Add initial I-P for load balancer and load balancer groups")
Reported-at: https://issues.redhat.com/browse/FDP-181
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Numan Siddique <numans@ovn.org>
The userspace lacks the supplementary protocol state machine for SCTP,
resulting in the absence of 'protoinfo' fields. Nevertheless, this SCTP
test doesn't need this feature, making the check for it unnecessary.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
This patch moves the SCTP test from the kernel only, to the general OVN
system tests, enabling its execution in both the system-userspace and
system-dpdk test scenarios.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
The goal is to test northd performance so there's no point to start
hypervisors.

Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Acked-by: Numan Siddique <numans@ovn.org>
Trigger a full recompute after each DB build run and record the
results.

Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Acked-by: Numan Siddique <numans@ovn.org>
Make sure the capture is up before continuing the test.
A similar fix was committed earlier via 6c4ffe5
("system-test: Use OVS_WAIT_UNTIL for tcpdump start instead
fo sleep") but other tests were added in the meantime.

Suggested-by: Xavier Simonart <xsimonar@redhat.com>
Fixes: 086744a ("northd: Use LB port_str in northd.")
Reported-at: https://issues.redhat.com/browse/FDP-192
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
In case there's a test failure while daemons are stopped ensure that we
send a SIGCONT on exit so that they properly clean up.

Fixes: 30952c2 ("binding: fixed ovn-installed not properly removed (recomputes)")
Fixes: 0794a6e ("qos: fix potential double deletion of ovs idl row")
Fixes: feb9184 ("northd: Skip transient IDL records.")
Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Xavier Simonart <xsimonar@redhat.com>
When a logical switch port was deleted and added back quickly, it could
happen that the lsp was never reported up

Signed-off-by: Xavier Simonart <xsimonar@redhat.com>
Acked-by: Ales Musil <amusil@redhat.com>
Signed-off-by: Numan Siddique <numans@ovn.org>
If the router has a snat rule and it's external ip isn't lrp address,
when the arp request from other router for this external ip, will
be drop, because of this external ip use same mac address as lrp, so
can not forward to MC_FLOOD.

Fixes: 32f5ebb ("ovn-northd: Limit ARP/ND broadcast domain whenever possible.")
Reported-at: #209

Signed-off-by: Daniel Ding <danieldin186@gmail.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
This reverts commit 450e41e.

If a packet has to be tunnelled to another node and if the physical
interface used for tunnelling has lower MTU than the packet or
if there is a route exception with a lower MTU, then the geneve
kernel module generates an ICMP need frag packet.  This packet
was getting dropped since the metadata had to be swapped.
The commit did exactly that and fixed the issue.
But it has 2 issues -
  1. It introduced a regression for the scenario when an ICMP need frag
     packet generated outside of OVN has to be tunnelled and delivered
     to the destination VM/pod.  These ICMP need frag packets are now
     dropped.
  2. If the logical switches has ACLs or load balancers configured then
     these icmp need frag packets are dropped as they are not sent to
     the correct zone.

Its better to revert until we find a proper solution for the original
issue.

Reported-at: https://issues.redhat.com/browse/FDP-216

Fixes: 450e41e ("ovn: add geneve PMTUD support")
Signed-off-by: Numan Siddique <numans@ovn.org>
Acked-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Numan Siddique <numans@ovn.org>
@ts170710 ts170710 merged commit 27d9424 into ts170710:main Dec 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet