merge #3

ts170710 · 2023-12-27T08:10:32Z

No description provided.

'lb_data' engine node now also handles logical switch changes. Its data maintains ls to lb related information. i.e if a logical switch sw0 has lb1, lb2 and lb3 associated then it stores this info in its data. And when a new load balancer lb4 is associated to it, it stores this information in its tracked data so that 'northd' engine node can handle it accordingly. Tracked data will have information like: changed ls -> {sw0 : {associated_lbs: [lb4]} The first handler 'northd_lb_data_handler_pre_od' is called before the 'northd_nb_logical_switch_handler' handler and it just creates or deletes the lb_datapaths hmap for the tracked lbs. The northd handler 'northd_lb_data_handler' updates the ovn_lb_datapaths's 'nb_ls_map' bitmap accordingly. Eg. If the lb_data has the below tracked data: tracked_data = {'crupdated_lbs': [lb1, lb2], 'deleted_lbs': [lb3], 'crupdated_lb_groups': [lbg1, lbg2], 'crupdated_ls_lbs': [{ls: sw0, assoc_lbs: [lb1], {ls: sw1, assoc_lbs: [lb1, lb2]} The handler northd_lb_data_handler(), creates the ovn_lb_datapaths object for lb1 and lb2 and deletes lb3 from the ovn_lb_datapaths hmap. It does the same for the created or updated lb groups lbg1 and lbg2 in the ovn_lbgrp_datapaths map. It also updates the nb_ls_bitmap of lb1 for sw0 and sw1 and nb_ls_bitmap of lb2 for sw1. Reviewed-by: Ales Musil <amusil@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Numan Siddique <numans@ovn.org>

For a given load balancer group 'A', northd engine data maintains a bitmap of datapaths associated to this lb group. So when lb group 'A' gets associated to a logical switch 's1', the bitmap index of 's1' is set in its bitmap. In order to handle the load balancer group changes incrementally for a logical switch, we need to set and clear the bitmap bits accordingly. And this patch does it. Reviewed-by: Ales Musil <amusil@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Numan Siddique <numans@ovn.org>

A new engine node 'sync_to_sb_pb' is added within 'sync_to_sb' node to sync NAT column of Port bindings table. This separation is required in order to add load balancer group I-P handling in 'northd' engine node (which is handled in the next commit). 'sync_to_sb_pb' engine node can be later expanded to sync other Port binding columns if required. Reviewed-by: Ales Musil <amusil@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Numan Siddique <numans@ovn.org>

When a logical router gets updated due to load balancer or load balancer groups changes, it is now incrementally handled first in 'lb_data' engine node similar to how logical switch changes are handled. The tracking data of 'lb_data' is updated similarly so that northd engine handler - northd_handle_lb_data_changes() handles it. A new handler northd_handle_lr_changes() is added in the 'northd' engine node for logical router changes. This handler returns true if only load balancer or load balancer group columns are changed. It returns false for any other changes. northd_handle_lb_data_changes() also sets the logical router od's lb_ips accordingly. Below are the scale testing results done with these patches applied using ovn-heater. The test ran the scenario - ocp-500-density-heavy.yml [1]. With these patches applied (with load balancer I-P handling in northd engine node) the resuts are: ------------------------------------------------------------------------------------------------------------------------------------------------------- Min (s) Median (s) 90%ile (s) 99%ile (s) Max (s) Mean (s) Total (s) Count Failed ------------------------------------------------------------------------------------------------------------------------------------------------------- Iteration Total 0.138730 2.168997 3.224783 3.320061 3.326713 1.616405 202.050672 125 0 Namespace.add_ports 0.005276 0.005608 0.006604 0.009053 0.018615 0.005901 0.737612 125 0 WorkerNode.bind_port 0.034812 0.045776 0.053103 0.057902 0.060541 0.045659 11.414781 250 0 WorkerNode.ping_port 0.005281 0.006927 2.071924 3.186326 3.197238 0.743860 185.964955 250 0 ------------------------------------------------------------------------------------------------------------------------------------------------------- The results with the present main are: ------------------------------------------------------------------------------------------------------------------------------------------------------- Min (s) Median (s) 90%ile (s) 99%ile (s) Max (s) Mean (s) Total (s) Count Failed ------------------------------------------------------------------------------------------------------------------------------------------------------- Iteration Total 3.233795 4.364926 5.400982 6.412803 7.409757 4.792270 599.033790 125 0 Namespace.add_ports 0.005230 0.006564 0.007379 0.019060 0.037490 0.007223 0.902930 125 0 WorkerNode.bind_port 0.033864 0.044052 0.049608 0.054849 0.056196 0.044005 11.001231 250 0 WorkerNode.ping_port 0.005334 2.060477 5.222422 6.267332 7.284001 2.323020 580.754964 250 0 ------------------------------------------------------------------------------------------------------------------------------------------------------- Few observations: - The total time taken has come down significantly from 599 seconds to 202. - 99%ile with these patches is 3.32 seconds compared to 6.4 seconds for the main. - 99%ile with these patches is 3.2 seconds compared to 5.4 seconds for the main. - CPU utilization of northd during the test with these patches is between 100% to 300% which is almost the same as main. Main difference being that, with these patches the test duration is less and hence overall less CPU utilization. [1] - https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml Reviewed-by: Ales Musil <amusil@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Numan Siddique <numans@ovn.org>

The release message indicates that the address is no longer in use. Simply reply with status code success without any special handling as we do not store the information about address being in use. Reported-at: https://bugzilla.redhat.com/2237855 Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: Numan Siddique <numans@ovn.org>

If the flow size is bigger than UINT16_MAX it doesn't fit into openflow message. Programming of such flow will fail which results in disconnect of ofctrl. After reconnect we program everything from scratch, in case the long flow still remains the cycle continues. This causes the node to be almost unusable as there will be massive traffic disruptions. To prevent that check if the flow is within the allowed size. If not log the flow that would trigger this problem and do not program it. This isn't a self-healing process, but the chance of this happening are very slim. Also, any flow that is bigger than allowed size is OVN bug, and it should be fixed. Reported-at: https://bugzilla.redhat.com/1955167 Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: Mark Michelson <mmichels@redhat.com>

During ofctrl_add_or_append_flow we are able to combine two flows with same match but different conjunctions. However, the function didn't check if the conjunctions already exist in the installed flow, which could result in conjunction duplication and the flow would grow infinitely e.g. actions=conjunction(1,1/2), conjunction(1,1/2) Make sure that we add only conjunctions that are not present in the already existing flow. Reported-at: https://bugzilla.redhat.com/2175928 Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: Mark Michelson <mmichels@redhat.com>

The flows to allow DHCP response from ovn-controller were missing if a logical VIF port had dhcp v4/v6 options set and were handled incrementally. Fixes: 8bbd678 ("northd: Incremental processing of VIF additions in 'lflow' node.") Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Numan Siddique <numans@ovn.org>

Check if parent_name is properly set in build_gateway_get_l2_hdr_size routine if tag_request is set 0, since parent_name is mandatory for dynamically allocated VLANID. Reported-at: https://issues.redhat.com/browse/FDP-38 Fixes: b68753a ("northd: dynamically compute l2 hdr len for check_pkt_larger action") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Numan Siddique <numans@ovn.org>

_DUMP_DB_TABLES macro should instead dump "port_group" table. Fixes: 5b6a7ad ("northd: Add incremental processing for NB port groups.") Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Numan Siddique <numans@ovn.org>

previously if multiple routers in the same az are connected to the same transit switch then ovn-ic would only propagate the routes of one of these routers to the ic-sb. This commit fixes this behaviour and allows multiple routers in a single az to use route advertisements. Co-authored-by: Maxim Korezkij <maxim.korezkij@mail.schwarz> Signed-off-by: Maxim Korezkij <maxim.korezkij@mail.schwarz> Signed-off-by: Felix Huettner <felix.huettner@mail.schwarz> Acked-by: Ales Musil <amusil@redhat.com> Signed-off-by: Mark Michelson <mmichels@redhat.com>

when connecting multiple logical routers to a transit switch per az then previously the routers in the same az would not learn each others routes while the routers in the others az would learn all of them. As this is confusing and would require each user to have additional logic that configures static routing within each az. Acked-by: Mark Michelson <mmichels@redhat.com> Acked-by: Ales Musil <amusil@redhat.com> Co-Authored-By: Maxim Korezkij <maxim.korezkij@mail.schwarz> Signed-off-by: Maxim Korezkij <maxim.korezkij@mail.schwarz> Signed-off-by: Felix Huettner <felix.huettner@mail.schwarz> Signed-off-by: Mark Michelson <mmichels@redhat.com>

QoS was not configured in OVS db when db was read only: the configuration was just ignored and not done later when OVS db became writable. It was sometimes set later, if/when a recompute happened. This is now fixed: when OVS db is read only, the ports on which qos must be applied are stored and qos will be applied when OVS db becomes writable. To avoid race conditions between delayed qos and new qos changes (e.g. a qos configuration delayed in one loop as ovs is ro, followed in next loop, when ovs becomes rw, by another qos on the same port), all qos changes are done at the same time. This issue was identified by some random failures in system test "egress qos". Reported-at: https://bugzilla.redhat.com/2234349 Signed-off-by: Xavier Simonart <xsimonar@redhat.com> Acked-by: Ales Musil <amusil@redhat.com> Signed-off-by: Numan Siddique <numans@ovn.org>

Acked-by: Numan Siddique <numans@ovn.org> Signed-off-by: Mark Michelson <mmichels@redhat.com>

To avoid any warning spam in the northd.log remove the "hosting-chassis" status only if it was previously specified. Fixes: 19164b0 ("Expose distributed gateway port information in NB DB") Reported-at: https://issues.redhat.com/browse/FDP-54 Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com>

The daemon life cycle spans over the whole test case life time, which significantly speeds up test cases that rely on fmt_pkt for packet byte representations. The speed-up comes from the fact that we no longer start a python environment with all scapy modules imported on any fmt_pkt invocation; but only on the first call to fmt_pkt. The daemon is not started for test cases that don't trigger fmt_pkt. (The server is started lazily as part of fmt_pkt call.) For example, without the daemon, all tests that use fmt_pkt took the following on my vagrant box: real 17m23.092s user 26m27.935s sys 5m25.486s With the daemon, the same set of tests run in: real 2m16.741s user 2m40.155s sys 0m47.514s We may want to make the daemon global, so that it's invoked once per test suite run. But I haven't figured out, yet, how to run a trap to clean up the deamon and its socket and pid files on suite exit (and not on test case exit.) Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> unixctl impl Acked-by: Ales Musil <amusil@redhat.com> Signed-off-by: Mark Michelson <mmichels@redhat.com>

ovn-org/ovn-kubernetes#3830 introduced a new binary that is required to run a kind cluster. Eventually it would be good to modify the ovn-kubernetes Dockerfile in OVN to rely on the one from ovn-kubernetes repo. Submitted-at: #219 Signed-off-by: Patryk Diak <pdiak@redhat.com> Acked-by: Ales Musil <amusil@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com>

Minor (non-functional) fixes to commit 4023d6a. 1. Fix typo of function name: collect_lb_groups_for_ha_chassis_groups 2. Update the comments of the collect_lb_groups_for_ha_chassis_groups function to avoid confusion. 3. Rename tmp_ha_chassis to tmp_ha_ref_chassis, because HA chassis is more like the chassis belonging to a HA chassis group. Fixes: 4023d6a ("northd: Fix recompute of referenced chassis in HA chassis groups.") Signed-off-by: Han Zhou <hzhou@ovn.org> Acked-by: Mark Michelson <mmichels@redhat.com>

Every LR datapath references to some LR group, but not every LR group has HA chassis groups. This patch avoids collecting LR groups and reference chassis for the LR groups without HA chassis groups. In addition, this patch also refactors the function build_ha_chassis_group_ref_chassis to avoid the unnecessary SB ha_chassis_group lookup by name, because the only field used is the name. Signed-off-by: Han Zhou <hzhou@ovn.org> Acked-by: Mark Michelson <mmichels@redhat.com>

This fixes make clean removing the daemon file because of the mechanics of pycov make target. Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com>

Remove some debugging code in unit-tests Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Mark Michelson <mmichels@redhat.com>

Use conditional monitor for the FDB for local datapaths. Fixes: 6ec3b12 ("MAC learning: Add a new FDB table in southbound db.") Signed-off-by: Han Zhou <hzhou@ovn.org> Acked-by: Dumitru Ceara <dceara@redhat.com>

When client sends RS we reply with RA to the source address, however the client is allowed to send RS with unspecified source address ("::"). In that case we would send the RA to that empty address which is not correct. According to RFC 2461 [0] the server is allowed to reply to the source address directly only if the address isn't unspecified: A router MAY choose to unicast the response directly to the soliciting host's address (if the solicitation's source address is not the unspecified address) Make sure we change the source for all noes address when it is unspecified. [0] https://www.ietf.org/rfc/rfc2461.txt Reported-at: https://issues.redhat.com/browse/FDP-43 Signed-off-by: Ales Musil <amusil@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Mark Michelson <mmichels@redhat.com>

Multiple ovn-ic tests change routes in az1 and check whether the routes are properly updated in az2. Some of those tests add some ovn-nbctl --wait=sb sync in az2. However, if ovn-ic processes are slow for whatever reasons, such tests might have failed. Signed-off-by: Xavier Simonart <xsimonar@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com>

The test was sometimes failing if an arp packet was received after installing priority=0 flow, and before installing the flow matching arp. Signed-off-by: Xavier Simonart <xsimonar@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com>

The test was checking if a lport was properly released, but in some race condition the logging message is slighly different. Signed-off-by: Xavier Simonart <xsimonar@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com>

- dhcpv4 : 1 HV, 2 LS - Remote port binding - MLD snoop/querier/relay - basic connectivity with multiple requested-chassis Signed-off-by: Xavier Simonart <xsimonar@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com>

The test was setting ACL rules, then sending packets, then changing ACLs rules, then sending packets. Then it checked whether those packets were properly received/dropped at the end. It should check whether those packets are properly recived/dropped before updating ACLs rules for the second test phase, as otherwise there is no guarentee that packet are fully handled when we update the ACL rules. Signed-off-by: Xavier Simonart <xsimonar@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com>

Signed-off-by: Xavier Simonart <xsimonar@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com>

trim_zeros was redefined multiple times in ovn.at, and was undefined for one test. Signed-off-by: Xavier Simonart <xsimonar@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com>

I have no idea why, but this is the only patch in this series where the execution time is *improved* by switching to fmt_pkt. Execution time: 8.321s Execution time on "main" branch: 8.576s Signed-off-by: Mark Michelson <mmichels@redhat.com>

This test is slower than it used to be since some of the test_ip calls that were backgrounded (with '&') cannot do this anymore. This is because there is a race condition with starting the scapy server when the calls are backgrounded. Execution time: 36.471s Execution time on "main" branch: 13.913s Signed-off-by: Mark Michelson <mmichels@redhat.com>

This test is much slower with fmt_pkt compared to main. It's likely due to the high number of fmt_pkt calls made during the test. Execution time: 17.484s Execution time on "main" branch: 4.975s Signed-off-by: Mark Michelson <mmichels@redhat.com>

Execution time: 3.413s Execution time on "main" branch: 2.675s Signed-off-by: Mark Michelson <mmichels@redhat.com>

Execution time: 3.329s Execution time on "main" branch: 2.794s Signed-off-by: Mark Michelson <mmichels@redhat.com>

... 2 peer LRs, static routes. Execution time: 4.114s Execution time on "main" branch: 3.428s Signed-off-by: Mark Michelson <mmichels@redhat.com>

Execution time: 4.308s Execution time on "main" branch: 3.170s Signed-off-by: Mark Michelson <mmichels@redhat.com>

Execution time: 4.100s Execution time on "main" branch: 3.083s Signed-off-by: Mark Michelson <mmichels@redhat.com>

Execution time: 3.791s Execution time on "main" branch: 2.897s Signed-off-by: Mark Michelson <mmichels@redhat.com>

The group_table and meter_table are initialized in ovn-controller, with n_ids = 0. Then they are re-initialized in ofctrl, with proper number of n_ids, in state S_CLEAR_FLOWS. However, nothing prevented to start adding flows (by adding logical ports) using groups before ofctrl reaches this state. This was causing some wrong flows (missing group in the flow). With this patch, as soon as the feature rconn is available, i.e. before adding flows, those table are properly re-initialized. This issue is usually not visible in main and recent branches ci, since "Wait for new ovn-controllers to connect to Southbound." as this was slowing down the moment when the test started to add ports. This was causing the following test to fail: "ECMP static routes -- ovn-northd -- parallelization=yes -- ovn_monitor_all=yes". Fixes: 1d6d953 ("controller: Don't artificially limit group and meter IDs to 16bit.") Signed-off-by: Xavier Simonart <xsimonar@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com>

Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Mark Michelson <mmichels@redhat.com>

While either check works, it's better to use a more explicit check. Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Mark Michelson <mmichels@redhat.com>

The daemon will now log to scapy.log file. Log messages include stats on request processing time as well as any errors that may happen during processing. If you'd like to see even more logs (e.g. for debugging purposes), just pass --verbose to scapy-server. Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Mark Michelson <mmichels@redhat.com>

When running fmt_pkt in parallel, before the patch, there could be several attempts to start scapy-server. Using flock on a test case specific file should guarantee that only one of the subshells will actually be able to start a daemon, which is designed to be a singleton. Now that start_scapy_server doesn't necessarily start a server (only the first lucky subshell will do), wait for server socket file before proceeding with ovs-appctl call. Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Mark Michelson <mmichels@redhat.com>

Introduce specific flows for E/W ICMPv{4,6} need frag packets generated by the tunnel module to be delivered back to the source port, if packets to be tunnelled do not fit path MTU. This patch enables PMTUD for East/West Geneve traffic. Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2241711 Tested-by: Jaime Ruiz <jcaamano@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Numan Siddique <numans@ovn.org>

We're sending TCP RST not RST-ACK as the comment suggests. Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2023-November/409288.html Reported-by: Mark Michelson <mmichels@redhat.com> Fixes: a35725a ("pinctrl: send RST instead of RST_ACK bit for lb hc") Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Ales Musil <amusil@redhat.com>

Add missing wait condition and at the same time move the wait to the last ovn-nbctl command of the chain, as there is no need to wait if there are more command afterward. Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com>

When multiple OVSDB updates have been received since the last northd run it's possible that the IDL tracks changes for database entities that were added _and also_ removed from the last time the northd processing engine run. In some cases, those may appear as being simultaneously "new" and "deleted". Currently, the only tables for which this can be a problem are NB.Load_Balancer and NB.Load_Balancer_Group. Skip these "transient records" to avoid adding soon to be deleted rows to the tracked_lb_data->crupdated_lbs records. These are used to build 'northd' I-P engine state in northd_handle_lb_data_changes(). We also avoid crashing if "unexpected" deletes are reported by the IDL. This is likely due to a bug in the IDL [0] but it's easy to avoid on the northd side. This commit also adds a test case which _might_ detect the issue when run under valgrind. The test case can't always detect the problem because a prerequisite for a Load Balancer to be "transient" is that the IDL processes the update that removes it from the NB Load_Balancer table and from the Load_Balancer_Group row that was referring to it in the following order: Load_Balancer_Group table update first and then Load_Balancer deletion. The order is controlled by the way 'struct shash' hashes records (table names in this case) and that's arch and/or compiler dependent. [0] https://issues.redhat.com/browse/FDP-193 Fixes: a24eed5 ("northd: Add initial I-P for load balancer and load balancer groups") Reported-at: https://issues.redhat.com/browse/FDP-181 Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Numan Siddique <numans@ovn.org>

The userspace lacks the supplementary protocol state machine for SCTP, resulting in the absence of 'protoinfo' fields. Nevertheless, this SCTP test doesn't need this feature, making the check for it unnecessary. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Simon Horman <horms@ovn.org> Signed-off-by: Dumitru Ceara <dceara@redhat.com>

This patch moves the SCTP test from the kernel only, to the general OVN system tests, enabling its execution in both the system-userspace and system-dpdk test scenarios. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Simon Horman <horms@ovn.org> Signed-off-by: Dumitru Ceara <dceara@redhat.com>

The goal is to test northd performance so there's no point to start hypervisors. Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Acked-by: Numan Siddique <numans@ovn.org>

Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Acked-by: Numan Siddique <numans@ovn.org>

Trigger a full recompute after each DB build run and record the results. Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Acked-by: Numan Siddique <numans@ovn.org>

Make sure the capture is up before continuing the test. A similar fix was committed earlier via 6c4ffe5 ("system-test: Use OVS_WAIT_UNTIL for tcpdump start instead fo sleep") but other tests were added in the meantime. Suggested-by: Xavier Simonart <xsimonar@redhat.com> Fixes: 086744a ("northd: Use LB port_str in northd.") Reported-at: https://issues.redhat.com/browse/FDP-192 Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com>

In case there's a test failure while daemons are stopped ensure that we send a SIGCONT on exit so that they properly clean up. Fixes: 30952c2 ("binding: fixed ovn-installed not properly removed (recomputes)") Fixes: 0794a6e ("qos: fix potential double deletion of ovs idl row") Fixes: feb9184 ("northd: Skip transient IDL records.") Suggested-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Xavier Simonart <xsimonar@redhat.com>

When a logical switch port was deleted and added back quickly, it could happen that the lsp was never reported up Signed-off-by: Xavier Simonart <xsimonar@redhat.com> Acked-by: Ales Musil <amusil@redhat.com> Signed-off-by: Numan Siddique <numans@ovn.org>

If the router has a snat rule and it's external ip isn't lrp address, when the arp request from other router for this external ip, will be drop, because of this external ip use same mac address as lrp, so can not forward to MC_FLOOD. Fixes: 32f5ebb ("ovn-northd: Limit ARP/ND broadcast domain whenever possible.") Reported-at: #209 Signed-off-by: Daniel Ding <danieldin186@gmail.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com>

Signed-off-by: Dumitru Ceara <dceara@redhat.com>

This reverts commit 450e41e. If a packet has to be tunnelled to another node and if the physical interface used for tunnelling has lower MTU than the packet or if there is a route exception with a lower MTU, then the geneve kernel module generates an ICMP need frag packet. This packet was getting dropped since the metadata had to be swapped. The commit did exactly that and fixed the issue. But it has 2 issues - 1. It introduced a regression for the scenario when an ICMP need frag packet generated outside of OVN has to be tunnelled and delivered to the destination VM/pod. These ICMP need frag packets are now dropped. 2. If the logical switches has ACLs or load balancers configured then these icmp need frag packets are dropped as they are not sent to the correct zone. Its better to revert until we find a proper solution for the original issue. Reported-at: https://issues.redhat.com/browse/FDP-216 Fixes: 450e41e ("ovn: add geneve PMTUD support") Signed-off-by: Numan Siddique <numans@ovn.org> Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Numan Siddique <numans@ovn.org>

numansiddique and others added 30 commits September 13, 2023 21:59

ovn-northd: Fix unknown table "port_group_set" warning.

e53a3ac

_DUMP_DB_TABLES macro should instead dump "port_group" table. Fixes: 5b6a7ad ("northd: Add incremental processing for NB port groups.") Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Numan Siddique <numans@ovn.org>

Set release date for 23.09.0.

9fe6c7b

Acked-by: Numan Siddique <numans@ovn.org> Signed-off-by: Mark Michelson <mmichels@redhat.com>

Rename scapy-server into scapy-server.py

01252a2

This fixes make clean removing the daemon file because of the mechanics of pycov make target. Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com>

test: get rid of debugging code

686caaf

Remove some debugging code in unit-tests Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Mark Michelson <mmichels@redhat.com>

ovn-controller: Add monitor condition for FDB.

2b1d8e1

Use conditional monitor for the FDB for local datapaths. Fixes: 6ec3b12 ("MAC learning: Add a new FDB table in southbound db.") Signed-off-by: Han Zhou <hzhou@ovn.org> Acked-by: Dumitru Ceara <dceara@redhat.com>

tests: skip test "MAC binding aging" if scapy not available.

4f7359e

Signed-off-by: Xavier Simonart <xsimonar@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com>

tests: move trim_zeros() to ovn-macros

f8df558

trim_zeros was redefined multiple times in ovn.at, and was undefined for one test. Signed-off-by: Xavier Simonart <xsimonar@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Dumitru Ceara <dceara@redhat.com>

putnopvut and others added 29 commits November 29, 2023 12:54

tests: Use fmt_pkt in 1 HV, 1 LS, 2 lport/LS, 1 LR.

7eaf013

Execution time: 3.413s Execution time on "main" branch: 2.675s Signed-off-by: Mark Michelson <mmichels@redhat.com>

tests: Use fmt_pkt in 1 HV, 2 LSs, 1 lport/LS, 1 LR.

1ef5eb7

Execution time: 3.329s Execution time on "main" branch: 2.794s Signed-off-by: Mark Michelson <mmichels@redhat.com>

tests: Use fmt_pkt in 2 HVs, 3 LS, 1 lport/LS, ...

20617e1

... 2 peer LRs, static routes. Execution time: 4.114s Execution time on "main" branch: 3.428s Signed-off-by: Mark Michelson <mmichels@redhat.com>

tests: Use fmt_pkt in 2 HVs, 3 LRs connected via LS, static routes.

046e11f

Execution time: 4.308s Execution time on "main" branch: 3.170s Signed-off-by: Mark Michelson <mmichels@redhat.com>

tests: Use fmt_pkt in 2 HVs, 2 LRs connected via LS, gateway router.

da78391

Execution time: 4.100s Execution time on "main" branch: 3.083s Signed-off-by: Mark Michelson <mmichels@redhat.com>

tests: Use fmt_pkt in icmp_reply: 1 HVs, 2 LSs, 1 lport/LS, 1 LR.

2638d1e

Execution time: 3.791s Execution time on "main" branch: 2.897s Signed-off-by: Mark Michelson <mmichels@redhat.com>

fmt_pkt: don't subshell when calling ovs-appctl

74e7ba1

Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Mark Michelson <mmichels@redhat.com>

fmt_pkt: use -S check to wait for scapy sock file

cd3dd36

While either check works, it's better to use a more explicit check. Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Mark Michelson <mmichels@redhat.com>

perf-northd.at: Don't start ovn-controllers.

207f414

The goal is to test northd performance so there's no point to start hypervisors. Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Acked-by: Numan Siddique <numans@ovn.org>

perf-northd.at: Parse and display more stopwatch data.

9e3cf5f

Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Acked-by: Numan Siddique <numans@ovn.org>

perf-northd.at: Add ovn-northd recompute statistics.

147a126

Trigger a full recompute after each DB build run and record the results. Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Acked-by: Numan Siddique <numans@ovn.org>

AUTHORS: Add Daniel Ding.

22dcf5a

Signed-off-by: Dumitru Ceara <dceara@redhat.com>

ts170710 merged commit 27d9424 into ts170710:main Dec 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

merge #3

merge #3

ts170710 commented Dec 27, 2023

merge #3

merge #3

Conversation

ts170710 commented Dec 27, 2023