Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Voqs to Virtual Switch (#1061) #1161

Closed
wants to merge 54 commits into from

Conversation

skbarista
Copy link
Contributor

  • Add Voqs to Virtual Switch
  • Add UnitTest.

Pterosaur and others added 30 commits April 29, 2022 18:38
* vslib: add support for read-only port capabilities

Signed-off-by: Dante Su <dante.su@broadcom.com>

* vslib: Drop LT capability query

Signed-off-by: Dante Su <dante.su@broadcom.com>
Signed-off-by: Stepan Blyschak stepanb@nvidia.com

The motivation for this change is described in the proposal sonic-net/SONiC#935 and proposal in SAI opencomputeproject/SAI#1404
NOTE: Requires to update SAI once opencomputeproject/SAI#1404 is in.
1. Setup pipeline without manual effort when checkout new release branch.
2. Use correct branch when downloading artifacts or checkout relative repos.
3. Clear downloaded artifacts to avoid using outdated dependencies.
4. Use commonlib pipeline to download libnl3 and libyang instead of vs image build, to increase success rate.
5. Add weekly build to keep artifacts remaining.
…1057)

Since sonic-db-cli depends on libswsscommon, we could not simply only purge libswsscommon, so we purge both together.
Fix sonic-net/sonic-buildimage#10850.

A fact is there might be different port types on asic, then different port stats capabilities.
Instead of using a cached supported port counter ID list for all ports, it gets supported
port counter list per port.
The previous regex can only match one device so that the original MACsec devices cannot been cleanup by config reload.
…ndorSai.cpp (sonic-net#1064)

introduce function check if sai_query_api_version exists or not configure.ac.
If not, then do not call this function in the syncd.

Signed-off-by: Guohan Lu <lguohan@gmail.com>
)

It is required to have -e in .SHELLFLAGS to fail on such errors.

Also, fixed install pathes for python bindings and disabled python2 bindings when DEB_BUILD_PROFILES=nopython2 is passed.

Signed-off-by: Stepan Blyschak [stepanb@nvidia.com](mailto:stepanb@nvidia.com)
In azp run, the following failure always happens at the stage of `make check` of building syncd. 

```
Making check in syncd
make[2]: Entering directory '/__w/1/s/syncd'
make check-TESTS
make[3]: Entering directory '/__w/1/s/syncd'
tests: tests.cpp:843: void test_watchdog_timer_clock_rollback(): Assertion `settimeofday(&currentTime, NULL) == 0' failed.
/bin/bash: line 5: 13004 Aborted (core dumped) ${dir}$tst
FAIL: tests
```
The execution of `settimeofday(&currentTime, NULL)` fails in slave docker with errno **EPERM**, because CAP_SYS_TIME capability is dropped in docker. Using option `--privileged` gives docker extended privileges for its success.

This failure has existed for a long time in azp build and is not exposed till sonic-net#1050.
…net#1068)

The fix sonic-net#1067 is not enough. If docker user is non-root, set capability CAP_SYS_TIME for settimeofday success in syncd test, then test_watchdog_timer_clock_rollback can be run.

Co-authored-by: junhuazhai <junhuazhai@contoso.com>
…et#1049)

The command to build a check program was:

```
gcc -lsai -I./SAI/inc -I./SAI/experimental -I./SAI/meta conftest.cpp -o conftest
```

This, however, does not work on gcc 10.2 where the linker wants first
the executable and then the shared library:

```
gcc -I./SAI/inc -I./SAI/experimental -I./SAI/meta conftest.cpp -lsai -o conftest
```

However CXX_FLAGS comes first before the source file, so added
--no-as-needed as a fix for this issue to make it link against libsai in
any order passed to the linker.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
Co-authored-by: liuh-80 <azureuser@liuh-dev-vm-02.5fg3zjdzj2xezlx1yazx5oxkzd.hx.internal.cloudapp.net>
Previously log rotate was performed after request and on first log line
that was recorded which caused delay and issues with syslog logrotate
when sending HUP signal, actual log rotate was not performed and handle
to a sairedis.rec was sill open preventing logrotate from happening.
Signed-off-by: Venkat Garigipati <venkatg@cisco.com>
Signed-off-by: Ze Gan <ganze718@gmail.com>
Update recording file if not start recording was performed previously
Upgrade SAI submodule to support build saithrift with python 3.9 in bullseye

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
…t#1075)

SAI_OBJECT_TYPE_TUNNEL has fewer attributes in 201811 than in 202012. These new attributes are CREATE only, and can't be added using just SET oper. Hence old object needs to be removed, and new object needs to be added.
Moreover existing sequence (make before break) causes SAI errors in removing VXLAN tunnel (as part of CPA teardown).

Adding SAI_OBJECT_TYPE_TUNNEL to break before make to avoid creating a new object before removing existing ones.
HLD: sonic-net/SONiC#1020

**Why I did it?**
FlexCounter class is a representation of flex counter group which supports querying multiple types of statistic/attributes. It supports multiple statistic/attribute types such as port counter, port debug counter, queue counter, queue attribute and so on. For each statistic/attribute type, it defines several member functions:

- setXXXCounterList: e.g. setPortCounterList, setPortDebugCounterList
- removeXXX: e.g. removePort, removeQueue
- collectXXXCounters: e.g. collectPortCounters, collectQueueCounters
- collectXXXAttr: e.g. collectQueueAttrs, collectPriorityGroupAttrs
- so on
- 
For different statistic/attribute types, these functions have very similar logic. This PR moved similar logic to single place to avoid redundant code.

**How I test it?**

Almost full unit test coverage of newly added code.
…cket (sonic-net#1080)

Why I did it
When MDIO devices (external PHYs) are connected on MDIO bus from NPU, the MDIO access is through SAI switch mdio read/write APIs. The syncd calling the SAI APIs needs to act as an IPC server so that the gbsyncd programming the MDIO devices can use the APIs by the IPC mechanism.

How I did it
MdioIpcServer class is added to start a new thread, to create an unix socket, to listen on the socket, to accept connection and to read/reply IPC messages. The corresponding functions for MDIO clause 45 and clause 22 access are also added to VendorSai class.

How to verify it
We can use socat to simulate the IPC client, e.g.
docker exec -it syncd socat - UNIX-CONNECT:/var/run/sswsyncd/mdio-ipc.srv
to read MDIO clause 45 register at an address and an offset
mdio <address> <reg offset>
to write MDIO clause 45 register at an address and an offset with a value
mdio <address> <reg offset> <value>
to read MDIO clause 22 register at an address and an offset
mdio-cl22 <address> <reg offset>
to write MDIO clause 22 register at an address and an offset with a value
mdio-cl22 <address> <reg offset> <value>

Signed-off-by: Jiahua Wang <jiahua.wang@broadcom.com>
…ily (sonic-net#1089)

* Added check if SDE profile configured

* put $PROFILE_DEFAULT into quotes, removed one extra space before $PROFILE_DEFAULT
Transfer organization from Azure to sonic-net
Note: the build may fail due to SAI header dependency. Vendor SAI implementation shall include this PR: opencomputeproject/SAI#1352

HLD: https://github.com/sonic-net/SONiC/blob/master/doc/bulk_counter/bulk_counter.md

**Why I did this?**

PR https://github.com/opencomputeproject/SAI/pull/1352/files introduced new SAI APIs that supports bulk stats:

sai_bulk_object_get_stats
sai_bulk_object_clear_stats
SONiC flex counter infrastructure shall utilize bulk stats API to gain better performance. This document discusses how to integrate these two new APIs to SONiC.

**What I did?**

1. Support using bulk stats APIs based on object type. E.g. for a counter group that queries queue and pg stats, queue stats support bulk while pg stats does not, in that case queue stats shall use bulk API, pg stats shall use non bulk API
2. Automatically fall back to old way if bulk stats APIs are not supported

**How I test this**

Almost full unit test coverage
Manual test
* [asan] suppress the static variable leaks

This is to suppress ASAN false positives for static variables.
For example, the ServiceMethodTable::m_slots is sometimes reported as
leaked.

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>

* [asan] add missing SWSS_LOG_ENTER()

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
In vs test, run 20 tests at a time instead of running all the tests together.
This may help in fixing the test issue.
richardyu-ms and others added 16 commits September 16, 2022 20:32
Upgrade submodule SAI head to latest commit 566d4a8

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
…t-boot (sonic-net#1100)

Fast-reboot is utilizing warm-reboot infrastructure to improve its performance, but it should ignore warm-boot logic when syncd starts in fast-boot.
As well it shouldn't use temporary view between init and apply.
* Add Voqs to Virtual Switch
* Add UnitTest.
* Fix libyang missing in lgtm validation issue
The sairedis changes related to the SAI here: opencomputeproject/SAI#1365
This change is to support SAI_PORT_ATTR_PORT_SERDES_ID on the SAI for vs gearbox
Fix sonic-net#1131. The communication channel between orchagent and gbsyncd needs different from
the channel "NOTIFICATIONS" between orchagent and syncd.
* Make changes to building and packaging sairedis

This commit includes the following changes:
1. Use Debian build profiles instead of custom build targets to build
different configurations of sairedis. Build profiles were designed for
this purpose. This also makes the debian/rules file a bit cleaner.
2. Rely on the debug packages being automatically created, instead of us
explicitly specifying it in debian/control.
3. Add actual support for excluding Python 2 binding build.
4. Make sure the compile flags used for building Python 2 and Python 3
are actually correct.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

* Update pipeline file

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

* Fix argument for profile

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

* Exclude libsai package from the list of dependencies

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

* Allow multiple syslog artifacts during tests

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
…)" (sonic-net#1141)

This may be causing or exposing some issues during tests.

This reverts commit 7b39fc4.
* Fixing issue #11621
* The cleanup code for stale rif counters are now moved to syncd . Earlier as part of fix for issue #2193 the cleanup for stale rif counters was added.
* But it could create a race condition between orchagent removes RIF rate counters from DB and lua script fetching them.
* So as a fix all such cleanup has been moved to syncd.

Signed-off-by: Suman Kumar <suman.kumar@broadcom.com>
* [SAI submodule update] Enable support for SAI v1.11.0

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* update SAI for saithift fix

* upgrade to latest sai 1.11

add cases

fix a code issue

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

* refactor code

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
While the Recorder is writing to the sairedis log file, it's possible for a log rotate to occur at exactly the right time so that file stream used for writing (a std::ofstream) is re-opened in the middle of the write operation (since writing and log rotate are handled by separate threads). Since standard library objects are not thread safe, this can cause some pointers used during the write operation to be overwritten, leading to a segmentation fault when the write operation proceeds.

To prevent this from occurring, acquire a lock for the file stream for any methods that change the ofstream (including opening, closing, and writing to it with the << operator). Also use recordLine for all writes to the file stream to avoid deadlock.

Signed-off-by: Lawrence Lee lawlee@microsoft.com
Signed-off-by: richardyu-ms <richard.yu@microsoft.com>

Signed-off-by: richardyu-ms <richard.yu@microsoft.com>
* validation support for SAI_ATTR_VALUE_TYPE_JSON
* add sairedis-lib and vslib tests for generic programmable
@yxieca
Copy link
Contributor

yxieca commented Dec 1, 2022

@skbarista can you help check the check failures?

@skbarista
Copy link
Contributor Author

/azpw run Azure.sonic-sairedis

@mssonicbld
Copy link
Collaborator

/AzurePipelines run Azure.sonic-sairedis

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@skbarista
Copy link
Contributor Author

@yxieca the failures are not related to my changes. I see other pr #1159 also facing similar failures. I have restarted the tests

@yxieca
Copy link
Contributor

yxieca commented Dec 1, 2022

@yxieca the failures are not related to my changes. I see other pr #1159 also facing similar failures. I have restarted the tests

@skbarista LGTM is pointing out a build error, please take a look.

@skbarista
Copy link
Contributor Author

skbarista commented Dec 1, 2022

"[2022-12-01 19:06:10] [build-stderr] In file included from defaultvalueprovider.cpp:10:
[2022-12-01 19:06:10] [build-stderr] defaultvalueprovider.h:8:10: fatal error: libyang/libyang.h: No such file or directory
[2022-12-01 19:06:10] [build-stderr] 8 | #include <libyang/libyang.h>
[2022-12-01 19:06:10] [build-stderr] | ^~~~~~~~~~~~~~~~~~~
[2022-12-01 19:06:10] [build-stderr] compilation terminated.
[2022-12-01 19:06:10] [build-stderr] make[3]: *** [Makefile:809: libswsscommon_la-defaultvalueprovider.lo] Error 1
[2022-12-01 19:06:10] [build-stderr] make[3]: *** Waiting for unfinished jobs....
[2022-12-01 19:06:14] [build-stdout] make[3]: Leaving directory '/opt/src/sonic-swss-common/common'
[2022-12-01 19:06:14] [build-stderr] make[2]: *** [Makefile:440: all-recursive] Error 1
[2022-12-01 19:06:14] [build-stdout] make[2]: Leaving directory '/opt/src/sonic-swss-common'
[2022-12-01 19:06:14] [build-stdout] make[1]: Leaving directory '/opt/src/sonic-swss-common'
[2022-12-01 19:06:14] [build-stderr] make[1]: *** [Makefile:372: all] Error 2
[2022-12-01 19:06:14] [build-stderr] dh_auto_build: error: make -j4 returned exit code 2
[2022-12-01 19:06:14] [build-stderr] make: *** [debian/rules:33: build] Error 25
[2022-12-01 19:06:14] [build-stderr] dpkg-buildpackage: error: debian/rules build subprocess returned exit status 2
[2022-12-01 19:06:14] [ERROR] Spawned process exited abnormally (code 2; tried to run: [/opt/work/lgtm-workspace/lgtm/extract.s"

The error is above. Looks to be an environment issue. sonic-sairedis build with the latest code passed. @yxieca

*[Cisco] enable SAI bulk API config from syncd to pass a set of Rid and review code-flow performance on cisco SAI/SDK layer. until route-check.py(sonic-utility) will not dump the associated ERR message.
@skbarista skbarista force-pushed the voq-counters-202205 branch 2 times, most recently from c2ca202 to 47503f1 Compare December 2, 2022 19:49
@skbarista skbarista closed this Dec 2, 2022
@skbarista skbarista deleted the voq-counters-202205 branch December 2, 2022 19:50
@skbarista
Copy link
Contributor Author

skbarista commented Dec 2, 2022

I messed up something when I tried to merge upstream into my branch. So closed this pr and created a new pr #1162

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.