-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Buffer Queue Not Changed to Use Queue Data from Config #19624
Comments
It is found that GNMI db_client does support multi-asic handling. This issue is on the client side. Need to handle on the gnmi client to indicate which namespace. The fix will be on sonic-mgmt side. Close this one. There's really one issue still exist for COUNTERS_QUEUE_NAME_MAP on COUNTERS_DB. From the design if create_only_config_db_buffers is set to true, the COUNTERS_QUEUE_NAME_MAP should get the data from config. Otherwise it's get from SAI maximum numbers of buffer queue support. With current observation, no matter create_only_config_db_buffers is set to true or not, COUNTERS_QUEUE_NAME_MAP always contains maximum number of queues (16). This makes the telemetry test test_telemetry_queue_buffer_cnt fails even with the multi-asic support code added into sonic-mgmt. |
@qiluo-msft could you check @wumiaont finding here on the flag create_only_config_db_buffers. |
@wumiaont Do you intend to close this issue? |
I would like to use this one to track the COUNTERS_QUEUE_NAME_MAP issue mentioned in the comments. I will modify this Issue description to have more accurate info. |
@qiluo-msft Please help to narrow down the issue to see what could be not right here. |
Further analysis found that FlexCounterOrch need support for multi-asic/multi-linecard scenario. For example, in a multi-linecard and multi-asic scenario. The keys of BUFFER_QUEUE looks like this: [ixre-egl-board40|asic0|Ethernet24|5-6]. In a single card single asic scenario this buffer queue keys is like this: Ethernet24|5-6. There could be various different combinations such as single card multi asic | multi cards single asic etc. The class needs to handle those situation properly. More of a concern is not just FlexCounterOrch. Looks to me Orchagent needs to support those above different scenarios from an infrastructure level. |
Does not see the code logic is correct for PortsOrch::generateQueueMapPerPort(const Port& port, FlexCounterQueueStates& queuesState, bool voq). The port is coming from generateQueueMap() which is walking through all ports. queuesState contains configured data from BUFFER_QUEUE. But it seems this queuesState is not get used properly at all so no matter what you configured on BUFFER_QUEUE it would not take effect. @qiluo-msft |
let me take a look at it |
@wumiaont If you look at the code in BufferOrch::processQueue, we added the following code under ~voq check after the discussion with MSFT, Nokia and Arista.(sonic-net/sonic-swss#3050) So when the Port BUFFER_QUEUE config is added or removed, you will not see change in FLEX_COUNTER_DB and COUNTERS_DB. -Sakthi |
In this case test_telemetry_queue_buffer test won't apply when the switch type is voq. I will add check in test_telemetry_queue_buffer to check switch type and skip test if type is 'voq'. |
@vmittal-msft The code flow I am checking is this: FlexCounterOrch::doTask --> getQueueConfigurations() (This is the only place to check if create_only_config_db_buffers is set or not, then get queueState info accordingly) ---> PortsOrch::generateQueueMap() ----> PortsOrch::generateQueueMapPerPort(const Port& port, FlexCounterQueueStates& queuesState, bool voq)
I do not see create_only_config_db_buffers could take effect based on the above code logic. My question to you is that is this a bug or it's designed this way? #19624 was created for this issue. I closed this one yesterday as I am told that in voq scenario the queue name map should always have full queue mapping it will never get from config. I am double checking with you to get it confirmed. Also want to know for non voq switch_type case should this flag be supported or not? Please advice. Thanks. |
Reopen this one as we currently skip test because of this PR. When this is closed, the test seems to not be skipped and result in failure. We need answer for the above concern. |
Description
Issue is found when running Telemetry test case of test_telemetry_queue_buffer_cnt. This test sets "create_only_config_db_buffers" to true in config db, to create only relevant counters. Then remove one of the buffer queues. After that it Uses gnmi to query COUNTERS_QUEUE_NAME_MAP for Ethernet0 compare number of queue counters on Ethernet0. It is expected that it will less than previous count.
There were multi-asic support issues with the sonic-mgmt for test_telemetry_queue_buffer_cnt. After issues have been resolved on sonic-mgmt side it found that COUNTERS_QUEUE_NAME_MAP from COUNTERS_DB under proper namespace still uses maximum number of buffer queue SAI support (16) even when create_only_config_db_buffersis set to true in CONFIG file. This causes test failed.
For example: With multi-asic chassis: sonic-db-cli -n asic0 COUNTERS_DB hgetall "COUNTERS_QUEUE_NAME_MAP"
Steps to reproduce the issue:
On Multi-Asic chassis, set create_only_config_db_buffers in config to be true.
"DEVICE_METADATA": {
"localhost": {
"create_only_config_db_buffers ": "true",
................................
}
}
Reload the config. (sudo config reload -y)
After reload check COUNTERS_QUEUE_NAME_MAP
sonic-db-cli -n asic0 COUNTER_DB hgetall COUNTERS_QUEUE_NAME_MAP.
You will see Ethernet0:0 to Ethernet0:15 exists. From config under BUFFER_QUEUE it's defined only from 0-6.
Describe the results you received:
test_telemetry_queue_buffer_cnt failed without able to detect COUNTERS_QUEUE_NAME_MAP changes with config change.
Describe the results you expected:
test_telemetry_queue_buffer_cnt successes
Output of
show version
:SONiC Software Version: SONiC.HEAD.770825-nokia-master-470b90a084
SONiC OS Version: 12
Distribution: Debian 12.6
Kernel: 6.1.0-11-2-amd64
Build commit: 470b90a084
Build date: Tue Jul 16 17:31:07 UTC 2024
Built by: gitlab-runner@sonic-bld2
-->
The text was updated successfully, but these errors were encountered: