Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Signed-off-by: Raphael Tryster raphaelt@nvidia.com
What I did
Implemented a solution for Lab 5 in SONiC training based on one of the solutions provided in the Wiki, with some changes. These include using the counters DB instead of calls to SAI to retrieve the tx error counter, passing the statistics data by reference so that the difference between successive counter readings would be used correctly, and added python code to handle some not found conditions.
Why I did it
I used a prepared solution, studied it with gdb and modified it based on what I observed in gdb, because so much of the way code is written in C++ is unfamiliar that I expected writing from scratch would take an order of magnitude more than the time allotted, and I would get more benefit from studying and modifying existing code.
How I verified it
Configure polling period, e.g.
sudo config tx_error_stat_poll_period 30
Configure error threshold, e.g.
sudo config interface tx_error_threshold set Ethernet0 10
Disable automatic refresh of counters, so that counter values can be injected into DB and stay there:
counterpoll port disable
Lookup OID of port being tested, e.g.
redis-cli -n 2 hgetall "COUNTERS_PORT_NAME_MAP" | grep -A1 "Ethernet0"
Ethernet0
oid:0x10000000008d4
Inject tx errors to that port and verify that DB was updated:
redis-cli -n 2 hset "COUNTERS:oid:0x10000000008d4" "SAI_PORT_STAT_IF_OUT_ERRORS" "20"
(integer) 0
redis-cli -n 2 hget "COUNTERS:oid:0x10000000008d4" "SAI_PORT_STAT_IF_OUT_ERRORS"
"20"
Show status within the poll period:
show interfaces tx_error
Port status statistics
Ethernet0 error 20
Show status after the poll period:
show interfaces tx_error
Port status statistics
Ethernet0 ok 20
Repeat sequences of the above commands to verify that status becomes error when the delta exceeds the threshold, and ok when it doesn't.
Details if related