Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[warm-reboot]: Neighbor entry is not restored after warm-reboot #3108

Closed
volodymyrsamotiy opened this issue Jul 2, 2019 · 2 comments
Closed
Assignees
Labels

Comments

@volodymyrsamotiy
Copy link
Collaborator

volodymyrsamotiy commented Jul 2, 2019

Description

Valid and reachable neighbor entry is not restored after warm-reboot.
Before issuing warm-reboot command, neighbor is REACHABLE in Linux and also it is programmed in HW.

root@sonic:/home/admin# ip neigh show to 30.1.10.101
30.1.10.101 dev Vlan3001 lladdr 24:8a:07:9c:86:02 REACHABLE

Then, after warm-reboot procedure when switch boots up this neighbor is not present in HW and it is marked as FAILED in Linux.

root@sonic:/home/admin# warm-reboot
...
root@sonic:/home/admin# ip neigh show to 30.1.10.101
30.1.10.101 dev Vlan3001 lladdr 24:8a:07:9c:86:02 FAILED

It looks like neighbor was removed for some reason during reconciliation after warm-reboot.
Below is the log snippet with "delete" messages.

INFO swss#supervisord: restore_neighbors restore_neighbors service is started
INFO swss#supervisord: restore_neighbors restore_neighbor service is done for system warmreboot
NOTICE swss#neighsyncd: :- isNeighRestoreDone: neighbor table restore to kernelis done
INFO swss#supervisord: neighsyncd Listens to neigh messages...
NOTICE swss#neighsyncd: :- insertToMap: NEIGH_TABLE, delete key: Vlan3001:30.1.10.101,
NOTICE swss#neighsyncd: :- reconcile: NEIGH_TABLE STALE/DELETE, key: Vlan3001:30.1.10.101, neigh:24:8a:07:9c:86:02, family:IPv4, cache-state:DELETE,
NOTICE swss#orchagent: :- removeNeighbor: Removed next hop 30.1.10.101 on Vlan3001
NOTICE swss#orchagent: :- removeNeighbor: Removed neighbor 24:8a:07:9c:86:02 onVlan3001

Steps to reproduce the issue:

  1. Configure neighbor entry using below configuration.
{
    "VLAN": {
        "Vlan3001": {
            "vlanid": 3001
        }
    },
    "VLAN_MEMBER": {
        "Vlan3001|Ethernet8": {
            "tagging_mode": "tagged"
        }
    },
    "VLAN_INTERFACE": {
        "Vlan3001": {},
        "Vlan3001|30.1.10.1/24": {}
    },
    "NEIGH": {
        "Vlan3001|30.1.10.101": {
            "family": "IPv4"
        }
    }
}
  1. Verify that neighbor is reachable.
  2. Execute warm-reboot command.
  3. Wait until switch finishes restoration after warm-reboot.
  4. Observe that neighbor is not reachable.

Describe the results you received:
Reachable neighbor entry is not present after warm-reboot.

Describe the results you expected:
Neighbor should be restored after warm-reboot.

Additional information you deem important (e.g. issue happens only occasionally):

root@sonic:/home/admin# show version

SONiC Software Version: SONiC.HEAD.19-8c3fdfd0
Distribution: Debian 9.9
Kernel: 4.9.0-9-2-amd64
Build commit: 8c3fdfd0
Build date: Sat Jun 29 07:28:14 UTC 2019
Built by: johnar@jenkins-worker-4

Platform: x86_64-mlnx_msn2700-r0
HwSKU: ACS-MSN2700
ASIC: mellanox
Serial Number: MT1822K07823
Uptime: 12:25:18 up  1:13,  2 users,  load average: 3.40, 3.51, 3.60

Docker images:
REPOSITORY                 TAG                 IMAGE ID            SIZE
docker-syncd-mlnx          HEAD.19-8c3fdfd0    64d5cb77da05        369MB
docker-syncd-mlnx          latest              64d5cb77da05        369MB
docker-lldp-sv2            HEAD.19-8c3fdfd0    f8486c5aeb69        299MB
docker-lldp-sv2            latest              f8486c5aeb69        299MB
docker-dhcp-relay          HEAD.19-8c3fdfd0    0e2d0fa51c81        288MB
docker-dhcp-relay          latest              0e2d0fa51c81        288MB
docker-database            HEAD.19-8c3fdfd0    2fc55bdfa038        280MB
docker-database            latest              2fc55bdfa038        280MB
docker-snmp-sv2            HEAD.19-8c3fdfd0    797c740bae2c        313MB
docker-snmp-sv2            latest              797c740bae2c        313MB
docker-orchagent           HEAD.19-8c3fdfd0    68d9ed9b22a4        319MB
docker-orchagent           latest              68d9ed9b22a4        319MB
docker-teamd               HEAD.19-8c3fdfd0    2b78121ef284        301MB
docker-teamd               latest              2b78121ef284        301MB
docker-sonic-telemetry     HEAD.19-8c3fdfd0    6b14465f032d        302MB
docker-sonic-telemetry     latest              6b14465f032d        302MB
docker-router-advertiser   HEAD.19-8c3fdfd0    2619418c8ab1        280MB
docker-router-advertiser   latest              2619418c8ab1        280MB
docker-platform-monitor    HEAD.19-8c3fdfd0    b698c6a6ea2e        394MB
docker-platform-monitor    latest              b698c6a6ea2e        394MB
docker-fpm-frr             HEAD.19-8c3fdfd0    88fcddc62877        319MB
docker-fpm-frr             latest              88fcddc62877        319MB

@yxieca
Copy link
Contributor

yxieca commented Sep 12, 2019

@prsunny your recent change addressed this issue. right?

@prsunny
Copy link
Contributor

prsunny commented Sep 12, 2019

Yes, this can be closed. Fixed by sonic-net/sonic-swss#1040

@prsunny prsunny closed this as completed Sep 12, 2019
mssonicbld added a commit that referenced this issue Mar 28, 2024
…atically (#18240)

#### Why I did it
src/sonic-utilities
```
* bdc57206 - (HEAD -> master, origin/master, origin/HEAD) Revert "Fix for Switch Port Modes and VLAN CLI Enhancement (#3108)" (#3246) (89 minutes ago) [jingwenxie]
* e35452b7 - Modify "show interface transceiver status" CLI to show SW cmis state (#3238) (2 days ago) [mihirpat1]
* 04a33e1f - Add "state" field in CONFIG_DB a toggle of the fabric port monitor feature (#2932) (2 days ago) [jfeng-arista]
* 3c489ba5 - Enhance route-check for multi-asic platforms (#3216) (5 days ago) [Deepak Singhal]
* c149e48b - [chassis] Add chassis support for CLI "config qos reload" (#3233) (6 days ago) [wenyiz2021]
* d8541add - Update port2alias (#3217) (8 days ago) [abdosi]
* d4688a8f - [graceful reboot] Add the pre_reboot_hook script execution, add the watchdog arm before the reboot (#3203) (8 days ago) [Vadym Hlushko]
* 125f36f3 - [ipintutil]Handle exception in show ip interfaces command (#3182) (10 days ago) [Sudharsan Dhamal Gopalarathnam]
* 9d532017 - [chassis][show-runningconfig] Fix the show runningconfiguration all issue on the Supervisor (#3194) (2 weeks ago) [Marty Y. Lok]
* 1a9261ce - [Techsupport]Handle SAI kv pair if present in sai common profile (#3196) (2 weeks ago) [Sudharsan Dhamal Gopalarathnam]
* 7466dc4a - Skip the validation of action in acl-loader if capability table in STATE_DB is empty (#3199) (2 weeks ago) [bingwang-ms]
* b879b658 - [Bug] Fix fw_setenv illegel character issue (#3201) (3 weeks ago) [xumia]
* 0b41a560 - [config] Add YANG alerting for override (#3188) (3 weeks ago) [jingwenxie]
* 24683b0c - [show] multi-asic show running test residue (#3198) (3 weeks ago) [jingwenxie]
* 995a797a - CLI to skip polling for periodic information for a port in DomInfoUpdateTask thread (#3187) (3 weeks ago) [mihirpat1]
* 9aa9eaa5 - [config] Add Table hard dependency check (#3159) (3 weeks ago) [jingwenxie]
* 5f0ffcca - [fast/warm-reboot] Put ERR message in syslog when a failure is seen (#3186) (4 weeks ago) [Vaibhav Hemant Dixit]
* 92220dcf - Fix for Switch Port Modes and VLAN CLI Enhancement (#3108) (4 weeks ago) [Saba Akram]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants