Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SubnetDecap] Add subnet decap HLD #1657

Merged
merged 2 commits into from
Jun 4, 2024

Conversation

lolyu
Copy link
Contributor

@lolyu lolyu commented Apr 7, 2024

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
@zhangyanzhao
Copy link
Collaborator


### 6.6 CLI

TBD
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please capture the CLI for better clarity on how this feature is enabled?


In Azure, Netscan probes the network paths/devices by sending IPinIP traffic. The IPinIP packet crafted by the Netscan sender has the outer DIP equals the destination device Loopback address, and the inner DIP equals the IP address of the Netscan sender. When the IPinIP packet is routed to/received by the destination device, they will be decapsulated and the inner packet will be routed back to the Netscan sender.
As of today, Netscan uses this IP-decap based probing to detect route blackholes in the Azure network. The limitation is that Netscan is only able to probe the networking switches without the capability to detect any route blackholes for host nodes, especially VLAN subnet IPs. Due to the fact that the host nodes don’t have native IP-decap functionality, it is more appropriate to implement the IP-decap functionality on T0 SONiC as SONiC supports IPinIP decapsulation, and T0 SONiC will respond to the Netscan probes on behalf of the host nodes to decapsulate the Netscan IPinIP probe packets with DIP as any VLAN subnet IPs.
In this design, subnet decap is introduced to enhance SONiC with the capability to generate the decap rules for the VLAN subnet so IPinIP packets from Netscan with DIP as either VLAN subnet IPs could be decapsulated and forwarded back to the Netscan sender to allow Netscan to have the awareness of any possible route blackholes to those destinations.
Copy link
Collaborator

@prvattem prvattem Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please clarify, how the route blackholes are identified/detected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

```
### SUBNET_DECAP
; Stores subnet based decapsulation configurations
key = SUBNET_DECAP|subnet_type ; SUBNET_DECAP|vlan
Copy link
Collaborator

@prvattem prvattem Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the significance of the subnet_type here?
With subnet_type being vlan, is it helping in programming HW table in anyway?

Instead, we can have VLANID as part of the key itself?

### TUNNEL_DECAP_TABLE
; Stores a list of decap tunnels
key = TUNNEL_DECAP_TABLE:tunnel_name ; tunnel name as key
tunnel_type = "IPINIP" ; tunnel type
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you planning to have any other tynnel_types here other than IPINIP ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we only support IPINIP tunnel in this HLD.

```
### TUNNEL_DECAP_TABLE
; Stores a list of decap tunnels
key = TUNNEL_DECAP_TABLE:tunnel_name ; tunnel name as key
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is the tunnel_name configured?
Please capture the CLI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tunnel name is statically templated, check the workflow.

```
### SUBNET_DECAP
; Stores subnet based decapsulation configurations
key = SUBNET_DECAP|subnet_type ; SUBNET_DECAP|vlan
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is other subnet type or interface support? Example for Ethernet L3 interface?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the db schema, the subnet type now is an attribute, could be either vlan or vip.


### 6.4 VLAN Subnet Decap

#### 6.4.1 VLAN Subnet Decap Rule Generation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since swssconfig is oneshot script, how to handle the change of vlan interface address after swssconfig playing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is not supported/covered by this HLD

#### 6.2.3 STATE_DB

```
### TUNNEL_DECAP_TABLE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why add new state db? Its role is not indicated in the following workflow diagram

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the workflow diagram to indicated the state db.

@zhangyanzhao
Copy link
Collaborator

@lolyu can you please add the code PRs to this HLD PR? Thanks.

@zhangyanzhao
Copy link
Collaborator

code PR is not ready, move to backlog for future release

@lolyu
Copy link
Contributor Author

lolyu commented May 10, 2024

@lolyu can you please add the code PRs to this HLD PR? Thanks.

Updated, could you please move the feature in progress?

@zhangyanzhao
Copy link
Collaborator

@bingwang-ms @lolyu can you please help to address the comments? Thanks.

1. Update DB schema
2. Update workflows with new DB schema
3. Add warm-reboot and CLI sections

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
@lolyu
Copy link
Contributor Author

lolyu commented May 24, 2024

Hi @prvattem @philo-micas, the PR is updated, and could you please help review again?

Thanks!

@bingwang-ms bingwang-ms merged commit eba0a3d into sonic-net:master Jun 4, 2024
1 check passed
@zhangyanzhao
Copy link
Collaborator

Target 202405 release. @bingwang-ms will take care

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants