Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/master'
Browse files Browse the repository at this point in the history
  • Loading branch information
Mauue committed Feb 24, 2022
2 parents b61eda6 + 50122b7 commit 6488ad2
Show file tree
Hide file tree
Showing 2 changed files with 50 additions and 23 deletions.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
*~
*.swp
*.un~



.DS_Store

68 changes: 45 additions & 23 deletions doc/distributed-data-plane-verification/coral.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,21 +3,20 @@
#### Rev 0.1

# Table of Contents
todo
TBD in Markdown with links

# Revision

| Rev | Date | Author | Change Description |
| :---: | :---: | :----- | :----------------- |
| | | | |
| ---- | ---------- | -----------| ------------------|
| v0.1 | 02/24/2022 | Qiao Xiang, Chenyang Huang, Ridi Wen, Jiwu Shu@Xiamen University, China| Initial version |

# About this Manual
This document describes the design details of distributed data plane verification feature.

TODO

# Scope
This document describes the high level design details about how distributed data plane verification works.
This document describes the high-level design of the distributed data plane verification feature.




# Definitions/Abbreviations
###### Table 1: Abbreviations
Expand All @@ -28,36 +27,59 @@ This document describes the high level design details about how distributed data
| LEC | Local Equivalence Class |
| DPV | Data plane verification |

# 1 Motivation
# Overview

Network errors such as forwarding loops, undesired blackholes and waypoint violations are the outcome of various
issues (e.g., software bugs, faulty hardware, protocol misconfigurations, and oversight negligence). They can happen in
all types of networks (e.g., enterprise networks, wide area networks and data center networks), and have both disastrous
financial and social consequences. Data plane verification (DPV) is important for finding such network errors.
Current DPV tools employ a centralized architecture, where a server collects the data planes of all devices and verifies them.
Despite substantial efforts on accelerating DPV, this centralized architecture is inherently unscalable.
Despite substantial efforts on accelerating DPV, this centralized architecture is inherently unscalable because (1) it requires a highly available management network, which is hard build itself; and (2) the server becomes the performance bottleneck and a single point of failure.

In this HLD, to tackle the scalability challenge of DPV, we propose a distributed
data plane verification feature, which circumvents the scalability bottleneck
of centralized design and performs data plane checking on commodity network
devices. Our key insight is that DPV can be transformed into a counting problem
on a directed acyclic graph, which can be naturally decomposed into lightweight
tasks executed at network devices, enabling scalability. To be concrete, this
feature provides:

* A declarative requirement specification language that allows operators to
flexibly specify common requirements studied by existing DPV tools (e.g.,
reachability, blackhole free, waypoint and isolation), and more advanced, yet
understudied requirements (e.g., multicast, anycast, no-redundant-delivery and
all-shortest-path availability).
* A verification planner that takes the operator specified requirement as input,
and systematically decomposes the global verification into lightweight,
on-device computation tasks.
* A distributed verification (DV) protocol that specifies how on-device
verifiers communicate task results efficiently to collaboratively verify the
requirements.

In order to tackle the scalability challenge of DPV, distributed data plane verification is designed, a generic, distributed,
on-device DPV framework, which circumvents the scalability bottleneck of centralized design. The key insight is as follows. A
directed acyclic graph (DAG), which represents all valid paths in
the network, is called DVNet. The problem of DPV can be transformed into a counting problem in DVNet; the latter can then be decomposed into small tasks at nodes on the DVNet, which can be distributively executed at corresponding network devices, enabling scalability.

The picture below demonstrates the architechture and workflow of distributed data plane verification.

The picture below demonstrates the architecture and workflow of distributed data plane verification.
<!---![system](img/system-diagram.jpg)-->

![architecture](img/architecture.png)

Firstly, DVNet is generated based on specified verification requirement and actual network topology. Then, the counting problem is distributed to individual switches. On each switch, counting result is computed depending on received verification messages and delivered to corresponding upstream node on DVNet. Finally, the source switch would be able to determine whether there is an error on data plane according to received verification messages.
The demo of distributed data plane verification can be found at [distributeddpvdemo.tech](distributeddpvdemo.tech).

A series of demos of the proposed feature can be found at [distributeddpvdemo.tech](distributeddpvdemo.tech). All demos are conducted on a small testbed of commodity switches installed with SONiC or ONL.

# Requirements
* The ddpv container needs to have access to the device data plane (i.e., FIB and ACL) stored in the database container.
* The ddpv container at neighboring switches needs to use sockets to exchange verification messages.
* The ddpv container will be developed in Java, and will need a Java Runtime Environment (JRE).
* New CLI commands need to be added to allow the ddpv container to receive the counting tasks from the verification planner and show related information, e.g., verification results, counting numbers, and status.


# 2 Overview

## 2.1 Functionality Overview
# 2 Functionality Overview
1. Distributed data plane verification allows user to verify a wide range of requirements, e.g., reachability, isolation, loop-freeness, black hole freeness, waypoint reachability and all shortest-path availability requirement in Azure RCDC [1].
2. Distributed data plane verification is able to verify data plane in the scenario of both burst update and incremental update.
## 2.2 Requirements Overview
1. In order to compute Local Equivalence Class (LEC) table, distributed data plane verification needs to have access to FIB stored in database container.
2. Sockets can be built to receive and deliver verification messages containing counting result.
3. Java Runtime Environment (JRE) is needed in operation system.
4. New CLI commands need to be added to specify data plane verification requirements and show related information, e.g., verification results, counting numbers, and status.


# 3 Functionality
## 3.1 Functionality Description
Expand Down Expand Up @@ -200,4 +222,4 @@ The process is similar to green start. Finally, update is complete until each de
Ashish Bhargava, Paul-Andre C Bissonnette, Shane Foster, Andrew
Helwer, Mark Kasten, Ivan Lee, et al. 2019. Validating Datacenters
at Scale. In Proceedings of the ACM Special Interest Group on Data
Communication. 200–213.
Communication. 200–213.

0 comments on commit 6488ad2

Please sign in to comment.