diff --git a/.gitignore b/.gitignore index 66d2a676053..1b6ea320271 100644 --- a/.gitignore +++ b/.gitignore @@ -1,5 +1,10 @@ *~ *.swp *.un~ + + + + + .DS_Store .vscode/ diff --git a/doc/distributed-data-plane-verification/design.md b/doc/distributed-data-plane-verification/design.md new file mode 100644 index 00000000000..7f06f5f667d --- /dev/null +++ b/doc/distributed-data-plane-verification/design.md @@ -0,0 +1,397 @@ +# Distributed Data Plane Verification +# High Level Design Document +#### Rev 0.2 + +# Table of Contents +TBD in Markdown with links + +# Revision + +| Rev | Date | Author | Change Description | +| ---- | ---------- | -----------| ------------------| +| v0.1 | 02/24/2022 | Qiao Xiang, Chenyang Huang, Ridi Wen, Yuxin Wang, Jiwu Shu@Xiamen University, China| Initial version | +| v0.2 | 09/09/2022 | Qiao Xiang, Chenyang Huang, Ridi Wen, Yuxin Wang, Jiwu Shu@Xiamen University, China| Add details | + + +# Scope +This document describes the high-level design of the distributed data plane verification feature. + + + + +# Definitions/Abbreviations +###### Table 1: Abbreviations +| Abbreviation | Full form | +| ------------ | --------------------------- | +| FIB | Forwarding Information Base | +| CLI | Command Line Interface | +| LEC | Local Equivalence Class | +| DPV | Data plane verification | + +# Overview + + + + + + +In this HLD, we propose a distributed +data plane verification feature, which allows devices equipped with SONiC to verify their data planes efficiently and enable fast in-network verification. Instead of doing verification at the centralized server, we offload verification tasks to SONiC switches and thus provide good scalibility. Not only can a scalable DPV +tool quickly find network errors in large networks, it can +also support novel services such as fast rollback and switching +among multiple data planes [1], and data plane +verification across administrative domains [2]. + +Our key insight is that DPV can be transformed into a counting problem on a directed acyclic graph called DVNet, which can be naturally decomposed into lightweight +tasks executed at network devices, enabling scalability. + +To be concrete, this +feature provides: + + +* A declarative requirement specification language that allows operators to + flexibly specify common requirements studied by existing DPV tools (e.g., +reachability, blackhole free, waypoint and isolation), and more advanced, yet +understudied requirements (e.g., multicast, anycast, no-redundant-delivery and +all-shortest-path availability). +* A verification planner that takes the operator specified requirement as input, + and systematically decomposes the global verification into lightweight, +on-device computation tasks. +* A distributed verification (DV) protocol that specifies how on-device + verifiers communicate task results efficiently to collaboratively verify the +requirements. + + + +The picture below demonstrates the architecture and workflow of distributed data plane verification. + + +![system](img/system-diagram-v5.jpg) + + + +A series of demos of the proposed feature can be found at [distributeddpvdemo.tech](DDPV-Demos). All demos are conducted on a small testbed of commodity switches installed with SONiC or ONL. A technical report of the detailed design of DDPV can be found at [https://arxiv.org/abs/2205.07808](arXiv). + +# Requirements +* The ddpv container needs to have access to the device data plane (i.e., FIB and ACL) stored in the database container. +* The ddpv container at neighboring switches needs to use sockets to exchange verification messages. +* The ddpv container will be developed in Java, and will need a Java Runtime Environment (JRE). +* New CLI commands need to be added to allow the ddpv container to receive the counting tasks from the verification planner and show related information, e.g., verification results, counting numbers, and status. +* For devices without an on-device verifier, off-device instance (e.g., VM) can be assigned that plays as a verifier to collect the data plane from the +device and exchange messages with other verifiers based on DVNet. +* A computing device (e.g., a server) is needed to serve as a planner, which is responsible for managing devices,inputting topologies, distrbuting tasks and getting verification results. This is similar as the process of configuring routing protocols. + + + + +# Functionality +## Functionality Description +Distributed data plane verification detects a wide range of network errors (e.g., switch operating system errors) by checking the actual +data plane on the network device, so that the operator can detect the network error in time, take relevant +measures to correct the error, and reduce the loss as much as possible. Distributed data plane verification can efficiently validate a +wide range of network forwarding requirements of operators in different scenarios. If the current network +status does not meet the operator's network forwarding requirements, then prompt the operator network error. +Distributed data plane verification generates a directed acyclic graph called DVnet based on the network topology and requirements, +and performs a reverse counting process on DVnet, finally determines whether the network is wrong. + + + + + + +# Design + +The ddpv container runs two daemon processes: lecbuilderd and vagentd. We give a +brief view of the key classes and the main functionalities of each process. + +## lecbuilderd +The core of lecbuilderd is the LECBuilder class. + +- LECBuilder + + This class is responsible for converting the data plane (e.g., FIB and ACL) of the residing device to local equivalence classes (LECs). LECBuilder needs to have **READ** access to the Application-DB table[5] (APPL_DB) which stores FIB and ACL, and seats within the redis-engine of the database container. + Given a device X, a local equivalence class is a set of packets whose actions are identical at X. + LECBuilder stores the LECs of its residing device using a data structure called binary decision diagram (BDD). + The major methods in LECBuilder are: + + - `buildLEC()`: read the database container to get the data plane of the device, + and build the LECs. + - `updateLEC()`: get the updates of data plane from the database container and update the LECs incrementally. + + + +## vagentd +The vagentd process uses a dispatch-worker thread pool design. + +### Dispatcher + + Each SONiC device is connected to a planner who sends verification tasks through sockets using TCP protocol with port 5108. The Dispatcher class receives the computation task configurations from the planner and spawns +Worker threads correspondingly. It also establishes socket connections with neighboring devices using TCP protocol with port 5109, dispatches received messages to corresponding Worker threads and sends the messages from Worker threads to corresponding neighbor devices. Both port 5108 and port 5109 are unused ports according to IETF documentation. + +The main methods in Dispatcher include: + + - `receiveInstruction()`: receive instructions on computation tasks from the planner, and spawn corresponding Worker threads. This method is only invoked when the system starts or the planner updates the computation tasks based on operators' instructions. + - `receiveMessage()`: receive the verification messages from neighboring devices and dispatch them to the corresponding workers. + - `receiveLECUpdate()`: receive the updates of LECs from lecbuilderd and dispatch them to corresponding workers. + - `sendMessage()`: receive the sendResult requests + from workers and send them to corresponding devices. + - `sendAlert()`: if a worker specifies in its sendResult request that the result indicates a violation of an +operator-specified requirement, the dispatcher sends an alert to the operators. + + +### Worker + +This class executes the lightweight computation tasks specified by the planner. + Each node in DVNet corresponds to a worker thread. + The running state of workers is controlled by the thread pool. + The main methods in Worker include: + - `receiveMessage()`: receive the verification message from the dispatcher, and execute the computation task incrementally. + - `receiveLECUpdate()`: receive the updates of LECs from the dispatcher, and execute the computation task incrementally. + - `sendResult()`: send the result of the computation task to the dispatcher, which either forwards it to a corresponding neighbor device or sends an alert to the operators, depending on whether a violation of an operator-specified requirement is found by the worker. + +# Use Case Examples + +We use two examples to demonstrates how the DDPV feature works. More +illustrations can be found at [distributeddpvdemo.tech](DDPV-Demos). The first +example is in a network in Figure 2. + +