Skip to content

Commit

Permalink
change the EgressNode to EgressTunnel (#730)
Browse files Browse the repository at this point in the history
Signed-off-by: dcwbq <biqiang.wu@daocloud.io>
  • Loading branch information
biqiangwu authored Aug 31, 2023
1 parent fac1855 commit 4c9b05d
Show file tree
Hide file tree
Showing 33 changed files with 327 additions and 323 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ The gateway provides network egress capabilities for Kubernetes clusters.

### CRDs

* EgressNode
* EgressTunnel
* EgressGateway
* EgressPolicy
* EgressClusterPolicy
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,9 @@ spec:
jsonPath: .status.eip.ipv6
name: ipv6
type: string
- description: egressNode
- description: egressTunnel
jsonPath: .status.node
name: egressNode
name: egressTunnel
type: string
name: v1beta1
schema:
Expand Down
2 changes: 1 addition & 1 deletion docs/README.en.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ There are two clusters A and B. Cluster A is VMWare-based and runs mainly Databa

### CRDs

* EgressNode
* EgressTunnel
* EgressGateway
* EgressPolicy
* EgressClusterPolicy
Expand Down
2 changes: 1 addition & 1 deletion docs/README.zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ EgressGateway 项目为 Kubernetes 提供 Egress 能力。

### CRDs

* EgressNode
* EgressTunnel
* EgressGateway
* EgressPolicy
* EgressClusterPolicy
Expand Down
16 changes: 8 additions & 8 deletions docs/concepts/Architecture.zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,26 +5,26 @@ EgressGateway 由控制面和数据面 2 部分组成,控制面由 4 个控制

## Controller

### EgressNode reconcile loop (a)
### EgressTunnel reconcile loop (a)

#### 初始化

1. 从 ConfigMap 配置文件中获取双栈开启情况及对应的隧道 CIDR
2. 通过节点名称根据算法生成唯一的标签值
3. 会检查 Node 是否有对应的 EgressNode,没有的话就创建对应的 EgressNode,且状态设置为 `Pending`。有隧道 IP 则将 IP 与节点绑定,绑定前会检查 IP 是否合法,不合法则将状态设置为 `Pending`
3. 会检查 Node 是否有对应的 EgressTunnel,没有的话就创建对应的 EgressTunnel,且状态设置为 `Pending`。有隧道 IP 则将 IP 与节点绑定,绑定前会检查 IP 是否合法,不合法则将状态设置为 `Pending`

#### EgressNode Event
#### EgressTunnel Event

- Del:先释放隧道 IP,再删除。如果 EgressNode 对应的节点还存在,重新创建 EgressNode
- Del:先释放隧道 IP,再删除。如果 EgressTunnel 对应的节点还存在,重新创建 EgressTunnel
- Other:
- phase != `Init` || phase != `Ready`:则分配 IP,分配成功将状态设置为 `Init`,分配失败将状态设置为 `Failed`。这里是全局唯一会分配隧道 IP 的地方
- mark != algorithm(NodeName):该字段禁止修改,直接报错返回

#### Node Event

- Del:删除对应的 EgressNode
- Del:删除对应的 EgressTunnel
- Other:
- 节点对应的 EgressNode 不存在,则创建 EgressNode
- 节点对应的 EgressTunnel 不存在,则创建 EgressTunnel
- 无隧道 IP,设置 phase 为 `Pending`
- 有隧道 IP,校验隧道是否合法,不合法则设置 phase 为 `Pending`
- 隧道 IP 合法,校验 IP 是否分配给本节点,不是则设置 phase 为 `Pending`
Expand All @@ -40,11 +40,11 @@ EgressGateway 由控制面和数据面 2 部分组成,控制面由 4 个控制

- Other:
* EIP 减少,如果 EIP 被引用,禁止修改。分配 IPV4 与 IPV6 时,要求一一对应,所以两者的个数需要一致。
* 如果 nodeSelector 被修改,从 status 获取旧的 Node 信息,与最新的 Node 进行对比。将删除节点上的 EIP 重新分配到新的 Node 上。更新对应 EgressNode 中的 EIP 信息。
* 如果 nodeSelector 被修改,从 status 获取旧的 Node 信息,与最新的 Node 进行对比。将删除节点上的 EIP 重新分配到新的 Node 上。更新对应 EgressTunnel 中的 EIP 信息。

#### EgressPolicy Event

- Del:列出 EgressPolicy 找到被引用的 EgressGateway,再对 EgressPolicy 与 EgressGateway 解绑。解绑需要做的事情有,找到对应的 EIP 信息。如果使用了 EIP,则判断是否需要回收 EIP。如果此时 EIP 已经没有 policy 使用,则回收 EIP,更新自身及 EgressNode 的 EIP 信息。
- Del:列出 EgressPolicy 找到被引用的 EgressGateway,再对 EgressPolicy 与 EgressGateway 解绑。解绑需要做的事情有,找到对应的 EIP 信息。如果使用了 EIP,则判断是否需要回收 EIP。如果此时 EIP 已经没有 policy 使用,则回收 EIP,更新自身及 EgressTunnel 的 EIP 信息。
- Other:
* EgressPolicy 不能修改绑定的 EgressGateway。如果允许修改,则列出 EgressGateway 找到原先绑定的 EgressGateway,进行解绑。再对新的进行绑定。
* 新增 EgressPolicy,则将 EgressPolicy 与 EgressGateway 进行绑定,绑定中,判断是否需要分配 EIP。
Expand Down
2 changes: 1 addition & 1 deletion docs/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ nav:
- Architecture: concepts/Architecture.md
- Datapath: concepts/Datapath.md
- reference:
- CRD EgressNode: reference/EgressNode.md
- CRD EgressTunnel: reference/EgressTunnel.md
- CRD EgressGateway: reference/EgressGateway.md
- CRD EgressPolicy: reference/EgressPolicy.md
- CRD EgressClusterPolicy: reference/EgressClusterPolicy.md
Expand Down
14 changes: 7 additions & 7 deletions docs/proposal/01-egress-gateway/EgressGateway.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

### CRDS

The egress gateway model abstracts three Custom Resource Definitions (CRDs): `EgressNode` , `EgressNode` and `EgressGatewayPolicy`. They are cluster scoped CRDs.
The egress gateway model abstracts three Custom Resource Definitions (CRDs): `EgressTunnel` , `EgressTunnel` and `EgressGatewayPolicy`. They are cluster scoped CRDs.

#### EgressGateway
```yaml
Expand Down Expand Up @@ -38,10 +38,10 @@ status:
* `ipv4` address list.
* `ipv6` address list.

#### EgressNode
#### EgressTunnel
```yaml
apiVersion: egressgateway.spidernet.io/v1
kind: EgressNode
kind: EgressTunnel
metadata:
name: "node1"
spec:
Expand All @@ -55,10 +55,10 @@ status:
physicalInterfaceIPv6: ""
```

The `EgressNode` CRD stores vxlan tunnel information, which is generated by the Controller from the Node CR.
The `EgressTunnel` CRD stores vxlan tunnel information, which is generated by the Controller from the Node CR.

* status
* `phase` indicates the status of EgressNode. If 'Ready' has been assigned and the tunnel has been built, 'Pending' is waiting for IP assignment, 'Init' succeeds in assigning the tunnel IP address, and 'Failed' fails to assign the tunnel IP address.
* `phase` indicates the status of EgressTunnel. If 'Ready' has been assigned and the tunnel has been built, 'Pending' is waiting for IP assignment, 'Init' succeeds in assigning the tunnel IP address, and 'Failed' fails to assign the tunnel IP address.
* `vxlanIPv4IP` field represents the IPv4 address of VXLAN tunnel.
* `vxlanIPv6IP` field represents the IPv6 address of VXLAN tunnel.
* `tunnelMac` field represents the MAC address of IPv4 VXLAN tunnel Interface.
Expand Down Expand Up @@ -164,10 +164,10 @@ Controller consists of Webhook Validator and Reconcile Flow.

<img src="./Controller Reconcile Flow.png" width="70%"></img>

Controller has 2 control processes, the first Watch cluster nodes, generate tunnel IP address and MAC address for Node, then `Create` or `Update` EgressNode CR Status. The second control flow watch `EgressNode` and `Egressgateway`, sync match node list from `labelSelector`, election egress gateway node.
Controller has 2 control processes, the first Watch cluster nodes, generate tunnel IP address and MAC address for Node, then `Create` or `Update` EgressTunnel CR Status. The second control flow watch `EgressTunnel` and `Egressgateway`, sync match node list from `labelSelector`, election egress gateway node.

### Agent

<img src="./Agent Reconcile Flow.png" width="70%"></img>

Agent has two control processes, the first Watch `EgressNode` CR, which manages node tunnel, and node tunnel is a pluggable interface that can be replaced by Geneve. The second control process manages datapath policy, which watches `EgressNode`, `EgressGateway` and `Egresspolicy`, and sends them to the host through the police interface. It is currently implemented by a combination of *ipset*, *iptables*, and *route*, and it can be replaced by *eBPF*.
Agent has two control processes, the first Watch `EgressTunnel` CR, which manages node tunnel, and node tunnel is a pluggable interface that can be replaced by Geneve. The second control process manages datapath policy, which watches `EgressTunnel`, `EgressGateway` and `Egresspolicy`, and sends them to the host through the police interface. It is currently implemented by a combination of *ipset*, *iptables*, and *route*, and it can be replaced by *eBPF*.
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
## EgressNode CRD
## EgressTunnel CRD

```yaml
apiVersion: egressgateway.spidernet.io/v1
kind: EgressNode
kind: EgressTunnel
metadata:
name: "node1"
spec:
Expand All @@ -20,7 +20,7 @@ status:
字段说明
* status
* `phase` 表示 EgressNode 的状态,’Ready’ 隧道IP已分配,且隧道已建成,’Pending’ 等待分配IP,’Init’ 分配隧道 IP 成功,’Failed’ 隧道 IP 分配失败
* `phase` 表示 EgressTunnel 的状态,’Ready’ 隧道IP已分配,且隧道已建成,’Pending’ 等待分配IP,’Init’ 分配隧道 IP 成功,’Failed’ 隧道 IP 分配失败
* `vxlanIPv4IP` 隧道 IPV4 地址
* `vxlanIPv6IP` 隧道 IPV6 地址
* `tunnelMac` 隧道 Mac 地址
Expand All @@ -33,22 +33,22 @@ status:

### 初始化
1. 从 CM中获取 IPv4、IPv6 及对应的 CIDR
2. 会检查node 是否有对应的 EgressNode,没有的话就创建对应的EgressNode,且状态设置为 “pending”。有隧道 IP 则将 IP 与节点绑定,绑定前会检查 IP 是否合法,不合法则将状态设置为 “Pending”
2. 会检查node 是否有对应的 EgressTunnel,没有的话就创建对应的EgressTunnel,且状态设置为 “pending”。有隧道 IP 则将 IP 与节点绑定,绑定前会检查 IP 是否合法,不合法则将状态设置为 “Pending”

### 节点事件:
- 删除事件:删除对应的 EgressNode
- 其他事件:如果没有对应的 EgressNode,则创建 EgressNode
- 其他事件:如果有对应的 EgressNode,则对EgressNode进行校验。校验逻辑如下:
- 删除事件:删除对应的 EgressTunnel
- 其他事件:如果没有对应的 EgressTunnel,则创建 EgressTunnel
- 其他事件:如果有对应的 EgressTunnel,则对EgressTunnel进行校验。校验逻辑如下:

- - 无隧道IP,将状态置为 “Pending”
如果有隧道IP,判断是否合法,不合法,就将状态置为 “Pending”
如果合法,校验 IP 是否已分配,如果已分配,且分配给其他节点了,则将状态置为 “Pending”
未分配给其他节点,就分配给本 “EgressNode”,将状态设置为 “Init”
未分配给其他节点,就分配给本 “EgressTunnel”,将状态设置为 “Init”
如果已分配,且就是分配给本节点的,则将状态设置为 “Init”

### EgressNode事件
- 删除事件:先释放IP。如果 EgressNode 对应的节点存在,则释放IP,重新创建 EgressNode
- 其他事件:如果 EgressNode 状态为 “Init” 或 者“Ready” 时,不做任何处理。如果不是,则分配 IP,分配成功将状态设置为 “Init”,分配失败将状态设置为 “Failed”。这里是全局唯一会分配隧道 IP 的地方
### EgressTunnel事件
- 删除事件:先释放IP。如果 EgressTunnel 对应的节点存在,则释放IP,重新创建 EgressTunnel
- 其他事件:如果 EgressTunnel 状态为 “Init” 或 者“Ready” 时,不做任何处理。如果不是,则分配 IP,分配成功将状态设置为 “Init”,分配失败将状态设置为 “Failed”。这里是全局唯一会分配隧道 IP 的地方


## 分配隧道 IP
Expand Down
4 changes: 2 additions & 2 deletions docs/proposal/03-egress-ip/README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,13 @@

### CRD

#### EgressNode
#### EgressTunnel

用于记录跨节点通信的隧道网卡信息。集群级资源,与 Kubernetes Node 资源名称一一对应。

```yaml
apiVersion: egressgateway.spidernet.io/v1beta1
kind: EgressNode
kind: EgressTunnel
metadata:
name: "node1"
status:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
The EgressNode CRD is used to record tunnel network interface information for cross-node communication. It is a cluster scope resource that corresponds one-to-one with the Kubernetes Node resource name.
The EgressTunnel CRD is used to record tunnel network interface information for cross-node communication. It is a cluster scope resource that corresponds one-to-one with the Kubernetes Node resource name.

```yaml
apiVersion: egressgateway.spidernet.io/v1beta1
kind: EgressNode
kind: EgressTunnel
metadata:
name: "node1"
status:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
EgressNode CRD 用于记录跨节点通信的隧道网卡信息。这是一个集群级资源,它与 Kubernetes Node 资源名称一一对应。
EgressTunnel CRD 用于记录跨节点通信的隧道网卡信息。这是一个集群级资源,它与 Kubernetes Node 资源名称一一对应。

```yaml
apiVersion: egressgateway.spidernet.io/v1beta1
kind: EgressNode
kind: EgressTunnel
metadata:
name: "node1"
status:
Expand Down
2 changes: 1 addition & 1 deletion docs/usage/Install.zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ EgressPolicy 对象是租户级别的,因此,它务必创建在 selected 应
3. 查看 EgressPolicy 的状态
$ kubectl get EgressPolicy -A
NAMESPACE NAME GATEWAY IPV4 IPV6 EGRESSNODE
NAMESPACE NAME GATEWAY IPV4 IPV6 EGRESSTUNNEL
default test default 172.22.0.110 egressgateway-worker2
$ kubectl get EgressPolicy test -o yaml
Expand Down
4 changes: 2 additions & 2 deletions docs/usage/Uninstall.en.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,10 +67,10 @@ To ensure that the running applications are not affected before uninstalling Egr

It is worth noting that before uninstalling EgressGateway, it is recommended to back up related data and ensure that the uninstall operation does not affect the ongoing business applications.

4. During the uninstallation process, sometimes the EgressNodes CRD of EgressGateway may remain in a waiting state for deletion. If you encounter this situation, you can try using the following command to resolve the issue:
4. During the uninstallation process, sometimes the EgressTunnels CRD of EgressGateway may remain in a waiting state for deletion. If you encounter this situation, you can try using the following command to resolve the issue:

```shell
kubectl patch crd egressnodes.egressgateway.spidernet.io -p '{"metadata":{"finalizers": []}}' --type=merge
kubectl patch crd egresstunnels.egressgateway.spidernet.io -p '{"metadata":{"finalizers": []}}' --type=merge
```

This command removes the finalizer in the EgressGateway CRD, allowing Kubernetes to delete it. This issue is caused by the controller-manager, and we are monitoring the Kubernetes team's progress on fixing it.
4 changes: 2 additions & 2 deletions docs/usage/Uninstall.zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,10 +67,10 @@

需要注意的是,在卸载 EgressGateway 之前,建议先备份相关数据,并确保卸载操作不会影响正在使用的业务应用。

4. 在卸载过程中,有时候会遇到 EgressGateway 的 EgressNodes CRD 一直处于等待删除的情况。如果您遇到了这种情况,可以尝试使用下面的命令解决问题:
4. 在卸载过程中,有时候会遇到 EgressGateway 的 EgressTunnels CRD 一直处于等待删除的情况。如果您遇到了这种情况,可以尝试使用下面的命令解决问题:

```shell
kubectl patch crd egressnodes.egressgateway.spidernet.io -p '{"metadata":{"finalizers": []}}' --type=merge
kubectl patch crd egresstunnels.egressgateway.spidernet.io -p '{"metadata":{"finalizers": []}}' --type=merge
```

这个命令的作用是删除 EgressGateway CRD 中的 finalizer,从而允许 Kubernetes 删除这个 CRD。此问题是由 controller-manager 引起的,我们正在关注 Kubernetes 团队对此问题的修复情况。
Expand Down
2 changes: 1 addition & 1 deletion pkg/agent/agent.go
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ func New(cfg *config.Config) (types.Service, error) {

metrics.RegisterMetricCollectors()

err = newEgressNodeController(mgr, cfg, log)
err = newEgressTunnelController(mgr, cfg, log)
if err != nil {
return nil, fmt.Errorf("failed to create node controller: %w", err)
}
Expand Down
12 changes: 6 additions & 6 deletions pkg/agent/police.go
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,7 @@ func (r *policeReconciler) initApplyPolicy() error {
node := new(egressv1.EgressTunnel)
err := r.client.Get(context.Background(), types.NamespacedName{Name: val.NodeName}, node)
if err != nil {
r.log.Error(err, "failed to get egress node, skip building rule of policy")
r.log.Error(err, "failed to get egress tunnel, skip building rule of policy")
continue
}
policyName := policy.Name
Expand Down Expand Up @@ -521,7 +521,7 @@ func buildNatStaticRule(base uint32) map[string][]iptables.Rule {
Match: iptables.MatchCriteria{}.MarkMatchesWithMask(base, 0xffffffff),
Action: iptables.AcceptAction{},
Comment: []string{
"Accept for egress traffic from pod going to EgressNode",
"Accept for egress traffic from pod going to EgressTunnel",
},
},
{
Expand Down Expand Up @@ -712,14 +712,14 @@ func buildFilterStaticRule(base uint32) map[string][]iptables.Rule {
Match: iptables.MatchCriteria{}.MarkMatchesWithMask(base, 0xffffffff),
Action: iptables.AcceptAction{},
Comment: []string{
"Accept for egress traffic from pod going to EgressNode",
"Accept for egress traffic from pod going to EgressTunnel",
},
}},
"OUTPUT": {{
Match: iptables.MatchCriteria{}.MarkMatchesWithMask(base, 0xffffffff),
Action: iptables.AcceptAction{},
Comment: []string{
"Accept for egress traffic from pod going to EgressNode",
"Accept for egress traffic from pod going to EgressTunnel",
},
}},
}
Expand All @@ -732,14 +732,14 @@ func buildMangleStaticRule(base uint32) map[string][]iptables.Rule {
Match: iptables.MatchCriteria{}.MarkMatchesWithMask(base, 0xff000000),
Action: iptables.SetMaskedMarkAction{Mark: base, Mask: 0xffffffff},
Comment: []string{
"Accept for egress traffic from pod going to EgressNode",
"Accept for egress traffic from pod going to EgressTunnel",
},
}},
"POSTROUTING": {{
Match: iptables.MatchCriteria{}.MarkMatchesWithMask(base, 0xffffffff),
Action: iptables.AcceptAction{},
Comment: []string{
"Accept for egress traffic from pod going to EgressNode",
"Accept for egress traffic from pod going to EgressTunnel",
},
}},
"PREROUTING": {{
Expand Down
Loading

0 comments on commit 4c9b05d

Please sign in to comment.