Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib: fix binding to a vrf #8644

Merged
merged 4 commits into from
Jun 2, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion bgpd/bgp_network.c
Original file line number Diff line number Diff line change
Expand Up @@ -895,7 +895,10 @@ int bgp_socket(struct bgp *bgp, unsigned short port, const char *address)
frr_with_privs(&bgpd_privs) {
sock = vrf_socket(ainfo->ai_family,
ainfo->ai_socktype,
ainfo->ai_protocol, bgp->vrf_id,
ainfo->ai_protocol,
(bgp->inst_type
!= BGP_INSTANCE_TYPE_VIEW
? bgp->vrf_id : VRF_DEFAULT),
(bgp->inst_type
== BGP_INSTANCE_TYPE_VRF
? bgp->name : NULL));
Expand Down
3 changes: 0 additions & 3 deletions doc/user/Useful_Sysctl_Settings.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,4 @@ net.ipv6.neigh.default.base_reachable_time_ms=14400000

# Use neigh information on selection of nexthop for multipath hops
net.ipv4.fib_multipath_use_neigh=1

# Allows Apps to Work with VRF
net.ipv4.tcp_l3mdev_accept=1
```
23 changes: 9 additions & 14 deletions doc/user/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -541,20 +541,15 @@ Additional kernel modules are also needed to support MPLS forwarding.

:makevar:`VRF forwarding`
General information on Linux VRF support can be found in
https://www.kernel.org/doc/Documentation/networking/vrf.txt. Kernel
support for VRFs was introduced in 4.3 and improved upon through
4.13, which is the version most used in FRR testing (as of June
2018). Additional background on using Linux VRFs and kernel specific
features can be found in
http://schd.ws/hosted_files/ossna2017/fe/vrf-tutorial-oss.pdf.

A separate BGP TCP socket is opened per VRF.

**Important note** as of June 2018, Kernel versions 4.14-4.18 have a
known bug where VRF-specific TCP sockets are not properly handled. When
running these kernel versions, if unable to establish any VRF BGP
adjacencies, downgrade to 4.13. The issue was fixed in 4.14.57, 4.17.9
and more recent kernel versions.
https://www.kernel.org/doc/Documentation/networking/vrf.txt.

Kernel support for VRFs was introduced in 4.3, but there are known issues
in versions up to 4.15 (for IPv4) and 5.0 (for IPv6). The FRR CI system
doesn't perform VRF tests on older kernel versions, and VRFs may not work
on them. If you experience issues with VRF support, you should upgrade your
kernel version.

.. seealso:: :ref:`zebra-vrf`

Building
^^^^^^^^
Expand Down
4 changes: 2 additions & 2 deletions doc/user/zebra.rst
Original file line number Diff line number Diff line change
Expand Up @@ -261,8 +261,6 @@ Link Parameters Commands
Allows nexthop tracking to resolve via the default route. This is useful
when e.g. you want to allow BGP to peer across the default route.

.. _zebra-vrf:

Administrative Distance
=======================

Expand Down Expand Up @@ -338,6 +336,8 @@ work FRR must use the same metric when issuing the replace command.
Currently FRR only supports Route Replace semantics using the Linux
Kernel.

.. _zebra-vrf:

Virtual Routing and Forwarding
==============================

Expand Down
51 changes: 38 additions & 13 deletions lib/vrf.c
Original file line number Diff line number Diff line change
Expand Up @@ -994,25 +994,50 @@ const char *vrf_get_default_name(void)
return vrf_default_name;
}

int vrf_bind(vrf_id_t vrf_id, int fd, const char *name)
int vrf_bind(vrf_id_t vrf_id, int fd, const char *ifname)
{
int ret = 0;
struct interface *ifp;
struct vrf *vrf;

if (fd < 0)
return -1;

if (vrf_id == VRF_UNKNOWN)
return -1;

/* can't bind to a VRF that doesn't exist */
vrf = vrf_lookup_by_id(vrf_id);
if (!vrf_is_enabled(vrf))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suppose at this step, vrf is not enabled. Meaning zebra has not yet processed this message and has not created this netdevice. So the code returns from here without binding the socket-id to the VRF (or interface).
Later, zebra creates the netdevice. Now, after creation of netdevice, will this socket-id be rebinded to the VRF (or interface) ? (i.e. will this code get called again ? )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually up to the daemon to correctly process the VRF disable/enable events and recreate its sockets when the VRF is enabled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @idryzhov

return -1;
pguibert6WIND marked this conversation as resolved.
Show resolved Hide resolved

if (ifname && strcmp(ifname, vrf->name)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an issue #7432
Can you please test the below scenario with your changes.
Setup:
Sonic1------Sonic2

Test Steps:

Configure eBGP session between Sonic-1 and Sonic2 in default vrf using Peer-group.
Create a dummy non-default vrf BGP instance on Sonic1 device.
Save and reload Sonic1 device.
Once the device is up, try to configure 'password ' on Sonic1 BGP neighbor alone.
The eBGP session should go down and must not come up until matching password is configured on Sonic2 device. But it comes up. This is because BGP packets are sent using other BGP socket (which was created using dummy non-default vrf BGP instance where vrf binding to the BGP socket had failed).
Expected Behavior :
The eBGP session should go down and must not come up until matching password is configured on Sonic2 device.

Copy link
Contributor

@sudhanshukumar22 sudhanshukumar22 May 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I tested that the above scenario works with your fixes in vrf.c and vrf.h. I will defer my pull request (#7432) in favor of this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the testing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 1014: From netdevice creation point of view, how is the VRF interface different from a physical interface ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I get this question and its relevance to this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this code, when ifname is interface (say Ethernet64), we are processing it differently (by calling if_lookup_by_name() and checking inside if_name_head database). But if ifname is vrf (e.g. Vrf-Red), we will not check inside the database. I was wondering why we have this difference in code. From zebra perspective, both look same as below.
sonic# show interface Vrf1
Interface Vrf1 is up, line protocol is up
Link ups: 0 last: (never)
Link downs: 0 last: (never)
vrf: Vrf1
index 8 metric 0 mtu 65536 speed 0
flags: <UP,RUNNING,NOARP>
group: 0
Type: Ethernet
HWaddr: a6:2b:4f:6a:2a:b5
Interface Type VRF
sonic# show interface Ethernet64
Interface Ethernet64 is up, line protocol is up
Link ups: 0 last: (never)
Link downs: 0 last: (never)
vrf: default
index 64 metric 0 mtu 9100 speed 100000
flags: <UP,BROADCAST,RUNNING,MULTICAST>
group: 0
Type: Ethernet
HWaddr: b8:6a:97:e2:5e:9f
inet 64.1.1.1/24
Interface Type Other

Copy link
Contributor Author

@idryzhov idryzhov May 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done, because the code flow in the daemons is the following:

  1. receive vrf registration from zebra
  2. create and bind socket to the vrf
  3. receive interface registration from zebra

If we try to look for an interface on the second step, we always fail, because we didn't receive the interface registration message yet. It's enough to check for the VRF existence if the ifname is the VRF name.

/* binding to a regular interface */

/* can't bind to an interface that doesn't exist */
ifp = if_lookup_by_name(ifname, vrf_id);
if (!ifp)
return -1;
pguibert6WIND marked this conversation as resolved.
Show resolved Hide resolved
} else {
/* binding to a VRF device */

/* nothing to do for netns */
if (vrf_is_backend_netns())
return 0;

/* nothing to do for default vrf */
if (vrf_id == VRF_DEFAULT)
return 0;

ifname = vrf->name;
}

if (fd < 0 || name == NULL)
return fd;
/* the device should exist
* otherwise we should return
* case ifname = vrf in netns mode => return
*/
ifp = if_lookup_by_name(name, vrf_id);
if (!ifp)
return fd;
#ifdef SO_BINDTODEVICE
ret = setsockopt(fd, SOL_SOCKET, SO_BINDTODEVICE, name, strlen(name)+1);
ret = setsockopt(fd, SOL_SOCKET, SO_BINDTODEVICE, ifname,
strlen(ifname) + 1);
if (ret < 0)
zlog_debug("bind to interface %s failed, errno=%d", name,
errno);
zlog_err("bind to interface %s failed, errno=%d", ifname,
errno);
#endif /* SO_BINDTODEVICE */
return ret;
}
Expand Down
11 changes: 5 additions & 6 deletions lib/vrf.h
Original file line number Diff line number Diff line change
Expand Up @@ -251,15 +251,14 @@ extern int vrf_sockunion_socket(const union sockunion *su, vrf_id_t vrf_id,
const char *name);

/*
* Binds a socket to a VRF device.
* Binds a socket to an interface (ifname) in a VRF (vrf_id).
*
* If name is null, the socket is not bound, irrespective of any other
* arguments.
* If ifname is NULL or is equal to the VRF name then bind to a VRF device.
* Otherwise, bind to the specified interface in the specified VRF.
*
* name should be the name of the VRF device. vrf_id should be the
* corresponding vrf_id (the ifindex of the device).
* Returns 0 on success and -1 on failure.
*/
extern int vrf_bind(vrf_id_t vrf_id, int fd, const char *name);
extern int vrf_bind(vrf_id_t vrf_id, int fd, const char *ifname);

/* VRF ioctl operations */
extern int vrf_getaddrinfo(const char *node, const char *service,
Expand Down
2 changes: 0 additions & 2 deletions tests/topotests/bgp_evpn_rt5/test_bgp_evpn.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,6 @@
from lib import topotest
from lib.topogen import Topogen, TopoRouter, get_topogen
from lib.topolog import logger
from lib.common_config import adjust_router_l3mdev

# Required to instantiate the topology builder class.
from mininet.topo import Topo
Expand Down Expand Up @@ -130,7 +129,6 @@ def setup_module(mod):
logger.info("result: " + output)

router = tgen.gears["r2"]
adjust_router_l3mdev(tgen, "r2")
for cmd in cmds_vrflite:
logger.info("cmd to r2: " + cmd.format("r2"))
output = router.run(cmd.format("r2"))
Expand Down
7 changes: 0 additions & 7 deletions tests/topotests/bgp_l3vpn_to_bgp_vrf/customize.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,6 @@
from lib.topogen import Topogen, TopoRouter, get_topogen
from lib.topolog import logger
from lib.ltemplate import ltemplateRtrCmd
from lib.common_config import adjust_router_l3mdev

# Required to instantiate the topology builder class.
from mininet.topo import Topo
Expand Down Expand Up @@ -176,9 +175,6 @@ def ltemplatePreRouterStartHook():
"ip link set dev {0}-cust1 up",
]
for rtr in rtrs:
# adjust handling of VRF traffic
adjust_router_l3mdev(tgen, rtr)

for cmd in cmds:
cc.doCmd(tgen, rtr, cmd.format(rtr))
cc.doCmd(tgen, rtr, "ip link set dev {0}-eth4 master {0}-cust1".format(rtr))
Expand Down Expand Up @@ -219,9 +215,6 @@ def ltemplatePreRouterStartHook():
"ip link set dev {0}-cust2 up",
]
for rtr in rtrs:
# adjust handling of VRF traffic
adjust_router_l3mdev(tgen, rtr)

for cmd in cmds:
cc.doCmd(tgen, rtr, cmd.format(rtr))
cc.doCmd(tgen, rtr, "ip link set dev {0}-eth0 master {0}-cust2".format(rtr))
Expand Down
20 changes: 0 additions & 20 deletions tests/topotests/bgp_l3vpn_to_bgp_vrf/scripts/check_linux_vrf.py
Original file line number Diff line number Diff line change
@@ -1,24 +1,4 @@
from lib.lutil import luCommand
from lib.common_config import kernel_requires_l3mdev_adjustment

l3mdev_accept = kernel_requires_l3mdev_adjustment()
l3mdev_rtrs = ["r1", "r3", "r4", "ce4"]
for rtr in l3mdev_rtrs:
luCommand(rtr, "sysctl net.ipv4.tcp_l3mdev_accept", " = \d*", "none", "")
found = luLast()
luCommand(
rtr, "ss -naep", ":179", "pass", "IPv4:bgp, l3mdev{}".format(found.group(0))
)
luCommand(rtr, "ss -naep", ":.*:179", "pass", "IPv6:bgp")
luCommand(
rtr,
"sysctl net.ipv4.tcp_l3mdev_accept",
" = {}".format(l3mdev_accept),
"pass",
"l3mdev matches expected (real/expected{}/{})".format(
found.group(0), l3mdev_accept
),
)

rtrs = ["r1", "r3", "r4"]
for rtr in rtrs:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@
from lib import topotest
from lib.topogen import Topogen, TopoRouter, get_topogen
from lib.topolog import logger
from lib.common_config import adjust_router_l3mdev
from lib.common_config import required_linux_kernel_version

# Required to instantiate the topology builder class.
from mininet.topo import Topo
Expand All @@ -66,6 +66,12 @@ def build(self, *_args, **_opts):

def setup_module(mod):
"Sets up the pytest environment"

# Required linux kernel version for this suite to run.
result = required_linux_kernel_version("5.0")
if result is not True:
pytest.skip("Kernel requirements are not met")

tgen = Topogen(BGPIPV6RTADVVRFTopo, mod.__name__)
tgen.start_topology()

Expand All @@ -84,9 +90,6 @@ def setup_module(mod):
for cmd in cmds:
output = tgen.net[rname].cmd(cmd.format(rname))

# adjust handling of vrf traffic
adjust_router_l3mdev(tgen, rname)

for rname, router in router_list.items():
router.load_config(
TopoRouter.RD_ZEBRA, os.path.join(CWD, "{}/zebra.conf".format(rname))
Expand Down
8 changes: 1 addition & 7 deletions tests/topotests/isis_topo1_vrf/test_isis_topo1_vrf.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,10 +41,7 @@
from lib.topogen import Topogen, TopoRouter, get_topogen
from lib.topolog import logger
from lib.topotest import iproute2_is_vrf_capable
from lib.common_config import (
required_linux_kernel_version,
adjust_router_l3mdev,
)
from lib.common_config import required_linux_kernel_version

from mininet.topo import Topo

Expand Down Expand Up @@ -124,9 +121,6 @@ def setup_module(mod):
for cmd in cmds:
output = tgen.net[rname].cmd(cmd.format(rname))

# adjust handling of vrf traffic
adjust_router_l3mdev(tgen, rname)

for rname, router in tgen.routers().items():
router.load_config(
TopoRouter.RD_ZEBRA, os.path.join(CWD, "{}/zebra.conf".format(rname))
Expand Down
48 changes: 0 additions & 48 deletions tests/topotests/lib/common_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -4487,51 +4487,3 @@ def verify_ip_nht(tgen, input_dict):

logger.debug("Exiting lib API: verify_ip_nht()")
return False


def kernel_requires_l3mdev_adjustment():
"""
Checks if the L3 master device needs to be adjusted to handle VRF traffic
based on kernel version.

Returns
-------
1 or 0
"""

if version_cmp(platform.release(), "4.15") >= 0:
return 1
return 0


def adjust_router_l3mdev(tgen, router):
"""
Adjusts a routers L3 master device to handle VRF traffic depending on kernel
version.

Parameters
----------
* `tgen` : tgen object
* `router` : router id to be configured.

Returns
-------
True
"""

l3mdev_accept = kernel_requires_l3mdev_adjustment()

logger.info(
"router {0}: setting net.ipv4.tcp_l3mdev_accept={1}".format(
router, l3mdev_accept
)
)

output = tgen.net[router].cmd("sysctl -n net.ipv4.tcp_l3mdev_accept")
logger.info("router {0}: existing tcp_l3mdev_accept was {1}".format(router, output))

tgen.net[router].cmd(
"sysctl -w net.ipv4.tcp_l3mdev_accept={}".format(l3mdev_accept)
)

return True
13 changes: 6 additions & 7 deletions tests/topotests/ospf6_topo1_vrf/test_ospf6_topo1_vrf.py
Original file line number Diff line number Diff line change
Expand Up @@ -92,10 +92,7 @@
from lib.topogen import Topogen, TopoRouter, get_topogen
from lib.topolog import logger
from lib.topotest import iproute2_is_vrf_capable
from lib.common_config import (
required_linux_kernel_version,
adjust_router_l3mdev,
)
from lib.common_config import required_linux_kernel_version

#####################################################
##
Expand Down Expand Up @@ -159,6 +156,11 @@ def build(self, **_opts):
def setup_module(mod):
"Sets up the pytest environment"

# Required linux kernel version for this suite to run.
result = required_linux_kernel_version("5.0")
if result is not True:
pytest.skip("Kernel requirements are not met")

tgen = Topogen(NetworkTopo, mod.__name__)
tgen.start_topology()

Expand Down Expand Up @@ -197,9 +199,6 @@ def setup_module(mod):
for cmd in cmds2:
output = tgen.net[rname].cmd(cmd.format(rname))

# adjust handling of vrf traffic
adjust_router_l3mdev(tgen, rname)

for rname, router in tgen.routers().items():
router.load_config(
TopoRouter.RD_ZEBRA, os.path.join(CWD, "{}/zebra.conf".format(rname))
Expand Down