YAP: Yet Another Private WAN
This is an alternative method of providing private WAN in bonding. Instead of funneling traffic into private WAN routers via GRE tunnels, it peers space tables directly on VLANs off the aggregators using OSPF. This allows for the following improvemements over standard private WAN:
- Custom, more efficient backhauls can be used, improving speed in most cases
- Tables can be peered with any switches or routers in the data centers
- Reduces processing load on aggregators due to simplified rulesets
If a backhaul is not already set up in a data center, additional "VXR" boxes can be added to each data center to provide an overlay backhaul using VXLAN-over-IPSEC.
Installation and setup
Initial installation
First, install the software on the bondingadmin server:
make install
Note
The rest of the yap commands are run on the management server, unless otherwise stated.
Then add a read-only user in the Bondingadmin web interface allow the
tool to query the API. Add the user details using the yap
tool:
yap auth-set user@example.com mypassword
Setting up regions
Each region will have a series of aggregators and VLAN assignments for the spaces. To add a region:
yap region-add yvr
Adding spaces
To add the space with key foo
:
yap space-add foo
Setting VLAN region associations
If a VLAN is not associated to a space in a region, none of the nodes
in that region will set up peering for the space. To add a VLAN
association for space foo
in region yvr
on
vlan 1234
:
yap vlan-set foo yvr 1234
Enabling IPSEC
To enable IPSEC:
yap ipsec-enable
Setting up a VXR
If using VXR hosts to provide a backhaul overlay, install the latest openSUSE Leap distribution on a host, set up the base networking, then install and setup salt-minion.
Assuming we are going to call the node yvr-vxr01
and our
Bondingadmin host is bondingadmin.mydomain.com:
zypper in salt-minion
echo yvr-vxr01 > /etc/salt/minion_id
echo "master: bondingadmin.mydomain.com" > /etc/salt/minion.d/yap.conf
echo -e "grains:\n type: vxr" >> /etc/salt/minion.d/yap.conf
systemctl enable --now salt-minion
On the Bondingadmin server, accept the salt key for the box:
salt-key -a yvr-vxr01
Then add a record using yap
, with the name, ip, region,
and VLAN trunk port:
yap vxr-add yvr-vxr01 1.2.3.4 yvr eth1
The necessary software will be installed automatically.
Adding aggregators
To add an aggregator, get the ID from Bondingadmin, select a region for it, setup a vlan trunk interface, then add it:
yap agg-add 1 yvr eth1
This will install some software on the aggregator to maintain the
VLANs and OSPF peering on the eth1
trunk port.
Showing status
On each Aggregator and VXR, there is a yap
command that
manages the local state. to show the state of space
foo
:
yap status foo
From the bondingadmin server, you can check state on multiple hosts
simultaneously by specifying a node list to the salt
cmd.run
command. For example, to show the state of space
foo
on the VXR yvr-vxr01
and the aggregator
with ID 1:
salt -C 'L@yvr-vxr01,node-1' cmd.run "yap status foo"
Architectural overview
The following diagram shows an overview of the various nodes involved in a typical YAP deployment for a space. This fictional space has a firewall in YVR only, but bonds in both YVR and TOR.
The red circles denote details and troubleshooting commands that can be run on each respective node.
Adding spaces
Prerequisites
- All bonds are moved to yap-enabled aggregators.
- A VLAN is designated for each region that will host bonds. For example, for a space that has bonds on aggregators in two regions, YVR and TOR, you must designate a VLAN for both regions.
Migrating existing private WAN spaces
The following commands are all to be run on the management server.
Warning
There will be a brief outage when migrating a space.
Add the space:
yap space-add <key>
This can be run in advance as it does not make any runtime changes.
To calculate the subnet for each region/space, you can run the following command. This only returns the network that will be designated for the VLAN on the aggregators in the region, it does not apply any changes:
yap subnet-get <key> <region>
This will return the base subnet for this space-region pair, as well as the specific IPs of the aggregators in that region. The first IP in the subnet is reserved for the firewall:
Subnet: 100.31.88.0/21 Firewall: 100.31.88.1 Aggregators: agg03: 100.31.88.5
Configure the firewall with the IP shown in step 2 on the VLAN interface and configure OSPF. While the exact settings will be vendor-specific, here are the general details:
- area 0.0.0.0
- subnet <from step 2>
- redistribute connected
- hello interval 10s
- dead interval 40s
Add a VLAN association for each region:
yap vlan-set <key> <region> <vlan_id>
This will start the VLAN interfaces on each yap-enabled aggregator in the region using the same subnet reflected in step 2.
Caution
This is the start of an outage for the space, as the private WAN router's BGP protocols for the space are brought down to prevent routing loops/conflicts.
Confirm OSPF is up in each region by running this command on the aggregators:
yap status <key>
If the OSPF protocol is not 'Running', jump to troubleshooting B: Aggregator.
Once OSPF is up and the routes have propagated both ways, you can disable the outbound gateway configured in the existing space to finish cleanup.
Adding new private WAN spaces
Follow the same steps as for migrating an existing space, with these two exceptions:
- Enable private WAN on the space through the management server interface.
- An outbound gateway should not be enabled in the space's private WAN tab, however, you may wish to add a disabled gateway for record-keeping of the firewall's IP.
Troubleshooting
A: Bond
While YAP doesn't directly affect bonds, it can be useful to troubleshoot private WAN routes at the bond level, by inspecting their routing table:
ip route show table bonding-pwan
B: Aggregator
YAP-enabled aggregators have a yap
command installed
that can be used to show information about the spaces currently running
on the aggregator.
The most useful command is yap status <space key>
,
which shows the status of the bird protocols and the current routing
table for that space:
agg:~# yap status bammya
spcbammya BGP krt8251 up 2018-12-06 Established
ospf_bammya OSPF krt8251 up 07:21:22 Running
default via 100.109.152.1 dev vl-bammya proto bird
10.10.1.0/24 via 100.109.152.8 dev vl-bammya proto bird
192.168.33.0/24 via 100.109.152.8 dev vl-bammya proto bird
The BGP protocol for the space is controlled by bonding and should be
in 'Established' state. The ospf_<key>
protocol is
the one managed by YAP and should be in 'Running' state. If the status
is 'Alone' instead, it means there are no OSPF neighbors.
If you want to, you can show the current OSPF neighbors for a space:
pwanbirdc - show ospf neighbor ospf_<key>
An aggregator has one VLAN interface per space, which follows the
naming convention of vl-<key>
. You can use this
command to show the VLAN id:
ip -d link show dev vl-bammya
Lastly, you can look at the VLAN interface to see the aggregator's IP, as well as the subnet designated for the space and routing group:
agg:~# ip address show dev vl-bammya
440: vl-bammya@eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether d0:43:1e:c5:1b:44 brd ff:ff:ff:ff:ff:ff
inet 100.109.152.7/21 scope global vl-bammya
In the example above, the firewall would be configured with
100.109.152.1/21
.
Knowing the subnet, you can test ICMP connectivity to the firewall IP:
ping <gateway IP>
When troubleshooting OSPF it may be useful to run a packet capture on the VLAN interface to see which options are set:
tcpdump -ni vl-<key> proto 89 -vvv
D: VXR
The most useful command is yap status <space key>
,
which shows the status of the bird protocol and the current routing
table for that space:
agg:~# yap status bammya
ospf_bammya OSPF bammya up 07:21:23.175 Running
default via 100.109.152.1 dev vl-bammya proto bird metric 32
10.10.1.0/24 via 100.109.152.8 dev vl-bammya proto bird metric 32
Otherwise, the same troubleshooting steps apply as on the aggregator.
If you need to troubleshoot the VXLAN as well, you can view the interface details with the standard linux utilities:
agg:~# ip -d l show dev vx-<key>
191: vx-bammya: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1432 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/ether 66:da:5c:17:37:38 brd ff:ff:ff:ff:ff:ff promiscuity 0
vxlan id 59 srcport 0 0 dstport 4789 ageing 300 udpcsum noudp6zerocsumtx noudp6zerocsumrx addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
E: Firewall
Out of YAP's control. Here be dragons.
F: bondingadmin
Like all the nodes, there is a command in the path called
yap
that serves as the entry point for all things
backhauled. Most of the commands are described above in their relevant
sections. You can always run yap
with no arguments to see
what actions are available:
root@bondingadmin:~# yap
/usr/local/bin/yap <action> [args]
Actions:
region-list
region-show <region>
region-add <region>
...