YAP: Yet Another Private WAN

This is an alternative method of providing private WAN in bonding. Instead of funneling traffic into private WAN routers via GRE tunnels, it peers space tables directly on VLANs off the aggregators using OSPF. This allows for the following improvemements over standard private WAN:

  • Custom, more efficient backhauls can be used, improving speed in most cases
  • Tables can be peered with any switches or routers in the data centers
  • Reduces processing load on aggregators due to simplified rulesets

If a backhaul is not already set up in a data center, additional "VXR" boxes can be added to each data center to provide an overlay backhaul using VXLAN-over-IPSEC.

Installation and setup

Initial installation

First, install the software on the bondingadmin server:

make install

Note

The rest of the yap commands are run on the management server, unless otherwise stated.

Then add a read-only user in the Bondingadmin web interface allow the tool to query the API. Add the user details using the yap tool:

yap auth-set user@example.com mypassword

Setting up regions

Each region will have a series of aggregators and VLAN assignments for the spaces. To add a region:

yap region-add yvr

Adding spaces

To add the space with key foo:

yap space-add add foo

Setting VLAN region associations

If a VLAN is not associated to a space in a region, none of the nodes in that region will set up peering for the space. To add a VLAN association for space foo in region yvr on vlan 1234:

yap vlan-set foo yvr 1234

Enabling IPSEC

To enable IPSEC:

yap ipsec-enable

Setting up a VXR

If using VXR hosts to provide a backhaul overlay, install the latest openSUSE Leap distribution on a host, set up the base networking, then install and setup salt-minion.

Assuming we are going to call the node yvr-vxr01 and our Bondingadmin host is bondingadmin.mydomain.com:

zypper in salt-minion
echo yvr-vxr01 > /etc/salt/minion_id
echo "master: bondingadmin.mydomain.com" > /etc/salt/minion.d/yap.conf
echo -e "grains:\n  type: vxr" >> /etc/salt/minion.d/yap.conf
systemctl enable --now salt-minion

On the Bondingadmin server, accept the salt key for the box:

salt-key -a yvr-vxr01

Then add a record using yap, with the name, ip, region, and VLAN trunk port:

yap vxr-add yvr-vxr01 1.2.3.4 yvr eth1

The necessary software will be installed automatically.

Adding aggregators

To add an aggregator, get the ID from Bondingadmin, select a region for it, setup a vlan trunk interface, then add it:

yap agg-add 1 yvr eth1

This will install some software on the aggregator to maintain the VLANs and OSPF peering on the eth1 trunk port.

Showing status

On each Aggregator and VXR, there is a yap command that manages the local state. to show the state of space foo:

yap status foo

From the bondingadmin server, you can check state on multiple hosts simultaneously by specifying a node list to the salt cmd.run command. For example, to show the state of space foo on the VXR yvr-vxr01 and the aggregator with ID 1:

salt -C 'L@yvr-vxr01,node-1' cmd.run "yap status foo"

Architectural overview

The following diagram shows an overview of the various nodes involved in a typical YAP deployment for a space. This fictional space has a firewall in YVR only, but bonds in both YVR and TOR.

The red circles denote details and troubleshooting commands that can be run on each respective node.

VXLAN backhaul diagram

Adding spaces

Prerequisites

  • All bonds are moved to yap-enabled aggregators.
  • A VLAN is designated for each region that will host bonds. For example, for a space that has bonds on aggregators in two regions, YVR and TOR, you must designate a VLAN for both regions.

Migrating existing private WAN spaces

The following commands are all to be run on the management server.

Warning

There will be a brief outage when migrating a space.

  1. Add the space:

    yap space-add <key>

    This can be run in advance as it does not make any runtime changes.

  2. To calculate the subnet for each region/space, you can run the following command. This only returns the network that will be designated for the VLAN on the aggregators in the region, it does not apply any changes:

    yap subnet-get <key> <region>

    This will return the base subnet for this space-region pair, as well as the specific IPs of the aggregators in that region. The first IP in the subnet is reserved for the firewall:

    Subnet: 100.31.88.0/21
    Firewall: 100.31.88.1
    Aggregators:
        agg03: 100.31.88.5
  3. Configure the firewall with the IP shown in step 2 on the VLAN interface and configure OSPF. While the exact settings will be vendor-specific, here are the general details:

    • area 0.0.0.0
    • subnet <from step 2>
    • redistribute connected
    • hello interval 10s
    • dead interval 40s
  4. Add a VLAN association for each region:

    yap vlan-set <key> <region> <vlan_id>

    This will start the VLAN interfaces on each yap-enabled aggregator in the region using the same subnet reflected in step 2.

    Caution

    This is the start of an outage for the space, as the private WAN router's BGP protocols for the space are brought down to prevent routing loops/conflicts.

  5. Confirm OSPF is up in each region by running this command on the aggregators:

    yap status <key>

    If the OSPF protocol is not 'Running', jump to troubleshooting B: Aggregator.

  6. Once OSPF is up and the routes have propagated both ways, you can disable the outbound gateway configured in the existing space to finish cleanup.

Adding new private WAN spaces

Follow the same steps as for migrating an existing space, with these two exceptions:

  • Enable private WAN on the space through the management server interface.
  • An outbound gateway should not be enabled in the space's private WAN tab, however, you may wish to add a disabled gateway for record-keeping of the firewall's IP.

Troubleshooting

A: Bond

While YAP doesn't directly affect bonds, it can be useful to troubleshoot private WAN routes at the bond level, by inspecting their routing table:

ip route show table bonding-pwan

B: Aggregator

YAP-enabled aggregators have a yap command installed that can be used to show information about the spaces currently running on the aggregator.

The most useful command is yap status <space key>, which shows the status of the bird protocols and the current routing table for that space:

agg:~# yap status bammya

spcbammya BGP      krt8251  up     2018-12-06  Established
ospf_bammya OSPF     krt8251  up     07:21:22    Running

default via 100.109.152.1 dev vl-bammya proto bird
10.10.1.0/24 via 100.109.152.8 dev vl-bammya proto bird
192.168.33.0/24 via 100.109.152.8 dev vl-bammya proto bird

The BGP protocol for the space is controlled by bonding and should be in 'Established' state. The ospf_<key> protocol is the one managed by YAP and should be in 'Running' state. If the status is 'Alone' instead, it means there are no OSPF neighbors.

If you want to, you can show the current OSPF neighbors for a space:

pwanbirdc - show ospf neighbor ospf_<key>

An aggregator has one VLAN interface per space, which follows the naming convention of vl-<key>. You can use this command to show the VLAN id:

ip -d link show dev vl-bammya

Lastly, you can look at the VLAN interface to see the aggregator's IP, as well as the subnet designated for the space and routing group:

agg:~# ip address show dev vl-bammya

440: vl-bammya@eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether d0:43:1e:c5:1b:44 brd ff:ff:ff:ff:ff:ff
    inet 100.109.152.7/21 scope global vl-bammya

In the example above, the firewall would be configured with 100.109.152.1/21.

Knowing the subnet, you can test ICMP connectivity to the firewall IP:

ping <gateway IP>

When troubleshooting OSPF it may be useful to run a packet capture on the VLAN interface to see which options are set:

tcpdump -ni vl-<key> proto 89 -vvv

D: VXR

The most useful command is yap status <space key>, which shows the status of the bird protocol and the current routing table for that space:

agg:~# yap status bammya

ospf_bammya OSPF       bammya up     07:21:23.175  Running

default via 100.109.152.1 dev vl-bammya proto bird metric 32
10.10.1.0/24 via 100.109.152.8 dev vl-bammya proto bird metric 32

Otherwise, the same troubleshooting steps apply as on the aggregator.

If you need to troubleshoot the VXLAN as well, you can view the interface details with the standard linux utilities:

agg:~# ip -d l show dev vx-<key>

191: vx-bammya: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1432 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether 66:da:5c:17:37:38 brd ff:ff:ff:ff:ff:ff promiscuity 0
    vxlan id 59 srcport 0 0 dstport 4789 ageing 300 udpcsum noudp6zerocsumtx noudp6zerocsumrx addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535

E: Firewall

Out of YAP's control. Here be dragons.

F: bondingadmin

Like all the nodes, there is a command in the path called yap that serves as the entry point for all things backhauled. Most of the commands are described above in their relevant sections. You can always run yap with no arguments to see what actions are available:

root@bondingadmin:~# yap
/usr/local/bin/yap <action> [args]

Actions:

region-list
region-show <region>
region-add <region>
...
Description
Yet Another Private WAN (DEPRECATED as of bonding 6.5)
Readme 145 KiB
Languages
Shell 93%
SaltStack 6%
Makefile 1%