I recently wrapped up a Dell networking deployment consisting of both Dell S-series switches running the Force10 Operating System (FTOS) and N-series switches running the Dell Network Operating System (DNOS). Both boasted straightforward configuration and were pleasant to work with.
The FTOS switches in particular offered a powerful and Dell-recommended feature called Peer Routing that could be used in conjunction with the Virtual Link Trunking (VLT) capabilities. VLT is similar to Cisco’s Virtual Port Channel (vPC) feature, and allows for a single port channel to be multihomed to two Dell FTOS switches. This type of design has a few benefits, namely:
- Link aggregation and redundancy to two upstream switches using a port-channel
- Non-blocking port-channels, as you don’t need to run a separate port-channel to each upstream switch. This would result in spanning-tree blocking an entire port-channel, which is an expensive proposition when dealing with 10GigE links.
- “Split-brain” architecture in the upstream core or distribution layer. Each upstream switch, despite participating in a VLT domain and sharing port-channels, features an independent control and management plane. For example, upgrading the code on one of the upstream switches is non-disruptive, as the other switch can act on behalf of its downed peer. This is in contrast to a stacked-switch architecture, which results in a single control and management plane for all of the switches in the stack. This can create a single point of failure.
If you’ve spent any amount of time doing data center networking, then you’re probably already aware of the benefits listed above, even if the technologies are called different things by different vendors. I don’t want to spend too much time discussing the VLT architecture, so I’d encourage you to read the [PDF Warning] VLT Reference Architecture for additional information.
In this article, I’ll discuss the Peer Routing functionality and provide some insight on a caveat that I came across while deploying it in the lab. Luckily, Dell’s helpful enterprise networking support representatives were able to provide some clarification after taking a look at my issue.
Let’s start by considering the topology above. We can see a fairly simple collapsed core network, with two Dell S4048 FTOS switches at the core and a downstream access switch with a connected host. The S4048 switches exist in a Virtual Link Trunking domain. They’re joined together by 2 40GigE interfaces in a port-channel for the VLT Interconnect (VLTi) and a VLT Heartbeat link across their management interfaces. They both have a single 40GigE connection to the upstream network. The access switch is uplinked with 2 10GigE connections in a Virtual Link Trunk. One 10GigE connection goes to each upstream core switch, but they are both able to exist in a single non-blocking port-channel due to the VLT functionality.
This type of design raises some questions for routing purposes. Namely, how can we ensure load-balancing (or at least failover) of our default gateway. Using a standard First Hop Redundancy Protocol (FHRP), such as Virtual Router Redundancy Protocol (VRRP) could result in inefficiencies. While it would satisfy our failover requirement, it would result in only one switch actively routing traffic for end hosts. It could also result in inefficient utilization of our 2x40GigE VLTi links, and the 40GigE uplinks to the upstream network. Not to mention, we would have to burn an additional IP address for the VRRP VIP. The Peer Routing functionality in Dell FTOS switches is designed to solve these exact problems.
Peer Routing Overview
Peer Routing, simply put, allows a switch in a VLT domain to forward traffic on behalf of it’s peer switch. With this type of solution, it is possible for an end host to be assigned a gateway IP address of either upstream switch in the VLT domain and traffic will be forwarded by the switch that received it, instead of being sent across the VLTi. Consider the green traffic flow shown above. Host 1 is assigned a gateway IP address of 192.168.10.1, which is the IP of Core-S4048-01’s VLAN 10 interface. Due to the port-channel hashing, Host 1’s traffic flows across the port-channel to Core-S4048-02. Since the traffic is destined at layer 2 for Host 1’s default gateway, it contains the destination MAC address of Core 1. However, instead of passing the traffic across the VLTi to Core 1, Core 2 happily forwards the traffic upstream on behalf of its peer.
This solution works because each switch installs a LOCAL_DA CAM table entry with the MAC address of its peer’s interfaces. This allows each peer to forward traffic on behalf of the other peer. Naturally, the upstream routing information should be consistent across both peers for traffic forwarding to be accomplished efficiently, without unnecessary traversal of the VLTi. The implementation of Peer Routing alleviates a few key pain points:
- Links are used more efficiently. Traffic is able to cross either link in the port-channel and be routed by either peer without having to unnecessarily traverse the VLTi.
- Gateway redundancy is introduced, as each switch will continue routing traffic even if its peer fails. The length of time that the peer continues to handle traffic can be adjusted by a timer, but the default value is infinity.
- There is no need to dedicate an additional IP address to a VIP, as is necessary with traditional FHRPs.
Next, we’ll take a brief look at a caveat of Peer Routing that I experienced while testing a failure scenario in the lab.
Peer Routing Failure Scenario Caveat
One of the goals when testing out the VLT and Peer Routing architecture in the lab was to ensure that traffic would continue to flow throughout the environment when either switch in the VLT failed. I figured this would be simple enough to design an experiment around. I ran a constant ping on Host 1 to the default gateway (192.168.10.1) and cut power to Core 1. I expected the ping to continue uninterrupted due to installation of the CAM table entry on Core 2 for Core 1’s MAC addresses. After all, pings to the default gateway with any normal FHRP would continue largely uninterrupted, save for any delay due to a dead timer.
Much to my surprise, my pings began timing out completely when I failed Core 1. My colleague and I tried a variety of troubleshooting methods. We adjusted timers, removed and replaced configuration sections in differing orders, and eventually gave up and called Dell support. We even stumped them for about an hour. Finally, they were able to grab an engineer who was an expert on the Peer Routing capabilities. He confirmed that this behavior was actually expected. While the switches will continue to route traffic on their peer’s behalf in the event of a failure, they won’t respond to management or control plane traffic destined to the peer’s IP address.
Confirming this behavior was simple enough. We just added another host on a different VLAN and, without changing the default gateways of either host, ensured that they could still communicate with each other while Core 1 was down. The gateway IPs of both hosts were still set to IP addresses on Core 1, so the fact that traffic continued to flow while Core 1 was powered off provided validation of Peer Routing’s redundancy functionality.
This caveat makes a great deal of sense, in retrospect. We wouldn’t want the switches to respond to control plane or management plane traffic destined for their peers, as that would wreak havoc on routing protocols, SSH, and other traffic where we want there to be no ambiguity about the destination of that traffic. Our only goal for Peer Routing is redundancy and load-sharing for traffic routing.
It’s important to be aware of this caveat, particularly related to ICMP, when deploying Peer Routing in your environment. It is possible that, during a failure scenario, an end host will be unable to ping its default gateway but will still be able to reach remote subnets without issue. ICMP is an important troubleshooting tool, and some applications even rely on pings to their default gateway to assess network health. This is definitely a caveat to be aware of, especially since it doesn’t seem to be documented anywhere in Dell’s documentation on Peer Routing.
Peer Routing is a really nice feature of the Dell FTOS switching line that can help to easily bring additional redundancy, load-sharing, more efficient traffic flows into an environment. It’s simple to enable, and exhibits clear benefits over using a traditional FHRP. However, it’s important to understand how its works and its caveats prior to deployment. While the ICMP caveat is minor, it’s also not clearly documented and can leave you scratching your head when one of your switches fails. It certainly had me scratching my head in the lab.
For an excellent in-depth technical discussion of Peer Routing with configuration examples, I’d encourage you to check out Dell’s whitepaper: [PDF Warning] The Case of Routed VLT, Peer Routing, VLT Proxy Gateway and Their Relationship to VRRP by Victor Lama.