My OTV TakePosted: April 1, 2013 Filed under: Cisco Nexus, Networking | Tags: cisco otv, data center interconnect, datacenter interconnect, dci, l2 dci, layer 2 dci, otv 3 Comments
After my recent DFW VMUG presentation where I spoke on the topic, a friend emailed me and asked what I thought about OTV.
“You mentioned that you were against OTV. Curious on your take on this, as we are using it across two datacenters using N7K, UCS, NetApp and VMware.”
I’d like to share my response to him here.
Please don’t get me wrong. If one is forced to implement a Layer 2 Data Center Interconnect (DCI), OTV is probably the best solution. Sometimes, L2 connectivity between data centers is a functional requirement – perhaps even a constraint. In these cases, one should look at the benefits and risks of implementing an L2 DCI and then make an informed decision on whether they should continue with such a deployment. Should they choose to deploy OTV, someone needs to accept the risks associated with OTV in its current implementation.
For instance, one problem that still exists with OTV in its current form is a traffic trombone should a VM move from one DC to the other, say from Data Center A to Data Center B. OTV solves only half the traffic trombone problem through the use of FHRP Isolation. Basically, only outbound traffic is path optimized. This is to say that the VM will see its default gateway exist in Data Center B and *not* Data Center A after the vMotion. External clients with active connections to the VM as it is vMotioned (which is the whole point of a live migration – to keep active connections) from Data Center A to Data Center B will still send traffic to Data Center A. This causes external client traffic to traverse the WAN in order to reach the VM in Data Center B.
The other problem with OTV that if there is a broadcast storm in one data center, say, Data Center A, it can spill over into Data Center B provided that the VLAN in which the broadcast storm occurs is stretched across the WAN using the overlay. Don’t mistake this for the STP Isolation that OTV offers. STP Isolation says that STP BPDUs will not traverse the overlay. When an STP BPDU reaches the OTV interface, it will be blocked from traversing the overlay. STP Isolation does *not* block these broadcasts. One must implement storm control or similar technologies to protect against these broadcast storms.
Some easy solutions to the traffic trombone problem would be to cold migrate the VM, reboot it once it’s in the far DC, or somehow kill all the current external connections once the VM is in the far DC. That way, new connections will enter Data Center B in the first place, assuming your load balancers/DNS servers are aware of the move. This problem can be solved somewhat easily without much downtime at the expense of losing active connections during the move from DC A to DC B.
The one big argument that has existed for decades and will likely continue is the fact that Layer 2 switching does not scale well. And when you force one layer 2 domain (read VLAN) to exist in two data centers at the same time, you’ve now created a single fault domain out of which there used to be two. Now, many folks will tell you that they’ve been using OTV for over a year and half with no problems. Those same people will say that they’ve turned off Spanning Tree in their LAN because they’ve been burned by it in the past and they haven’t had any problems so far. These are accidents waiting to happen. Someone always plugs a cable into the wrong port, fat-fingers something at the CLI, a NIC starts flapping, or one could just wait for the next new software bug, but these problems always happen. When they do, they’ll not only take out a VLAN in Data Center A, but they’ll also take out that same VLAN in Data Center B.
Now, this is not to say, again, that sometimes, one seems to be forced to implement a L2 DCI, In these cases, just be sure to document your objections and be a good soldier. Make sure the decision makers understand the risks and accept them. Then implement and move on.
To say I’m against OTV just isn’t accurate. I agree with smarter dudes than I that one should be sure that a Layer 3 solution wouldn’t work better before settling for a L2 DCI. There are also some up and coming technologies that show promise in solving the traffic trombone, such as LISP. As a note, as of today, beware that VXLAN is *not* a L2 DCI.
The January 2012 OTV primer from Cisco (http://www.cisco.com/en/US/docs/solutions/Enterprise/Data_Center/DCI/whitepaper/DCI3_OTV_Intro_WP.pdf) states their broadcast control policy as such (speaking to my broadcast storm statement above):
“Broadcast Policy Control
…OTV will provide additional functionality such as broadcast suppression, broadcast white-listing, and so on, to reduce the amount of overall Layer 2 broadcast traffic sent across the overlay. Details will be provided upon future functional availability. (emphasis mine)”
The way I read this, such broadcast control is not implemented yet. Now, January 2012 is getting pretty dated. I’m not familiar with newer documentation that states otherwise.
Mike, well written. My only comment is about inbound traffic to the data center. As you say, OTV is not meant to solve the issue where client traffic hits the wrong data center and has to cross the DCI to reach the server. You do need to rely on your load balancers/GSS to optimize inbound traffic flows. Alternatively, a technology like LISP would work extremely well at optimizing those flows as well. No matter what though, you almost always need existing flows to continue to route through the original data center (where the VM lived prior to the hot move) in order to satisfy stateful services such as firewalls and load balancers. A thorough design would definitely optimize new flows though.
Totally agree with this: “Make sure the decision makers understand the risks and accept them.”
I have implemented a couple of ASR1K-based OTV solutions recently. You make a very good point about a trombone remaining for any traffic that I’d call north bound. It is the firewall state and external routing that have yet to be resolved. I’ve found that many clients are willing to accept traffic flowing east/west in order to return to the owning firewall. Since most have sufficient bandwidth on the L2 DCI/s they accept that traffic pattern in order to gain the DR-type mobility of moving Guests as dictated by load or disaster.
I’m wondering if firewall vendors might be looking to create a ‘cluster’ solution extended across a Layer 2 DCI that will help better share state so that traffic can go directly north bound rather than having to make that extra east/west trip..?..
As you mentioned, many of us are watching for the maturity and production adoption of LISP.
This is a conversation my architect and I have had recently. His point is essentially that, many times, there are CPU cycles and bandwidth to spare these days, especially with 10Gb links between sites. Perhaps if these do, indeed, exist in a particular situation, the decision for OTV is a bit easier.
All the best,