Typical ESXi host-facing switchport configs


I was troubleshooting a production issue a couple days ago that led me to request the switchport configs from our Networking team of our ESXi 5.0 hosts that pass virtual machine traffic. Here’s a snippet of what they came back with for two particular ports:


interface GigabitEthernet1/5
description -=R910 ESX# 1 – Front Side=-
switchport mode trunk
end


interface GigabitEthernet1/6
description -=R910 ESX# 1 – Front Side=-
end

Well. Not only do I see our problem (no config *at all* on one port!), but I see something else that troubles me. Our ESXi host-facing ports are only configured as trunk ports. Absolutely* nothing* else. Well, this just won’t do.

I’ve wanted to perform a virtual infrastructure audit since I started working here in June but haven’t asked for it, yet. Apparently, this should be done sooner rather than later because if something as blatant, obvious, and important as a switchport config can be overlooked, who knows what else is wrong.

I know this isn’t necessarily earth-shattering, but it’s important to know the difference. It seems as if folks think that simply because vSphere software is installed that their virtual infrastructure will auto-magically do all the great things that it’s designed to do. Obviously, it won’t. One needs to design it and configure it correctly to do these things. So without further ado, please let me remind everyone what should be configured, somewhat minimally. I know configs will differ from shop to shop, but take these and adjust to taste. For instance, all these being equal, I’d like to only pass virtual machine VLANs over these ports. Something like:


interface GigabitEthernet1/10
description -=R910 ESX# 1 – Front Side=-
switchport mode trunk
switchport trunk allowed vlan 10,20,30-50,70

In our environment, we have over one hundred server VLANs – basically one VLAN per server function or application – so you can imagine the administrative overhead in adding VLANs to such a list to allow traffic to and from each ESXi host. There’s also the idea that one day a Network Admin could input something like below to add an additional VLAN to the list allowed above and *really*
bork something up:

switchport trunk allowed vlan 80

Of course, what this would do is remove all currently allowed VLANs from an interface and add VLAN 80 to the trunk port. Any virtual machines on VLANs 10,20,30 through 50, and 70 using that particular uplink would lose connectivity. Of course, in vSphere 5.1, the new networking health check features would alert a VMware Admin of the VLAN misconfiguration. For an overview of these super-cool and awesome features, check out Chris Wahl’s blog. Hooray for that, but I say it shouldn’t happen in the first place. To avoid that all-too easy of a mistake, I think it’s perhaps OK to trunk all VLANs to the host. I understand this will also add CPU overhead to the ESXi host because every broadcast on every VLAN must be processed by the host. I’m not certain, but I don’t think VLAN pruning on the immediate upstream switch would help reduce the unneeded broadcasts reaching the host. If you know, please leave a comment.

Finally, let me add my two cents (more of a reminder) of what should be configured on ESXi host-facing switchports:

spanning-tree portfast (for those running IOS)

or

spanning-tree port type edge (for those running NX-OS)

In most circumstances, VMs should not be sending BPDUs (and ESXi hosts don’t run STP at all). For deep dives into why it’s safe to connect virtual switches to physical switches with portfast enabled, see Ivan’s blog at blog.ioshints.info and search for “standard switch STP.” The idea here is that it’s safe and, therefore, you should bring the port up as fast as possible.

spanning-tree portfast bpduguard (for those running IOS)

spanning-tree bpduguard enable (for those running NX-OS)

Enabling bpduguard will disable the physical switchport to which the offending ESXi host is connected. To be more precise, it will put the port in errdisable state which will log why the port was shut down, i.e. what caused the error. In this case, it would have been shut down because it received an STP BPDU on a port on which you said it would never receive an STP BPDU. I like the idea of disabling the port because it brings attention to the fact that a VM has sent an STP BPDU when, most likely, a VM *never* should have sent such a packet. If you have monitoring software like SolarWinds or some such, someone would be alerted to this state immediately and someone could then investigate. One should be careful with this configuration because it’s possible that all ports VMs use for communication will be shut down as the ESXi host learns of its uplink “failure,” causing a Denial of Service. BPDU Guard can be configured to turn the port back on after a configurable amount of time.

Another option is to use BPDU Filter. This will silently drop STP BPDUs instead of disabling the port. I don’t like this idea as much because you’re not alerted to the fact that VMs are sending BPDUs when they shouldn’t. I believe VMware released this feature with their vSphere 5.1 networking enhancements on all vSwitches – both standard and distributed. Both Chris and Ivan talk about this on their respective blogs here and here.

Finally, one should configure the switch’s host-facing ports to not negotiate a trunk port with the host. ESXi doesn’t speak DTP so there’s only delay introduced when the switch tries to negotiate the trunk.

switchport nonegotiate

I’m not sure if this command is different on NX-OS. Leave a comment if you know. I hope this helps a bit. I have to go off and write up a pleasant request to Networks to make the changes. Cheers!

Google

Advertisements

5 Comments on “Typical ESXi host-facing switchport configs”

  1. anonymous says:

    “spanning tree port type edge trunk” for trunks, should work on NX-OS and IOS

  2. Cool, thanks for letting me know. And yeah, as I was looking at it, it seemed to me that the port type edge command came out with NX-OS.

  3. vmmojo says:

    If you don’t have access to VMware Health Analyzer, get with your VMware partner and have them download it from the Partner Portal. This will analyze your environment, gernerate a Word report and Report Card that you can present. Also recommends fixes, references VMwre documents and best practice, so you know where you stand. After changes run again to set baseline. Be aware some things it finds may be part of a storage optimization effort such as NetApp VSC, so it is actually not incorrect, but you can document and justify the discrepancy.

  4. Hey Miguel! Thanks for the comment. You’re absolutely right. I’ve only read about the VMware Health Analyzer – I haven’t used it. Not yet, anyways. I was using the VMware Compliance Checker for vSphere 5.0 for some security audits, but I have yet to present my recommendations to my team – I have several things going on. Thanks for the note, though. I’ll look into it.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s