Cisco Nexus Fibre Channel configuration templatePosted: May 23, 2013
I recently had the opportunity to configure native fibre channel in my test lab at work using Nexus 55xx series switches and Cisco’s UCS. What I’ll do in this post is to share my templatized fibre channel configuration in a somewhat ordered way, at least from the Nexus point of view. This test lab was configured with the attitude that it should show off the capabilities of the hardware and software. Concepts included in this initial configuration include NPIV, NPV, SAN port-channels, F_Port trunking, VSANs, device aliases, and of course, standard FC concepts like zones and zonesets.
Let me first share the end-state as of today, what I’ll call Phase I. I’ll explain what my initial plan was for Phase I and, after learning a bit more, what I plan to do for Phase II. Please feel free to correct me in the comments below – I made a lot of mistakes configuring this and I wouldn’t be surprised if there’re a few more in there.
So we’ll start from the top. You can see that each NetApp controller, Controller A and Controller B, is connected to each fabric, Fabric A and Fabric B, respectively using two 8 Gbps links. If you’re already somewhat familiar with FC, you’ll remember that it’s useful to provide two distinct fabrics from initiators to targets, such as a server to storage, or storage to tape.
You may ask why the storage is cross connected to each fabric when hosts are generally not cross-connected. I’ll remind you that there are actually two controllers in the diagram. Each controller can be considered an individual host. The storage system is actually a NetApp FAS2240-2 in a dual-controller, HA configuration. Cross connecting each controller is useful because in case of a failure in a fabric, the multipathing software will sense the failure and move traffic to the other fabric. Multipath software installed on the initiator operating system or hypervisor will discover and monitor each distinct path across each fabric to each controller’s targets. For example, imagine that each controller is only connected to one fabric, say, Controller A to Fabric A and Controller B to Fabric B. Imagine an ESXi host with FC datastores residing on both Controller A and Controller B. If a fabric fails, such as Fabric Interconnect A or the Nexus 5596, the ESXi host might be forced to take a non-optimal path to its LUN through Controller B. It’s considered a non-optimal path because Controller A hasn’t failed, and thus still owns the LUN, but I/O must traverse Controller B through an HA interconnect to reach Controller A and the associated LUN. This is the best case. A worst case might be that ALUA or multipathing somewhere in the stack is not configured properly and there’s no controller failover because the storage controller hasn’t actually failed – rather a switch in the fabric has failed. In this case, the datastores hosted on Controller A go offline because the host cannot access them. This is why it’s useful to connect each storage controller to each fabric. Should there be a complete fabric failure, multipathing software in the host will simply move all traffic to the other fabric which still has an optimum path to the storage.
In the case of a storage controller failing, we assume that some type of high-availability is configured and all disks and associated logical storage units become owned by the remaining, online controller. The host notices the failure of the original storage controller and, upon successful controller failover, will begin accessing the same datastores now hosted by the online controller through an optimized path, possibly on the same fabric.
ISLs and why I don’t need them
Now, here’s one thing I now know I did wrong in this configuration. While attending a NetApp SAN bootcamp for partners this week, I had the opportunity to talk to some experts. Although, if I can say, I expected to learn such when I broached the subject. I’m talking about whether or not I need to include the interswitch link (ISL) between each 5500 Nexus. Currently, it’s configured as an Extended ISL (EISL), SAN port-channel trunk (it’s “Extended” because it’s carrying multiple VSANs). It’s trunking the two VSANs in the fabric, 100 and 200.
As I stepped back to admire my handy-work in the lab, I realized that I created what I wanted to: two distinct fabrics in two distinct VSANs which gives ESXi hosts in the UCS chassis two distinct paths to each storage controller. But I was bothered by the idea that I was trunking the VSANs across this ISL. Only VSAN 100 exists in Fabric A and only VSAN 200 exists in Fabric B. But with the ISL in place, now each VSAN exists on each Nexus and therefore, each fabric. An example of what the 5596 ISL looks like is shown below, but each switch shows the same thing: each VSAN active on each switch.
In addition to this, I’ve created SAN port-channel trunks from each Nexus to each UCS Fabric Interconnect (operating in End Host Mode). Now, the trunk to the interconnects is ok, and warranted, because I’d like to be able to add more VSANs (read more security zones, perhaps for multi-tenancy) at a later time. But the idea that was clarified in the SAN bootcamp this week is that ISLs are only needed if initiators and targets are on either side in each trunked VSAN. In this case, initiators and targets are going north-south. This particular ISL would be useful if I had initiators, in both VSANs attached to, say, the 5548 and targets, in both VSANs, attached to the 5596, i.e. east-west traffic. That is not what’s configured. So in short, this ISL is not needed. I’ll be removing it the next chance I get.
Nexus configuration overview
Included in this post is the actual FC configuration for the Nexus switches. Let me share the overview of that configuration first.
Each switch has defined the VSANs that run through them. As I mentioned above, this configuration has each fabric’s VSAN defined on each Nexus. This is not needed. There doesn’t need to be an ISL or VSAN trunking between fabrics. So what it should look like is that only VSANs defined on each swtich, on each fabric, are those that use that fabric for communication. Seems obvious, I know, leave me alone about it already. So VSAN 100 is defined in Fabric A on the Nexus 5596 and VSAN 200 is defined in Fabric B on the Nexus 5548.
The next configuration item is switch priority. Priority is set per VSAN. Switch priority will define which switch in a VSAN is the principal switch. The principal switch is responsible for assigning domain IDs. Domain IDs uniquely identify each switch in the fabric. The default priority for the Nexus is 128, lower priority has precedence. I wanted to force the election of the 5596 as the principal switch so I configured a priority of 64 for each VSAN.
As just mentioned, the domain ID of a switch uniquely identifies it on an FC fabric, specifically on a VSAN. When end hosts log into the fabric, they’re assigned FC IDs and the switch’s domain ID is included in the FC ID, thus uniquely identifying the end host port on the fabric. It’s recommend to statically assign domain IDs per switch. This is likely done as a security precaution. By assigning static domain IDs, one can configure the entire fabric to only allow certain domain IDs. Switches that can’t obtain a dynamically assigned domain ID (some dude walks up and plugs an FC switch into your fabric) are usually isolated from the fabric. This reminds me, somewhat, of VTP in an Ethernet network. If you don’t have the statically assigned VTP domain name and/or password, you can’t share in the joy of auto-magic VLAN distribution. Though if you’re isolated in an FC fabric, your traffic is going nowhere.
I configured two-link SAN port-channels between each Nexus and its respective UCS Fabric Interconnect. I also configured these SAN port-channels as trunks so that we can show off how multiple tenants can be provided their own VSAN and thus keep their storage traffic completely segregated in the fabric. As mentioned above, the ISL SAN port-channel is going away, so I won’t mention it any more.
I suspended the default zone (zone 1) and denied traffic flow to default zone members. This ensures that members not assigned to an active zone, and thus in the default zone, are not permitted to talk to each other.
FC port initialization
Depending on what the port is plugged into, it will be statically assigned an FC port type. Common FC port types include
- N_Port – These are end device ports, such as an end host HBA port. N_Ports plug into F_Ports.
- F_Port – These are switch ports that plug into end host ports, N_Ports.
- E_Port – An E_Port plugs into another E_Port and is configured in an ISL.
Variations of these ports are also seen, such as TE_Port, which is a trunking E_Port, which is part of an EISL. You’ll also see TF_Ports, which are ports connected to end hosts but are also trunking VSANs. NP_Ports also exist and are used in N_Port virtualization to proxy other N_Port fabric logins.
As most are familiar, working with strings of hexadecimal numbers is not exactly glorious work. It’s useful to define aliases for these WWNs, in my case, WWPNs, so we don’t have to manage hex strings. The Nexus supports two types of aliases. FC aliases are defined per VSAN. As long as the FC aliases are used in the zone in which they’re defined, you can reference them all day long. But if you wanted to move that piece of hardware into another VSAN, you’d have to redefine the FC alias in that particular VSAN. Probably not that big of deal, but still annoying. A better idea might be to use device aliases. Device aliases can be used just like FC aliases, except now they’re defined once and can then forever more be used in any VSAN without ever having to be redefined. Pretty useful if you ask me. So that’s what I did here.
Zones and zonesets
And finally, the meat and potatoes of any FC fabric: zones and zonesets. So I might have to rework this a bit. I created ZONESET-1 in Fabric A on the Nexus 5596 and ZONESET-2 in Fabric B on the Nexus 5548. ZONESET-1 only contained zones with initiators and targets from Fabric A and ZONESET-2 only contained zones with initiators and targets from Fabric B. Because of the ISL between the Nexus switches (and our friend CF distribute), these zonesets, zones, and device aliases were shared across switches. For example, the 5596 now has ZONESET-1 and ZONESET-2 defined. As you know, only one zoneset can be active at any one time on any particular switch. But each switch shows two zonesets, the active zoneset in their fabric and a zoneset from the *other* fabric that should never be active on that particular switch. So once I remove the ISL between the Nexus switches, I’ll remove the (alien?) zoneset that shouldn’t be in either switch.
Since I only have ESXi hosts on my UCS blades and no other servers to work with, I wanted to configure a VM in FC NPIV mode so I could show an end-to-end FC configuration. After getting my lab configured, the last part I needed to do was configure the actual VM for FC NPIV and start presenting storage. From what I could find, or couldn’t find, UCS doesn’t support an NPIV-enabled VM. By the time I reached this point, I had already converted the UCS Fabric Interconnects back to their default FC End Host mode. End Host Mode is also known as NPV mode which is generally required on first hop FC switches to which NPIV-enabled VMs connect. They were configured for FC Switch Mode because we had our NetApp controllers directly attached to the interconnects, which requires such a configuration. So what I have is FC storage presented to the ESXi hosts only.
I won’t walk through the UCS configuration, but I’ll include some screenshots. For instance, while in FC Switch Mode, the interconnects didn’t connect to the Nexii (that’s the plural for Nexus. If it’s not, it should be) via FC. Remember, the storage was directly connected via FC to the interconnects and so were the ESXi hosts. All storage traffic stayed local to the interconnects. So I had to create Uplink FC ports and in particular, an uplink SAN port-channel. I did this through the SAN tab and the SAN Cloud configuration. For those not familiar, SAN Cloud on UCS refers to a real SAN fabric, i.e. the interconnects uplink to FC switches. The Storage Cloud, on the other hand, is used when FC targets are connected directly to the interconnects in FC Switch Mode. As you can see in the screenshot to the right, I’ve only defined VSAN 100 trunked to the UCS in Fabric A. Fabric B is just the opposite, with only VSAN 200 defined and trunked. The corresponding configuration for Fabric A FC Port-Channel 51 is also shown.
Besides creating a SAN port-channel for each interconnect, I had to configure each ESXi host’s vHBA to reside on their respective fabrics. This configuration change varies depending on if or how you’re using templates. The end result, though, looks like this:
Cisco Nexus FC configuration template
Now for my benefit and yours, I want to share my somewhat templatized or at least ordered Nexus FC switch configuration. I couldn’t find the equivalent of this anywhere on the webs, so I pored over the SAN Switching Configuration Guide for my version of Nexus, 6.0(2)N1(1).
!enable FC features feature fcoe feature fport-channel-trunk !only needed for Nexus-to-UCS port channels !enable FC interfaces. I'm using the N55-M16UP in slot 2 of each Nexus slot 2 port 9-16 type fc copy ru st reload !distribute command only needed if multiple switches in fabric !also is enabled by default, but let's make sure cf distribute !configure new VSANs vsan database vsan 100 vsan 100 name FABRIC-A vsan 200 vsan 200 name FABRIC-B !configure switch priority fcdomain priority 64 vsan 100 fcdomain priority 64 vsan 200 !configure static domain IDs fcdomain domain 10 static vsan 100 fcdomain domain 10 static vsan 200 fcdomain domain 20 static vsan 100 fcdomain domain 20 static vsan 200 fcdomain restart vsan 100 fcdomain restart vsan 200 !configure allowed domain IDs !distribute command only needed if multiple switches in fabric fcdomain allowed 10,20 vsan 100 fcdomain allowed 10,20 vsan 200 fcdomain distribute !configure a SAN port-channel !Cisco recommends mode active on second switch instead of E !switchport mode could be F if connecting to !End Host Mode UCS Fabric Interconnect interface san-port-channel switchport description <description!> channel mode active switchport mode E switchport trunk mode on switchport trunk allowed vsan 100 switchport trunk allowed vsan add 200 no shutdown !assign FC ports to previously configured SAN port-channel !again, could be mode F interface fc / switchport description switchport mode E channel-group force !assign FC interfaces or SAN port-channels to VSANs vsan database vsan interface san-port-channel !configure default zone security vsan database vsan 1 suspend !configure device aliases !distribute command only needed if multiple switches in fabric device-alias distribute device-alias database device-alias name pwwn device-alias mode enhanced device-alias commit ! !create zones zone name vsan member device-alias zone commit !create zonesets zoneset name vsan member member zoneset distribute full vsan zoneset activate name vsan zone commit vsan cop run start !if troubleshooting zone merges across ISLs try show port internal info interface san-port-channel !also useful show vsan membership show zone show zoneset show device-alias database show interface san-port-channel
We also have two MDS switches in the lab that I’d like to get up and running. I’ll put these in front of the NetApp, but I’m curious as to how to interconnect them to the Nexii. Should I have some sort of partial-mesh or should I stay with two completely separate fabrics? I’ll have to whiteboard it and look at failure scenarios. Feel free to leave your comments on this.