Options for Managing vSphere Replication Traffic on Cisco UCS

I was recently designing a vSphere Replication and SRM solution for a client and I stated we would use static routes on the ESXi hosts.  When asked why, I was able to 1. discuss why the default gateway on the management network wouldn’t work and 2. present some options as to how we could separate the vSphere Replication traffic in a way that would allow flexibility in throttling its bandwidth usage. 

You won’t see listed here Network I/O Control because this particular client didn’t have Enterprise Plus licensing and therefore wasn’t using a vDS.  In addition, this client was using a fibre channel SAN on top of Cisco UCS with only a single VIC in his blades.  This configuration doesn’t work well with NIOC because it doesn’t take into account FC traffic which is sharing bandwidth with all the Ethernet traffic NIOC *is* managing.

DR Options for SQL Server in a vSphere Environment

While SQL Server is not one of my core competencies, I have worked with clients to protect their business critical applications in a VMware environment that utilizes SRM for DR.  These options rely on either Native SQL protection schemes or VMware options like SRM or vSphere Replication.  There are, of course, many 3rd party options, as well, depending on the storage array in use, which I won’t go into here.  While there are usually good, better, and best options, the idea I’d like to get across here is that there are many ways to protect SQL Server.  They can all be used at the same time even.  I’ve had clients that had so many SQL Servers, this is essentially what they did – they had to pick and choose how to protect each based on their relative importance.

SQL Server 2012 AlwaysOn Availability Groups

For the most critical SQL Servers, the image below shows the high-level view of what my clients have used with success.  For server failures at the Primary Data Center, there are multiple SQL Servers.  AAGs can use both an Active-Active model and an Active-Passive model with regard to where the active database resides.  Continuing with the Primary Site, Node 1 can host both an Active and a Passive database.  Node 2 can host an Active and Passive database, as well, working with Node 1 to perform synchronous replication.  Through asynchronous replication, both databases can be replicated to the DR site, where only Passive copies reside.  In the event Site A completely fails, Node 3 can be brought online.

SRM error – failed to recover datastore. An error occurred during host configuration

I’d like to share an error I was receiving when running test recoveries with SRM 5.1.1 on a NetApp, ONTAP version 8.1.2. The datastores in question were NFS. The error received in the SRM report is consistently in Section 4. Create Writable Storage Snapshot, but strangely, on differing datastores, first datastore 6, then on datastore 5, then back to datastore 6. This is without making changes to the environment in between tests. Weird, huh? The exact error in the report is

“Error – Failed to recover datastore ‘<datastore6>. An error occurred during host configuration.”

