Microsoft made lots of changes with CSV 2.0 in Windows Server 2012. But it seems like that message has not gotten through to people. I’ve responded to quite a few comments here on the blog and I’m seeing stuff on forums. What’s really annoying is that when you tell people that X has changed, they don’t listen.
I would strongly recommend that people take some time (I don’t care about excuses) to watch the TechEd presentation, Cluster Shared Volumes Reborn in Windows Server 2012: Deep Dive, by Rob Hindman and Amitabh Tamhane (Microsoft). There are lots of changes. But I want to focus on the big ones that people repeatedly question.
OK, what are the major changes?
There IS NO Redirected IO in WS2012 CSV Backup
Let me restate that in another way: Windows Server 2012 does not use Redirected IO to backup CSVs.
This has been made possible thanks to substantial changes in how VSS places VMs that are stored on CSV into a quiescent state. The backup agent (VSS Requestor) kicks off a backup request with a list of virtual machines. The Hyper-V Writer identifies the storage location(s) of the VMs’ files. A new component, the CSV Writer, is responsible for coordinating the Hyper-V nodes in the cluster … meaning all VMs on a CSV that is being backed up to be placed into a quiescent state at the same time. This allows for a single distributed VSS snapshot of each CSV. That allows the provider (hardware, software or system) to go to work and get the snapshot.
This is much simpler than what CSV did in Windows Server 2008 R2. [The following does not happen in WS2012] There was no CSV Writer. There was no coordination, so Redirected IO was required. The node performing a snapshot needed exclusive access to the volume so all IO went through it for the time being. A lot of people knew that bit up to there. The bit that most people didn’t know was that each node (hosting VMs that were being backed up) took snapshots of each CSV that was being backed up. And that could cause problems.
I’ve heard several times now from people who’ve experienced issues with volumes going offline during backup. There were two causes that I’ve seen, and both were related to a third party hardware VSS provider:
- Using a hardware VSS provider that did not support CSV
- The rapidly rotating and repeated snapshot process caused chaos in the SAN with the hardware snapshots
But, all that is G-O-N-E when backing up CSV on Windows Server 2012:
- There is no redirected IO
- There is a single VSS snapshot performed
SCSI3 Reservation Starvation Should Go Away
Every node in a Hyper-V cluster used SCSI3 persistent reservations and SCSI3 reservations to connected to CSVs. Every SAN has a finite number of those persistent reservations and reservations. The SCSI3 persistent reservations was a bottleneck. No manufacturer shares that number, and it’s a hell of a lot smaller than you’d expect – we typically find out about it during a support call. To compound this, each host required a number of SCSI3 persistent reservations, and that multiplied based on:
- Number of hosts in the cluster
- Number of CSVs
- Number of storage channels per host (possibly even a multiple of the number of physical HBAs/NICs, depending on the SAN)
What happens when you deploy too many nodes, CSVs, or storage channels? CSVs go offline. Yup. The SAN is starved of resources to connect the hosts to the LUNs. I saw this with small deployments with an entry level SAN, 3 hosts, and 5 CSVs. And it aint pretty.
Imagine a cluster with 64 nodes!?!?! With Windows Server 2012, each node gets a static key instead of using the legacy persistent reservation multiplication. That means your SAN can support more CSVs and more hosts running Windows Server 2012 than it would have with Windows Server 2008 R2. Note that the static key is assigned when the node is added to the cluster.
You can find the static keys in the registry of your cluster nodes in \HKEY_LOCAL_MACHINE\Cluster\Nodes\<Node Number>\ReserveID (REG_QWORD). You can identify which node number is which host by the NodeName (REG_SZ) value. You can see an example of this below.
This new system, which replaces persistent reservations, gives you better cluster infrastructure scalability, but it doesn’t eliminate the scalability limits of your SAN.