2011
07.26

In the PowerPoint that I posted yesterday, I mentioned that you should not go overboard with creating CSVs (Cluster Shared Volumes).  In the last two weeks, I’ve heard of several people who have.  I’m not going to play blame game.  Let’s dig into the technical side of things and figure out what should be done.

In Windows Server 2008 Hyper-V clustering, we did not have a shared disk mechanism like CSV.  Every disk in the cluster was single owner/operator.  Realistically (and required by VMM 2008) we had to have 1 LUN/cluster disk for each VM.

That went away with CSV in Windows Server 2008 R2.  We can size our storage (IOPS from MAP) and plan our storage (DR replication, backup policy, fault tolerance) accordingly.  The result is you can have lots of VMs and virtual hard disks (VHDs) on a single LUN.  But for some reason, some people are still putting 1 VM, and even 1 VHD, on a CSV.

An example: someone is worried about disk performance and they spread the VHDs of a single VM across 3 CSVs on the SAN.  What does that gain them?  In reality: nothing.  It actually is a negative.  Let’s look at the first issue:

SAN Disk Grouping is not like Your Daddy’s Server Storage

If you read some of the product guidance on big software publisher’s support site, you can tell that there is still some confusion out there.  I’m going to use HP EVA lingo because it’s what I know.

If I had a server with internal disks, and wanted to create three RAID 10 LUNs, then I would need 6 disks.

image

The first pair would be grouped together to make LUN1 at a desired RAID level.  The second pair would be grouped together to make the second LUN, and so on.  This means that LUN1 is on a completely separate set of spindles to LUN2 and LUN3.  They may or may not share a storage controller.

A lot of software documentation assumes that this is the sort of storage that you’ll be using.  But that’s not the case with a cluster with a hardware SAN. You need to use the storage it provides, and it’s usually nothing like the storage in a server.

By the way, I’m really happy that Hans Vredevoort is away on vacation and probably will miss this post.  He’d pick it to shreds Smile

Things are kind of reversed.  You start off by creating a disk group (HP lingo!)  This is a set of disks that will work as a team, and there is often a minimum number required.

image

From there you will create a virtual disk (not a VHD – it’s HP lingo for a LUN in this type of environment).  This is the LUN that you want to create your CSV volume on.  The interesting thing is that each virtual disk in the disk group spans every disk in the disk group.  How that spanning is done depends on the desired RAID level.  RAID 10 will stripe using pairs of disks, and RAID5 will stripe using all of the disks.  That gives you the usual expected performance hit/benefits of those RAID levels and the expected available amount of data.

In the below, you can see two virtual disks (LUNs) have been created in the disk group.  The benefit of this approach is that the virtual disks can benefit by having many more spindles to use.  The sales pitch is that you are getting much better performance than the alternative server internal storage.  Compare LUN1 from above (2 spindles) with vDisk1 below (6 spindles).  More spindles = more speed.

I did say it was sales pitch.  You’ve got other factors like SAN latency, controller cache/latency, vDisks competing for disk I/O, etc. But most often, the sales pitch holds fairly true.

image

If you think about it, a CSV spread across a lot of disk spindles will have a lot of horsepower.  It should provide excellent storage performance for a VM with multiple VHDs.

A MAP assessment is critical.  I’ve also pointed out in that PowerPoint that customers/implementers are not doing this.  This is the only true way to plan storage and decide between VHD or passthrough disk.  Gut feeling, “experience”, “knowledge of your network” are a bunch of BS.  If I hear someone saying “I just know I need multiple physical disks or passthrough disks” then my BS-ometer starts sending alerts to OpsMgr – can anyone write that management pack for me?

Long story short: a CSV on a SAN with this type of storage offers a lot of I/O horsepower.  Don’t think old school because that’s how you’ve always thought.  Run a MAP assessment to figure out what you really need.

Persistent Reservations

Windows Server 2008 and 2008 R2 Failover Clustering use iSCSI3 persistent reservations (PRs) to access storage.  Each SAN solution has a limit on how many PRs they can support.  You can roughly calculate what you need using:

PRs = Number of Hosts * Number of Storage * Channels per Host Number of CSVs

Let’s do an example.  We have 2 hosts, with 2 iSCSI connections each, with 4 CSVs.  That works out as:

2 [hosts] * 2 [channels] * 4 [CSVs] = 16 PRs

OK; Things get more complicated with some storage solutions, especially modular ones.  Here you really need to consult an expert (and I don’t mean Honest Bob who once sold you a couple of PCs at a nice price).  The key piece may end up being the number of storage channels.  For example, each host may have 2 iSCSI channels, but it maintains connections to each module in the SAN.

Here’s another example.  There is an iSCSI SAN with 2 storage modules.  Once again, we have 2 hosts, with 2 iSCSI connections each, with 4 CSVs.  This now works out as:

2 [hosts] * 4 [channels –> 2 modules * 2 iSCSI connections] * 4 [CSVs] = 32 PRs

Add 2 more storage modules and double the number of CSVs to 8 and suddenly:

2 [hosts] * 8 [channels –> 4 modules * 2 iSCSI connections] * 8 [CSVs] = 128 PRs

Your storage solution may actually calculate PRs using a formula with higher demands.  But the question is: how many PRs can your storage solution handle?  Deploy too many CSVs and/or storage modules and you may find that you have disks disappearing from your cluster.  And that leads to very bad circumstances.

You may find that a storage firmware update increases the number of required PRs.  But eventually you reach a limit that is set by the storage manufacturer.  They obviously cripple the firmware to create a reason to buy the next higher up model.  But that’s not something you want to hear after spending €50K or €100K on a new SAN.

They way to limit your PR requirement is to deploy only the CSVs you need.

Undoing The Damage

If you find yourself in the situation with way too many CSVs then you can use SCVMM Quick Storage Migration to move VMs onto fewer, larger CSVs, and then remove the empty CSVs.

Recommendations

Slow down to hurry up.  You MUST run an assessment of your pre-virtual environment to understand what storage you buy.  You also use this data as a factor for planning CSV design and virtual machine/VHD placement.  Like my old woodwork teacher used to say: “measure twice and cut once”.

Take that performance requirement information and combine it with backup policy (1 CSV backup policy = 1 or more CSVs, 2 CSV backup policies = 2 or more CSVs, etc), fault tolerance (place clustered or load balanced VMs on different CSVs), and DR policy (different storage level VM replication policies requires different CSVs).

4 comments so far

Add Your Comment
  1. Very good article. We are in the planning stages our own new Hyper-V virtualization project for organization, and this was helpful. I am wondering how node growth should be planned when considering clustered shared volumes, since each host owns/coordinates each csv. For example if you start out with 2 hosts, and 4 csv, and then add an additional host to the cluster, can it access the same csv’s that are already defined and host its own vms?

    • Steve, each CSV has 1 CSV coordinator. It delegates rights to each other host in the cluster so they can run VMs that are stored on that host. The thing you do have to watch out for when you add hosts/CSVs is the usage of persistent reservations and the limit on how many of those PRs that the SAN can handle.

  2. Disk IO is so important to any VM implementation. Without access to large anc complex Storage Networks I have only really been able to use server host Direct Attach storage.
    Many time I “Feel” (unscientific I know) that seperate LUNS do give me better performance. As, thought the LUN configured with significant disk numbers is able to access multiple disks, it is still being accessed by multiple VM. So one advantage negates the other.
    Surely for best IO with SQL databases in mind and rapid access to data in a smaller business environment it is best to just allocate physical disks, assign them to the LUN for each VM.

    • You’ll gain maybe 2% over the speed of fixed VHD, and lose access to features like easy host/storage level backup, Live Storage Migration (and Shared Nothing Live Migration), Hyper-V Replica, and on … and on … and on.

      Using passthrough disk = FAIL in my opinion. #2 reason we virtualise (and stats prove it) is flexibility. Passthrough disks (raw device mapping) are not flexible.

      Just follow my advice.

Get Adobe Flash player