2011
04.27

My whitepaper on How to Build a Hyper-V Cluster Using the Microsoft iSCSI Software Target v3.3 proved to be popular, getting over 500 downloads, thanks to many of you linking, retweeting, and so on.

At the time, a TechNet page stated that MPIO would not be supported with iSCSI initiators that were members of a failover cluster.  I quoted that page, and I excluded MPIO from the setup.  This revelation disappointed a lot of people. 

Hans Vredevoort (clustering MVP) contacted some of the storage folks in Microsoft to discuss the MPIO/cluster member initiators issue. It turns out that the Microsoft page in question was incorrect. It used to be true, but the v3.3 Software Target does support iSCSI initiators that are members of a cluster. The document has been updated with this note, but I have not added configuration steps for MPIO.

5 comments so far

Add Your Comment
  1. Damnskippy, that is excellent news! Thx for updating us.

    Question tho, you mentioned “it used to be true”, any chance you can specify which versions don’t support it?

    • No idea.

  2. Hi.

    I have your book and I thought I’d share my experiences trying to build the absolute cheapest HA cluster using the Microsoft iSCSI Software Target. I think I have a solution, but I’m not sure if there are any gotchas if I implement it in a production environment.

    I’ve set up a test environment similar to the one you describe in your book and white paper. However, for the shared storage, my solution is to build two servers (which I’ll call “SANs”) using the software target and using another product to mirror the data between the primary and secondary SANs. If the primary SAN goes down, the secondary takes over. Or if you need to do maintenance, you just reverse the roles. The primary SAN will have a 16 member array using fast disks. The secondary array will use fewer larger/slower/cheaper disks to save money.

    I set up a volume on the primary SAN to hold the shared storage. Mirror that volume to the secondary SAN. Set up identical iSCSI targets on each SAN, and configure each member of the cluster to connect to all of the targets on both SANs. For each mirrored pair of iSCSI targets, only one of them is available at any given time because the replication software locks the secondary mirror. If the primary SAN goes down, the mirror reverses direction but the iSCSI Target on the secondary doesn’t know that its VHD has been unlocked. All that’s required is to restart the iSCSI Target services on both SANs. This causes the iSCSI initiators on each member of the cluster to automatically reconnect and remount the shared storage, this time from the secondary SAN. The cluster doesn’t know the difference.

    You can automate most if not all of these tasks using the task scheduler, so that when it detects a mirror reversal, it automatically restarts the iSCSI target service. However, you can’t reverse the direction of the mirrors without shutting down the VMs first unless you want them to crash. Similarly, a hardware failure on the primary SAN will crash the VMs. However, they’ll reboot from the secondary SAN. So it’s not completely fool proof.

    Do you see anything wrong with this setup?

    • Sounds innovative. I’d just be worried about what happens to VMs if one of these storage hosts goes down. Is failover instant? Probably not due to some heartbeat in the storage replication/clustering solution. So that may lead to 9E (I think that’s the code) BSODs caused by clustering not having access to the storage resources on the active Hyper-V hosts.

  3. A failure on the primary storage will cause all of the VMs to crash. The failover process is something like this:

    1) Secondary becomes primary
    2) iSCSI target service restarts
    3) iSCSI initiators on cluster reconnect
    4) Cluster reconnects to shared storage
    5) VMs can restart.

    That process takes a good minute or so. There are commercial products that make it seamless, but you pay for it.

    I’ve looked at a lot of systems, but they seem overly confusing and/or expensive. Are there “turnkey” solutions that you’d recommend? We’d be running about 20 VMs and use about 4-5 TB of storage.

Get Adobe Flash player