2013
12.11

At work we have a small number of VMs to operate the business.  For our headcount, we actually would have lots of VMs, but distribution requires lots of systems for lots of vendors.  I generally have very little to do with our internal IT, but I’ll get involved with some engineering stuff from time to time.

2 non-clustered hosts (HP DL380 G6) were setup before I joined the company.  I upgraded/migrated those hosts to WS2012 earlier this year (networking = 4 * 1 GbE NIC team with virtualized converged networking for management OS and Live Migration). 

We decided to migrate the non-clustered hosts to create a Hyper-V cluster.  This was made feasibly affordable thanks to Storage Spaces, running on a shared JBOD.  We distribute DataOn, so we went with a single DNS-1640, to attached to both servers using the LSI 9207-8e dual port SAS card.

Yes, we’re doing the small biz option where two Hyper-V hosts are directly connected to a JBOD where Storage Spaces is running.  If we had more than 2 hosts, we would have used the SMB 3.0 architecture of Scale-Out File Server (SOFS).  Here is the process we have followed so far (all going perfectly up to now):

Step 1 – Upgrade RAM

Each host had enough RAM for it’s solo workload.  In a cluster, a single node must be capable of handling all VMs after a failover.  In our case, we doubled the RAM in each of the two servers.

Step 2 – Drain VMs from Host1

Using Shared-Nothing Live Migration, we moved VMs from Host1 to Host2.  This allows us to operate on a host for an extended period without affecting production VMs.

Note that this only worked because we had already upgraded the RAM (step 1) and we had sufficient free disk space in Host2.

Step 3 – Connect Host1

We added an LSI card into Host1.  We racked the JBOD.  And then we connected Host1 to the JBOD, one SAS cable going to port1/module1 in the JBOD, and the other SAS cable going to port1/module2 in the JBOD (for HA).

Host1 was booted up.  I downloaded the drivers, firmware, and BIOS from LSI for the adapter (never, ever use the drivers for anything that come on the Windows media if there is an OEM driver) and installed them.

Step 4 – Create Cluster

I installed two Windows features on Host1:

  • Failover Clustering
  • MPIO

I added SAS in MPIO, requiring a reboot.

Additional vNIC was added to the Management OS called Cluster2.  I then renamed the Live Migration network to Cluster 1.  QoS was configured so that the VMSwitch has 25% in the default bucket, and each of the 3 vNICs in the ManagementOS has 25% each.

SMB Multichannel constraints was configured for Cluster1 and Cluster2 for all servers.  That’s to control which NICs are used by SMB Multichannel (used by Redirected IO).

I then created a single node cluster and configured it.  Then it was time for more patching from Windows Update.

Step 5 – Hotfixes

I downloaded the recommended updates for WS2012 Hyper-V and Failover Clustering (not found on Windows Update) using a handy PowerShell script.  Then I installed them on & rebooted Host1.

Step 6 – Storage Spaces

In Failover Cluster manager I configured a new storage pool.  We’re still on WS2012 so a single hot spare disk was assigned.  Note that I strongly recommend WS2012 R2 and not assigning a hot spare; parallelized restore is a much faster and better option.

3 virtual disks (LUNs) were created:

  • Witness for the cluster
  • CSV1
  • CSV2

Rule of thumb: create 1 CSV per node in the cluster that is connected by SAS to the Storage Pool.

Step 7 – Configure Cluster Disks

The cluster is still single-node, so configuring a witness disk for quorum will cause alerts.  You can do it, but be aware of the alerts.

Each of the CSV virtual disks were converted to CSV and renamed to CSV1 and CSV2, including the mount points.

Step 8 – Test

Using Shared-Nothing Live Migration, a VM was moved to the cluster and placed on a CSV. 

This is where we are now, and we’re observing the performance/health of the new infrastructure.

Step 9 – Shared-Nothing Live Migration From Host2

All of the VMs will be moved from the D: of Host2 to the cluster and spread evenly across the two CSVs in the cluster, running on Host1.  This will leave Host1 drained.

Remember to reconfigure backups to backup VMs from the cluster!

Step 10 – Finish The Job

We will:

  1. Reconfigure the networking of Host2 as above (I’ve saved the PowerShell)
  2. Insert the LSI card in Host2 and connect it to the JBOD
  3. Install all the LSI drivers & updates on Host2 as we did on Host1
  4. Add the Failover Cluster and MPIO roles to Host2
  5. Add Host2 as a node in the cluster
  6. Patch up Host2
  7. Test Live Migration
  8. Plan out VM failover prioritization
  9. Configure Cluster Aware Updating self-updating for lunch time on the second Monday of every month – that’s a full month after Patch Tuesday, giving MSFT plenty of time to fix any broken updates (I’m thinking of Cumulative Updates/Update Rollups).

And that should be that!

4 comments so far

Add Your Comment
  1. Hi Aidan,

    Do you have any link for configuring Cluster Aware Updates as you would recommend?

  2. Hey
    Do you know a method to minimize the redirected IOs?
    Something like a script that moves a vm to the host who is the owner of the VMs Storage CSV?

    • You just answered your own question.

Get Adobe Flash player