Maintenance Windows For Patching A WS2012 Hyper-V Cluster Make No Sense To Me

Here I am, working on a Sunday (when I wrote this post).  It’s not so bad, it’s raining outside, so that rules out going for a walk or doing some photography.  I jumped onto Twitter and saw someone moaning that they had to work on a Sunday to patch their Hyper-V cluster.  To me that’s a WTF! moment.
Windows Server 2012 Failover Clustering gives us Cluster Aware Updating (CAU).  Using this you can patch a Hyper-V cluster without getting manually involved in “maintenance modes” and Live Migration.  The process will:
  1. Download updates from Microsoft, WSUS, etc, or a file share, to the hosts (and this is expandable to 3rd party updates such as OEMs).
  2. Put host 1 into maintenance mode – that drains it of virtual machines using Live Migration and … Quick Migration (for VMs marked as LOW priority, by default, which I DO NOT agree with).  You can make it 100% Live Migration so no services suffer an outage during the moves.  The more bandwidth your Live Migration network has, the faster this will be – using 1 Gbps networking for 512 GB RAM hosts is stupid!
  3. Patch and reboot host 1
  4. Wait for host 1 to come back online
  5. Bring host 1 out of maintenance mode
  6. Repeat steps 2-5 for each host
This process orchestrates the entire process.  All you’ve go to do is make it happen:
  • You can manually invoke CAU from a Failover Cluster Manager console not running on a cluster member
  • You can set up a special CAU role on the cluster with a patching schedule – it’s a clustered role so it will move just like the VMs
And the process is customizable, e.g. don’t proceed/continue if Y hosts are offline.
So … let me ask you a question.  If your VMs are moving around using Live Migration, and their services never go offline … why do you need a maintenance window?  Why exactly do you want to be a sad bastard like me and work on a Sunday?
Me, I think I’d do my host patching on a Wednesday morning, at around 11am, in a typical business.  Why?  A few reasons:
  1. Live Migration keeps services online so the business should not notice.
  2. I’m “in” the office already.  If something does go wrong, I am not getting a call at 3am or at the weekend.  I’m sober, awake (as much as I will be, anyway), and able to respond immediately.
  3. Any support services will have their primary staff available.  If I do need to call someone for hardware or software support, they are online, and I’m not dealing with the red-eye team at 3am on a Sunday morning.
  4. I can monitor for exceptions quite happily.
  5. The business doesn’t need to pay me overtime or give me time-in-lieu.
  6. Peak business in IT is at either end of the week (“password reset Monday” and “I didn’t want to bother you” Friday afternoons) so Wednesday seems like a nice balance.
So yeah, I do think that CAU should kill the Hyper-V cluster patching window.
Edit 1:
The same person was on Twitter many hours later, complaining that patching Hyper-V took them “11 hours”.  Really!?!?! Hmm, I think if that was me I’d be asking what I was doing wrong.  Just sayin’  is all …
You can learn more about Windows Server 2012 Hyper-V from the book, Windows Server 2012 Hyper-V Installation And Configuration Guide:
Please follow and like us:

9 Comments on Maintenance Windows For Patching A WS2012 Hyper-V Cluster Make No Sense To Me

  1. I’d like to add that CAU works flawelessly and we have been using it on all our Hyper-V clusters since RTM, during business hours for the reason Aidan mentioned and if you know that our business is > 95% virtualized on Windows Server 2012 Hyper-V cluster so we cannot afford to mess this up. It will aways retrun your cluster to the state it was in when you started wheter the updates fail or not. We have grown to trust it. You can even use it to deploy firmware, BIOS updates etc.

  2. Miha Pecnik // May 2, 2013 at 4:53 PM // Reply

    All well and good, but that’s just for the hosts, one still has VMs and not all those are aleays clustered.

  3. Maxim Batourine // May 3, 2013 at 2:10 AM // Reply

    It is all nice on the paper, but in reality – I have blue screens time to time when moving machines between nodes – always same BSoD – vmswitch.sys.
    Which points to the same problem – various versions of intel drivers i gues.
    Issue is – it is not repeatable. But in 3 node cluster of 300gb ram and 40 vms. That happens regularly. Tried 3 different versions of drivers – can’t confirm any of them as guilty – BSoD are still sporadically.
    The only one common – using intel 520 10g adapters in convergent mode – each adapter has its own vm switch and jumbo frame enabled.
    So have huge stop in using live migration – never know when it is going crash the node.
    Therefore caw is limited on what people is comfortable to run.
    Simple live migrations usually fine, massive movements of 100gb ram vms one in 4 times results in BSoD.

    • Aidan Finn // May 7, 2013 at 11:41 AM // Reply

      Open a support case with your h/w vendor. You buy the h/w you should expect it to work correctly.

  4. I have yet to do even a HV 2008R2 project where we do not patch the cluster during the day. Not automated (unless they also have SCorch and use it) but even that works great.

  5. is there a VMware or KVM (or any other platform, I don’t want enumerate all of them) equivalent to this. It seems that this is an absolute killer feature for a common scenario. I just heard a person who moved their pfsense out of a virtualbox VM and onto a physical machine because the host updates would downtime his network all the time.
    BTW. Aidan, we love you. Keep the Hyper-V flame burning!

  6. We use SCCM 2012 and I have multiple Hyper-V host clusters. I have NODE1 in a Sunday device collection and NODE2 in a Tuesday Device collection to receive their updates. I still manually put the host in maintenance mode during the day so when the host machine gets pactched and restarted the VM’s are already on the other node. From what I’ve read SCCM 2012 and the CUA do not play well together yet. Have you heard otherwise or know of any articles that explain how these two can work together? Do you know of a better way of utilizing just SCCM 2012 to do Hyper-V host cluster updates?

Leave a comment

Your email address will not be published.