2013
03.11

Fellow MVP, Carsten Rachfahl, just retweeted an interesting article on the Ask PFE (Microsoft Premier Field Engineering – a consulting support service offered to customers with lots of money) that discusses best practices for Windows Server 2012 Hyper-V.  A friend of mine is a PFE and I know how deep into weeds they can get in their jobs.  That means this could be a very interesting article.  I’ve read it.  Most of it I 100% agree with.  A small bit of it I don’t agree with.  Some of it I’d like to expand a bit on.

On Server Core

PFEs work for Microsoft so I expected and got the company line.  As you probably know, I prefer a full install because (a) it’s easier to troubleshoot when things go wrong and (b) third party management and configuration software (such as that from your h/w vendor) often relies on not just a GUI but also the presence of IE on the local machine.  The ability to switch between full, Core, and Minimal UI is not there yet, in my opinion, because it requires a reboot.

I don’t care about numbers of patches, I care about numbers of reboots, which is still going to be around once per month.  And thanks to Live Migration (clusters and SMB 3.0 enabled non-clustered hosts), I even don’t care about the reboots because I’ll patch during the workday with no service downtime.

As for memory: you’ll save a few MB with Core.  When your hosts have 48 GB + (all the way up to 4 TB) RAM then a few MB is meanlingless.  You might save 4 GB of disk space.  When the smallest LUN I can put in a host for the management OS is 300 GB (that’s the smallest disk you can get delivered from HP these days) then I really couldn’t give a flying monkey about a 6 GB Windows install versus a 12 GB one Smile

On BIOS/Firmware/Drivers

100% agree on being up to date.  Some h/w vendors, such as IBM, will screw around with you to delay shipment of a replacement dead disk (firmwares, gathering logs, analysis of said logs by support, etc) so minimise the risks.  Didier Van Hoye (MVP) has done some blogging and presenting on how to use Cluster Aware Updating to install firmware/drivers on clustered Dell servers.

On selection of h/w, I’m not alone in recommending that you find a mix of components that you like and are happy with, and stick to them as much as possible.  Not all h/w, drivers, and firmwares are made equal, even by the same manufacturer!  You’ll have a lot of eggs in these baskets and you want these baskets to be well made.

Use of GPO

I like and use this.  I put my hosts, even in the lab, in their own OU and have a GPO just for these hosts.  Some of it is for overrides (e.g. not to force patch installs to my hosts like with other physical servers) and some of it can be for other customizations.  I like the power plan setting idea by PFE.  You could also use this GPO to push out your firewall settings, AV configs, manage services, etc.

Store VM Files On Non-System Drive

This is important for non-HA VMs (typically not on a cluster).  This is to avoid Dynamic VHDs, snapshots (AVHD/AVHDX), Hyper-V replica logs (HRL) growing to the point of filling the system drive and rendering the host dead while pausing the VMs.  Do you really want to have to boot the host up off a WinPE USB disk to resolve this issue?  The most common offenders here will be small businesses, especially uneducated field engineers who are deploying their first hosts.

Place the VMs on a dedicated LUN – I don’t care how small the company or host is.  We advise this for a very valid reason!  I don’t care about nor value your “virtualisation experience” on your laptop!

The BIN File

There’s a good reminder there that VMs with the “save state” automatic host shutdown action will maintain a BIN file.  This used to be all VMs.  Now, only those VMs maintain this placeholder file to write the memory to disk.  This file matches the amount of RAM currently assigned to the VM.  VMs with Dynamic Memory enabled will see this file grow and shrink, and you need to account for how big this file can get.

TIP: a host with 96 GB RAM can never assign more than 96 GB RAM, and therefore cannot generate more than 96 GB of BIN file on its storage.  You also cannot have more than X GB of BIN file if your VMs with the “save state” shutdown action have a total of X maximum RAM (dynamic memory setting).

PAL

I’d never heard of this tool.  Well worth noting – I have heard very interesting stories about the abilities of PFEs to troubleshoot problems based on perfmon metrics alone!

VMQ

There’s much more to VMQ than just enabling it.  BE VERY CAREFUL!  You need to know what you are doing, especially if implementing RSS as well or doing converged fabrics or NIC teaming.

Jumbo Frames

I wouldn’t be so liberal about recommending Jumbo Frames for iSCSI.  Consult your h/w vendor first.

iSCSI and NIC Teaming

Correct: iSCSI NICs should not be NIC teamed.  It’s not supported and it will end badly.

HOWEVER, there is a subtle exception to this in converged fabrics.  Note that the iSCSI virtual NICs in this design are not NIC teamed, and MPIO is used instead.  The actual NIC team is abstracted beneath the virtual switch.  But you should still check with your SAN manufacturer for support of this option.

Recommended Networking on Hosts

There is something subtle here that most are missing.

1) You only need iSCSI if you are using iSCSI.  That should seem obvious to everyone … but there are always a few people …

2) Note the poster talks about the recommended number of networks.  They are not talking about the recommended number of physical NICs.  I can quite happily create these networks using a single 10 GbE NIC.  See converged fabrics.

Dynamic disks

Smile I like that they recommend fixed VHD/X files for production.  That’s what I recommend.  Yes, Microsoft are back on the “Dynamic VHDs are just as good” bandwagon, just as they were with W2008 R2.  And many of us found that fragmentation caused read performance issues, particularly for relational databases.

BTW, there is a near religious split in the MVP world over Dynamic versus Fixed VHD/X.  Some of the optimisations in VHDX (TRIM and UNMAP) muddy the waters, but I always come back to fragmentation.  Storage (particularly databases) only ever grow, and tiny growth increments lead to fragmentation.  Fragmentation leads to read performance issues, and that slows down queries and user interaction with applications. And that leads to helldesk calls.

As for passthrough disks.  I hate passthrough disks.  If you find an engineer or consultant who says you should use passthrough disks for scalability or performance, then I want you to do this:

Kick them in the balls.  Repeatedly.

Fixed VHDX will run (read and write) at nearly the same speed as the underlying physical disk.  There will be contention across the physical spindles on your storage.  More spindles = more IOPS.  Creating a passthrough disk on the same disk group as a CSV is pointless and shows how dumb the engineer really is.  And VHDX scales out to 64 TB.  Few people need virtual LUNs bigger than 64 TB.

Page File

The PFE blog tells us to set the paging file to 4 GB.  That is my advice … for W2008 and W2008 R2 Hyper-V.  However, we have been told not to do this for WS2012 Hyper-V.  It is intelligent enough to figure out how to manage its own paging file.

Management OS Memory Reserve

The PFE blog tells us to configure the MemoryReserve registry key.  I also used to tell people to do this on W2008 R2 to reserve memory on the host against the needs of Dynamic Memory because the default reservation algorithm might not do enough.  We are told not to use MemoryReserve in WS2012 Hyper-V unless Microsoft Support instructs you specifically to do otherwise.  The memory management has changed under the hood and the default reservation algorithm should be enough.

Integration Services

I need to disagree with the following:

Enlightened OS’s (Server 2008 or higher, Windows 7 or higher) don’t need IS installed manually.

Yes, they contain Hyper-V ICs … as they were at the release of the media … YEARS AGO.  Even the built-in ICs in WS2012 and Windows 8 are already out of date (a Windows Update late last year brought an update).  You should always update the ICs to (a) have bug/security fixes and (b) gain access to new features.  This can be painful if you are doing it manually (it requires a VM reboot).  This is why I like ConfigMgr: I can distribute the new ICs as a custom update or as a piece of software, and schedule the install/reboot during a maintenance window (possibly configured in my collections).

EDIT#1

Myself and the post author, Roger Osborne (PFE), have chatted offline.  I’ve also reach out to the product group to get advice on the paging file and MemoryReserve.  The last we (authors of the new book) had heard from Redmond, what I posted above was correct.

18 comments so far

Add Your Comment
  1. Thanks for the timely article – I was just reading through the article and wondering how much I should trust their recommendations.

  2. What is your opinion on AV on hosts? My feeling is no but of course security guys at work are going “THAT’S HORRIBLE, YOUR GOING TO GET ENTIRE COMPANY INFECTED!!”

    • I personally hate AV on hosts. If the exceptions are removed, for any reason (manual error, bad program update, etc), then you can end up in a world of hurt. Hosts should have firewall up and not be used as file servers, for browsing, etc. Local admin/login should be limited entirely to troubleshooting. Therefore no need for AV.

    • I second Aidan’s sentiment. Frankely we’d have a cluster down incident once/year if we ran AV on the cluster hosts due to bugs/false positives (McAfee in our case, nough said). Lock those hosts down, keep ‘m patched and manage according to their importance.

  3. Great post, btw I have a good reason for core servers: If you have a core server most admins won’t touch it and that’s good. Otherwise you may end up with a iSCSI Target Server installed on the Hyper-V host… ;)

  4. Thanls fot the nice post!.
    I totally agree with Thomas. Having a GUI enabled host invites all the “they don’t know what the’re doing”-admins to browse around all available Management Tools. If Server 2012 would have Windows Store they’d install pinball or whatever other types of craplications some day. To threat the host as it would be a ESX or XenServer host gives you an additional protection from dumb sysadmins. Sometimes it’s my final task to remove the GUI before handing over the host to the customer ;-)

    • Just like Domain Admins, not everyone should have admin rights on your hosts. Use OU, GPO, etc, and limit the local admins and log on rights for your hosts. And if an admin on the customer site fraks up, then tell their boss to stop hiring morons, and you’ll stop having to come in, bail their asses out, and invoicing them for lots of money :)

  5. How about pagefiles in the VMs ? I’ve seen different recommendations. My personal experience: Simply don’t have one IF you’re using Dynamic Memory and can do without a crashdump. I set Dynamic memory to 50% buffer, and the pagefile never gets used. I tried putting it on a different drive, etc. but what’s the use if it never gets used anyway. What’s your best experience regarding this ?

    • The page file in the VM is an it-depends situation. If using Dynamic Memory then that page file (if system managed) will grow so the drive storing it must be large enough to accommodate this. I wouldn’t have some rule that says the buffer must be 50%. That buffer is used by the guest OS for file caching unless pressure spikes drastically. Some apps won’t use it, e.g. SQL has a recommendation of setting the buffer to the minimum of 5%.

      • ARe you able advise or point to MS best practices for pagefiles on a VM, specifically a SQL server. We are getting grief from our Sharepoint and SQL admins wanting us to add a 70 odd GB disk for a pagefile (32GB memory VM) and we cannot find anything definitive on this – worse, we keep finding conflicting recommendations on different MS sites/blogs.
        We do not want to use a large pagefile on a VM no matter what it is running.
        We are running Server 2008 R2 for both host and guests at present (2012 migration in the pipeline).

  6. Great post, thank you. Could you please explain in few words if there is a way to manually update Integration Components on host OS? Just to be clear, I know how to update Integration Components within virtual machine, what I don’t know is where/how can I find the latest version of Integration components and download them to Hyper-V host server so that I can deploy them to virtual machine from there?

    • This is done for you either by a Windows update or a service pack.

      • Thanks. I thought that there was a way to do this manually.

        • There is no need for a manual update of the ICs on the host. They do nothing for the host itself – they are an addition to enlightened guest OSs. Updates to functionality on the host enable new features in the guest OS ICs. Those updates include the updated ICs that are stored on the management OS. Therefore, no manual update required if you’re patching/service packing your hosts.

          • Thanks once again Aidan. I wasn’t clear enough – we host VMs for our customers and according to best practices we remove virtual DVD drive for each VM. Basically, in order to update IC on VM, I need 2 reboots – one to attach virtual DVD, and another one to update IC. Guys at MS should have thought a liitle bit more about this scenario I guess…anyway, I hope there is a way around virtual DVD in order to update IC in VM.

  7. The podcast Runas Radio has some interviews with the creator of PAL.

  8. As for passthrough disks. I hate passthrough disks. If you find an engineer or consultant who says you should use passthrough disks for scalability or performance, then I want you to do this:

    Kick them in the balls. Repeatedly.

    Classic :)

Get Adobe Flash player