2014
02.12

Microsoft has released a February update rollup for Windows Server 2012, as well as Windows 8 and Windows RT (8). There’s one included update that Hyper-V or clustering folks should be aware of:

KB2920469 deals with a situation where you cannot change the schedule for CAU self-updating mode in Windows Server 2012 or Windows Server 2012 R2 by using CAU GUI.

Assume that you have a cluster that runs Windows Server 2012 or Windows Server 2012 R2. When you try to change the Cluster-Aware Updating (CAU) schedule by using the GUI, the time for the automatic update schedule is not changed, and the old time remains.

As usual with update rollups, don’t be an IT haemophiliac; delay approval of this update for a month and let the rest of the world be Microsoft’s test lab. If you don’t see any update to this update in one month, then approve and deploy it if you’re happy.

2014
02.11

Remember when HP announced that they were considering selling their PC division? They felt the market was weak and they should focus more on servers & storage. That killed PC sales for HP, ensure Lenovo was the number one choice in business, and fired yet another HP CEO. Eventually the non-decision was reversed but at what we can only guess was a huge cost.

Meg Whitman, the current CEO, seems determined to kill off HP’s enterprise business completely. If you follow me on Twitter then you would have read a tweet I sent out on Feb 7th (while on vacation):

image

HP formally announced (we read rumours over a month ago) that they would be restricting access to firmware updates. You would need to maintain an active support contract on your hardware (a la Cisco) to have the right to download firmware for your servers and storage.

Huh!?!? Sure, HP, this firmware is your “intellectual property” as you asserted in the announcement. But I’m sure that people who bought the hardware with 3 years support expect, you know, support for 3 years. With new Linux variants out every few months, vSphere updated annually, and Windows versions appearing every 12-18 months, we kind of need those firmware updates for a stable platform. If HP doesn’t want to offer me stability, then why the hell would I consider using their out-of-date hardware? Seriously?!?!

It appears that Mary McCoy of HP felt like she needed to defend the boneheaded decision. There is no defence. This is about as stupid as changing the licensing of a virtualization product to be based on maximum VM RAM – and we saw how quickly that course got reversed.

image

HP is truly a Blackberry in the making, but just bigger. Ineptitude is the central quality you need to sit on the board or to be an executive. Cluelessness and a disconnection from reality are desirable skills. In my non-guru hands, 3Par underperforms against Dell Compellent (and much better people than me have proven this) and the Gen8 servers are now doomed.

I used to be a HP advocate. Their server hardware was my first choice every time for a decade. But that all changed with the release of WS2012 when I saw how Dell had taken the lead – or was it that HP stopped competing? And now HP wants to commit Seppuku at the hands of the samurais at the top. Bye bye HP.

In other recent news, Lenovo bought the X series server business from IBM. I HATE IBM’s products and support. But I do love what Lenovo has done to the IBM PC business. I wonder how or if they’ll repair the IBM server business to give Dell some competition that HP evidently doesn’t want to offer?

2014
02.11

Dell, in cooperation with Microsoft, announced the release of their supported hardware for Windows Server 2012 R2 Storage Spaces and Scale-Out File Server.

image

Microsoft said:

Dell’s announcement is an exciting development which will help more customers take advantage of the performance and availability of virtualized storage with Windows Server.

Dell went on:

Microsoft’s Storage Spaces, a technology in Windows Server 2012 R2, combined with Dell’s PowerEdge servers and PowerVault storage expansion solutions, can help organizations like hosters and cloud-providers that don’t have the feature-set needs for a separate storage array to deliver advanced, enterprise-class storage capabilities, such as continuous availability and scalability, on affordable industry-standard servers and storage.

The HCL has not been updated yet, but it appears that. Dell has two appliances that they are pushing:

  • MD1200
  • MD1220: a 24 drive tray, similar to the DataOn DNS-1640

Dell has also published Deploying Windows Server 2012 R2 Storage Spaces on Dell PowerVault.

image 2 x clustered servers with 2 x MD12xx JBODs

So, one of the big storage companies has blinked. Who is next?

BTW, when I checked out the Irish pricing, the Dell MD1220 was twice the price of the DataOn DNS-1640. After bid price, that’ll be an even match, so it’ll come down to disks for the pricing comparison.

2014
02.11

I am pretty particular about where I store virtual machine files. I STRONGLY DISLIKE the default storage paths of Hyper-V. I use 3 options:

  • Local storage: Virtual hard disks and virtual machine files go into D:Virtual Machines<VM Name>
  • CSV: Virtual hard disks and virtual machine files go into C:ClusterStorage<CSV Mount Name><VM Name>
  • SMB 3.0: Virtual hard disks and virtual machine files go into \<SMB 3.0 Server Name><Share Name><VM Name>

Each VM gets its own folder. All files for that VM, including virtual hard disks, go into that folder. I NEVER use the default VM file locations on the C: of the management OS. Using those locations is STUPID. And if you cannot see why … please put down the mouse and hand in your resignation now.

Microsoft has published a KB article to reinforce the fact that there are supported file share path formats. The wording is a bit iffy – see my above examples to see what is supported. Long story short: Place the VM files into a dedicated subfolder for that VM.

2014
01.31

This KB applies to Windows Vista and Windows Server 2008 up to Windows 8 and Windows Server 2012. There’s no mention of Hyper-V, but considering that hosts have lots of NICs, it seemed relevant to me. The scenario is when duplicate friendly names of network adapters are displayed in Windows.

Symptoms

Consider the following scenario:

  • You have one or more network adapters installed on a computer that is running one of the following operating systems:
    • Windows Vista
    • Windows Server 2008
    • Windows 7
    • Windows Server 2008 R2
    • Windows 8
    • Windows Server 2012
  • The display names of the network adapters are changed. For example, the device driver is updated.
  • You add new network adapters to the computer. The new network adapters are of the same make and model as the original network adapters.

In this scenario, duplicate friendly names of the original network adapters are displayed in Device Manager.
For example, you have two network adapters installed on a computer. Before you update the driver, Device Manager shows the following:

  • <Network adapter name>
  • <Network adapter name> #2

After the driver is updated, the names of the network adapters are changed to the following in Device Manager:

  • <Network adapter new name>
  • <Network adapter new name> #2

After you add new network adapters that are of the same make and model, Device Manager shows the following:

  • <Network adapter new name>
  • <Network adapter new name> #2
  • <Network adapter new name> #3
  • <Network adapter new name> #4
  • <Network adapter new name> #5
  • <Network adapter new name> #6
  • <Network adapter new name>
  • <Network adapter new name> #2

In this scenario, Device Manager displays duplicate friendly names of the original network adapters.

A hotfix is available to resolve this issue.

2014
01.31

You’d think that after all these years, considering how critical and pervasive IT has become, that employers would understand that:

  • IT is complex: there is more to IT services and infrastructure than clicking Install in an app store.
  • No one person can know everything: Hell, no one person can know all of System Center!!!! (You hear that Microsoft certification process managers?!?!?!)
  • Good people are required: There are lots of “cowboys” out there who can do a shit job, but you need good people to do a good job.
  • Good people are rare, and therefore expensive: You’d think that business people would understand the rules of supply and demand.

But, it appears that lessons have not been learned.

Exhibit A:

Here’s a tweet from earlier today by MVP Didier Van Hoye:

image

Yes, some employer wants a person with little to no experience to decide and plan the future of their IT, and therefore the ability of their business to function. That’s smart … no … that’s moronic.

Exhibit B:

Some company (I haven’t bothered to figure out who yet) in Dublin (Ireland) is recruiting for a cloud consultant. I was spam-emailed last week, I’ve seen adverts on LinkedIn, and I’ve been cold called by head hunters. This employer is seeking a unicorn, bigfoot, or abominable snowman type of creature. They want a consultant who knows EVERYTHING:

  • Hyper-V, vSphere, etc
  • System Center, VMware’s suite, etc
  • Hardware and storage
  • AWS
  • Azure
  • I think there also might have been some networking stuff in the laundry list

And that person will earn the princely sum of €55K per year. Firstly, this person does not exist. Secondly, €55K is the going rate for a mid-level consultant that has a few of those skills.

The world still needs to learn that IT pro staff are not glorified cleaners. It’s not like we can go to college for 2 years to learn how to balance or cook the books and we’re set for the rest of our careers.

2014
01.30

Automatic Virtual Machine Activation (AVMA) is the one Hyper-V feature in the Datacenter edition of Windows Server that you won’t find in the other versions (Standard or Hyper-V Server). This is a technical feature than enables a licensing feature. Hosts that are licensed with the Datacenter edition are entitled to host as many VM installations of Windows Server as you are able to get on to that licensed physical machine. The complication for larger or hosting companies is activating the installations: firewalls and NVGRE network virtualization (SND or software-defined networking) makes routing to Microsoft’s clearing house or a KMS a little difficult. So Microsoft allows you to activate the host, and install AVMA keys into the guest OS of your template virtual machines.

Microsoft has published a KB article that is related to a funny you might see in your virtual machines that is related to AVMA. 

Symptoms

On a Windows Server 2012 R2 Datacenter Hyper-V host, you may see 2 unknown device under Other Devices in device manager of any virtual machine running operating systems earlier than Windows Server 2012 R2.
If you view the properties of these devices and check driver details, Hardware IDs or Compatible IDs, they will show the following:

  • vmbus{4487b255-b88c-403f-bb51-d1f69cf17f87}
  • vmbus{3375baf4-9e15-4b30-b765-67acb10d607b}
  • vmbus{99221fa0-24ad-11e2-be98-001aa01bbf6e}
  • vmbus{f8e65716-3cb3-4a06-9a60-1889c5cccab5}

Cause

These Virtual Devices (VDev) are provided for Automatic Virtual Machine Activation (AVMA) to communicate with the host. AVMA is only supported on virtual machines running Windows Server 2012 R2 or later versions of operating systems.

According to Microsoft the unknown devices are “harmless and can be ignored”. Hosting companies might want to add this one to their customer knowledgebase. In my experience, this is one of those little annoying things that will create annoying and time consuming helpdesk calls.

2014
01.29

Microsoft has released Update Rollup 1 for System Center 2012 R2, covering everything except Endpoint Protection and Configuration Manager (they’re almost a separate group).

As usual with update rollups, I would caution you to let others download, install, and test this rollup. Don’t approve it for deployment for another month. And even then, make sure you read each product’s documentation before doing an installation.

Those who lived through URs over the last 12-18 months will remember that System Center had as bad, if not worse, time than Windows Server 2012 with these Update Rollups.

EDIT:

Update Rollup 5 for System Center 2012 Service Pack 1 was also released. The same advice applies; don’t deploy for 1 month and let others be the guinea pigs.

2014
01.24

I’m in an IE hating kind of mood this week. For no reason, IE11 decided to die on my new laptop on Thursday. That’s forced me back into the hands of Google (I find Firefox to be the worst to use of the 3 big browsers).

What’s happening? When I open IE11 it comes up with “Your last browsing session closed unexpectedly”. And then it locks up. I’ve reset IE and I’ve deleted all items. I’ve disabled all plugins and no joy.

One thing I found was interesting: I reset the home page to the default. IE opens just fine then. But try to browse to a page and it freezes before the page can load. It’s as if the rendering of the page is causing the issue.

A nice suggestion I got via twitter from Tero Alhonen was to:

  1. Disable Browser sync via PC Settings > SkyDrive > Sync Settings
  2. Uninstall IE from Programs & Features > Windows Features
  3. Reboot
  4. Reinstall IE
  5. Reboot
  6. Re-test

Why try this? Because a new test user on the same machine has no issues.

No joy.

I have also tried removing and recreating my user. That, in my opinion, is going too far to fix this issue, and although I could go to some extremes down this path, I am not willing to do so. Why the frak should I? A browser should just damned well work.

BTW, I have found plenty of people on forums having the same issue for months. A few seem to have resolved their issue by installing a new NVIDIA graphics driver. My Yoga’s devices are up to date and it has a Intel HD graphics.

So IE11 is now dead (literally) to me. And MSFT wonders why Win8x isn’t doing well ….

EDIT:

A quick update. Most sites will not load, e.g. Bing or Google. Some (a very few) load slowly, e.g. independent.ie. This leads me to think that there is a rendering issue in IE11 that is specific to my user profile, and was synced in via Skydrive.

EDIT 2:

Tero came back to me with another idea. Disable GPU rendering in the advanced IE settings. I opened up Internet Options in Control Panel and checked Use Software Rendering Instead Of GPU Rendering. I started up IE and pages are opening as expected. Thanks Tero!

image

2014
01.22

I’ve heard several times in various presentations about a whitepaper by Microsoft that discusses how the Windows build team in Microsoft HQ replaced traditional SAN storage (from a certain big name storage company) with Scale-Out File Server architecture based on:

  • Windows Server 2012 R2
  • JBOD
  • Storage Spaces

I searched for this whitepaper time and time again and never found it. Then today I was searching for a different storage paper (which I have yet to find) but I did stumble on the whitepaper with the build team details.

The paper reveals that:

  • The Windows Build Team were using traditional SAN storage
  • They needs 2 petabytes of storage to do 40,000 Windows installations per day
  • 2 PB was enough space for just 5 days of data !!!!
  • A disk failure could affect dozens of teams in Microsoft

They switched to WS2012 R2 with SOFS architectures:

  • 20 x WS2012 R2 clustered file servers provide the SOFS HA architecture with easy manageability.
  • 20 x  JBODs (60 x 3.5″ disk slots) were selected. Do the maths; that’s 20 x 60 x 4 TB = 4800 TB or > 4.6  petabytes!!! Yes, the graphic says they are 3 TB drives but the text in the paper says the disks are 4 TB.
  • There is an aggregate of 80 Gbps of networking to the servers. This is accomplished with 10 Gbps networking – I would guess it is iWARP.

The result of the switch was:

  • Doubling of the storage throughput via SMB 3.0 networking
  • Tripling of the raw storage capacity
  • Lower overall cost – reduced the cost/TB by 33%
  • In conjunction with Windows Server dedupe, they achieved 5x increase in capacity wutg 45-75% de-duplication rate.
  • This lead to data retention going from 5 days to nearly a month.
  • 8 full racks of gear were culled. They reduced the server count by 6x.
  • Each week 720 petabytes of data flows across this network to/from the storage.

image

Check out the whitepaper to learn more about how Windows Server 2012 R2 storage made all this possible. And then read my content on SMB 3.0 and SOFS here (use the above search control) and on The Petri IT Knowledgebase.

2014
01.20

Microsoft’s Jeff Woolsey just tweeted about a new release of FreeBSD that has built-in support for running as a guest operating system in a Hyper-V virtual machine.

image

The release notes for FreeBSD Release 10.0 say:

Major enhancements in virtualization, including the addition of bhyve(8), virtio(4), and native paravirtualized drivers providing support for FreeBSD as a guest operating system on Microsoft Hyper-V.

According to the FreeBSD Wiki, the following Hyper-V features are added to FreeBSD 10.0:

  • Support for integrated shutdown from Hyper-V console.
  • Support for keeping time synchronized between FreeBSD guest and Hyper-V host.
  • Support for Hyper-V specific IDE and SCSI storage devices.
  • Support for Hyper-V specific network adapter.
  • Live migration with and without static IP migration. Note that to enable static IP migration, administrators will need to include the KVP driver and daemon available in FreeBSD 10.0 ports for Hyper-V.

There are also some workarounds to a couple of issues:

  • The Hyper-V integration services are not activated in i386 release of FreeBSD 10.0 due to an oversight during the development process.
  • Device names may change once the Hyper-V storage integration service is installed on FreeBSD.

Now I know what’s going to happen here, because it happened before when the FreeBSD community said that this support was coming. NO ONE with authority has publicly said that Microsoft supports FreeBSD yet, as far as I know. Until then, please ignore any tweets or press reports that claim that Hyper-V supports FreeBSD. The way I read it is that FreeBSD is supporting being used on Hyper-V, and not the other way around. Look at the wording carefully.

What does that mean? FreeBSD probably works great as a guest OS installed into a Hyper-V VM. But if you have an issue with the guest OS’s stability or performance then take it up with the FreeBSD community because Hyper-V does not support FreeBSD.

That won’t change until there is an announcement on a formal Microsoft blog such as Ben Armstrong’s one, the Virtualization Team blog, the Openness Blog, or the Server & Cloud blog. Otherwise, please ignore any claims that Hyper-V supports FreeBSD, even if it says microsoft.com in the URL – I’m being serious about that. Some of the MSFT bloggers and DPEs got carried away with misinterpreting the previous development announcement. Until you see one of the aforementioned blogs clearly saying that “Hyper-V supports FreeBSD” (in that order, not FreeBSD supports Hyper-V) or it’s posted in the official list of supported guest OSs, then FreeBSD is not a supported guest OS on Hyper-V.

On a positive note, this development does open up some interesting possibilities. A number of appliances are based on FreeBSD, including NetApp (Data ONTAP) who I believe were one of the players behind this support. You’ll also see a number of security and networking solutions in the list. Wouldn’t it be nice to see some Hyper-V appliances appearing?!

Technorati Tags: ,
2014
01.17

The company I work for is a distributor. We sell Microsoft licensing (retail, OEM, volume licensing), retail and business laptops, Apple, and much more. Every summer I see how busy our Apple sales folks get. Back-to-school is a huge season for them and Apple recognises this by getting product out in time for the shopping spree.

Meanwhile, Microsoft has been doing general availability releases in October, completely missing the season when parents go spend crazy on their precious darlings. Microsoft has effectively halved their seasons by only catching Christmas. Apple gets both the summer buzz and the winter holidays. Sure, Microsoft has gotten lots of biz from €400 laptops in this season, but we know how much that market has been shrinking thanks to the constant IDC headlines.

We know now that “Windows 9” (codename “threshold”) is coming out in April 2015 (or thereabouts). I suspect that is an RTM date. GA will probably be the end of May or start of June. That’s a good thing.

The releases of Windows 8 and Windows 8.1 have shown us that the interval between RTM and GA is not enough for OEMs to get product out onto shelves. We’ve seen October GAs and previously announced stuff has taken 4-6 months to appear in the retail channel where customers can buy it. I suspect there are two factors in the delay:

  • OEMs are slow to build and ship
  • Retailers are focusing on clearing old stock before ordering next generation stock

For Microsoft and the willing consumer that is a lose-lose perfect storm.

With GA possibly in June, that gives the channel a chance to get stock out in the market by August, the sweet spot in the back-to-school market, and even longer for products to mature for the Christmas shopping season (November onwards).

If this is what happens then I would hope that Microsoft sticks to April RTM dates.

Technorati Tags: ,
2014
01.17

Microsoft announced the general availability of Hyper-V Recovery Manager (HRM) overnight. HRM is an Azure-based subscription service that allows you manage and orchestrate your Hyper-V Replica disaster recovery between sites.

As you can see in the below diagram, HRM resides in Azure. You have an SCVMM-managed cloud in the primary site.  You have another SCVMM-managed cloud in a secondary site; yes, there is a second SCVMM installation – this probably keeps things simple to be honest. Agents are downloaded from HRM to each SCVMM install to allow both SCVMM installations to integrate with HRM in the cloud. Then you manage everything through a portal. Replication remains direct from the primary site to the secondary site; replication traffic never passes through Azure. Azure/HRM are only used to manage and orchestrate the process.

There is a big focus on failover orchestration in HRM, including the ability to tier and build dependencies, just as real-world applications require.

I’ve not played with the service yet. I’ve sat through multiple demos and read quite a bit. There are nice features but there is one architectural problem that concerns me, and an economic issue that Microsoft can and must fix or else this product will go the way of Google Reader.

Pros

  • Simple: It’s a simple product. There is little to set up (agents) and the orchestration process has a pretty nice GUI. Simple is good in these days of increasing infrastructure & service complexity.
  • Orchestration: You can configure nice and complex orchestration. The nature of this interface appears to lend itself to being quite scalable.
  • Failover: The different kinds of failover, including test, can be performed.

Cons

  • Price: HRM is stupid expensive. I’ve talked to a good few people who knew about the pricing and they all agreed that they wouldn’t pay €11.92/month per virtual machine for an replication orchestration tool. That’s €143.04 per year per VM – just for orchestration!!! Remember that the replication mechanism (Hyper-V Replica) is built-in for free into Hyper-V (a free hypervisor).
  • Reliance on System Center: Microsoft touts the possibility of hosting companies using HRM in multi-tenant DR services. Let’s be clear here; the majority of customers that will want a service like this will be small-to-medium enterprises (SMEs). Larger enterprises will either already have their own service or have already shifted everything into public cloud or co-location hosting (where DR should already exist). Those SMEs mostly have been priced out of the System Center market. That means that service providers would be silly to think that they can rely on HRM to orchestrate DR for the majority of their customers – the many small ones that need the most automation because of the high engineering time versus profit ratio.
  • Location! Location! Location!: I need more than a bullet point for this most critical of problems. See below.

I would never rely on a DR failover/orchestration system that resides in a location that is outside of my DR site. I can’t trust that I will have access to that tool. Those of us who were working during 9/11 remember what the Internet was like – yes, even 3000 miles away in western Europe; The Internet ground to a halt. Imagine a disaster on the scale of 9/11 that drew the same level of immediate media and social interest. Now imagine trying to invoke your business continuity plan (BCP) and logging into the HRM portal. If the Net was stuffed like it was on 9/11 then you would not be able to access the portal and would not be able to start your carefully crafted and tested failover plan. And don’t limit this to just 9/11; consider other scenarios where you just don’t have remote access because ISPs have issues or even the Microsoft data centre has issues.

In my opinion, and I’m not alone here, the failover management tool must reside in the DR site as an on-premise appliance where it can be accessed locally during a disaster. Do not depend on any remote connections during a disaster. Oh; and at least halve the price of HRM.

2014
01.16

I just got called over by a panicking sales person in the office who had been reading the BBC News site. The BBC incorrectly reported that Microsoft was extending support and patching for Windows XP, beyond the end date of April 8th (also the end of support for Office 2003).

Let me repeat this:

Support for Windows XP and Office 2003 ENDS on April 8th, 2014

 

There will be no changes to this, no matter what some clueless intern in the BBC news department might have imagined up.

The story links to an announcement by Microsoft that clarifies that support for Microsoft antivirus products on Windows XP will continue through to July 14, 2015. Some people will continue to use Windows XP beyond the end of support date and Microsoft will be providing them with a minimum level of security. They’ll still be vulnerable to attack via vulnerabilities that will be patched on Windows Vista and newer, but still exist in Windows XP.

Another ZDNet blogger (some beardy dude I never heard of) was complaining that Microsoft will continue to allow people to activate Windows XP. I’m not even going to link to that click-bait article because it doesn’t deserve it. Of course activations will continue. People bought the product, still own it, and still have the legal right to use it.

Geez! There really are only two tech journalists out there: Paul Thurrott and Mary Jo Foley.

2014
01.15

I had an email from Bart Van Der Beek earlier this week questioning an aspect of my kit list for a Hyper-V cluster that is using a SOFS with Storage Spaces for the shared cluster storage. I had added RAM to the SOFS nodes to use for CSV Cache. Bart had talked to some MSFT people who told him that CSV Cache would not be used with tiered storage spaces. He asked if I knew about this. I did not.

So I had the chance to ask Elden Christensen (Failover Clustering PM, TechEd speaker, and author of many of the clustering blog posts, and all around clustering guru) about it tonight. Elden explained that:

  • No, CSV cache is not used with tiered storage spaces where the heapmap is used. This is when the usage of 1 MB blocks is tracked and those blocks are automatically promoted to the hot tier, demoted to the cold tier, or left where they are on a scheduled basis.
  • CSV Cache is used when the heapmap is not used and you manually pin entire files to a tier. This would normally only be used in VDI. However, enabling dedupe on that volume will offer better performance than CSV Cache.

So, if you are creating tiered storage spaces in your SOFS, there is no benefit in adding lots of RAM to the SOFS nodes.

Thanks for the heads up, Bart.

2014
01.15

This one has been bugging me for a couple of weeks and I just managed to find a fix. Right-clicking on the start button and pressing Windows+X failed to do anything.

The fix?

1) Open up command prompt and run:

xcopy %SystemDrive%UsersDefaultAppDataLocalMicrosoftWindowsWinX %userprofile%AppDataLocalMicrosoftWindowsWinX /e /y

2) Log out and log in again. Everything should work as expected.

Technorati Tags: ,
2014
01.15

The Failover Cluster Validation Wizard can perform a number of storage tests to determine the suitability and supportability of the shared storage of a potential new cluster. This is important for a Hyper-V cluster that will use directly attached shared storage such as a SAN (not SMB 3.0).

Microsoft has published a KB article for when these storage tests on a multi-site (stretch, cross-campus, or metro) failover cluster may not discover all shared LUNs on Windows Server 2012 or Windows Server 2012 R2.

Symptoms

Consider the following scenario:

  • You have a Windows Server 2012 or Windows Server 2012 R2 multi-site failover cluster.
  • A multi-site storage area network (SAN) is configured to have site-to-site mirroring.
  • You use the Validate a Configuration Wizard to run a set of validation tests on the failover cluster.

In this scenario, storage tests may not detect all logical unit numbers (LUNs) as shared LUNs.

Cause

Storage validation test selects only shared LUNs. A LUN is determined to be shared if its disk signatures, device identification number (page 0×83), and storage array serial number are the same on all cluster nodes. When you have site-to-site mirroring configured, a LUN in one site (site A) has a mirrored LUN in another site (site B). These LUNs have the same disk signatures and device identification number (page 0×83), but the storage array serial number are different. Therefore, they are not recognized as shared LUNs.

Resolution

To resolve the issue, run all the cluster validation tests before you configure the site-to-site LUN mirroring.

Note If the validation test is needed afterward for support situations, LUNs that are not selected for storage validation tests are supported by Microsoft and the storage vendor as valid Shared LUNs.

2014
01.15

This KB article looks like it affects Windows Server 2012 clusters so I’m including it in today’s posts. The fix is for when a stop error 0x9E in Windows Server 2012 or Windows 8.

Symptoms

When you have a cluster node that is running Windows Server 2012, you may encounter a 0x9E Stop error.

Cause

This issue occurs because of lock contention between the memory manager and the Cluster service or a resource monitor when a large file is mapped into system cache.

A hotfix is available to resolve this issue.

2014
01.15

Microsoft has published a hotfix for when OffloadWrite is doing PrepareForCriticalIo for the whole VHD in a Windows Server 2012 or Windows Server 2012 R2 Hyper-V.

Symptoms

Consider the following scenario:

  • You have a Hyper-V host that is running Windows Server 2012 or Windows Server 2012 R2.
  • You run a copy for a file in a virtual machine.
  • There is an offload write for Virtual Hard Disk (VHD) in the host.

In this scenario, NTFS in the host would do PrepareForCriticalIo for the whole VHD. This operation may cause the following bad consequences for Cluster Shared Volumes:

  • Redirected I/O may be time-out.
  • Snapshot creation can be stuck until the offloadWrite is complete.
  • Volume dismount will be blocked by inflight I/O. This can cause the Physical Disk Resource to be detected as deadlocked if dismount takes more than 3 minutes, or the cluster node to be bug checked if dismount takes more than 20 minutes.

A hotfix is available for this issue.

2014
01.15

Keep in mind that one of the features of Live Migration is that the original virtual machine and/or files are not removed until after the entire process is complete. This “bur no bridges” approach ensures that your virtual machine remains running no matter what happens during the migration process. I’ve seen this personally during the preview releases of WS2012 and WS202 R2 when stress testing Live Migration and other features.

Microsoft has published a KB article for when Hyper-V storage migration fails when you try to migrate VHD and configuration files to CSV volumes in Windows Server 2012.

Symptoms

Consider the following scenario:

  • You install the Hyper-V role on a Windows Server 2012-based two-node failover cluster.
  • You have two Cluster Shared Volumes (CSV) volumes.
  • You create a virtual machine on a cluster node. The virtual machine has a single 60-gigabyte (GB) fixed-size virtual hard disk (VHD).
    Note The virtual machine is not created on a CSV volume.
  • On the cluster node, the available space on drive C is less than 20 GB.
  • In the Hyper-V Manager console, you try to move the VHD file to one CSV volume, and you try to move the configuration files to the other CSV volume.
    Note The CSV volumes have enough space to hold the VHD file and the configuration files.

In this scenario, the migration operation fails, and you receive an error message that resembles the following:

Migration did not succeed. Not enough disk space at ”.

Note This issue still occurs after you install hotfix 2844296. For more information about hotfix 2844296, click the following article number to view the article in the Microsoft Knowledge Base:

2844296

(http://support.microsoft.com/kb/2844296/ )

Shared Nothing Live Migration fails when you try to migrate a virtual machine to a destination server in Windows Server 2012.

Cause

This issue occurs because the target CSV volumes are incorrectly identified as being a system drive volume instead of multiple separate CSV volumes.

A hotfix is available to resolve this issue.

2014
01.15

The Virtual Machine Management Service (VMMS) runs in user mode in the management OS of every Hyper-V host. It has nothing to do with SCVMM; that’s just an unfortunate similarity in names. The VMMS provides the WMI or management interface to Hyper-V for all management tools, such as PowerShell, Hyper-V Manager, or Failover Cluster Manager.

Microsoft published a KB article for when a standard Hyper-V replica is created unexpectedly after you restart the VMMS service in Windows Server 2012.

Symptoms

Consider the following scenario:

  • You have Windows Server 2012-based Hyper-V servers that are running in an environment that has Hyper Replica deployed.
  • You set more than two recovery points. 
  • You restart the VMMS service on a replica server, or you restart the replica server.
  • You wait about 5 minutes until the first time delta is arrived from the primary site.

In this scenario, a standard replica (recovery point) is created unexpectedly. 
Note If the time interval between the latest recovery point and the arrival of the delta is less than 60 minutes, a standard replica should not be created.

Cause

This issue occurs because the VMMS service incorrectly compares the time stamp of the earliest recovery point to the latest delta time stamp. Therefore, the system takes a new snapshot every time the VMMS service is restarted

A hotfix has been published to resolve this issue. It’s not an issue I’d expect to see too often but the fix is there.

2014
01.15

Time for you to do … exactly nothing for a month, because Microsoft has pushed out another UR for Windows 8, Windows RT, and Windows Server 2012. So make sure this sucker is unapproved and sits like that for a month until some other sucker has tested it for you. If there is a problem (and based on the last 12 months, there probably is one or more) then let that other person find the issue, report it, and Microsoft re-issue a fixed update rollup.

After digging into the contents of the update, we can see that there are networking fixes and a cluster fix. The latter is KB2876391, "0x0000009E" Stop error on cluster nodes in a Windows Server-based multi-node failover cluster environment.

Symptoms

Assume that you have a Windows Server 2008 R2 Service Pack 1 (SP1) or Windows Server 2012-based multi-node failover cluster that uses the Microsoft Device Specific Module (MSDSM) and Microsoft Multipath I/O (MPIO). The following events occur at almost the same time:

  • A new instance of an existing device arrives. Specifically, a new path to an MPIO disk is generated.
  • MSDSM finishes an I/O request. The request was the last outstanding I/O request.

In this scenario, some cluster nodes crash. Additionally, you receive a Stop error message that resembles the following:

STOP: 0x0000009E (parameter1, parameter2, parameter3, parameter4)

Notes

  • This Stop error describes an USER_MODE_HEALTH_MONITOR issue.
  • The parameters in this Stop error message vary, depending on the configuration of the computer.
  • Not all "Stop 0x0000009E" errors are caused by this issue.

Cause

This issue occurs because a remove lock on a logical unit number (LUN) is obtained two times, but only released one time. Therefore, the Plug and Play (PnP) manager cannot remove the device, and then the node crashes.

 

The hotfix is included in the UR. Despite what the Premier Sustained Engineering author wrote, this is not just for “Windows Server 2008 R2 SP1-based multi-node failover cluster environment” but it is also for WS2012.

2014
01.14

Microsoft has released a KB article to confirm a problem where Hyper-V Manager incorrectly reports "Update required" for the Hyper-V integration services in Windows Server 2012 guest operating systems that use SR-IOV on Windows Server 2012 R2 Hyper-V hosts.

Symptoms

Assume that you have a Windows Server 2012 R2-based Hyper-V server. A Windows Server 2012-based guest operating system that has integration services up to date and that uses Single Root I/O Virtualization (SR-IOV) is running on the server. After you restart the guest operating system, Hyper-V Manager incorrectly reports the integration services state of the guest operating system as Update required.

Status

This is a known issue of Windows Server 2012 R2. Except for the status report, there are no negative effects to the Hyper-V system.

Microsoft has confirmed that this is a problem.

You can ignore this annoying warning. I suspect that if the warning status appears in VMM then it is really annoying. But very few of you should be affected because SR-IOV is not needed by many VMs.

2014
01.14

I’ve had a number of requests to specify the pieces of a solution where there is a Windows Server 2012 R2 Hyper-V cluster that uses SMB 3.0 to store virtual machines on a Scale-Out File Server with Storage Spaces (JBOD). So that’s what I’m going to try to do with this post. Note that I am not going to bother with pricing:

  • It takes too long to calculate
  • Prices vary from country to country
  • List pricing is usually meaningless; work with a good distributor/reseller and you’ll get a bid/discount price.
  • Depending on where you live in the channel, you might be paying distribution price, trade price, or end-customer price, and that determines how much margin has been added to each component.
  • I’m lazy

Scale-Out File Server

Remember that an SOFS is a cluster that runs a special clustered file server role for application data. A cluster requires shared storage. That shared storage will be one or more Mini-SAS-attached JBOD trays (on the Storage Spaces HCL list) with Storage Spaces supplying the physical disk aggregation and virtualization (normally done by SAN controller software).

On the blade versus rack server question: I always go rack server. I’ve been burned by the limited flexibility and high costs of blades. Sure you can get 64 blades into a rack … but at what cost!?!?! FlexFabric-like solutions are expensive, and strictly speaking, not supported by Microsoft – not to mention they limit your bandwidth options hugely. The massive data centres that I’ve seen and been in use 1U and 2U rack servers.  I like 2U rack servers over 1U because 1U rack servers such as the R420 have only 1 full-height and 1 half-height PCI expansion slots. That half-height slot makes for tricky expansion.

For storage (and more) networking, I’ve elected to go with RDMA networking. Here you have two good choices:

  • iWARP: More affordable and running at 10 GbE – what I’ve illustrated here. Your vendor choice is Chelsio.
  • Infiniband: Amazing speeds (56 Gbps with faster to come) but more expensive. Your vendor choice is Mellanox.

I’ve ruled out RoCE. It’s too damned complicated – just ask Didier Van Hoye (@workinghardinit).

There will be two servers:

  • 2 x Dell R720: Dual Xeon CPU, 6 GB RAM, rail kits, dual CPU, on-board quad port 1 GbE NICs. The dual CPU gives me scalability to handle lots of hosts/clusters. The 4 x 1 GbE NICs are teamed (dynamic load distribution) for management functionality. I’d upgrade the built-in iDRAC Essentials to the Enterprise edition to get the KVM console and virtual media features. A pair of disks in RAID1 configuration are used for the OS in each of the SOFS nodes.
  • 10 x 1 GbE cables: This is to network the 4 x 1 GbE onboard NICs and the iDRAC management port. Who needs KVM when you’ve already bought it in the form of iDRAC.
  • 2 x Chelsio T520-CR: Dual port 10 GbE SFP+ iWARP (RDMA) NICs. These two rNICs are not teamed (not compatible with RDMA). They will reside on different VLANs/subnets for SMB Multichannel (cluster requirement). The role of these NICs is to converge SMB 3.0 storage, and cluster communications. I might even use these networks for backup traffic.
  • 4 x SFP+ cables: These are to connect the two servers to the two SFP+ 10 GbE switches.
  • 2 x LSI 9280-8e Mini-SAS HBAs: These are dual port Mini-SAS adapters that you insert into each server to connect to the JBOD(s). Windows MPIO provides the path failover.
  • 2 x Windows Server Standard Edition: We don’t need virtualization rights on the SOFS nodes. Standard edition includes Failover Clustering.

Regarding the JBODs:

Only use devices on the Microsoft HCL for your version of Windows Server. There are hardware features in these “dumb” JBODs that are required. And the testing process will probably lead to the manufacturer tweaking their hardware.

Not that although “any” dual channel SAS drive can be used, some firmwares are actually better than others. DataOn Storage maintain their own HCL of tested HDDs & SSDs and HBAs. Stick with the list that your JBOD vendor recommends.

How many and what kind of drives do you need? That depends. My example is just that: an example.

How many trays do you need? Enough to hold your required number of drives :D Really though, if I know that I will scale out to fill 3 trays then I will buy those 3 trays up front. Why? Because 3 trays is the minimum required for tray fault tolerance with 2-way mirror virtual disks (LUNs). Simply going from 1 tray to 2 and then 3 won’t do because data does not relocate.

Also remember that if you want tiered storage then there is a minimum number of SSDs (STRONGLY) recommended per tray.

Regarding using SATA drives: DON’T DO IT! The available interposer solution is strongly discouraged, even by DataOn.  If you really need SSD for tiered storage then you really need to pay (through the nose).

Here’s my EXAMPLE configuration:

  • 3 x DataOn Storage DNS-1640D: 24 x 2.5” disk slots in each 2U tray, each with a blank disk caddy for a dual channel SAS SSD or HDD drive. Each has dual boards for Mini-SAS connectivity (A+B for server 1 and A+B for server 2), and A+B connectivity for tray stacking. There is also dual PSU in each tray.
  • 18 x Mini-SAS cables: These cables are used to connect the LSI cards in the servers to the JBOD(s) and to stack the trays. At least I think 18 cables are required. They’re short cables because the servers are on top/under the JBOD trays and the entire storage solution is just 10U in height.
  • 12 x STEC S842E400M2 400GB SSD: Go google the price of these for a giggle! These are not your typical (or even “enterprise”) SSD that you’ll stick in a laptop.  I’m putting 4 into each JBOD, the recommended minimum number of SSDs in tiered storage if doing 2-way mirroring.
  • 48 x Seagate ST900MM0026 900 GB 10K SAS HDD: This gives us the bulk of the storage. There are 20 slots free (after the SSDs) in each JBOD and I’ve put in 16 disks into each. That gives me loads of capacity and some wiggle room to add more disks of either type.
  • 18 x Mini-SAS Cables: I’m not looking at a diagram and I’m tired so 18 might not be the right number. There’s a total of 10U of hardware in the SOFS (servers + JBOD) so short Mini-SAS cables will do the trick. These are used to attach the servers to the JBODs and to daisy chain the JBODs. The connections are fault tolerant – hence the high number of cables.

And that’s the SOFS, servers + JBODs with disks.

Just to remind you: it’s a sample spec. You might have one JBOD, you might have 4, or you might go with the 60 disk slot models. It all depends.

Hyper-V Hosts

My hosting environment will consist of one Hyper-V cluster with 8 nodes. This could be:

  • A few clusters, all sharing the same SOFS
  • One or more clusters with some non-clustered hosts, all sharing the same SOFS
  • Lots of non-clustered hosts, all sharing the same SOFS

One of the benefits of SMB 3.0 storage is that a shared folder is more flexible than a CSV on a SAN LUN. There are more sharing options, and this means that Live Migration can span the traditional boundary of storage without involving Shared-Nothing Live Migration.

Regarding host processors, the L2/L3 cache plays a huge role in performance. Try to get as new a processor as possible. And remember, it’s all Intel or all AMD; do not mix the brands.

There are lots of possible networking designs for these hosts. I’m going to use the design that I’ve implemented in the lab at work, and it’s also one that Microsoft recommends. A pair or rNICs (iWARP) will be used for the storage and cluster networking, residing on the same two VLANs as the cluster/storage networks that the SOFS nodes are on. Then two other NICs are going to be used for host and VM networking. These two NICs could be 1 GbE or 10 GbE or faster, depending on the needs of your VMs. I’ve got 4 pNICs to play with so I will team them.

    • 8 x Dell R720: Dual Xeon CPU, 256 GB RAM, rail kits, dual CPU, on-board quad port 1 GbE NICs. These are some big hosts. Put lots of RAM in because that’s the cheapest way to scale. CPU is almost never the 1st or even 2nd bottleneck in host capacity. The 4 x 1 GbE NICs are teamed (dynamic load distribution) for VM networking and management functionality. I’d upgrade the built-in iDRAC Essentials to the Enterprise edition to get the KVM console and virtual media features. A pair of disks in RAID1 configuration are used for the management OS.
    • 40 x 1 GbE cables: This is to network the 4 x 1 GbE onboard NICs and the iDRAC management port in each host. Who needs KVM when you’ve already bought it in the form of iDRAC.
    • 8 x Chelsio T520-CR: Dual port 10 GbE SFP+ iWARP (RDMA) NICs. These two rNICs are not teamed (not compatible with RDMA). They will reside on the same two different VLANs/subnets as the SOFS nodes. The role of these NICs is to converge SMB 3.0 storage, SMB 3.0 Live Migration (you gotta see it to believe it!), and cluster communications. I might even use these networks for backup traffic.
    • 16 x SFP+ cables: These are to connect the two servers to the two SFP+ 10 GbE switches.
    • 8 x Windows Server Datacenter Edition: The Datacenter edition gives us unlimited rights to install Windows Server into VMs that will run on these licensed hosts, making it the economical choice. Enabling Automatic Virtual Machine Activation in the VMs will simplify VM guest OS activation.

There are no HBAs in the Hyper-V hosts; the storage (SOFS) is accessed via SMB 3.0 over the rNICs.

Other Stuff

Hmm, we’re going to need:

  • 2 x SFP+ 10 GbE Switches with DBC support: Data Center Bridging really is required to do QoS of RDMA traffic. If would need PFC (Priority Flow Control) support if using RoCE for RDMA (not recommended – do either iWARP or Infiniband). Each switch needs at least 12 ports – allow for scalability.  For example, you might put your backup server on this network.
  • 2 x 1 GbE Switches: You really need a pair of 48 port top-of-rack switches in this design due to the number of 1 GbE ports being used and the need for growth.
  • Rack
  • PDU

And there’s probably other bits. For example, you might run a 2-node cluster for System Center and other management VMs. The nodes would have 32-64 GB RAM each. Those VMs could be stored on the SOFS or even on a JBOD that is directly attached to the 2 nodes with Storage Spaces enabled. You might run a server with lots of disk as your backup server. You might opt to run a pair of 1U servers are physical domain controllers for your infrastructure.

I recently priced up a kit, similar to above. It came in much cheaper than the equivalent blade/SAN configuration, which was a nice surprise. Even better was that the SOFS had 3 times more storage included than the SAN in that pricing!

2014
01.10

This is an odd KB article from Microsoft for Hyper-V. It deals with a virtual machine that is running on on Hyper-V Server 2012 or Hyper-V Server 2012 R2.  The VM is configured to use the DLC protocol but the VM cannot connect to an SNA host.

Symptoms

You use Microsoft Hyper-V Server 2012 or Microsoft Hyper-V Server 2012 R2 to host a virtual machine such as for Microsoft Host Integration Server 2009. If the virtual machine is configured to use the Data Link Control (DLC) protocol to connect to a Systems Network Architecture (SNA) host such as an IBM Mainframe z/OS system, the connection fails.

Cause

This problem occurs because Hyper-V Server 2012 and Hyper-V Server 2012 R2 do not support 802.3 frame types that do not have a Sub-Network Access Protocol (SNAP) header.

A hotfix is available from Microsoft to resolve this issue.

Get Adobe Flash player