I love it when I read about someone saying “my virtualisation solution support more processors and more memory in a host than your one”. It’s like arguing over who’s a better captain: Kirk or Picard? By the way, it’s Janeway.
In my experience I’ve yet to see a host with more than 200 GB RAM or 4 sockets (physical processors). And they’ve been the rare ones. Most have been 1 or 2 CPUs, with 32-48GB RAM, and often less than that.
For me, sizing a host comes down to a few things that need to be balanced:
- How much physical resource (such as RAM, IOPS, storage bandwidth, fault tolerance, CPU, etc) do I need? An assessment or a proof of concept will help with this. Failing that, do some googling for a reference architecture (which will give you a guesstimate).
- What will this stuff cost me to buy? Too many people get caught up on this one. I’ll come back to this in a moment.
- What will this stuff cost me to own? Ah, the forgotten element of the equation!
Purchase cost is only the starting point of the cost of own your new shiny piece of kit. These things can cost as much to run over 3 years than to purchase. But I hardly ever hear of anyone trying to figure that cost out, ask about it, or include it in their budgeting. That €10,000 server can cost you a total of €20,000 over 3 years. Build a a 10 node cluster and the numbers get big pretty quick. The ownership cost is complicated, but some of the big elements are host licensing, rack space, and the ever increasing cost of electricity.
OK, let’s assume we’ve done a sizing process and we need host fault tolerance with 40+ vCPUs and 400GB of RAM for the VMs. Storage will be courtesy of a 10 GB iSCSI SAN. How do you size those hosts? Do you get 2 big old beasts with enough capacity for all the VMs? Or do you get lots of machines that are stocked full of 4 GB DIMMs?
Things got a little more complicated in the last 12-18 months because hardware manufacturers are giving us machines that can support 24, 32 or more DIMMs in a single server. One machine (a HP DL585 G7) can take 4 * 12 core AMD processors (48 cores – giving you enough vCPU capacity to exceed the maximum support limits of Hyper-V with the new W2008 R2 SP1 ratio of 12:1 vCPUs to cores) and 512 GB RAM by using 16 GB DIMMs. But here’s the catch: how much does that beastie cost? A high end CPU can cost around €1,200. And pricing for DIMMs is not linear. In other words a 16 GB DIMM costs a good deal more than 4 * 4 GB DIMMs.
On the plus side with the big beastie, you are able to minimise your power consumption. If carbon footprint is your primary objective then this is your puppy! Or is it? Doesn’t fewer servers equal less power?
OK, so the big beast is expensive. What about going for something that uses 4 GB DIMMs? Typically that’ll mean a 2 CPU server with 96 GB of RAM (96 GB is the new 32 GB).
This does mean that you’re using more economic components. But this has an interesting effect on power costs. They go up. You’re using more CPUs, more power supplies, more rack space, more networking, and the cost goes on and on.
So where is the sweet spot? I’ve done some very rough sums using the Irish retail prices of HP servers and components, in combination with the HP power calculator with Irish electricity prices. I’ve taken the hardware costs, the power costs over 3 years, and created a total cost of owning a host server solution. And then I took the above requirements to size and price up 4 different solutions using the big iron servers, and the budget spec servers, and a couple of points between.
||Bid Price Hardware (80%)
||Total Cost (3 Years)
|3 * DL385 G7, 2 * 12 Core, 256 GB RAM
|4 * DL385 G7, 2 * 8 Core, 192 GB RAM
|6 * DL385 G7, 2 * 8 Core, 96 GB RAM
|2 * DL585 G7, 4 * 12 Core, 512 GB RAM
A few notes on the pricing first. I took retail pricing from the HP Ireland site and assumed a 20% discount for the bid price. That is pretty conservative. The power costs used Irish retail power rates (all I had available to me). I did not include rack space costs, or network costs (more servers equals more of those, thus driving up prices). Each server had additional CPUs (fastest available), an extra dual port 1GB NIC, an extra dual port 10 GB NIC (iSCSI), and 2 * 300 GB SAS drives.
So what was the result? The big iron DL585 boxes were not the cheapest to power. In fact, they came in third. I was a little surprised by this. I guess all those 16 GB DIMMs and 4 CPUs require a lot of cooling. There was no low power 16 GB DIMM; I used low power 4 GB and 8 GB DIMMs in the 2 middle specifications.
The DL385 G7 seemed to be the way to go then. I picked out models that came with the fastest of the AMD CPUs that were available. I then tweaked the choice of memory, this increasing/decreasing the number of hosts required for the VM RAM load, and further tweaked the CPU cores that were used (8 or 12) to match requirements.
The “budget” hardware purchase using 4 GB DIMMs came with a sting in the tail. It was the most expensive solution to power (6 servers instead of 2, 3 or 4), possible because it requires 6 servers. The purchase price was not actually budget at all; it was the second most expensive.
OK, the DL385 G7 is a virtualisation server. Why not spec it according to the maximums using 12 core CPUs and 16 GB DIMMs. This gave me a 3 node cluster. This was the cheapest solution to power, which the greener computing fans will be happy to hear. The purchase price was the second lowest which is good news. But over 3 years the total cost for this solution came in second. Maybe it would do better in a company where servers stay in production for longer. But virtualisation makes it easier to change hardware every 3 years and newer hardware tends to be cheaper to power and offer greater density.
Finally I found the sweet spot. I used the DL385 G7, loaded it with 8 Core CPUs and fully populated it with 8 GB DIMMs, giving me 192 GB RAM per host. This 4 node cluster came with the lowest total purchase price. The CPU switch and the change from 16 GB to 8 GB DIMMs made a huge dent, despite requiring an extra chassis. The power cost was the second highest, but not by much.
So what do I make of all this? I say it in the book, and I find myself saying it several times a week when talking to people. Your business and technology requirements should drive every decision you make. If you work for a company that must have a greener image then you’ll pay a little extra for the solution with the smallest footprint. If you’re concerned about rack space then you’ll take the solution that requires the least number of Us. If you are worried about fault tolerance then you’ll increase the cluster size to spread the load a little more. In my example, it appears that the sweet spot is to use a solution somewhere between the extremes, but with regular server models.
My advice to you is to open up Excel, get the various specifications, get the costs, and use a manufacturer’s power calculator to figure you what this stuff will cost you to power. You’ll probably need someone from Accounts to give you rack space, power, network, etc, costs – or help you calculate them. Don’t just pick out some arbitrary specification. And to complicate things: bid pricing (which you should be getting) will always change the equation, as will the inevitable price/model changes, over the following 3 years. And try to do other memory configurations that I haven’t done. There might be more possibilities that I haven’t calculated.
This blog post is the property of Aidan Finn (@joe_elway / http://www.aidanfinn.com) and may not be reused in any manner without prior consent of Aidan Finn. You may quote one paragraph from this blog post if you link to the original blog post.