In this post I want to show you the amazing effect that Write-Back Cache can have on the write performance of Windows Server 2012 R2 Storage Spaces. But before I do, let’s fill in some gaps.
Background on Storage Spaces Write-Back Cache
Hyper-V, and many other applications/services/etc, does something called write-through. In other words, it bypasses write caches of your physical storage. This is to avoid corruption. Keep this in mind while I move on.
In WS2012 R2, Storage Spaces introduces tiered storage. This allows us to mix one tier of HDD (giving us bulk capacity) with one tier of SSD (giving us performance). Normally a heap map process runs at 1am (task scheduler, and therefore customisable) and moves around 1 MB slices of files to the hot SSD tier or to the cold HDD tier, based on demand. You can also pin entire files (maybe a VDI golden image) to the hot tier.
In addition, WS2012 R2 gives us something called Write-Back Cache (WBC). Think about this … SSD gives us really fast write speeds. Write caches are there to improve write performance. Some applications are using write-through to avoid storage caches because they need the acknowledgement mean that the write really went to disk.
What if abnormal increases in write behaviour led to the virtual disk (a LUN in Storage Spaces) using it’s allocated SSD tier to absorb that spike, and then demote the data to the HDD tier later on if the slices are measured as cold.
That’s exactly what WBC, a feature of Storage Spaces with tiered storage, does. A Storage Spaces tiered virtual disk will use the SSD tier to accommodate extra write activity. The SSD tier increases the available write capacity until the spike decreases and things go back to normal. We get the effect of a write cache, but write-through still happens because the write really is committed to disk rather than sitting in the RAM of a controller.
Putting Storage Spaces Write-Back Cache To The Test
What does this look like? I set up a Scale-Out File Server that uses a DataOn DNS-1640D JBOD. The 2 SOFS cluster nodes are each attached to the JBOD via dual port LSI 6 Gbps SAS adapters. In the JBOD there is a tier of 2 * STEC SSDs (4-8 SSDs is a recommended starting point for a production SSD tier) and a tier of 8 * Seagate 10K HDDs. I created 2 * 2-way mirrored virtual disks in the clustered Storage Space:
- CSV1: 50 GB SSD tier + 150 GB HDD tier with 5 GB write cache size (WBC enabled)
- CSV2: 200 GB HDD tier with no write cache (no WBC)
Note: I have 2 SSDs (sub-optimal starting point but it’s a lab and SSDs are expensive) so CSV1 has 1 column. CSV2 has 4 columns.
Each virtual disk was converted into a CSV, CSV1 and CSV2. A share was created on each CSV and shared as \\Demo-SOFS1\CSV1 and \\Demo-SOFS1\CSV2. Yeah, I like naming consistency
Then I logged into a Hyper-V host where I have installed SQLIO. I configured a couple of params.txt files, one to use the WBC-enabled share and the other to use the WBC-disabled share:
- Param1.TXT: \\demo-sofs1\CSV1\testfile.dat 32 0×0 1024
- Param2.TXT \\demo-sofs1\CSV2\testfile.dat 32 0×0 1024
I pre-expanded the test files that would be created in each share by running:
- "C:\Program Files (x86)\SQLIO\sqlio.exe" -kW -s5 -fsequential -o4 –b64 -F"C:\Program Files (x86)\SQLIO\param1.txt"
- "C:\Program Files (x86)\SQLIO\sqlio.exe" -kW -s5 -fsequential -o4 -b64 -F"C:\Program Files (x86)\SQLIO\param2.txt"
And then I ran a script that ran SQLIO with the following flags to write random 64 KB blocks (similar to VHDX) for 30 seconds:
- "C:\Program Files (x86)\SQLIO\sqlio.exe" -BS -kW -frandom -t1 -o1 -s30 -b64 -F"C:\Program Files (x86)\SQLIO\param1.txt"
- "C:\Program Files (x86)\SQLIO\sqlio.exe" -BS -kW -frandom -t1 -o1 -s30 -b64 -F"C:\Program Files (x86)\SQLIO\param2.txt"
That gave me my results:
To summarise the results:
The WBC-enabled share ran at:
- 2258.60 IOs/second
- 141.16 Megabytes/second
The WBC-disabled share ran at:
- 197.46 IOs/second
- 12.34 Megabytes/second
Storage Spaces Write-Back Cache enabled the share on CSV1 to run 11.44 times faster than the non-enhanced share!!! Everyone’s mileage will vary depending on number of SSDs versus HDDs, assigned cache size per virtual disk, speed of SSD and HDD, number of columns per virtual hard disk, and your network. But one thing is for sure, with just a few SSDs, I can efficiently cater for brief spikes in write operations by the services that I am storing on my Storage Pool.