Enabling Multi-Tenancy and Converged Fabric for the Cloud Using QoS

Speakers: Charley Wen and Richard Wurdock

Pretty demo intensive session. We start off with a demo of “fair sharing of bandwidth” where PSH is used with minimum bandwidth setting to provide equal weight to a set of VMs. One VM is needs to get more bandwidth but can’t get it. A new policy is deployed by script and it get’s a higher weight. It then can access more of the pipe. Maximum bandwidth would have capped the VM so it couldn’t access idle b/w.

Minimum Bandwidth Policy

Enforce bandwidth allocation –> get performance predictability
Redistribute unused bandwidth –> get high link utilisation

The effect is that VMs get an SLA. They always get the minimum if the require it. They consume nothing if they don’t use it, and that b/w is available to others to exceed their minimum.

Min BW % = Weight / Sum of Weights

Example of 1 Gbps pipe:

VM 1 = 1 = 100 Mbps
VM 2 = 2 = 200 Mbps
VM 3 = 5 = 500 Mbps

If you have NIC teaming, there is no way to guarantee minimum b/w of total potential pipe.

Maximum Bandwidth

Example, you have an expensive WAN link. You can cap a customer’s ability to use the pipe based on what they pay.

How it Works Under the Covers

Bunch of VMs trying to use a pNIC. The pNIC reports it’s speed. It reports when it sends a packet. This is recorded in a capacity meter. It feeds into the traffic meter and it determines classification of packet. Using that it figures out if exceeds capacity of the NIC. The peak bandwidth meter is fed by latter and it stops traffic (draining process).

Reserved bandwidth meter guarantees bandwidth.

All of this is software, and it is h/w vendor independent.

With all this you can do multi-tenancy without over-provisioning.

Converged Fabric

Simple image: two fabrics: network I/O and storage I/O across iSCSI, SMB, NFS, and Fiber Channel.

Expensive, so we’re trying to converge onto one fabric. QoS can be used to guarantee service of various functions of the converged fabric, e.g. run all network connections through a single hyper-v extensible switch, via 10 Gbps NIC team.

Windows Server 8 takes advantage of hardware where available to offload QoS.

We get a demo where a Live Migration cannot complete because a converged fabric is saturated (no QoS). In the demo a traffic class QoS policy is created and deployed. Now the LM works as expected … the required b/w is allocated to the LM job. The NIC in the demo supports h/w QoS so it does the work.

Business benefit: reduced capital costs by using fewer switches, etc.

Traffic Classification:

You can have up to 8 traffic classes – 1 of them is storage, by default by the sound of it.
Appears that DCB is involved with the LAN miniport and iSCSI miniport is traffic QoS with traffic classification. My head hurts.

Hmm, they finished after using only half of their time allocation.

Leave a Reply Cancel reply