The SANMAN: Storage According to the VMware Admin: SDRS, SIOC, VASA & Storage vMotion

System Admins were generally the early embracers and end users of VMware ESX as they immediately recognized the benefits of virtualization. Having been bogged down with the pains of running physical servers such as downtime for maintenance, patching and upgrades, they were the natural adopters of the bare metal hypervisor. The once Windows 2003 system admin was soon configuring virtual networks and VLANs as well as carving up Storage datastores, quickly empowering them as the master of this new domain that was revolutionizing the datacenter. As the industry matured in its understanding of VMware, so did VMware’s recognition that the networking, security and storage expertise should be broadened to those that had been involved in such work in the physical world. Along came features such as the Nexus 1000v and VM vShield that enabled the network and security teams to also plug into the ‘VM world’ enabling them to add their expertise and participate in the configuration of the virtual environment. With vSphere 5, VMware took the initiative further by bridging the Storage and VMware gap with new features that Storage teams could also take advantage of. Despite this terms such as SIOC, Storage DRS, VASA and Storage vMotion still seem to draw blanks from most Storage folk or are looked down upon as ‘a VMware thing’. So what exactly are these features and why should Storage as well as VM admin take note of them as well as work together to take full advantage of their benefits?

Firstly there’s Storage DRS (SDRS) and in my opinion the most exciting new feature of vSphere 5. SDRS enables the initial placement and on-going space & load balancing of VMs across datastores that are part of the same datastore cluster. Simply put think of a datastore cluster as an aggregation of multiple datastores into a single object and SDRS balancing the space and I/O load across it.

In the case of space utilization, this takes place by ensuring that a set threshold is not exceeded. So should a VM reach say 70% space threshold, then SDRS will move the VMs via Storage vMotion to other datastores to balance out the load.

Storage DRS based on Space utilisation

The other balancing feature which is load balancing based on I/O metrics, uses the vSphere feature Storage I/O Control (SIOC). In this instance SIOC is used to evaluate the datastores in the cluster by continuously monitoring how long it takes an I/O to do a round trip and then feeds this information to Storage DRS. If the latency value for a particular datastore is above a set threshold value for a period of time, then SDRS will rebalance the VMs across the datastores in the cluster via Storage vMotion. With many Storage administrators operating ‘dynamic tiering’ or ‘fully automated tiering’ at the backend of their storage arrays, it’s vital that a co-operative design and decision is made to ensure that the right features are utilized at the right time.

Storage DRS based on I/O latency

While most are aware of vMotion’s capabilities of seamlessly migrating VMs across hosts, Storage vMotion is a slightly different feature that allows the migration of running VMs from one datastore to another without incurring any downtime. In vSphere 5.0, Storage vMotion has been improved by enabling the operation to take place a lot quicker.

It does this by using a new Mirror Driver mechanism that keep blocks on the destination synchronized with any changes made to the source after any initial copying. The migration process then does a single pass of the disk, copying all the blocks to the destination disk. If there are any blocks that have changed this copy, the mirror driver will then synchronise from the source to the destination. It’s this single pass block copy that enables Storage vMotion to take place a lot quicker, enabling the end user to reap the benefits immediately.

Storage vMotion & the new Mirror Driver

As for the new feature named VASA, this has a focus around providing insight and information to the VM admin about the underlying storage. To explain VASA in its simplest terms is a new set of APIs that enables storage arrays to provide vCenter with visibility into the storage’s configuration, health status and functional capabilities. VASA also allows the VM admin to see the features and capabilities of their underlying physical storage. It allows the admin to see details such as the number of spindles for a volume, the number of expected IOPS or MB/s, the RAID levels, whether the LUNs are either thick or thin provisioned or even if there are any deduplication or compression details. By leveraging the information provided by VASA, SDRS can also utilize this to make its recommendations on space and I/O load balancing. Basically VASA is a great feature that ensures VM admins can quickly provision storage to VMs that are most applicable to them.

This leads onto the feature termed Profile Driven Storage. Profile Driven Storage is a feature that enables you to select the correct datastore on which to deploy your VMs based on that datastore’s capabilities. So building a Storage Profile, can happen in two ways, one is that the storage device has its capability associated automatically via VASA. The other way is that the storage device’s capability is user-defined and manually associated.

VASA & Profile Driven Storage

With the User-Defined option you can apply labels to your storage, such as Bronze, Silver & Gold based on the capabilities of that Storage. So for example once a profile is created and the user-defined capabilities are added to a datastore, you can then use that profile to select the correct storage for a new VM. If the profile is compatible with the VM’s requirements it is said to be compliant, if they do not, then the VM is said to be non-compliant. So while VASA and profile driven storage are still a new feature, their potential is immense especially in the future, as storage admin can potentially work alongside VM admins to help classify and tier their data.

As mentioned before Storage I/O Control or SIOC is a feature that enables you to configure rules and policies to help specify the business priority of each VM. It does this by dynamically allocating I/O resources to your critical application VMs whenever there’s an I/O congestion detected. Furthermore by enabling SIOC on a datastore you can trigger the monitoring of device latency as observed by the hosts. As SIOC takes charge of I/O allocation to VMs it also by default ignores

Disk.SchedNumReqOutstanding (DSNRO). Typically it’s DSNRO that sets the Queue Depth on the hypervisor layer but once SIOC is enabled it consequently takes on this responsibility basing its judgements on the I/O congestion and policy settings. This offloads a significant amount of performance design tasks from the admins but ultimately still requires the involvement of the Storage team to ensure that I/O contention is not falsely coming from poorly configured Storage and highly congested LUNs.

SIOC ignores Disk.SchedNumReqOutstanding to set the Queue Depth at the hypervisor level

So while these new features are ideal for the SMB they may still not be the sole answer to every Storage / VMware related problem related to virtualizing mission critical applications. As with any new feature or technology their success relies in their correct planning, design and implementation and for that to happen a siloed VM or Storage only approach needs to be evaded.