Monitoring the SAN shine with Virtual Instruments

Posted by Archie Hendryx on Tuesday, August 31, 2010

It was about three months ago that one of my friends had informed me he was leaving HDS to join a company named Virtual Instruments. ‘Virtual Instruments?’ I asked myself, trying to fathom if I’d have heard of them before only to realize that I had once seen a write up on their SAN monitoring solution, which was then termed NetWisdom. I was then inadvertently asked to mention Virtual Instruments in one of my blogs - nice try pal but I had made it clear several times before to vendors requesting the same that I didn’t want my blog to become an advertising platform. Despite this though I was still intrigued by what could have persuaded someone to leave a genuinely stable position at HDS to a company I hadn’t really had much exposure to myself. Fast forward a few months, several whitepapers and numerous discussions and I find myself writing a blog about the very said company.

Simple fact is it’s rare to find a solution or product in the storage and virtualization market that can truly be regarded as unique. More often than not most new developments fall victim to what I term the ‘six month catch up’ syndrome in which a vendor brings out a new feature only for its main competitor to initially bash it and then subsequently release a rebranded and supposedly better version six months later. The original proponents of thin provisioning, automated tiered storage, deduplication, SSD flash drives etc. can all pay testament to this. It is hence why I have taken great interest in a company that currently occupies a niche in the SAN monitoring market and as yet doesn’t seem to have worthy competitor, namely Virtual Instruments.

My own experience of Storage monitoring has always been a pain in the sense that nine times out of ten it was a defensive exercise in proving to the applications, database or server guys that the problem didn’t lie with the storage. Storage most of the time is fairly straightforward, wherein if there are any performance problems with the storage system they’ve usually stemmed from any immediate change that may have occurred. For example provision a write intensive LUN to an already busy RAID group and you only have to count the seconds before your IT director rings your phone on the verge of a heart attack at how significantly his reporting times have increased. But then there was always the other situation when a problem would occur with no apparent changes having been made. Such situations required the old hat method of troubleshooting supposed storage problems by pinpointing whether the problem was between the Storage and the SAN fabric or between the Server and the SAN but therein dwelled the Bermuda Triangle at the centre of it all i.e. the SAN. Try to get a deeper look into the central meeting point of your Storage Infrastructure and to see what real time changes have occurred on your SAN fabric and you’d subsequently enter a labyrinth of guesses and predictions.

Such a situation occurred to me when I was asked to analyze and fix an ever-slowing backup of an Oracle database. Having bought more LTO4 tapes, incorporating a destaging device, spending exorbitant amounts of money on man days for the vendor’s technical consultants, playing around with the switches buffer credits and even considering buying more FC disks, the client still hadn’t resolved the situation. Now enter yours truly into the labyrinth of guesses and predictions. Thankfully I was able to solve the issue by staying up all night and running a Solaris IOSTAT, while simultaneously having the storage system up on another screen. Eventually I was able to pinpoint (albeit with trial and error tactics) the problem to rather large block sizes and particular LUNs that were using the same BEDs and causing havoc on their respected RAID groups. With several more sleepless nights to verify the conclusion, the problem was finally resolved. Looking back surely there was a better, cost effective and more productive way to have solved this issue. There was but I just wasn’t aware of it.

Furthermore ask any Storage guy that’s familiar with SAN management/monitoring software such as HDS’ Tuning Manager, EMC’s ControlCenter, HP’s Storage Essentials and their like and they’ll know full well that despite all the SNIA SMI-S compliancy they still fail to provide metrics beyond the customary RAID group utilization, historic IOPS/sec, cache hit rate, disk response times etc. in other words from the perspective of the end-user there really is little to monitor and hence troubleshoot. Frustratingly such solutions still fail to provide performance metrics from an application to storage system view and thus also fail to allow the end user to verify if they are indeed meeting the SLAs for that application. Put this scenario in the ever growing virtual server environment and you are further blinded by not knowing the relation between the I/Os and the virtual machines from which they originated.

Moreover Storage vendors don’t seem to be in a rush to solve this problem either and the pessimist in me says this is understandable when such a solution would inevitably lead to a non-procurement of unnecessary hardware. With a precise analysis and pinpointing of performance problems/degradation and you have the consequent annulment of the haphazard ‘let’s throw some more storage at it’, ‘let’s buy SSDs’ or ‘let’s upgrade our Storage System’ solutions that are currently music to the ears of storage vendor sales guys. So amidst these partial viewing vendor provided monitoring tools, which lack that essential I/O transaction-level visibility, Virtual Instruments (VI) pushes forth it’s solution, which boldly claims to encompass the most comprehensive monitoring and management of end-to-end SAN traffic. From the intricacies of a virtual machine’s application to the Fibre Channel cable that’s plugged into your USPV, VMax etc. VI say they have an insight. So looking back had I had VI’s ability to instantly access trending data on metrics such as MB/sec, CRC errors, log ins and outs etc. I could have instantly pinpointed and resolved many of the labyrinth quests I had ventured through so many times in the past.

Looking even closer at VI, there are situations beyond the SAN troubleshooting syndrome in which it can benefit an organization. Like most datacenters if you have one of the Empire State Building-esque monolithic storage systems it is more than likely being under utilized with the majority of its residing applications not requiring the cost and performance of such a system. So while most organizations are aware of this and look to saving costs by tiering their infrastructure onto cheaper storage via the alignment of their data values to the underlying storage platform, it’s seldom a seen reality due to the headaches and lack of insight related to such operations. Tiering off an application onto a cheaper storage platform requires the justification from the Storage Manager that there will be no performance impact to the end users but due to the lack of precise monitoring information, many are not prepared to take that risk. In an indirect acknowledgement to this problem, several storage vendors have looked at introducing automated tiering software for their arrays which in essence merely looks at the LUN utilization before migrating them to either higher-performance drives or cheaper SATA drives. In reality this is still a rather crude way of tiering an infrastructure when you consider it ignores SAN fabric congestion or improper HBA queue depths. In such a situation a monitoring tool that tracks I/Os across the SAN infrastructure without being pigeonholed to a specific device is axiomatic in the enablement of performance optimization and the consequent delivery of Tier I SLAs with cheaper storage – cue VI and their VirtualWisdom 2.0 solution.

In the same way that server virtualisation exposed the under utilization of physical server CPU and Memory, the VirtualWisdom solution is doing the same for the SAN. While vendors are more than pleased to further sell more upgraded modules packed with ports for their enterprise directors, it is becoming increasingly apparent that most SAN fabrics are significantly over-provisioned with utilization rates often being less than 10%. While many SAN fabric architects seem to overlook fan in ratios and oversubscription rates in a rush to finish deployments within specified project deadlines, underutilized SAN ports are now an ever-increasing reality that in turn bring with them the additional costs of switch and storage ports, SFPs and cables.

Within the context of server virtualisation itself, which has undoubtedly brought many advantages with it, one irritating side affect has been the rapid expansion of FC traffic to accommodate the increased number of servers going through a single SAN switch port and the complexity now required to monitor it. Then there’s the virtual maze which starts with applications within the Virtual Machines that are in turn running on multi-socket and multi-core servers, which are then connected to a VSAN infrastructure only to finally end up on storage systems which also incorporate virtualization layers whether that be with externally attached storage systems or thinly-provisioned disks. Finding an end-to-end monitoring solution in such a cascade of complexities seems an almost impossibility. Not so it seems for the team at Virtual Instruments.
Advancing upon the original NetWisdom premise, VI’s updated Virtual Wisdom 2.0 has a virtual software probe named ProbeV. The ProbeV collects the necessary information from the SAN switches via SNMP and on a port to port basis metrics on information such as the number of frames and bytes are collated alongside potential faults such as CRC errors synchronization loss, packet discards or link resets /failures. Then via the installation of splitters (which VI name TAPs - Traffic Access Points) between the storage array ports and the rest of the SAN, a percentage of the light from the fibre cable is then copied to a data recorder for playback and analysis. VI’s Fibre-Channel probes (ProbeFCXs) then analyze every frame header, measuring every SCSI I/O transaction from beginning to end. This enables a view of traffic performance whether related to the LUN, HBA, read/write level, or application level, allowing the user to instantly detect application performance slowdowns or transmission errors. The concept seems straightforward enough but it’s a concept no one else has yet been able to put in practice, despite growing competition from products such as Akorri's BalancePoint, Aptare's StorageConsole or Emulex's OneCommand Vision.

Added to this VI’s capabilities can also provide a clear advantage in preparing for a potential virtualization deployment or dare I fall for the marketing terminology – a move to the private cloud. Lack of insight of performance metrics has evidently led to the stagnation of the majority of organizations virtualising their tier 1 applications. Server virtualization has reaped many benefits for many organizations, but ask those same organizations how many of them have migrated their IO intensive tier 1 applications from their SPAARC based physical platforms to an Intel based virtual one and you’re probably looking at a paltry figure. The simple reason is risk and fear of performance degradation, despite logic showing that a virtual platform with resources set up as a pool could potentially bring numerous advantages. Put this now in the context of a world where cloud computing is the new buzz as more and more organizations look to outsource many of their services and applications and you then have even fewer numbers willing to launch their mission critical applications from the supposed safety and assured performance of the in-house datacenter to the unknown territory of the clouds. It is here where VirtualWisdom 2.0 has the potential to be absolutely huge in the market and at the forefront of the inevitable shift of tier 1 applications to the cloud. While I admittedly I find it hard to currently envision a future where a bank launches it’s OLTP into the cloud based on security issues alone, I’d be blinkered to not realize that there is a future where some mission-critical applications will indeed take that route. With VirtualWisdom’s ability to pinpoint virtualized application performance bottlenecks in the SAN, it’s a given that the consequences will lead to an instantly significant higher virtual infrastructure utilization and subsequent ROI.

The VI strategy is simple in that by recognizing I/O as the largest cause of application latency, VirtualWisdom’s inclusion of baseline comparisons of I/O performance, bandwidth utilization and average I/O completions comfortably provide the necessary insight fundamental to any major virtualization or cloud considerations an organization may be planning for. With its ProbeVM, a virtual software probe that collects status from VMware servers via vCenter, the data flow from virtual machine through to the storage system can be comprehensively analyzed with historical and real-time performance dashboards leading to an enhanced as well as accurate understanding of resource utilization and performance requirements. With a predictive analysis feature based on real production data the tool also provides the user the ability to accurately understand the effects of any potential SAN configuration or deployment changes. With every transaction from Virtual Machine to LUN being monitored, latency sources can quickly be identified whether it’s from the SAN or the application itself, enabling a virtual environment to be easily diagnosed and remedied should any performance issues occur. With such metrics at their disposal and the resultant confidence given to the administrator, the worry of meeting SLAs could quickly become a thing of the past while also rapidly hastening the shift towards tier 1 applications being on virtualized platforms. So despite growing attention being given to other VM monitoring tools such as Xangati or Hyperic, they’re solutions still lack the comprehensive nature of VI.

The advantages to blue-chip, big corporate customers are obvious and as their SAN and virtual environments continue to grow, an investment into a VirtualWisdom solution should soon become compulsory for any end of year budget approval. In saying that though, the future of VI also quite clearly lies beyond the big corporates with benefits which include the enablement of an organization to have real- time proactive monitoring and alerting, consolidation, preemptive analysis of any changes within the SAN or Virtual environment and comprehensive trend analysis of application, host HBA, switches, virtualization appliances, storage ports and LUN performance. Any company therefore looking to either consolidate their costly over-provisioned SAN, accelerate troubleshooting, improve their VMware server utilization & capacity planning, implement a tiering infrastructure or migrate to a cloud would find the CAPEX improvements that come with VirtualWisdom a figure too hard to ignore. So while Storage vendors don’t seem to be in any rush to fill this gap, they too have an opportunity to undercut their competitors by working alongside VI by promoting its benefits as a complement to their latest hardware, something which EMC, HDS, IBM and most recently Dell have cottoned on to having signed an agreement to sell the VI range as part their portfolio. Despite certain pretenders claiming to take its throne, FC is certainly here to stay for the foreseeable future. If the market/customer base is allowed to fully understand and recognize its need, then there’s no preventing a future when just about every SAN fabric comes part and parcel with a VI solution ensuring its optimal use. Whether VI eventually get bought out by one of the large whales or continue to swim the shores independently, there is no denying that companies will need to seriously consider the VI option if they’re to avoid drowning in the apprehensive nature of virtual infrastructure growth or the ever increasing costs of under-utilized SAN fabrics.

VDI – A Vulnerably Dangerous Investment or A Virtual Dream Inclusion?

Posted by Archie Hendryx on Saturday, August 21, 2010

PCs are part of everyday life in just about every organization. First there’s the purchase of the hardware and the necessary software followed by an inventory recorded and maintained by the IT department. Then normal procedure would dictate that the same IT department would then install all required applications before delivering them physically to the end user. Then over a period of time the laptop/PC would be maintained by the IT department with software updates, patches, troubleshooting etc. to ensure full utilization of employees. Once the PC/laptop becomes outdated, the IT department is then tasked with the monotonous task of removing the hardware, deleting sensitive data and removing any installed applications to free up licenses. All of this is done to enable the whole cycle to be repeated all over again. So in this vicious circle, there are obvious opportunities to better manage resources and save unnecessary OPEX & CAPEX costs, one such solution being virtual desktops.

Having witnessed the financial rewards of server virtualization, enterprises are now taking note of the benefits and usage of virtualization to support their desktop workloads. Consolidation, centralization are now no longer buzz words which were once used for marketing spin but are instead tangible realities for IT managers who initially took that unknown plunge into what was then the deep mystical waters of virtualization. Now they’re also realizing that by enabling thin clients the cost of their endpoint hardware is also significantly driven down by the consequent lifespan extension of existing PCs. Indeed the future of endpoint devices is one that could revolutionize their existent IT offices – a future of PC/laptop-less office desks replaced by thin client compatible portable iPads? Anything is now possible.

There’s also no doubting that VDI brings with it even further advantages one being improved security. With data always being administered via the datacenter rather than from the vulnerability of an end user’s desktop, risks of data loss or theft are instantly mitigated. No longer can sensitive data potentially walk out of the company’s front doors. Also with centralized administration, data can instantly be protected from scenarios where access needs to be limited or copying needs protection. For example a company that has numerous outsourcers / contractors on site can quickly set their data and application access to be specified or even turned off. Indeed there is nothing stopping an organization in setting up ‘a contractor’ desktop template which can be provisioned instantly and then decommissioned the moment the outsourced party’s contract expires.

By centralizing the infrastructure, fully compliant backup policies can also become significantly easier. With PCs and hard drives constantly crashing leading to potential data loss, the centralized virtual desktop has an underlying infrastructure which is continuously backed up. Additionally with the desktop instance not being bound to the PC’s local storage but instead stored in the server, recovery from potential outages are significantly quicker with even the option of reverting the virtual desktops back to their last known good states. Imagine the amount of work the customary employees that constantly bombard the IT helpdesk with countless “help I’ve accidentally deleted my hard drive” phone calls could actually get done now, not to mention the amount of time it will free up for your IT helpdesk team. In fact you might even end up with an IT helpdesk that gets to answer the phone instead of taking you straight to voicemail.

Additionally an IT helpdesk team would also be better utilized with the centralized, server-based approach allowing for both the maintenance of desktop images and specific user data all without having to visit the end user’s office. Hence with nothing needing to be installed on the endpoint, deployment becomes incredibly faster and easier with VDI than the traditional PC desktop deployment. This can also be extended to the laborious practice of having to individually visit each desktop to patch applications, provision and decommission users, as well as upgrade to newer operating systems. By removing such activities, the OPEX savings are more than substantial.

OPEX savings can also be seen with the added benefit of optimizing the productivity of highly paid non-technical end users by avoiding them having to needlessly maintain their desktop applications and data. Furthermore the productivity of employees can also be improved significantly by a centralized control of which applications are used by end users and a full monitoring of their usage, so long gone should be the days of employees downloading torrents or mindlessly chatting away on social networks during working hours. Even the infamously slow start up time of Windows which has consequently brought with it the traditional yet unofficial morning coffee/cigarette break can be eradicated with the faster Windows boot up times found with VDI. Even lack of access to an employee’s corporate PC can no longer be used as an excuse to not log in from home or elsewhere remotely when required – a manager’s dream and a slacker’s nightmare.

So with all these benefits, where lies the risk or obstacle to adopting a VDI infrastructure for your company? Well as with most technology there rarely exists a one solution fits all scenario and VDI is no different. Prior to any consideration for VDI, a company must first assess their infrastructure and whether VDI could indeed reap these benefits or alternatively possibly cause it more problems.

One of the first issues to look for is whether the organization has a high percentage of end users which manipulate complex or very large files. In other words if a high proportion of end users are constantly in need of using multimedia, 2D or 3D modeling applications, or VOIP, than VDI should possibly be reconsidered for a better managed desktop environment. The performance limitations that came about with server-based computing platforms such as Microsoft's Terminal Services with regards to bandwidth, latency and graphics capabilities are still fresh in the mind of many old school IT end users and without the correct pre-assessment those old monsters could rear their ugly head. For example an infrastructure that has many end users using high performance / real time applications should think carefully before going down the VDI route regardless of what the sales guys claim.

Despite this though if having taken all this into consideration and realizing your environment is suited to a VDI deployment the benefits and consequent savings are extensive despite the initial expenditure. As for which solution to take this leads to another careful consideration and one that needs to be investigated beyond the usual vendor marketing hype.

Firstly when it comes to server virtualization, there currently is no threatening competition (certainly not in the Enterprise infrastructure) to VMware’s VSphere 4. In the context of desktop virtualization though, the story has been somewhat different. Citrix’s XenDeskTop for those who’ve deployed it certainly know that it has better application compatibility than VMview 3. Add to the problems of multimedia freeze framing that would often occur with the VMview 3 solution and Citrix looked to have cornered a market in the virtual sphere which initially seemed destined to be monopolized by VMware. Since then VMware have hit back with VMview 4 which brought in the vastly improved PCOIP display protocol which dwarfs their original RDS protocol and simplified their integration with Active Directory and overall installation of the product, but in performance terms XenDeskTop still has an edge. So it comes as no surprise that rumours are rife that VMWorld 2010 which is soon to take place in a couple of weeks will be the launching pad for VMview 4.5 and a consequent onslaught on the Citrix VDI model. Subsequent retaliation is bound to follow from Citrix who seemed to have moved their focus away from the server virtualization realm in favour of the VDI milieu which can only be better for the clients that they are aiming for. Already features such as Offline Desktop, which allow end users to download and run their virtual desktops offline and then later resynchronize with the data center are being developed beyond the beta stage.

So the fact remains that quickly provisioning desktops from a master image and instantly administering policies, patches and updates without affecting user settings, data or preferences is an advantage many will find hard to ignore. So while VDI has still many areas for improvement, depending on your infrastructure it may already be an appropriate time to reap the rewards of its numerous benefits.

VSphere 4 still leaves Microsoft Hyper V-entilating

Posted by Archie Hendryx on Tuesday, August 03, 2010

When faced with a tirade of client consultations and disaster recovery proposals/assessments, you can’t help but be inundated with opportunities to showcase the benefits of server virtualization and more specifically VMware’s Site Recovery Manager. It’s a given that if an environment has a significant amount of applications running on X86 platforms, then virtualization is the way to go not just for all the consolidation and TCO savings but for the ease in which high availability, redundancy and business continuity can be deployed. Add to that the benefit of a virtualized disaster recovery solution that can easily be tested, failed over or failed back. With what was once a complex procedure, testing can now be done via a simple GUI based recovery plan. Thus one should consequently see the eradication of trepidation that often existed in testing out how full proof an existent DR procedure actually was. Long gone should be the days of the archaic approach of the 1000 page Doomsday Book-like disaster recovery plans which the network, server and storage guys had to rummage through during a recovery situation, often becoming a disaster within itself. Hence then there really is little argument to not go with a virtualized DR site and more specifically VMware’s Site Recovery Manager, but not so it seems if you’ve been cornered and inculcated by the Microsoft Hyper V Sales team.

Before I embark further, let’s be clear that I am not an employee or sales guy for VMware - I’m just a techie at heart who loves to showcase great technology. Furthermore let it go on record that I’ve never really had a bone of contention with Microsoft before – their Office products are great, Exchange still looks fab and I still run Windows on my laptop (albeit on VMware Fusion). I even didn’t take that much offense when I recently purchased Windows 7 only to realize that it was just a well marketed patch for the heir to the disastrous Windows ME throne i.e. Windows Vista. I also took it with a pinch of salt that Microsoft were falsely telling customers that Exchange would run better on local disks as opposed to the SAN in an attempt to safeguard themselves from the ongoing threat of Google Apps (a point well exposed and iterated on David Vellante’s Wikibon article, “Why Microsoft has it’s head up it’s DAS”). Additionally my purchase of Office 2010 in which I struggled to fathom the significant difference between Office 2007, still didn’t irk me that much. What has turned out to be the straw that broke the camel’s back though is the constant claims Microsoft are making that Hyper-V is somehow an equally good substitute to VMware and consequently pushing customers to avoid a Disaster Recovery Plan that includes Site Recovery Manager. So what exactly are the main differences between the two hypervisors and why is it that I so audaciously refuse to even consider Hyper-V as an alternative to VSphere 4?

Firstly one of the contentions often faced with virtualizing is the notion that some applications don’t perform well if at all when on a virtualized platform. This is true when put in the context of Hyper V, which currently limits the number of vCPUs to only 4. That’s pretty much a no go for CPU thirsty applications leading to an erroneous idea that a large set of applications should be excluded from virtualization. This is not the case when put in the VSphere 4 context where guests can have up to 8 cores of vCPUs. In an industry which is following a trend of CPUs scaling up by adding cores instead of increasing clock rates, the future of high-end x86 servers provides a vast potential for just about any CPU hungry application to run on a virtualized platform – something VSphere 4 is already taking the lead in.

Then there’s the management infrastructure in which Hyper V uses software named Systems Center (SC) and more specifically the Systems Center Virtual Machine Manager (SCVMM), whereas the VSphere4 equivalent is named vCenter Server. With Hyper-V being part of a complete Microsoft virtualization solution, System Center is generally used to manage Windows Server deployments. The System Center Virtual Machine Manager on the other hand not only manages Hyper-V-hosted guests but also Virtual Server, VMware Server and VMware ESX and GSX guests. Ironically this can then also be extended to managing vMotion operations between ESX hosts, (perhaps an inadvertent admission from Microsoft that vMotion wipes the floor off their equivalent Live Migration). Compared to vCenter Server which can either be a physical or virtual machine this comes across as somewhat paltry when VSphere 4 now offers the ability to allow multiple vCenter servers to be linked together and controlled from a single console, enabling a consolidated management of thousands of Virtual Machines and several Datacenters. Add to this the functionality that vCenter Server provides a search-based navigation tool that enables the finding of virtual machines, physical hosts and other inventory objects based on a user defined criteria and you have the ability to quickly find unused virtual machines or resources in the largest of environments all through a single management pane.

Taking the linked management capabilities of vCenter further, VSphere 4 also offers what they term the vNetwork Distributed Switch. Previously for an ESX server a virtual network switch was provisioned and managed and configured. With the vNetwork Distributed Switch, virtual switches can now span multiple ESX servers while also allowing the integration of third-party distributed switches. For example the Cisco Nexus 1000v is the gateway for the network gurus to enter the world of server virtualization and take the reins of the virtual network which were previously being run by VM system admins. Put this in the context of multiple vCenter Servers in the new linked mode and end users have the capability to not only manage numerous virtual machines but also the virtual network switches. In an Enterprise environment where there are hundreds of servers and thousands of virtual machines, what previously would have been a per-ESX switch configuration change can now be done centrally and in one go with the vNetwork Distributed Switch. Hyper V as of yet has no equivalent.

That broad approach has also pushed VMware to not only incorporate the network guys into their world, but also the security and backup gurus. With VSphere 4’s VMSafe, VMware have now enabled the use of 3rd party security products within their Virtual Machines. An avenue for the security guys to at last enter the virtual matrix they previously had little or no input in. Then there’s the doorway that VSphere 4 has opened for backup gurus such as Veeam to plug into virtual machines and take advantage of the latest developments such as Change Block Tracking and vStorage APIs bringing customers a more sophisticated and sound approach to VM backups. Hyper V still has no VMsafe equivalent and certainly no Change Block Tracking.

Furthermore as Microsoft flaunt Hyper V’s latest developments, scrutiny shows that they are merely features that have been available on VMware for several years and even then still don’t measure up in terms of performance. Point in case being Hyper V’s rather ironically titled ’Quick Motion’. For high availability and unplanned downtime protection Hyper-V clusters have a functionality that restarts Virtual Machines on other cluster nodes if a node fails. With ‘Quick Motion’ a Virtual Machine is then moved between cluster hosts. Where it fails though is in its inability to do the action instantly as is the case with VMware’s vMotion and HA features. This hardly exudes confidence in Hyper V when a potential move that can take several seconds leaves you exposed to the risk of a network connection failure which consequently results in further unplanned downtime. Subsequently Quick Motion’s inability to seamlessly move Virtual Machines across physical platforms results in downtime requirements for any potential server maintenance. This is certainly not the case with VMware and vMotion wherein server maintenance requiring downtime is a thing of the past.

Moreover so seamless is the vMotion process that the end user has no idea that his virtual machine has just crossed physical platforms while they were inputting new data. This leads us to Hyper V’s reaction and improved offering now termed Live Migration which Microsoft claim is now on a par with vMotion. Upon further inspection this still isn’t the case as the amount of vMotion operations that can be simultaneously done between physical servers is still far more limited with Hyper V. Additionally while Hyper V claims to be gaining ground, VMware in return have shot even further ahead with VSphere4’s Storage vMotion capabilities which allows ‘on the fly’ relocation of virtual disks between the storage resources within their given cluster. So as VMware advances and fine tunes its features such as Distributed Resource Scheduler, Distributed Power Management (DPM), Thin Provisioning, High Availability (HA) etc., Hyper V is only just announcing similar functions.

Another issue with Hyper-V is that it’s simply an add-on of Windows Server which relies on a Windows 2008 parent partition i.e. it’s not a bare metal hypervisor as virtual machines have to run on the physical system’s operating system, (something akin to VMware’s Workstation). Despite Microsoft’s claims that the device drivers have low latency access to the hardware, thus providing a hypervisor-like layer that runs alongside the full Windows Server software, in practical terms those that have deployed both Hyper V and VMware can testify the performance stats are still not comparable. One of the reasons for this is that VMware have optimized their drivers with the hardware vendors themselves unlike Hyper V which sadly is stuck in the ‘Windows’ world.

This leads to my next point that with VSphere 4 there is no reliance on a general operating system and the various operating systems that are now supported by VMware continues to grow. Microsoft on the other hand, being the potential sinking ship that she is in the Enterprise Datacenter have tried to counter this advantage with marketing Hyper V as being able to run on a larger variety of hardware configurations. One snag they don’t talk about so much is that it has to be a hardware configuration that is designed to support Windows. Ironic when one of the great things about virtualization is that Virtual Machines with just about any operating system can now be run together on the same physical server, sharing pools or resources – not so for Microsoft and Hyper V who desperately try to corner customers to remain on a made-for-PC operating system that somehow got drafted into datacenters. Question now is how many more inevitable reboots will it take on a Windows Enterprise Server before IT managers say enough is enough?

Then there are some of the new features that were introduced in VSphere 4 which still have failed to take similar shape in the Hyper V realm. For example VMDirectPath I/O which allows device drivers in virtual machines to bypass the virtualization layer and access the physical resources directly – a great feature for workloads that need constant and frequent access to I/O devices.

There’s also the Hot-Add features wherein a virtual machine running Windows 2000 or above can have its network cards, SCSI adaptors, sound cards, CD-ROMs added or removed while still powered on. They even go further by letting your Win 2003 or above VM hot add memory or CPU and even extend your VMDK files – all while the machine is still running. There’s still nothing ‘hot’ to add from the Hyper V front.

Also instead of the headache inducing complexities that come with Microsoft’s Cluster Service, VSphere 4 comes with Fault tolerance – a far easier alternative for mission critical applications that can’t tolerate downtime or data loss. By simply creating a duplicate virtual machine on a separate physical host and via vLockstep technology to ensure consistency of data, VSphere 4 offers a long awaited and straightforward alternative to complex clustering that further enhances the benefits of virtualization. No surprise then that currently the Microsoft Hyper V sales guys tend to belittle it as no great advantage.

Another VSphere 4 feature which also holds great benefits and is non-existent in Hyper V is that of Memory overcommitment. This feature allows the allocation of more RAM to virtual machines than is physically available on the physical host. Via techniques such as Transparent page sharing, virtual machines can share their common code thus leading to significant savings in the all too common situation of having to add more memory to an existent server which equates to more than the cost price of the server.

So while Hyper V has also recently caught up with a Site Recovery Manager equivalent with the Citrix Essentials for Hyper V package, it’s still doing just that i.e. playing catch up. One of the main arguments for Hyper V is that it’s free or nearly free but again that’s the marketing jargon that fails to elaborate that you have to buy a license for a Windows Server first and hence help maintain the dwindling lifespan of Microsoft within the Datacenter. Another selling point that Hyper V had was that they were better aimed for small to medium sized businesses due to their cheaper cost….the recent announcement of VSphere 4.1 may now also put bed to that claim. So like all great empires, collapses are imminent and while I don’t believe Microsoft are going to the I.T. Black Hole, they certainly don’t look like catching up with VMware in the ever emerging and growing market of virtualization.