The SANMAN: January 2010

Sun’s Oracle Merger – A marriage made in heaven or a deal with the devil?

Posted by Archie Hendryx on Monday, January 25, 2010

With only the ‘you may now kiss the bride’ custom to follow, the ORACLE/SUN marriage (or dare I say SUN/ORACLE) is now finally complete. After months of legal wrangling which has caused nothing but embarrassment and dwindled SUN’s stature within the market sphere, reports also came out that half of Sun's 27,000 staff will be made redundant. Thus initial indications are clear that Oracle, known for its past agnosticism to open source has an eye for the merger being based on maximizing profit. In the meantime Sun’s competitors are probably smiling wryly as the delay of the merger played into their immediate interests but what threats and challenges does this partnership now pose to the once great open source vendor which did so much for developing the tech and e-commerce industry.

One thing which Oracle will most probably do is address and remediate the main cause for Sun’s tragic decline prior to the days when talk of ‘takeovers’ and ‘falling stock shares’ became the norm. In my humble opinion that was linked to Sun failing to consolidate on its strengths by audaciously venturing into unknown avenues only to find that it couldn’t compete with the existent competition. By spreading itself too thinly the ambitious nature of the company soon led it into labyrinths it couldn’t escape from. One such adventure was its acquisition of StorageTek.

StorageTek, known for their solid modular storage arrays and robust tape libraries had a decent reputation of their own prior to Sun’s takeover. Data Center managers, IT Directors and their like knew they had solid products when they purchased the brand StorageTek but in a miscalculated maneuver, Sun decided to rename all their Storage products with the Sun Microsystems brand. Suddenly Sun’s Sales team had to sell what for the average IT Director was seemingly a new and unproven product based on an unneeded name change. Additionally when these storage products took on the same name as Sun’s other storage company, StorEdge further confusion came into the mix. Couple that with an emerging market for disk based backups, purchasing a company that’s forte was tape libraries didn’t particularly make the best business sense.

So what future does Oracle have in plan for Sun’s current Storage portfolio? One certainty is that the OEM partnership with HDS’ enterprise arrays will continue, but as for their own range of modular arrays the future doesn’t look so promising. In a market with products such as EMC’s Clariion, HDS’ AMS range and ironically Larry Ellison’s Pillar Data systems, the truth of the matter is that Sun’s current modular range simply can’t compete. As cost effective as they are, their performance and scalability were always limited in relation to their direct competitors, something that was already acknowledged by Sun prior to the takeover when they disbanded the SE6920 due to its direct competition with the HDS equivalent USPVM.

Furthermore if Oracle’s push with the Exadata V2 is a sign of things to come, one can hardly see them developing an integrated backup model based on an increasingly frowned upon tape infrastructure made by StorageTek. Don’t get me wrong, I’ve worked with the SL8500 tape library and often wonder in amazement as the robotic arms gesticulate as if they were in the climactic scene from a Terminator movie. But that’s the problem …. it’s so 1990s. Add to the equation the NAS based SUN 7000 Unified Storage System which has received rave reviews and the question resonates as to whether Oracle will forsake its modular storage and tape libraries to further focus on just this trend.

Another venture in which Sun entered yet in hindsight did little to further their reputation was server virtualization. While VMware was taking off at the time with ESX 3 and the magic of Vmotion, DRS, HA, VCB etc. Sun had the dilemma that the server virtualization revolution taking place was compatible on x86 architecture and not Sun’s mainstay SPAARC. Not satisfied with reselling VMware for its x86 platforms, Sun decided to introduce their own version of virtualization which was compatible with their SPAARCs, namely Global zones. With huge monster servers such as the M series, the concept was to have numerous servers (zones) utilizing the resources of the one physical box i.e. the global zone. But in an industry that was moving further towards blade servers and consolidation via virtualization, the concept of having huge physical servers housing several virtual servers that couldn’t be Vmotioned and could only offer high availability by having a cluster of even more huge servers, seemed bizarre to say the least. No one disputes the great performance and power of Sun’s SPAARC servers but to offer them as a virtualization platform is completely unnecessary. Moreover the x86 platforms which haven’t radically changed over the years apart from their purple casing now being a slicker silver one, have also proved to be less than reliable when ESX is installed upon them. Indeed my only experience of the legendary PSOD was on the one occasion I had witnessed ESX installed on Sun x86 hardware. As RedHat and others make moves into the virtualization sphere with solutions superior to the Sun model, the questions begs as to what role virtualization will hold for Oracle. Larry Ellison has already made it evident that he wants to give full support for the SPAARC, but I’m not so sure, especially when Oracle decided to house Intel Xeons and not Sun SPARCs as the core of their Exadata V2.

As for the excellent host based virtualization of VirtualBox, the opensource nature of the product simply doesn’t fit in with Oracle’s approach of utilizing its dominant position to leverage big bucks from its customer base. With Oracle also already having Xen-based virtualization technology, I doubt virtualization will remain in the development radar of the newly occupied Sun offices. Come to think of it, will any of the opensource products remain?

Another aspect which worries me even further is the future of Solaris and ZFS. Despite Larry Ellison’s quotes of focusing on Java and Solaris, Solaris administrators still feel a touch uneasy, something which RedHat have taken advantage of by offering discount Solaris to RedHat conversion courses. As for ZFS, I’ve made no qualms as to my admiration of what is the most system admin friendly file system and logical volume manager on the market. But the recent legal wrangling over copyright with NetApp which is sure to escalate and Apple’s subsequent rejection for their OS leaves the revolutionary filesystem in a rather precarious position. Is Oracle going to put up a fight or will it be a case of no profit means no gain?

Despite the great wedding celebrations and fanfare which will inevitably occur during the honeymoon period, I will sadly shed a tear as a fair maiden that believed and stood for the virtues of platform independent technologies is to be whisked off into the sunset by another burly corporate man. One can only hope that the aforementioned kiss is one of love and understanding which will rejuvenate Sun and not a fatal kiss of death.

Green Storage: MAID To Do More Than Just Spin Down

Posted by Archie Hendryx on Tuesday, January 19, 2010

The fundamental Green problem of all data centers is that they cost a fortune to power up, which in turn produces heat which then costs a fortune to cool down. Within this vicious circle a bastion was set up to counter this, namely ‘Green Storage’, which has mostly taken shape in the form of virtualized storage, data deduplication, compression, thin provisioning, DC power and SSDs. Add to the circle power conservation techniques such as server virtualization and archiving onto nearline storage, and you have most companies claiming they are successfully moving towards being ‘Green’. Hence bring forth the proposition of MAID storage and many users would not see the need for it. Further reluctance towards the technology would also come from the fact that MAID has now somewhat tragically become synonymous with being merely a disk spin down technique, despite having the potential to be far more and concurrently bringing greater cost savings.

First coined around 2002, MAID (a massive array of idle disks) held a promise to only utilize 25% or less of its hard drives at any one time. As Green technology became fashionable, major storage vendors such as EMC, HDS, Fujitsu and NEC began incorporating MAID technology promising that drives could power down when they weren't in use, thus extending the lifecycle of cheap SATA drives and in turn reducing the costs of running the data center. But caught in the midst of being one of many features that the larger vendors were trumpeting, the development and progress of MAID failed to advance from its disk slowdown tag, leaving Data Center and Storage managers oblivious of its full potential. Furthermore MAID became associated with being a solution only suited for backups and archive applications as users were cautious of the longer access time that resulted as disks spun up after being idle.

Fast forward to 2010, with government regulations demanding more power savings, and the back of a year which saw data grow while budgets shrank, suddenly users are looking for further ways to maximize the efficiency of their data storage systems. Hence in a world where persistent data increases a hero in the vein of MAID may just be ideal.

To be frank, the detractors did have a point with the original MAID 1.0 concept which in essence was simply to stop spinning non-accessed disk drives. Also with a MAID LUN having to use an entire RAID group, the prospect of a user with less than a large amount of data meant an awful lot of wasted storage. Add in the scenario of data put on MAID that suddenly requires more user access and hence constant disk spin and the overall cost savings became miniscule. Therefore those that did go for MAID ended up utilizing the technology for situations where access requirements and data retrieval were not paramount, i.e. backup and archiving.

In retrospect what often gets overlooked is that even with tier 2 and tier 3 storage data only a fraction is frequently accessed therefore leaving MAID as a suitable counterpart to the less-active data sets. In conclusion the real crux of the matter is the potential access time overhead that occurs as disks have to be started up, which is a given when only one spin down level is available.

Now with updated ‘MAID 2.0’ technologies such as AutoMAID from Nexsan, varying levels of disk-drive ‘spin down’ are available which utilize LUN access history to adjust the MAID levels accordingly. With Level 0 you have hard drive full-spin mode, with full power consumption and the shortest data access time while Level 1 allows the unloading of disk read/write heads giving 15%-20% less than Level 0 in power usage and only a fraction of a second less in access time. Additionally you have Level 2, which not only unloads the disk heads but also slows the platters a further 30-50% from full speed, giving a 15 second range for access time on the initial I/O before being jolting up to full speed. Similar to MAID 1.0, Level 3 allows the disk platters to stop spinning; bringing power consumption down by 60%-70% with an access time of 30-45 seconds on the initial I/O. In a nutshell these various levels of MAID now open up the doors for the technology to be a viable option for both tier 2, 3 and 4 storage data without the apprehension of delayed access times.

Some companies have gone even further with the technology by adding the ability to dedupe and replicate data in its libraries. Thus users have the option to isolate drives from the MAID pool and dedicate others for cache, leaving the cache drives to continuously spin while simultaneously increasing the payback of deduplication. The possibilities for remote replication as well as policy-based tiering and migrations are obvious. An organization with a sound knowledge of their applications could make significant savings moving data off expensive tier 1 disks to a MAID technology that incorporates both deduplication and replication capabilities with minimum if any performance loss.

Moreover using MAID technology in a context where data becomes inactive during the night (user directories, CRM databases etc.), disks can easily be spun down when users leave their office. Saving on unnecessary spin cost and energy for numerous hours each evening, by also using an automated process for long periods of inactivity such as holiday periods, users would quickly increase energy savings as well as decrease man management costs.

No doubt that in the current mainstream MAID is still best suited for persistent data that's not static and depends largely upon accurate data classification practices. But once MAID 2 and its features of variable drive spin-down, deduplication and replication begin to get the attention they deserve, we may well see a ‘Green’ solution which really does bring significant cost savings and energy savings. With the real ‘Green’ concern of most IT Directors being that of the paper kind with Benjamin Franklin’s face on, that attention may just occur sooner than we think.

When RAID 10 Is Worth The Economic Cost

Posted by Archie Hendryx on Wednesday, January 06, 2010

Faced with the prospect of the extra disks needed for RAID 10 or the heavily marketed RAID 5, most users would go with the economic option and choose the latter believing they avoided themselves potential capacity problems for the future. But with 15k FC disks now seen as a norm for an OLTP (with some users even going for SSDs), the need to decide between RAID 10 or RAID 5 is something that needs to go beyond economic considerations.

The benefits of Storage administration becoming easier and more accessible via GUIs and single management panes has ironically led to the downside of an emergence of poor storage practices becoming more common place. Enterprise storage systems with unbalanced loads on their BEDs and FEDs, write pending cache levels which are too high despite an abundance of cache, array groups hitting maximums while others are dormant are just some of the situations occurring and causing unnecessary strains and performance problems. Coupled with the sometimes negligent approach to equating the application demands with the relevant storage, performance degradation of expensive high end arrays often leads to hard to trace degraded replication and back up procedures.

Hence the obvious case for RAID 10 to be considered instead of RAID 5 for an OLTP or Exchange database falls in a similar category, with an onset of Storage administrators ready to shake their heads in disagreement, content with the current performance they’ve attained with RAID 5. While on the surface that may be the case, a closer inspection as to how the arrays are affected with regards to the variations in read/write combinations coming from different hosts quickly paints an alternate picture.

In the case when several hundred reads and writes bombard the system, suddenly the effects on the array become apparent. Even with the best tuned cache you will struggle in such a situation if the amount of data reaching the FEDs can't be passed on at the same rate to the BEDs. The result is a high write pending cache level which will then lead to other problems such as high I/O wait times on the host or even arrays unable to accept further writes due to a chockablock backlog. The point to remember is that just because the host believes it has completed its writes, the actual writes being committed to disk are still the responsibility of the array. Like any Christmas shopping rush, order is maintained by a steady flow of customers going through the shop doors avoiding crushes, stampedes and eventual door blocks. In the same vein, pending writes in cache need to be written to disk at a rate similar to how they come in.

With the parity-based RAID 5, applications with heavy random writes can easily cause high write-pending ratios. Thus the suitability of RAID 10 for such applications becomes evident. For sequential reads and writes RAID 5 works a treat due to the minimized head movement. Yet if you fall for the trap of placing a write-intensive workload on RAID 5 volumes instead of RAID 10, you will soon have the burden of the overhead of parity calculations which will in turn affect the performance of the processors. Therefore the erroneous conclusion that saving costs by utilizing RAID 5 and compensating the performance with more cache will only lead to a high write pending level. Hence establishing the ratio of reads and writes generated at the host level and concurrently matching the appropriate RAID type will lead to better performance as well as optimization of your storage array.

To conclude a high write-pending utilization on your cache, could be the symptom of an imbalance on either your host or storage system. If your host has volume management that has striping deployed on the volumes, chances are you're probably not spreading the stripe across all the BEDs. Furthermore the RAID type is probably not the most suitable. With migration tools / software such as Hitachi’s Tiered Storage Manager (formerly known as Cruise Control) it’s a straightforward process of migrating data from one LUN to another transparently, thus allowing you to change from a RAID 5 to a RAID 10 parity group. In such circumstances the cost of RAID 10 may be higher but the performance costs related to mismatching the RAID to the relevant applications will be more so.