Storage Optimization


The impending storage crunch

Posted in Storage by storageoptimization on July 28, 2008
Tags: , , ,

No one can miss the fact that data storage is spiraling upward at a terrifying rate. Joerg Hallbauer puts it on Dell‘s Future of Storage blog hit the nail on the head with his post: “We are running out of places to put things.”

Citing data collected by IDC, Hallbauer concludes that in a mere three years, we there will be 1400 exabytes sitting on disk. Currently, according to the study, there are 281 exabytes of data being stored, and the CAGR rate is 70 percent. Much of this data is on laptops, home computers or servers under your desk today, but as Joerg correctly notes, there’s no question its migrating quickly to the cloud. Huge data centers will end up holding most of this data, and disk drives are not growing fast enough to deal with it anymore.

So, where do we go from here? Well, if the traditional answer was, wait for bigger drives so I can put more stuff on a disk, the other logical thing to do is to say, how can I put a lot more stuff on disks that I already have?  The answer is advanced storage optimization. The first simple storage optimization solutions are out there today – single instancing, deduplication, and compression. But the area of storage optimization is really just taking off, and much more sophisticated approaches are emerging that will allow a disk – whatever its physical size – to store 10, 20, or 100 times more data than it does today.  

What’s more, the move to large data centers providing huge cloud storage services will make this more efficient, because storage optimization is all about finding redundant information and figuring out how to store it more efficiently. So the larger the data set, the more likely you will see big wins from next generation storage optimization.

This also naturally leads to more tiering. Where today you have fast disks (Fibre Channel or SAS) and slow disks (SATA) making up the tiers, it’s much more likely in the future that the fast tiers will be solid state storage of some sort (SSD and Flash, as Joerg points out) and the massive tiers that hold the bulk of all these Exabytes will be the largest possible disks integrated in to systems that have very efficient storage optimization built in.

Image credit: Orange Photography blog archives

Capacity-Optimized Storage: The Emergence of the O Tier

Posted in Storage by storageoptimization on July 23, 2008
Tags: , , , ,

 

Everyone is talking about the explosive growth of storage, but all growth is not the same. In fact, unstructured data (files) are growing much faster than structured data (databases), and capacity-optimized storage for files is growing much faster than traditional filer-based storage. This is driving some key developments in storage technology, as storage offerings emerge that are designed specifically for where the growth is.
       Traditionally, the difference between performance-optimized storage and capacity-optimized storage was just whether a storage system shipped with Fibre Channel drives or SATA drives, and maybe how much cache was in the storage controller. Now, the differences between Performance-Optimized and Capacity-Optimized storage are becoming much bigger, with advances in both tiers taking them in different directions and further away from each other.
       The “P Tier”–long dominated by NetApp and EMC–is seeing lots of advances, include bigger caches, solid state disk, and more fault tolerance. It’s where data gets created, and there is a huge focus on never losing data that has just been created. The “P” in this tier doesn’t just represent “Performance,” but also “Protection.” Performance is measured in SPEC sfs and IOPS, and protection features include mirroring, RAID levels, synchronous replication to DR sites, and snapshots every time a file is modified or deleted. However, the P Tier is very costly per Terabyte because of the premium technology required to provide all those protection mechanisms while providing stellar low latency performance at the same time.
       Enter the “O Tier”–or what IDC calls capacity-optimized storage. This is no longer just a NetApp or EMC filer with SATA drives instead of Fibre . True O Tier offerings–which are starting to come out from the major vendors–have several major architectural differences. EMC’s Hulk, IBM’s XIV, and the HP’s exciting ExDS Extreme Storage are all based on scale-out architectures. You buy “bricks” of capacity, at near commodity prices, and you can scale out these systems by just adding more bricks.  Almost all of the scale-out “O Tier” offerings are based on clustered or distributed file systems.  These architectures are drastically cheaper than P Tier storage, even P Tier offerings with SATA disks.
       What’s more, the O Tier is becoming clearer in what its metrics are. Data may be created in the P Tier, but it moves for long-term storage to the O Tier. That means there is less focus on extravagant protection measures. Data that makes it to the O Tier has already been backed up, snapped, replicated, and protected many times in the P Tier. On the O Tier, the key metrics are Cost per Terabyte, Terabytes per Admin, and Watts per Terabyte over its lifecycle.
       The O Tier is evolving to solve different storage problems than the legacy P Tier and because of that the O Tier is developing its own new features for capacity optimization. The most important of these new features is integrated data reduction. That can take the form of block-level dedupe, next generation compression, or content-aware optimization. There are several technologies coming out aiming to get 5X, 10X or 20X data reduction for online storage in the O Tier. Expect these technologies to be embedded as integrated elements in leading O Tier storage offerings.  Examples would include Data Domain moving from being a storage solution for backups to offering nearline storage with dedupe, or the several storage vendors who are integrating my company Ocarina Networks’ storage optimization solution in to their O Tier storage offerings.
       Anyone who is tracking trends in storage need to start paying attention to differentiating these tiers not by just what disks are in a given filer, but whether they are really P Tier filers or O Tier filers, with true Performance and Protection in the P Tier, or true Capacity-Optimization in the O Tier.
       While the traditional NAS leaders, EMC and NetApp, will certainly come out with O Tier offerings, the emergence of a new tier with different characteristics creates a new market opportunity for other major players to become the new  leaders in the O Tier. Look for HP, in particular, as well as IBM,  Ibrix, Isilon, and Blue Arc to be making major pushes in the O Tier this year and especially in ’09.

In Startup City’s spotlight

Posted in Featured,Storage,Video by storageoptimization on July 21, 2008
Tags: ,

I was interviewed on video by John Foley for InformationWeek’s Startup City a month or so ago, and have just discovered that the video is now up on the site. If you haven’t already, I encourage you to explore this blog. Foley does a great job of covering the vast and growing landscape of IT startups. Enjoy.

What’s Hot in Storage — Spending Less

Posted in Featured,Storage by storageoptimization on July 18, 2008
Tags: , , , , ,

Byte & Switch has once again released its “Top 10 Storage Startups to Watch” for 2008, and it’s definitely worth a read. My company Ocarina Networks was on that same list last year, and so I can say with confidence that they got it right at least once before. 

As reflected in this year’s list, data reduction technologies continue to be hot. Makes sense in a down economy that anything that increases capacity will continue to get budget dollars. As we’re finding, dollars for stuff like Ocarina is already there in every data center’s budget – it’s just listed as disk expense. We’re not only ahead of our revenue goals for our storage optimization product launched in April, but we’re having to triple the size of our sales force to keep up with demand. 

If you have planned to buy 100 TB of disk, and can spend half as much for an optimization solution that shrinks your files that means you don’t have to buy any disk at all. A win all the way around. While Ocarina started out with wins in large web sites – where the fastest year-to-year storage growth is taking place – we’re now seeing installs in life sciences, energy, movie studios, and finance.  

The chief takeaway from what I’ve seen: some nice-to-have new technologies may be facing a tough summer with an economic downturn, but data reduction scores high on both saving money and green IT, and is likely to stay strong, or maybe even move up in priority, during a down cycle in storage spending.

Rackspace down – what’s the lesson?

Posted in Uncategorized by storageoptimization on July 10, 2008

TechCrunchIT reports today that one of Rackspace’s data centers went down overnight due to a cooling issue. This is the second outage in recent months for the company, as they also went down in November after a car accident knocked out their power.

Fact is, data centers are becoming more and more complex, and the need for rack cooling is not only taking its toll on the environment, but is also making them vulnerable to breakdowns such as the one that happened last night. Slowing the growth of such centers needs to be a high priority, in  my opinion. Reducing the amount of space needed for storage – in other words storage optimization – is a highly effective counterbalance to this out of control growth.