Storage Optimization

Nice to be quoted/mentioned

Posted in Analyst,Storage by storageoptimization on June 30, 2008

It’s not always clear whether I’m making an impact, but this past week was one of those times when I realized that some others are taking note of the excitement around and importance I attach to the concept of storage optimization. In a June 27 editorial article in Processor, “Doing More With Less,” I was quoted in a section on “Saving Space” as follows:

“Carter George notes that, ‘… storage optimization is the key technology for utilization of the space needed for data storage. By using this technology, users can shrink existing files by as much as 90%, thus enabling the storage of up to 10 times more data on disks already owned by the enterprise.'” 

That same day, my company Ocarina Networks earned a mention in a post on Jon Toigo’s excellent Drunken Data blog. In a post recalling a conversation with Chris Santilli of Copan Systems, he writes: 

“Chris noted that de-duplication technology was past the hype stage (not sure about that one) but that the technology was still undergoing substantial development — rather like compression in its early days:  a lot of variations, no standards.  He further noted that some interesting work was being done by companies such as Ocarina on improved file type awareness that might help mitigate some nagging technical issues involving de-dupe of data on disks that had been defragged.  (Lot’s of “D’s” in that sentence.)”

Thanks guys. Good to know I can get the word out to the very folks who really know and understand what’s going on.



Commenting on Compression

Posted in Uncategorized by storageoptimization on June 23, 2008

On Storage Soup, the TechTarget Storage Blog, Tory Skyers wrote a really interesting post on “Compression, Dedupe and the Law” last week that I felt compelled to comment on. He raised a question about what dedupe could mean from a legal standpoint, considering that the data is altered when it goes through this process. 

My response, which you are welcome to read in detail on the site, is to point out one issue Tory missed. That is, that in-band compression is scary for the reasons he outlines, and fortunately, it’s not the only option these days.

The other comments posted are well worth the read as well. Good to hear people debating these issues.

What to do about the coming video explosion

Posted in Analyst,Storage,Video by storageoptimization on June 4, 2008

Pete Steege’s Storage Effect is commenting today on an ABI report that highlights the explosion of video content on the web, which expected to increase to one billion viewers by 2013. Steege’s response is that the report ignores the “digital home,” which will no doubt become ubiquitous in the coming years.

I agree, and would add that there are still other things driving video storage growth as well, such as a drastic increase in the number of video surveillance cameras and their resolution. But mainly, what I see is that the storage problem itself could actually be solved to a great extent with the proper optimization. For video, since video files are already compressed for transmission, the proper storage optimization has to include both video-specific recompression and video-specific deduplication.

For video on the internet, you have two related but different problems. One is to store the vast amount of content that is being generated. The second is provide the bandwidth needed for high-definition viewing of hot content.    

Most video content is not hot. People upload thousands of hours of video per day to popular sites like YouTube, but only a small fraction of that gets wide viewership. It all needs to be stored, but the key thing for most of it is to store it cheaply. That’s going to mean not just cheap disks, but video-specific storage optimization that greatly reduces the size of the video files.     

The relatively few videos (meaning, a couple hundred a day) that do become popular won’t be so aggressively compressed, or they’ll be compressed for bandwidth rather than for storage optimization. That is, solving the speed problem for the hot stuff that everyone is watching is easy – it will be replicated and cached, and people will get access to their hot shows and user-contributed videos.  Solving the “store 900 Petabytes of user-generated video really cheaply” problem is not so easy to solve.

Another major optimization of video storage is that most videos that most people want access to is duplicated across many homes. Today, a blockbuster movie, a hit TV show, a TiVo of the big game – these are all stored hundreds of thousands of times across millions of households.    

As video storage moves to cloud storage services, a lot of that can be deduplicated. For entire licensed content (e.g., a studio movie) that’s relatively easy – you’d say, here are 10,000,000 users uploading their copy of the Lion King…let’s just save one.  But to get real optimization, cloud storage providers are going to want to be able to find and compress video at finer granularity than that.  Let’s say there’s a football game broadcast on ABC in some markets, and carried by ESPN (with different commercials) in another market.  User A records it in standard def.  User B records it in high def.  The user in Atlanta records it from ABC.  The user in Portland records in from ESPN. To be efficient, you’ll want storage optimization that recognizes that those users are all uploading versions of the same thing, and takes out the redundant information as part of the compression / deduplication process.   

Without aggressive storage optimization – including video-specific compression and dedupe – the explosive growth of video content is going to overwhelm storage capability.