As promised, the Storage Optimization blog is getting a major facelift, with a brand new URL. This WordPress web address will soon disappear, so please update your links, RSS feed, and bookmarks.
The timing of this update is particularly poignant–it has been almost exactly two years since Ocarina Networks was first formed by a small group of dedicated folks who recognized the upwardly spiraling data growth problem for what it is: an opportunity to innovate.
The new blog will include regular news and commentary from our staff and writers, as well as plenty of guest posts from industry observers and insiders. All of this is aimed at helping you determine what you need to know when making decisions about storage. We sincerely hope you’ll enjoy it.
Again, the new link is www.onlinestorageoptimization.com. It is live now, so please update your settings!
Nice piece today in Bioinform about our compression solution for genomics data. Carter George of Ocarina spoke to the author of the piece, Vivien Marx, last week, as did Dave Lifka at Cornell. The article details the work we’re doing with Cornell University’s Center for Advanced Computing (CAC) in partnership with DataDirect to increase their capacity by up to 90 percent.
Gene sequencing has opened up new vistas in medical research that could lead to a completely new era of “personalized medicine,” with targeted treatments and few or no side effects from medications. Momentum for this type of medicine is building–the FDA announced today that it has created a new position dedicated to “coordinating and upgrading” the agency’s involvement in genomics and other elements of personalized medicine.
The potential is huge, and it’s truly horrifying to think that all this progress could be slowed or stopped due to the cost of storage. Thus, freeing up disk space truly can be a matter of life and death.
We’ve addressed this by developing compression solutions specifically designed for sequencing technologies such as those from Illumina and Affymetrix. The Bioinform article offers significant detail on the types of files we compress as well as the checksums Ocarina performs on each before any shadow files are deleted. We hope you’ll take a look at the piece.
Stephen Foskett had a nice post on his Packrat blog today that delves into the question of whether encryption can be done in such a way that it doesn’t interfere with compression. The whole post is worth a read. We were also pleased to see him describe Ocarina in the following manner:
“The software from Ocarina, for example, actually decompresses jpg and pdf files before recompressing them, resulting in astonishing capacity gains!”
The Packrat blog is on our RSS and Stephen is one of those bloggers who seems to have a grasp of just about everything that’s happening in storage–always adding his own fresh twist to the conversation. He’s also got a Twitter feed worth following, @sfoskett.
This blog and rest of the folks at Ocarina Networks are now a part of what could be termed the “micro-blogosphere.” We speak of course of Twitter. Our handle, http://www.twitter.com/optimizestorage. We hope you’ll follow us, and we look forward to following you.
Twitter has its pitfalls, as TechCrunch pointed out this week. But overall, we think it’s a good way to link up with those who are part of the emerging conversation around storage and where it is headed.
For Cornell, one of their biggest pain points was storage of genomics files. Genomics research is accelerating at a dizzying rate, and it turns out to be a very image intensive research area. Here’s one way to think about it: when J. Craig Venter and his team first sequenced the human genome, it took up 2GB of storage. Nowadays, a single molecular sequence can generate 100GB of data per HOUR. That’s not to say that all of it needs to be stored every time, but you get the idea. In fact, all around the world, genome sequencers are spitting out files as they race to find cures for life-threatening diseases such as cancer, Alzheimer’s and heart disease. If they don’t get storage under control, the pace of genomic research could actually be slowed.
Here’s a quick quiz to see how smart you are about primary storage optimization:
1. True or false: the only type of deduplication on the market today is block level deduplication–the type that looks at the zeros and ones on disk, and removes the duplicates.
2. Content aware deduplication is:
a) More effective than other types of optimization for primary storage;
b) The best approach to optimizing online files, such as photos, PDFs, and other already compressed files because it extracts them and reads them in their non-compressed format before optimizing them;
c) Only available from Ocarina Networks;
d) All of the above.
4. With online data sets, block level dedupe and content aware dedupe get:
a) About the same results;
b) Different results–block level is better;
c) Radically different results–Ocarina’s content aware deduplication solution gets 5x or better results than block level dedupe.
1. FALSE. There’s a new type of dedupe on the storage scene–content aware dedupe. This works in part by analyzing the ones and zeros in files that have been extracted out of their compressed format–a far more effective approach for the types of files that are driving storage growth, such as images, PDFs, and Windows files. More info. at: http://www.ocarinanetworks.com
2. d-all of the above.
3. FALSE: Block level dedupe gets its results because of the repetitive nature of backups – daily backups create dupes, dedupe takes them back out. For online data sets, you won’t get those results, because it’s not a repetitive data set. You need a different approach that can find the dedupe and compression opportunities in a single online set of files.
4. c–see the chart below for a comparison of results.
Pete Steege has a post today that rightly alerts us to the next wave of storage capacity demand–fatter network pipes, which, as he puts it, “beget fat storage.”
Remember the holodeck? Turns out some Stanford researchers have figured out a way to use holographics to store data. This is a quantum leap, and while actual commercial usage is many years away, but this is the kind of innovation that makes Silicon Valley great. Thanks to Robin Harris for discovering and posting this.
The explosive growth of data is threatening to overwhelm any number of industries. Whether we’re talking about an online photo sharing site or high throughput gene sequencing lab, the pain is the same. There’s too much data and not enough space to store it on, with the result that costs are spiraling out of control. A recent white paper from the Taneja Group: “Extending the Vision for Primary Storage Optimization: Ocarina Networks” takes a look at the emerging capacity optimization technologies to handle this influx of data. It comes to the conclusion that ours is one of the most compelling technologies, being the only content-aware primary storage optimization (PSO) on the market today.
In its conclusion, the report states: “‘If you’re looking at PSO technology, Ocarina needs to be on your short list.”
Click here to access this report.
Great news — the Storage Optimization blog is getting a face lift. Stay tuned as we will be changing our look and feel, with a lot more features, and a tie-in to microblogging. For those of you who subscribe to this blog or have it bookmarked, look out for a new Web address.
Thanks to all our readers and we hope you’ll enjoy the new, improved Storage Optimization.