Storage Optimization

Less is More–Part 2

Posted in Featured,File Systems,Storage by storageoptimization on April 25, 2008

As we all know, the internet is where there is huge storage growth, multi-petabyte scale, and a need to stay very close to the commodity price point on storage costs. There are two common threads across all of the “less is more” file systems that have been popping up to handle all this growth. 

First, they are all designed in a way that you can build very scalable, very large pools of storage using generic white box servers stuffed with cheap disks. Second, they mostly support only the most primitive operations — create a new file, read that file, delete a file. While I’m generalizing, and this is not exactly true for all of these new file systems, many just skip things that are considered standard in traditional file systems: locking, Posix semantics, authentication, ACLs, concurrency control, metadata or the ability to list and search for files.

The overhead of all those traditional file system operations is too much for massive internet-scale operations where the primary purpose of a file system is for a user to upload something, for millions of people to look at it over and over, and maybe someday, sometime, someone will delete something.

These file systems are in contrast to advanced file system developments from places like NetApp’s latest OnTap and WAFL releases, HP’s PolyServe cluster file system, or the transaction-enabled NTFS from Microsoft that you can find in Server 2008. 

The line in the sand is, there are file systems that are designed to be used by people, and file systems that are designed to be used by specific applications only.    

The commercial file systems grew up serving the needs of business users and business applications. They are designed to host a wide variety of applications, including production databases, to let users peruse and manage their files, and to let storage administrators keep up with both growth, availability, and corporate compliance requirements.     

As a consequence, more and more value-add features are being put in to the file system to support these use-cases. The “less is more” crowd, on the other hand, wants a very cost-effective but massively scalable pool of storage to make available to their web applications.  A global namespace (so it looks like one giant pool of storage), and low, low cost per terabyte are the drivers of these file systems.

Users don’t list their directories in these file systems. In fact, users never see these file systems. Users see web applications, and the web applications use databases to keep track of what files are where in the massive storage pool, and who is allowed to see them. In that sense, in the “less is more” file system world, a lot of the value-add and management functionality of the file system is moving up in to the application layer, especially in the largest content-rich web sites.   

>From my point of view, the feature-rich commercial file systems will continue to evolve to meet the needs of corporate customers, including scaling to meet their growth needs. The “less is more” file systems will continue to push out traditional file systems in the highest growth web properties and other customers whose data growth is at that many-petabyte scale. Finally, the two things are not entirely incompatible – most of the new web tier file systems actually have a bunch of single node file systems buried in them on each storage node somewhere at the bottom building block level of their architecture.

But it’s time that these two file system approaches evolve and develop some kind of relationship–because for now, neither is perfectly suited the problem at hand. There’s no reason why those building blocks couldn’t have richer functionality, such as transparent clustering and failover, that comes from commercial file systems, and still give you the massive scale and cheap $/petabyte of a global namespace and commodity building blocks.

The internet has often been the cauldron in which new technologies are forged that then eventually move in to the corporate data center. We saw this in the server world, where low cost Linux servers displaced Sun and other Unix systems early on, and eventually that movement to cheaper, standard servers pushed Big Unix out of the corporate data center too.   

The cost differences between a corporation’s EMC DMX storage array and a storage pool of white boxes with disk is even greater than the cost difference between Unix machines and standard Linux boxes. People are more hesitant to change storage platforms than server platforms (for good reason), but that huge cost difference and the rate at which storage is growing is going to cause the shift to happen sooner or later.

My prediction (and hope) is that someone will figure out a way to marry the “less is more” simple file system layers with richer underlying commercial file systems. This is what’s needed.


One Response to 'Less is More–Part 2'

Subscribe to comments with RSS or TrackBack to 'Less is More–Part 2'.

  1. draft_ceo said,

    Authentication and access control are a very small problem of a filesystem, and they generally are not of much cost. The same goes for locking, etc.

    In web 2.0 systems, you will still need to traverse a directory path to reach the appropriate files. And that needs to be fast like any other traditional filesystem requirement. Reading and writing will also need to be fast like in a traditional filesystem. I am trying to understand how these web 2.0 filesystems differ fundamentally from traditional filesystems.

    I guess the main difference is that web 2.0 stuff is free, and they do not have to guard too much against data loss 🙂

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: