Computerworld

Five hot -- and cool -- storage technologies

By 2010, we'll be creating close to one trillion gigabytes of data. Here's a look at some of the technologies that will manage all that information
  • Dave Webb (Network World Canada)
  • 16 June, 2008 11:09

Down economy or not, the growing appetite for enterprise data storage won't be sated anytime soon, if ever. The rise of data-heavy multimedia files, new customer touchpoints, evolving reporting and compliance standards and other trends are contributing to near exponential growth rates in the amount of data created and stored in the digital universe.

In 2006, according to IDC, 161 billion gigabytes of data was created - three million times the information in all the books written date, according to the research firm. That will grow to 988 billion gigabytes (or exabytes) by 2010. And while 70 per cent of that data will be created by individuals, organizations will be responsible for storing 85 per cent.

Against that backdrop, it's not hard to see why these five hot - and cool - storage technologies are significant in the market today.

Solid State Drives

DRAM and Flash-based drives are finding their way on to the market - EMC and Samsung are among the companies that have announced products - and the fact that they draw considerably less power, both to run and to cool, is resonating in the enterprise that's wrestling with high power costs and the optics of green technology use.

"The long and short of it is, to use current parlance, they're very green and very fast," says Mark Peters, analyst with Enterprise Strategy Group. How much faster than their spinning hard drive brethren? "It's orders of magnitude," Peters says. "While you're talking milliseconds for disk, you're talking microseconds for solid state." With mechanical seek time removed, especially for random I/O, solid state is far superior, with access times faster than 200 Mbps. They're also quieter.

On the downside, SSDs are more expensive - way more expensive - than hard disks. "You can easily be talking 10 to 30 times the price," Peters says. But that number comes with a caveat - if you define the price in cost-versus-I/O terms, rather than cost per gigabyte, the multiples get smaller. For a small percentage of data centre needs, like memory for heavily transactional processes, it might make sense to view it in terms of I/O, he says.

Meantime between failure numbers are similar - about one to two million hours - but critics point out the limit to the number of times an individual block can be written to as an Achilles' heel. (Blocks can be written to about 10,000 times; there's no limit to the number of times they can be read.) Peters says there are workarounds, like those incorporated in EMC's Symm flash. And using tools like thin provisioning can use space more efficiently. "We're a very ingenious race," Peters says. "We'll find ways around that."

Virtual Tape and Deduplication

If we could store all our enterprises' data on the fastest technology, we would, Peters says. Since we can't, we tier. Archives and often backups are traditionally stored offline, on magnetic tape. Virtual tape libraries mitigate some of the hassles of the physical medium - locating, transporting and mounting tapes for a restore.

There are two types of virtual tape library - those that are truly tapeless and entirely disk-based, and those that use a disk front-end to prep files for storage on tape, Peters says. De-duplication goes hand-in-hand with VTLs, and it's technology that vendors including IBM and EMC have been revisiting aggressively recently.

"(Virtual tape) keeps data online longer," says Peters. "(And) deduplication has wider benefits ... you can either save money, or you could move that storage up a tier in the hierarchy."

Page Break

Storage Virtualization

Yes, it's the "V" word, but we're not talking about server virtualization, which has dominated tech headlines this year. Storage virtualization - managing a number of storage resources such as storage area networks as a single pool and presenting to the user as a single resource - has been slow to take off in Canada, partly because of the smaller number of large enterprise operations compared to the US.

But when VMware launched VMware Infrastructure 3 with support for an iSCSI SAN, that made the storage virtualization pitch more compelling for the midsized enterprise, says John Sloan, senior research analyst with Canada-based Info-Tech Research Group. Many midsized companies would have direct attached storage on their servers, and would be sitting on the fence about consolidating and making a big investment in Fibre Channel, Sloan says. But now, as the server refresh window opens, companies are considering the more famous Baldwin brother. "This is where the pitch for server virtualization comes in," Sloan says. And in order to get the best out of a virtualized production environment, "they really need to abstract the storage from the server." Ironically, storage virtualization, which once had only large enterprise appeal, is more attractive to midsized businesses which don't have an existing investment in Fibre Channel to write off.

Cloud Storage

"Cloud" is a 2008 buzzword to rival the "V" word above. On the storage side, everyone from IBM to EMC to Microsoft has weighed in with a strategy or offering regarding storing data in the server cloud.

This trend, though, may be more sizzle than steak.

"The idea of hosting all your applications and files 'in the cloud' and using a thin client as your main means of access to them will never take off," argues Jon Stokes, senior editor and co-founder of Ars Technica. "I store quite a bit of data in my IMAP inbox via attachments, and I can access most of those files on my iPhone. So that's a form of cloud-plus-thin client computing, I suppose. But I'm not going to work an eight-hour day that way."

Though Ontario's privacy commissioner has been probing the security and privacy aspects of the cloud, Stokes says latency is the bigger challenge. There's an unchangeable inverse relationship between latency and cost per bit, and the cloud doesn't change that.

"Computer designers always place the maximum amount of the lowest-latency storage that they can afford as close to the ALUs as they can get it," he says. "Economics dictate that that maximum amount is never really enough to do everything you want to do, so they have to back that low-latency storage pool up with a larger, cheaper, higher-latency pool. And then they back the new pool up with an even larger pool, and so on until you get very far away from the ALUs."

Most of our work takes place in the layers of that hierarchy that are closest to the processor: the computer's hard disk, out to file servers on the local area network and the data centre. The cloud just provides more cheap, high-latency storage much further down in the hierarchy.

Page Break

"That bottom, very cheap/slow layer is useful, but when you start suggesting that people will cut out the middle layer and spend a ton of cycles waiting for data to go back and forth between main memory and some distant networked storage device, then you're just talking nonsense."

That said, the Web services market will take it seriously - it's not that latency-sensitive - and consumers and the enterprise use it for backup and document sharing, Stokes says.

Holographic Storage

Rumours of productization of the futuristic-sounding holographic storage have been in the wind for years as researchers fought to come up with a 3-D medium to pack more bits into the same optical storage space. Last year, InPhase Technologies, a Bell Labs venture spun out of Lucent Technologies in December 2000, introduced Tapestry 300r, a line of drives and media the company called the first commercial holographic storage products. Generation 1 of Tapestry will pack 30 GB of data onto a 1.5-mm thick, 130-mm diameter disk at read/write rates of 20 Mbps. The company says by Generation 3, within four years, the media will hold 1.6 TB and read and write at 120 Mbps.

Company co-founder and CTO Kevin Curtis begged off an interview ("We're really busy at the moment") but offered this by e-mail:

"We are still in development of a 300GB removable holographic drive. We have some early evaluation units that are going out next week to a couple of customers. Earlier prototype drives were used by Turner (Broadcasting System) for some simple testing and actually to broadcast a commercial over TBS."

The theory of holographic storage, short version: A laser beam carrying the data to be written is modulated by a reference beam, creating a three-dimensional composite that is written through the entire depth of the storage medium, not just on the surface like CD or DVD storage. Reading the disk with the data beam will reconstruct the reference beam when it finds matching data, speeding search and seek.

"The main thing with holographic storage is it has potential for longer-term storage of large data sets," says Info-Tech's Sloan. Given the growing need for data storage for compliance purposes, a long-term medium at a lower cost per bit than tape or existing optical media is promising, he says.

Two things might slow uptake though. First, proponents claim holographic storage can maintain data for 50 years. "It's hard to do a proof-of-concept on that in two weeks," Sloan says. And that very lifespan - one of its key benefits - can be a deterrent. Storage technologies evolve quickly. "What kind of technology will there be in four years?" Sloan asks - and will that investment in storage with a 50-year lifespan still be relevant?

Regardless, it's a fascinating technology, Sloan says. All other magnetic and optical medium work by mimicking bits of data - they're either ones or zeroes, on or off. "The idea that a single piece on the media can mean something different depending on the angle or the colour of the beam is interesting," he says.