From business-critical, decision-support information to endless amounts of possibly useful customer data, companies today are storing more information than many of them can handle. But data storage without a solid network architecture around it is increasingly like a filing cabinet that doesn't have any labels inside and is often stuck shut.
"Companies that treat data like a strategic asset -- that know how to manage and analyse data to gain new insights to their business -- will be in the best position to capitalize on the new business models, the new market opportunities, and the new ways to attract and develop customers," said Bill Russell, executive vice-president and COO of Hewlett-Packard Co.'s Enterprise Computing division, during a presentation in New York City in May.
"Storage capacity requirements are growing at a rate of over 50 percent per year ... and, of course, you don't solve this storage problem by throwing more capacity at it. It requires a fundamental shift in how you design and implement your storage infrastructure," Russell said.
By far, the most popular concept emerging in storage management is the storage area network (SAN), which brings all of the storage devices together on one network to better distribute server access to those devices.
Commonly touted benefits of SANs include: greater efficiency since servers can utilize various storage devices instead of all waiting for one; better overall network performance as storage and back-up activities are moved off the LAN and onto the SAN subnetwork; and disk mirroring for redundant protection.
Vendors confidently state that among the advantages of a SAN is a lower cost of ownership, but, as with any technology, that calculation depends on many factors such as cost of the hardware, security of the design, and the big kicker in storage management: interoperability.
In terms of secure design, it's important to ensure that the effort to grant servers access to multiple storage devices doesn't allow some servers to make a wild grab at any storage device they can see.
Michael Casey, a research director with consultancy Gartner Group Inc. in San Jose, California, said this is a problem particularly with Windows NT.
"NT, when it boots up, tries to take over all of the storage it can see," Casey said. "Many of the Unix variants aren't as bad ... you have to explicitly tell them these (storage devices) exist before the server tries to take control of them."
Furthermore, as Bruce Gordon, director of strategic planning with CLARiiON, a division of Data General Corp., in Southboro, Massachusetts, explained, allowing every server to openly see every device presents a security threat.
"If one of those machines is hooked up to the Internet," Gordon said, "a malicious person could get in there and reconfigure the storage software to start writing over another machine's storage."
Casey said the disks need to be masked so only the servers with permission to access particular devices can see those devices. This can be done through topology, software or by virtue of Fibre Channel host adapters all having unique worldwide names similar to an IP address, he said.
"You can use that unique host adapter name to map the host adapter to a specific set of disk drive volumes that the subsystem presents, so that host adapter and the host that it's in can only see the disk drives that are assigned to it. That way, you can share a pool of 100 drives among multiple servers by assigning particular logical volumes to particular hosts, even though the hosts are sharing the same connection," Casey said.
Gordon and Casey both said the management could be done in a switch, but agreed that such a method is slow and expensive.
"[This] requires that switch crack the packets open and see where they're going, which isn't normally something a switch has to do. Either it slows it down or the switch has to be a more powerful and more expensive switch to handle the same amount of traffic. So it's the wrong place architecturally to do it," Casey said.
Both agree that the right place to do management is in software with an array-based topology. Gordon explained this arrangement allows the switch to manage the ports but doesn't require the switch to look at packets and make decisions, as that is all handled by the software.
Another area of vendor buzz in the SANs arena is server-free back-up. Contrary to the name, server-free back-up does involve a server, but instead of the data moving into the server and then back out to the back-up device, the server merely controls the data movement directly from the storage device to the back-up device, thus minimizing the load on the server.
In this often confusing realm of topological hype, vendors will tout one system as being superior without clearly explaining that various methods of storage have benefits depending on each customer's specific storage needs.
For example, many vendors claim that network-attached storage (NAS) is in competition with SANs, but Gartner Group's Casey said the two storage methods are located in different parts of the network and don't even operate on the same concept. He said NAS involves a LAN-attached file server responding to file requests over some kind of file transfer protocol.
"SANs are a back-end network ... that tie the servers together with storage. They're not moving files, they're moving SCSI blocks," Casey said.
So a NAS system could have clients attached on one end and still be connected to a SAN or other back-up system on the other end, just like any other server, he said.
Fibre Channel vs. SCSI
In general, vendors are encouraging customers to buy Fibre Channel products instead of SCSI, even if they are not going to implement a SAN right away. Hewlett-Packard, CLARiiON, and StorageTek, among others, have all pointed out that it's better to buy Fibre Channel products now (usually at a higher cost than SCSI) and be SAN-ready for the future.
But not everyone agrees with this approach.
"That's baloney," said Alea Fairchild, managing director with consulting firm Greiner International in Boechout, Belgium. "If you need SCSI, go with SCSI. There are benefits to SANs, but sometimes the organization just needs simple storage," such as the direct-attached model where a server simply has a drive connected to it, Fairchild said.
Gartner Group's Casey said Fibre Channel is very beneficial from the server to the storage subsystem, but the link from the storage subsystem controller to the disk drives need not be Fibre Channel.
"The back-end connection to the disk drives can continue to be SCSI for all intents and purposes for another couple of years and it won't make much practical difference," Casey said.
The major benefits of using Fibre Channel as opposed to SCSI are higher throughput speeds, the ability to cable over longer distances and the ability to connect more devices.
"Our recommendation in general is people who need those benefits should go to Fibre Channel host connections," Casey said. "They don't necessarily have to go with fibre on the back end right away. They can go with Ultra SCSI or the next-generation SCSI which is even faster, and many of the vendors will continue to support back-end SCSI disks for another couple of years."
Interoperability and Standards
The big headache in the SAN arena right now is the lack of standards and associated interoperability problems.
"You can't just go plug a host adapter from here and a switch from there and a subsystem from a third place and expect them to work together," Casey said.
He recommends buying only from vendors who have done integration testing on the specific configurations for that customer and are willing to certify that the parts will all work together. He said the concept of a heterogeneous SAN is about five years away.
The Storage Networking Industry Association (SNIA) is working toward standards in this area.
Roger Reich, vice-chairman of SNIA in Colorado Springs, Colorado, said the organization is focused on four key areas of standardization: GUI-based management interfaces into network storage devices; third-party copy, or the ability to run back-up between storage elements without the data having to pass through and bog down the server; specifications for file systems that run over the top of all network storage elements; and the actual definitions of storage terminology.
Reich said customers cannot confidently implement a multivendor storage network right now, but he said that's no reason to avoid purchasing SAN technology.
"Everyone knows that the real Holy Grail of implementing the SAN is this heterogeneous or multivendor interoperability. We want to get there as fast as we can, but there's absolutely no reason to delay an investment in SAN technology today as long as it's cost-justified. SANs are just too powerful a technology to wait for, in that regard," Reich said.
"If you've got a back-up application today that needs a SAN, if you've got an on-line storage application that needs the sharing and connectivity that a SAN can provide from an individual vendor, you should go and buy it.
"It is undoubtedly true that some of the hardware that is shipping today will not be completely compatible with the truly heterogeneous solutions tomorrow, but it is SNIA's objective to limit that pain to the end user," Reich said, adding that most incompatible equipment could probably be bridged to the future network.
Advice from the Trenches
Wayne Chemy, senior architect and principal with Vespera Logic, has been installing SANs for Newcourt Financial in Toronto. He said the most important lesson he has learned is the value of building an isolation layer into the network that hides the storage elements from other services.
Chemy also found that the design had to take into account the specific needs of various applications, and that there was no need to go with full Fibre Channel.
"It didn't make sense for us to say one suit fits all. We have some applications where it makes sense to have the higher bandwidths and the extra costs associated with fibre, but as well we do some Ethernet trunking just to facilitate some of the smaller applications," Chemy said.
He said he did have to muddle through interoperability problems, but the isolation layer and varying transport kept those problems out of the business-critical level. He advised that anyone considering building a SAN start with a strong set of services that are transport-independent to protect against shifts in the market.
"If you're storage independent, you can take advantage of your old storage as well as net-new storage. For instance, in this framework we've got StorageTek with CLARiiON kinds of storage in the back. We've got some older stuff that's running off straight SCSI and we've got newer stuff in this implementation that's got fibre all the way down to the disk ... That allows you to get the framework ready and react to change, but not get stuck and boxed into a certain path," Chemy said.