Sizing up storage alternatives

It wasn't long ago that IT professionals shrugged off storage as a straightforward, albeit very boring, aspect of maintaining a computing infrastructure. But in the last few years, a push towards shared enterprise storage has given rise to several deployment options.

For instance, when does a network-attached storage (NAS) device do a better job storing hordes of enterprise data than a storage-area network (SAN)? And, how do these newer technologies compare with local storage, where a hard disk is directly accessed by a server via a SCSI cable connection?

With the help of leading switch maker McData Corp. and leading server vendor Compaq Computer Corp.- which both contributed components of our storage testing infrastructure - Network World Global Test Alliance partner MierCom kicked the competitive tires of these storage technology alternatives to see how performance varied across several common storage scenarios.

Our test bed was set up to loosely emulate file servers, Web servers, video servers and other application servers with regard to the data they routinely transfer to and from a storage location. We varied the storage location between a local SCSI-attached disk drive, a disk drive on a storage server across a Gigabit Ethernet LAN, and a disk drive in a SAN disk array connected over a Fibre Channel SAN.

Which setup worked best? It depends. Our tests show the right storage route to take depends on the storage network environment, the size of the files being stored or retrieved, the type of PCI bus connection, and how your users access the stored data.

Specifically, our tests indicate that:

* The NAS environment - where data moves between a server "initiator" and a storage "target" over a Gigabit Ethernet network - can deliver better data-transfer performance than a SAN in certain cases, such as when file sizes are small.

* SANs really outperform the NAS alternative when data reads or writes are sequential and file sizes are large, such as When connecting a server to a SAN, performance is virtually the same whether the SAN adapter uses a 32-bit or 64-bit PCI-bus connection.

* For a Gigabit Ethernet network interface card (NIC) in our NAS environment, performance was typically better via a 64-bit PCI-bus connection than a 32-bit PCI-bus connection. But the difference isn't much - only about 10 percent in our In all cases, writing data to a storage device takes more time and resources than reading it, and subsequently yields much lower data-transfer performance.

* With random data reads - when there's no correlation between data from one read to the next - data-transfer performance is much lower than sequential reads of large data files in all scenarios we tested.

* With random reads, data-transfer performance over a Gigabit Ethernet NAS is nearly as good as reading data from a local disk drive on a SCSI bus.

The data presented is, we believe, among the first such published storage-comparison results. Still, we caution readers to keep two points in mind.

First, these results are based on the particular equipment we deployed. A SAN disk array other than the Hitachi 5800 we used, for example, might exhibit different performance characteristics.

Second, due to the broad differences between SAN, NAS and SCSI environments, the results should not necessarily be viewed as perfect apples-to-apples comparisons. For example, while direct SCSI data storage exhibits the best data-transfer performance in some scenarios, it is not generally accessible by multiple servers concurrently, as stand-alone storage nodes in the NAS or SAN environments are.

Also, while we used an off-the-shelf Compaq server as a NAS storage target, we employed a specialized Hitachi Disk Storage Array as the target node in the SAN environment. There are specialized NAS storage nodes available, too, but our attempts to procure one for this testing were unsuccessful.

How we did it

The test scenarios we created involved an application-processing server, which, depending on the application, could be an e-mail server, Web server, database server or video server. This server was the initiator of each storage operation, meaning that it issued all disk read and/or write requests.

Those requests were sent to and processed by a storage target, which varied depending on the environment. In the NAS environment, the storage target was a Compaq ProLiant server, accessed via an IP-based Gigabit Ethernet network. In the SAN environment, the storage target was a Hitachi 5800 Disk Array, which was built for the purpose of being a SAN node. In the SCSI environment, the storage target was one of the application server's internal disk drives.

We used the same Compaq server configuration as the initiator in all the scenarios. This was a fairly robust Compaq ProLiant ML370, with dual 866-MHz Pentium III processors and 1G byte of RAM.

We only changed the initiator server configuration when we changed from a NAS to a SAN environment. Then we replaced the 3Com Gigabit Ethernet NIC with an Emulex LP7000e host bus adapter.

In the SAN and NAS environments, we also compared data-transfer performance between 32-bit and 64-bit PCI-bus connections. This was the connection inside the application server used by the Gigabit NIC and the SAN host bus adapter. The 3Com Gigabit NIC we used, model 3C985B-SX, can be plugged into a 32-bit PCI slot or 64-bit PCI slot within the Compaq server. The Emulex LP7000e HBA comes in different models for 32-bit and 64-bit PCI-bus connections.

The SCSI environment is not affected by whether a Gigabit Ethernet or Fibre Channel storage network is in place. The internal disk drive was directly SCSI-bus-connected to the processor motherboard of the Compaq server. No network I/O or NICs were involved.

Another key component to this testing was a sophisticated, public domain software test tool from Intel called Iometer. This software is well suited for this mixed-technology environment because it measures and reports average data transfer in megabytes per seconds - whether the data is being sent to a local SCSI-connected disk, out over a Gigabit Ethernet network via a NIC or out over a SAN via a host bus adapter. Iometer issues disk reads and/or writes to any defined disk drive, which can be a local drive or a network drive mapped to a NAS node, or a drive on a remote SAN disk array. Iometer, which consists of client and server software components, can also perform the same tests across multiple platforms concurrently and consolidate the results, or it can perform a test via multiple "threads" - instances of the same software process running concurrently and independently - on the same processor. This was the method we used for running two and five servers against the same storage target at the same time.

Scenarios

In our research on how to characterize different real-world storage applications, we found that storage scenarios vary in three regards: the relative percentage of storage requests that are reads vs. writes; whether disk access is random or sequential; and the typical file size. Based on this information, we developed five scenarios for this comparative testing.

In our first file-server scenario, we designed the server to imitate an application server, such as an e-mail or file server, that conducts many, typically small, reads and writes continuously. This scenario is characterized by 80 percent reads, 20 percent writes. File size is fixed at 4K bytes, and disk access is random in all cases. This scenario tests how well small files can be served across a Gigabit Ethernet network vs. a Fibre Channel SAN.

The cumulative data-transfer rates achieved in this scenario are relatively scant - less than 1M byte/sec. This is the impact of moving fairly small files, running a mix of reads and writes and using random disk access, all of which tend to slow things down. In this scenario, data-transfer performance for all three storage environments is fairly comparable. It is only when five or more servers are collectively accessing the disk storage that the SAN environment provides slightly greater aggregate throughput. A SAN might be a slightly better choice in this type of scenario, but only if you expect to have multiple servers concurrently accessing the same disk storage.

Our second file-server scenario was similar to the first with one exception. Rather than fixing all the file sizes at 4K bytes, we also included larger file sizes as 10 percent were 8K bytes and another 10 percent were 16K bytes. This scenario tested how the storage alternatives compared with some larger file sizes added in.

Our tests with this scenario showed that, as file sizes increased, data-transfer throughput also increased. As with the first file-server scenario, though, there was no clear winner between NAS, SAN or local SCSI disk. It's noteworthy that, even with five servers collectively accessing the same disk storage, only 1 percent to 2 percent of the Gigabit Ethernet or Fibre Channel bandwidth was used. This means the transport capacity of Fibre Channel and Gigabit Ethernet is huge.

In our third scenario, one, two and then five Web servers were serving the same set of Web pages and files. All disk operations are reads; all disk access is random. File sizes were variable, ranging from 20 percent very small (512 bytes) to 10 percent fairly large (128K bytes). This scenario showed how well Web pages can be served over the different storage/-transport options.

We were surprised with the results of our testing with this scenario. Given random-access retrieval of a range of file sizes, data-transfer rates achieved in the NAS environment clearly outperformed the SAN. Indeed, the NAS throughput was roughly double the SAN throughput in all cases. And despite the hype concerning the throughput speed offered by SANs, it was surprising to see Gigabit Ethernet perform so much better than Fibre Channel SANs in any situation.

In our fourth scenario, one, two and five video servers delivered streaming video. As with the previous scenario all disk operations were reads. However, all disk access here is sequential. The same 64K-byte file size was used in all cases. This scenario tested the relative performance of serving streaming video over the different storage/transport options.

When comparing the video server scenario with the results of the Web server scenario, we saw the opposite result. With sequential disk access to large files, and consistent, fairly sizable files, the SAN environment outperformed the NAS alternative by a considerable margin - from more than double for a single video server, to nearly four times the throughput when five video servers were reading the same disk files across the SAN or NAS. In the case of five video servers, the cumulative SAN throughput, 47M byte/sec, tapped roughly half the Fibre Channel SAN's bandwidth. These tests indicated that if you are going to serve large amounts of video from a shared-storage node, your best bet is a SAN deployment.

In our final scenario, one, two and five application servers were writing folders and directories to the storage target in large, 1M-byte files. All disk operations were writes, and disk access was 100 percent sequential. This scenario tests how well large files are transported and written sequentially to a back-up storage disk, emulating server backup to tape.

In this scenario applications servers were writing massive amounts of sequential data to a storage target's disk. In the NAS and SAN environments, it seemed the maximum disk-write-throughput point might have been reached because the storage data-transfer rate did not increase with two or more servers, compared with a single-server initiator. With the Hitachi 5800 disk array in the SAN environment, the peak we reached was about 30M byte/sec; with the Compaq NAS server the write capacity to a single disk peaked at about 5M byte/sec. The specialized SAN storage node clearly outperformed the off-the-shelf server acting as a NAS node in our test bed. We don't know how well a specialized NAS device would have fared by comparison, but given these two storage nodes, the SAN alternative delivers much better performance.

The SCSI option did well here, for backing up a single server. Indeed, performance was comparable to doing backup over a SAN. However, a key motivation to doing a backup in the first place was to create and maintain a copy of a server's data in a location where it would be safe if something took out the server. Local SCSI doesn't accomplish that end.

In the end

There are many other scenarios that could still be tested. For example, it would be interesting to see how data-transfer performance compares if disk storage was striped across multiple target disk drives, instead of just one. It would also be interesting to see how different, specialized storage nodes - such as those from Network Appliance in the case of NAS, or EMC in the case of SANs - perform by comparison. However, neither vendor was willing to participate in this novel test bed.

The data presented represents a first step toward quantifying which of the various storage alternatives does the best job for a particular set of requirements. As our testing shows, there are cases in which each delivers the best relative data-transfer performance.

It is clear that, as far as storage technologies go, one size does not fit all. Indeed, the moral of this story may be that users need to gain a better understanding of their storage needs before they sign on the bottom line for a SAN or NAS-based storage network.

Mier is founder of Miercom, a network consultancy and product test center in Princeton Junction, N.J. Percy is lab test engineer for storage systems at Miercom. They can be reached at ed@mier.com or kpercy@mier.com.

Join the newsletter!

Error: Please check your email address.

More about 3Com AustraliaCompaqCompaqEMC CorporationEmulexHitachi AustraliaIntelMcDataNetAppNICSEC

Show Comments