BlueArc Titan 3200 a giant among NAS systems
- 19 August, 2008 03:18
We don't have Olympic Games for file server systems but the SPEC SFS (System File Server) benchmark serves as the next best thing, providing a comparable rank of file server performance. If you sifted through all of the SPEC SFS results published to the SPEC Web site, you'd find that the fastest NAS systems are from NetApp, BlueArc, and EMC, who take what in Beijing would have been a gold, a silver, and a bronze medal, in that order.
Like Olympic records, SPEC results tend to change over time. In fact the current top 10 list, which reflects results as of June 18, 2008, looks quite different from a SPEC SFS results snapshot I took about two years ago.
Other differences aside, one important fact I want to highlight is that the new BlueArc Titan 3200, which the vendor announced in March, shows significantly improved performance over previous models, and puts the Titan 3210 Cluster into second place for number of operations per second, surpassed only by the NetApp Data Ontap GX.
With the Titan 3000, BlueArc claims to have doubled the theoretical performance of its systems, a claim the company has maintained at each major release. Combined with a full set of storage applications including snapshots, replication, mirroring, and WORM (write once read many) capability, to give only a short list, the amazing performance trajectory provided enough incentive for me to review the system.
I conducted the review at the BlueArc Customers Training Lab in San Jose, where BlueArc had prepared a redundant test bed with two Titan 3210 servers connected via FC (Fibre Channel) to a SATA storage array with 90 drives and a second array with 128 FC drives. A separate machine ran the BlueArc management software.
Network, storage, and file system modules
The Titan 3210 has a modular hardware architecture with four blades mounted horizontally, each providing specialized functionality. For example, the NIM (Network Interface Module) blade hosts six GbE or two 10G Ethernet ports and controls connectivity to the application server side of the storage network.
For storage connectivity, the Titan 3210 mounts a SIM (Storage Interface Module) with eight FC ports. Each of the remaining two slots of the unit hosts a FSM (File System Module), which manages protocols such as CIFS, NFS, and iSCSI.
While more primitive NAS solutions run on beefed up servers and queue up parallel tasks on general purpose processors, each module in the Titan 3210 has built-in ASICs programmed to execute in parallel. Multiple dedicated chips on the NIM handle IP, TCP, and UDP, speeding up processing of multiple tasks in parallel. Similarly, the FSM takes advantage of ASICs to expedite the processing of file operations.
Other performance-enhancing features include battery-backed NVRAM (Non-Volatile RAM) and, in clustered configurations, an optional Dynamic Read Caching feature. Dynamic Read Caching expedites read access to selected files while maintaining a consistent cache across as many as four nodes.
How far can you push a Titan 3200? According to BlueArc's specs, the system can scale as much as 4 petabytes of storage, with each file system ranging up to 256TB. Performance rates are on the order of 1600 megabytes per second and as high as 380,000 I/O operations per second.
Getting to know the beast
Perhaps even more impressive than those hardware specs is the Titan's software architecture, which is easy to navigate from its browser-based management GUI. The Titan organizes storage in file systems, storage pools, and system disks. Each system disk contains a number of physical disk drives and assigns properties such as RAID level and the granularity of each storage fragment.
Storage pools bring together multiple system disks to create reservoirs of capacity from which you can create file systems. This scheme gives admins the flexibility to tune for capacity and performance but is hidden from end users who see only a large directory containing their files.
An admin can anchor file systems to a static capacity or allow them to automatically expand when a threshold is reached. Expansion can be limited to a specific capacity target per file system. Setting that hard limit makes sense because no file system can be shrunk at the moment, although according to BlueArc that feature should become available in future versions.
To help prevent runaway capacity increases, admins can define two threshold levels for file storage and separate percents for what's being used by automatic snapshots. To help admins monitor the situation, the management GUI displays a color-coded bar showing the percent used, and the system generates a warning when one of those thresholds is reached.
For a more selective approach to monitoring capacity, admins can use the quota features of the Titan. Quotas allow you to define user directories (Virtual Volumes in BlueArc lingo) and their limits according to capacity or number of files. As with BlueArc's file systems, you can set soft and hard thresholds for user directories. You can also set limits per user and per group.
I was immediately comfortable with the Titan management GUI, and the capacity monitoring features of the system are adequate to prevent runaway expansions. But if you need more control over data content, such as the ability to limit or prevent storage of certain file types, you'll have to use third-party applications.
Setting limits on the space used by snapshots may seem overly cautious until you realize that the Titan can take 1,024 snapshots for each file system, either manually or driven by a schedule. Considering that you can define more than one hundred file systems, snapshots could quickly become a significant part of your storage allocation.
The Titan makes it easy for users to recover lost files from a snapshot. Users can access a read-only folder hidden in their own directory to retrieve lost files without having to wait for an admin's help.
Moving beyond file systems
If you expect to go beyond the 256TB limit of a BlueArc file system, or if you want location-independent shares, the Titan offers a powerful CNS (Clustered Name Space) option. Conceptually similar to Microsoft DFS, Titan's CNS makes user directories immune from any move (users always find their files at the "same" network drive), and lets you put the aggregate capacity of multiple Titan file systems to good use.
CNS is easy to set up from the multifaceted management GUI. A wizard leads you through the steps: First you create a namespace, which also establishes the root for all future directories under CNS. The next step is creating a CIFS share or an NFS Export on that root, which you should do to keep files from different departments separate, for example. The next step is to link an existing file system to the CNS folders.
That's how you migrate existing folders to CNS and how you can bring multiple file systems under the same name space. In my case, I had only two file systems to add.
It's important to note that adding an existing folder to CNS is a disruptive action. For users to reach their files at the new location, they will have to point to the CNS address of their folder. I typed a simple net use command in Windows and immediately gained access to the new folder under CNS.
It's the price to pay to start with a global name system, but after the folder is under CNS, any subsequent moves -- to balance load across file systems, for example -- will be transparent to users. Moreover, as mentioned, CNS opens wider horizons of capacity and performance. Depending on your requirements, that could be a compelling reason in itself to consider the Titan.
Cutting the fat on primary storage
CNS does the trick for painless and user-friendly administration. But what if users crowd your expensive storage arrays with files that are seldom if ever touched? The Titan has some powerful tools to automatically and painlessly move old files to a different storage tier.
Setting up automated data migration takes only a few steps from the management GUI. I first defined a target file system, then defined a migration path listing my source and target file system, and finally created a migration policy.
In the migration policy, you define the files to be moved according to location, file name, and the date the file was last accessed. Naturally, you can combine those criteria.
I set my source to a crowded file system on my FC arrays, and created a target file system on the less expensive and larger SATA drives.
After the migration, my source directory still looked intact, because the process replaces every migrated file with a stub pointing at the target location. Clicking on the stub immediately opens the file.
Data migration is probably the most effective way to remove clutter from primary storage. You avoid disturbing users with annoying file cleaning campaigns, and it can save some money because you purchase less expensive arrays. Plus it's the easiest way I have seen to start with tiered storage.
During my evaluation of the Titan 3200 I ran across more interesting features than I have space to describe: For example, creating iSCSI targets, file systems with two levels of WORM, and the intriguing replication capability that requires an additional license for incremental, only-the-files-that-changed replicas, but are included at no extra cost if you can live with replicas consisting of full copies of data and metadata.
The Titan doesn't come cheap. BlueArc estimates that my test configuration including storage would run between US$600,000 and US$700,000, which is serious money but compares favorably on many levels with several competing solutions. For a company that needs unified storage offering fast performance, reliability, good management tools, and top-notch scalability, the Titan 3000 should top the list.