Storage-area networks have helped scores of organizations more easily allocate and manage storage resources, but most SAN installations are dedicated to servers supporting one application. To gain more value, these SAN islands need to be integrated. But what is the best way to do that?
Because SANs weren't initially designed to work together, a number of issues come into play, from performance and stability to political hurdles concerning who has responsibility for what. The latter can be more challenging than any of the tech problems.
What follows is a summary of lessons learned from network executives and storage experts who already have started down the path of integrating SANs. The bottom line is that while technology in this area is still maturing and standards have yet to be defined, it makes sense to at least start moving in this direction.
"Companies have spontaneously acquired SAN islands to serve specific applications or departments," says Tom Clark, director of SAN technology for McData. "Shops may have two, four, 10 or 100 SANs. By connecting SAN islands you can share assets like tape libraries, add a storage array to provide capacity to different applications or consolidate management."
Kent Smith, president of IPSO, a systems integrator in Wayland Mass., says the first thing you need to do is establish deployment standards.
"Standards for the hardware you're going to allow to be used, the technology you're using to interconnect the SANs - whether it's Fibre Channel or SCSI or iSCSI or whatever - and standards for the software layer to manage those SANs," he says.
That might mean scrapping some investments and modifying others, but keep in mind that technology is moving in the direction of integration.
"What you're seeing from a lot of vendors is hardware that makes it easier to integrate these things," Smith says. "EMC, Brocade, HP, everybody is trying to make it easier to put a hardware layer in between SANs to create a virtual single SAN out of independent islands."
In the end, SAN islands typically are integrated in three ways, experts say:
- Consolidated in a simple core fashion in which a large director-level switch - a chassis-based switch with 64 or more ports and built-in redundancy and availability features - replaces smaller fixed-port switches.
- Deployed in a core-to-edge strategy, where larger director-level switches at the core of the data center are attached to smaller fixed-port switches at the network edge.
- Linked over distances with Fibre Channel over IP (FC/IP) or Internet Fibre Channel Protocol.
What's most appropriate depends on legacy infrastructure and the ultimate goal of the project.
On the cutting edge
When United Airlines Loyalty Services, the wholly owned e-commerce arm of United Airlines, realized it was time to take the next step and integrate its SAN islands, it didn't want to scrap existing investments so went with a core-to-edge strategy.
Gary Pilafas, senior storage/systems architect started with three SANs, one based on Brocade Silkworm 12000 director-level switches in its central Elk Grove, Ill., data center and two more based on six Silkworm 2800 and two 3800 switches several miles away in a Schaumberg, Ill., data center.
To link them, Pilafas installed CNT UltraNet Edge Routers, which convert Fibre Channel traffic into IP for transmission over a Gigabit Ethernet metropolitan-area network from service provider Nacio Systems. Besides providing core connectivity, the FC/IP link supports replication and disaster recovery.
Pilafas already had integrated the SAN fabrics within UAL Loyalty Services when the call came from corporate to connect them to UAL's existing SANs. He'll again use a core-to-edge strategy to do that.
"As we move toward integrating with UAL, we want to establish a point in one SAN where we put our director-level switches," he says. "Then we will consolidate our SAN islands into the core. Each island can logically become an edge SAN. We want to have a common fabric across all of UAL to be able to utilize resources that are not always busy."
Like UAL, MasterCard is pursuing a core-to-edge SAN integration approach, one that it hopes will save money in the long run.
MasterCard initially brought in SANs to address the extensive data synchronization that was required with its direct-attached storage systems.
"We were consolidating from individual servers with non-shareable [storage] resources into an environment of larger servers capable of supporting multiple applications" all backed up by SANs, says Jerry McElhatton, president of MasterCard's Global Technology Operations in O'Fallon, Miss.
While the SANs "released pockets of underutilized capacity and reduced the need to have redundant copies of data and the associated synchronization issues," each SAN was still land-locked.
"We [now] are bridging the SAN islands with a common set of platforms and tools to allow for complete cross-platform sharing and accessibility," McElhatton says.
He used a series of edge switches that feed into larger director-level switches to give servers multiple paths to storage.
"This final phase reduces our ports, cabling and switch requirements, and decreases the need to buy additional disk for individual servers, all of which reduces our overall costs," he says.
As with any technology deployment, there are traps you need to watch for. Case in point: extending a SAN with too many edge and director-level switches.
"You have to be concerned with inter-switch links [ISL] so you don't create artificial bottlenecks," says Randy Kerns, an analyst with Evaluator Group. An ISL is created when two Fibre Channel switches are tied together via ports called e-Ports. Each ISL, or hop, introduces latency, and the general goal is to limit hop count.
"ISL hops, when used improperly, can become a problem," says Lee Abrahamson, Solutions Development Manager for CNT. "The Fibre Channel protocol is about how a server finds a target. If you have a random ISL structure with ISLs going every which way, you may have created ISLs that never get used because the [routing] algorithm is based on hop count," Abrahamson says. "For instance, if one route has two hops, and the other has three, Fibre Channel will never select the one that is longer."
Another thing to watch out for is queues backing up, which slows data transfer. McData's Clark says a technique called trunking can be used to manipulate traffic queues and manage traffic across integrated SANs.
"Instead of having each SAN port manipulate a queue, which can get backed up and overloaded [with] slowing traffic, a customer can aggregate the ports via trunking and move the queue so it can be serviced by multiple ports. Throughput is increased and traffic is load balanced across those ports."
United's Pilafas uses trunking to increase access to his storage. "We trunk to the 3800 and then with the 2800s we use ISLs to get to the core of the SAN," he says. "We have more servers connected to the ISL fabric than the trunk fabric. Our total environment is about 600 ports."
Kerns says integrating SAN islands might require dealing with political problems within an organization.
Because SANs have grown independently, they often are segmented by political and departmental boundaries. "When you centralize SAN islands you do it to see dollar savings, but you have to cross political domains. You have the political concern you have to deal with first," he says.
Common management tools sometimes can help mitigate these disputes. "You want to make sure you have roles-based administration so the guy who's handling the topology has a different view of the SAN from the guy managing the local domain," Kerns says. "The big decision is what software I'm going to use to manage that visibility."
Beside management and politics, McData's Clark says the biggest concern with SAN integration is stability.
"The problem today is that if I simply e-Port switches together, then I create this large Layer 2 fabric, and that may be problematic in a number of ways," Clark says. "If you create a native Fibre Channel extended fabric with two or more SAN islands, the whole fabric becomes susceptible to fabric reconfiguration, to state change notification broadcasts."
Fibre Channel switches are similar to Layer 2 Ethernet switches, he says. They maintain information about all the possible routes between devices. When a link breaks, a fabric reconfiguration occurs because the switches must reevaluate which switch is the primary switch in the SAN, what its unique address is and the addresses of the devices attached to it, thus disrupting the SAN traffic. When a new device is added or removed from the fabric, a state-change notification is issued, thus slowing traffic.
Rich Copple, CTO of Community Health in Indianapolis, decided to avoid integration for just those reasons.
"We created three SAN islands on purpose and have not consolidated/integrated any," he says. "We had a specific purpose for keeping our fabrics separate, which is redundancy and fault tolerance. Our design eliminates pushing a bad switch configuration to a domain that includes our whole environment and possibly impacting our Tier-1 SAN."
What has to be considered is whether the benefits of integrating SANs outweigh any challenges, IPSO's Smith says. If integrating SANs seems to be the best way to go, the first thing to do is "establish standards for hardware and software," he says. "Establish your benchmark standards as something to evolve to. That doesn't mean you have to immediately conform to those standards, but establish what your objective standards are."
Then after investigating options and putting a solid plan in place, "overestimate the amount of storage you're going to need," Smith says. "You'll still come up short. Data expands to fill the space available. The more space you make available to your users the more they will demand, which brings up the last piece of advice: Have a reasonable charge-back policy to limit what is currently unlimited growth. Storage has to cost something. It has to have a value for it to be controllable."