SAN FRANCISCO (04/28/2000) - Storage service providers (SSPs) have added a new facet to the growing e-commerce industry. By providing managed storage services for customers, they make it easier to quickly ramp up a customer's storage needs for a new project or business. However, the provider has some issues to overcome.
The role of an SSP, similar to that of an Internet service provider and an application service provider (ASP), is to offer managed services for products a company cannot or chooses not to manage itself. Although storage is cheap, maintaining arrays of disks, tape libraries, and CD-ROM servers to support large database systems and data warehouse applications is neither cheap nor easy.
Now that the capabilities of high-speed wide area networks are reaching, and even surpassing, that of server-internal storage systems, an SSP can offer its services to a customer location over the network. The customer thus maintains its data offsite, but its servers have a direct connection to the data.
Sharing alike on the Internet
Storage sharing over the Internet between different host platforms has been possible since the 1980s and the rise of the Network File System (NFS) and other distributed, shared-file systems. Such protocols typically define how to exchange the files between computers. Some can define user permissions and shared access control between the hosts, while others can use transport-level security mechanisms to encrypt the data for transfer, but such protocols are in the minority.
The problem with storage over the Internet isn't the ability to share data, but the ability to do it quickly and reliably. Storage buses run at speeds from 10 Mbps to several hundred megabytes per second, which is faster than the network interfaces on many servers. Most companies don't have very-high-speed links such as OC-3c running at 155 Mbps between sites allocated for the sole purpose of providing storage services. Such a configuration is simply too expensive if the sites are distributed. Additionally, using network protocols such as TCP/IP increases overhead to each block of data transmitted between the server and the storage systems.
Until the advent of SSPs, the fastest means had been to use storage area networks (SANs) directly attached to the server through a controller or host bus adapter. These typically run over Fibre Channel and provide direct channelized bandwidth between the servers and storage subsystems. This is exactly what an SSP offers between the two units, with the difference being that the network infrastructure and storage subsystems are managed by the SSP instead of by the customer.
An inherent problem exists in accessing data across widely dispersed sites.
Concurrent access to the same data needs to have some sort of locking mechanism, whether on two servers located on the same LAN or in two different cities. Record locking becomes more important the farther apart the sites are, because even with direct high-speed connections to an SSP at both ends, there will always be some latency effect. Until we find a way around the laws of physics, this problem will remain.
The actual effect on user data varies with implementation. RAID storage systems can implement data mirroring across multiple drives or even storage subsystems.
If bandwidth is limited between the storage subsystems, and as the latency increases, replication becomes a better solution than direct mirroring. With replication, some of the data at the two sites may be out of sync between replication cycles, but this is temporarily acceptable.
Another problem with using NFS and other protocols to share data between sites pertains to reliability. Although any network connection can go down at any time, customers using an SSP service don't use the public -- and often unpredictable -- Internet to access the storage servers. Furthermore, SSPs can implement quality-of-service factors into the network infrastructure to guarantee reliability and bandwidth between sites.
Deploying a large-scale storage server, SAN, or network-attached storage system requires more than simply employing new hardware that attaches to the application servers. With some systems this may mean planning an entirely new fiber network to provide Fibre Channel connections to the various host bus adapters, storage switches, and hubs.
Planning a storage solution for an organization with a nationwide enterprise network takes even more careful planning. To properly support the time, security, and reliability needs of today's fast-paced companies, SSPs should be able to deploy storage services on the Internet timescale of days and weeks, rather than months and years.
An SSP should have the experience needed to plan and deploy a complicated structure for storage needs across the various servers. An SSP should be able to plan a SAN, know which products are compatible, and understand how data transfers are affected by network delays and other technical problems.
An SSP also can make it more affordable to connect distributed systems across several office locations, either nationwide or internationally. Such configurations are the specialized province of ISPs or network service providers, which offer bandwidth between sites. An SSP, on the other hand, may specialize in the type of connections needed for high-speed delivery of data storage services.
StorageNetworks was one of the first companies to implement the SSP concept.
The startup, formed in November 1998 by storage industry veterans Bill Miller and Peter Bell, now has more than 400 employees. The Waltham, Massachusetts, company has offices in Atlanta, Boston, Chicago, Colorado Springs, Colorado, Dallas, Denver, Houston, New York, Philadelphia, Washington, D.C., and San Jose, California.
StorageNetworks' PACS service offering consists of several levels: DataPACS, which offers direct data storage, BackPACS, which offers backup services, and SafePACS, which offers realtime data replication. The company is planning a global distributed storage network (GDSN) to interconnect its various storage points of presence (S-POPs) and offer direct fiber links between major cities and sites. Thus, customers with offices in, say, New York and San Francisco can access the same data stored in a replicated system between the S-POPs. Each S-POP offers 24-7 managed service in a secure environment supported by system administrators.
Who needs an SSP?
For startups with large storage demands, an SSP is an ideal way to get projects off the ground quickly and provide scalability for the midterm. For the long term, startups may move to their own storage systems, although an SSP should be able to offer a competitive rate for its services.
Web companies that need to store lots of server logs, user session information, or content can make use of an SSP's ability to easily expand available storage capacity. Fast ramp-up time for storage needs allows an SSP readily to handle temporary bursts of activity at a site. In most cases, network congestion to Web sites is a bigger problem than throughput to data storage, but sites can still crash when disks fill up with activity logs and customer orders.
The availability of backup systems over the Internet is still being debated, particularly because of their relevance to corporate needs, where backup systems are a common feature. At home, users may find Internet backup systems are more useful. If your data is being stored by an SSP, it makes perfect sense for the SSP to handle the backup activity as well. Because the backup occurs at the SSP itself, the backups don't tie up your own network bandwidth. With replication and multiport access to the storage subsystems, backups can be done even while the data is still online.
Replicating data across multiple sites is important for both data access and data security. The redundant storage in multiple locations keeps the data safe from failures at one office location or another. In addition, disaster recovery is much faster when the data -- identical to the original -- is preserved in a quickly accessible format.
Tribulations of an SSP
Today it's not unusual for a large server alone to need terabytes of storage space, and in a large corporation this can scale up to many, many petabytes.
The largest SAN products from major vendors are barely hitting a petabyte of capacity now. An SSP would thus need many petabyte-capable SANs at each site to manage the capacity of many large customers. To handle replicated information across sites, the capacities must match the needs at the other site, in addition to the customers handled at that location alone.
Several problems are essentially financial in nature. First, such storage capacity costs many millions of dollars. Second, such storage requires a great deal of physical space that must be specially cooled and uses extensive power.
Third, that kind of storage also needs tape libraries of substantial size, which further compounds the space and power issues. Using co-location space at an ISP is probably not as affordable for an SSP as purchasing its own building or office space to house this equipment. Finally, an SSP must have sufficient staff to monitor the activity of these storage systems, as well as to handle disk and subsystem failures.
Customers should not expect to pay the full costs for purchasing the hardware, setting up the S-POP sites, and so on. SSPs must offer acceptable rates and provide customers a rationale for using an SSP rather than buying their own storage systems. To realize profits, an SSP would have to amortize its costs over a fairly long time and probably endure several years in the red.
With an SSP, customers also should be able to define and manage service-level agreements that can track exactly the provided services. Due to the changing nature of data, the actual use of storage space can vary a good deal. Because of this, service-level agreements should take into account such variations, and an SSP should offer a monthly billing system that doesn't produce a monumental tome of service charges.
An SSP should be able to delegate control of the disk volumes and data contents to each customer's system administrators. Managing storage on a server is a complex task, even without the introduction of another level of administrative control over the hardware systems. The customer's system administrators must also be permitted to contact the SSP's staff directly with requests to swap hardware or tapes, or to do other physical tasks that are now out of their hands.
Making the connection
SSPs are stepping into virgin territory, but they stand to produce significant income if they can persuade customers of the advantages of using their services. This shouldn't be too difficult because today's companies work on Internet time. As StorageNetworks describes it, SSPs offer companies the means to build their storage needs on a pay-as-you-go and pay-as-you-grow basis.
Companies want to move faster without delving into the complexities of large storage systems. This is exactly the role an SSP can fill.
About the author
Rawn Shah is an independent consultant in Tucson, Arizona. He has written for years on Unix-to-PC connectivity and has watched many of today's systems come into being. He has worked as a systems and network administrator in heterogeneous computing environments since 1990.