Decentralized production and market globalization have created a demand for data consolidation over wide-area networks.
Companies with remote branch offices, which have critical data to store and share among users on a day-to-day basis, face a dilemma: Should they select and implement a file-caching system, or should they optimize WAN traffic to carry the data more effectively?
The remote-office dilemma
Organizations with remote offices, such as financial, media, insurance or pharmaceutical companies, have users compiling data at each remote location and saving their work locally, creating islands of essential information. This data needs to be updated regularly, backed up and often accessed by users at other branches.
According to Strategic Research Corp., up to 60% of a corporation's data is kept outside its managed servers, at the remote edge, and the large majority (about 75%) of the "edge data" is unprotected and thus unrecoverable.
When file servers are installed at every remote office, backup and restore procedures are done on-site. However, accessing files stored at another location becomes a logistical nightmare. Let's say you're in New York and you're working on a project with two other teams, one in Denver, one in Houston. How are you going to make sure that you always see the current version of the working file for each team? E-mailing the files back and forth is hardly an option. In many cases, it's not only impractical; it's also plain impossible.
What you want is the freedom to save your work on a central server and have the other teams do the same so everyone will have access to all the latest files. It would prevent any discrepancy between old and new versions as well as among the teams.
When the enterprise is widely distributed, consolidating all crucial data into a centralized location sounds like a good plan, but accessing that data center over the WAN remains problematic. It becomes crucial to use the network to allow WAN-wide file sharing, data storage and information retrieval. But with limited personnel resources and increasing business-continuity requirements, IT managers quickly find themselves overwhelmed by the remote-office dilemma. They clearly have to consider the access needs of remote users and harvest critical data, but they can't afford to put too heavy a workload on the WAN, or it would result in increased congestion.
Ideally, remote users should be able to seamlessly access the information stored at the data center, but traditional file systems such as Common Internet File System and Network File System were designed for the LAN and are hardly adapted to the long-distance bandwidth and latency challenges of the global enterprise network. Research firm Gartner Inc. finds that "80% of enterprise data is file-based (such as Microsoft Office documents, e-mail attachments, or graphical applications), using file system protocols that resist management and don't operate well over the WAN."
The remote-office dilemma is reflected in the philosophy underlying the different technologies in the field of file sharing or data consolidation over the WAN. The basic assumption is that ultimate data coherency is achieved through synchronous transmission over the WAN, just like over the LAN, wherein each user can access, modify and save in real time common work files such as Word documents.
The obvious problem with transmitting all the data all the time is heavy workload through limited-bandwidth pipes, leading to longer response times. File-caching technology addresses that issue by analyzing, streaming and storing the data at the remote site in order to transmit only the "deltas," or the parts of the files that have been modified, to the data center or central location. The remote-office cache keeps the most active files locally in "persistent storage" and emulates the central file server to reduce the number of transaction "handshakes" and to effectively manage the access to consolidated data.
Currently, there are three main vendors in the field of data consolidation over the WAN based on file-caching technology: DiskSites Inc., Actona Technologies Inc. and Tacit Networks Inc. All three have implemented data-compression mechanisms and other optimization techniques to ensure better performance, together with stringent authentication processes for data integrity. They all aim to provide a LAN-like experience to the user, and their bench tests or pilots seem to corroborate that ambition with true numbers.
Since the technology is relatively young, they are mostly at the post-beta stage, although all three claim a small number of fully operative customer implementations. One visible difference among them is in each company's approach to transmission mode, with DiskSites the only adopter of full synchronous mode over the WAN so far.
While their differential and compression algorithms place them at similar performance levels, DiskSites is more affected by WAN faulting than the other two. On the other hand, synchronous mode includes real-time user quota management and dynamic authentication, authorization and auditing processes that are identical to the way the LAN works and are central to global organizations.
With that mode, data can't be lost in the pipe, and usage conflict is completely avoided, ensuring ultimate information integrity. Actually, Actona and Tacit have developed advanced features, such as sync-on-close save or file lock and release, to overcome the system reliability and data coherency limitations due to their operating in asynchronous (store and forward) mode.
Tally Eitan, managing partner of Eitan, Pearl, Latzer & Cohen-Zedek LLP, a law firm operating in New York and Tel Aviv, recently implemented a file-caching system. "As a global law firm on both sides of the Atlantic, we tried various alternatives for data consolidation," he says. "We selected a synchronous solution, as it provides us with instantaneous, reliable global access to our file system."
A different approach is the acceleration technology for client/server transactions, which eliminates the caching mechanism and implements complex algorithms to ensure optimal performance over the WAN. Two companies operating in the field of transaction-acceleration technology for several years are Expand Networks Inc. and Peribit Networks Inc. Both now focus on increasing the WAN capacity through bandwidth compression or, in Peribit's case, through "molecular sequence reduction" (basically, the identification and deletion of repetitive data patterns).
Riverbed Technology Inc. provides WAN transaction acceleration by tackling low bandwidth and high latency with data suppression -- vs. compression -- techniques together with innovative "transaction prediction" algorithms. The company is now going through beta testing and is officially launching its product offering at the end of the first quarter.
The advantage of the technology is its application transparency: As opposed to file-caching, any type of TCP traffic, even Web or MS Exchange applications, can be carried over the WAN. It is, in a sense, the ultimate synchronous technology, with differentials (deltas) carrying only useful bits of information across the network for better performance.
However, simply carrying the data (albeit more efficiently) by working at the WAN layer and not going up into the application layer (e.g., file protocols), eliminates the benefits of getting into the file system and suppressing unnecessary file transactions. Eliminating these numerous handshakes through file-caching effectively tackles WAN latency, often providing a better response time and user experience. In addition, the transaction-acceleration technology is completely WAN-dependent, and there can be some concerns at having no caching at all for remote storage when WAN communication is interrupted.
Considering the early stage at which some of these companies are now, industry experts are expectant, but cautious. "You don't really know how well an appliance works until you get a chance to actually pound on it," says John Webster, senior analyst at and founder of Data Mobility Group LLC.
Is there some visible convergence between the two philosophies? Maybe, if we consider the file-caching products that integrate in synchronous mode some benefits of WAN bandwidth optimization, although they remain application-dependent. On their end, "carry" vendors have systematically differentiated themselves from file-caching solutions in their protocol-agnostic approach.
So if you face the remote-office dilemma, which way should you go? "If you only have a hammer, that's the tool you use to fix every problem," concludes Stan Schatt, research director and vice president at Forrester Research Inc.
"Many of our clients have problems linked to their distributed architecture, which may or may not be caused by bandwidth," he continues. "Caching vendors, compression vendors, application-acceleration vendors and prioritization vendors all are pressing them to go with their products. Put the problem in a wider perspective, identify its main cause and then select the appropriate technological solution."
Val Golan is the managing director of GoLAN Consulting, a New Jersey-based company providing an array of business services, including technology mapping, market analysis and technical marketing content to global corporations, with specific expertise in LANs, WANs and storage-area networks.