New methodologies for 'bottomless' e-mail storage

File-based storage systems add up to a powerful enterprise tool

With e-mail the dominant enterprise communication vehicle -- used for everything from simple notes to purchase orders, contracts, invoices and other critical business documents -- managing swelling message stores has become a primary concern.

Add to that the range of new compliance and records-retention issues, and it is no surprise that a recent survey by Osterman Research shows e-mail use growing at 20 percent per year and message stores growing at more than 35 percent per year.

But today's most commonly deployed enterprise e-mail servers store data using database architectures that perform large numbers of separate I/O operations to complete a single transaction. To meet the demands, organizations typically add dedicated, costly storage and strictly limit individual storage capacity.

New, open e-mail servers, however, enable a new, open messaging-storage process based on less expensive, modern filing systems that overcome the limitations of database architectures and improve overall performance. File-based storage lets these e-mail systems scale cost-effectively and decreases system-management complexity and administration overhead. This flexible approach simplifies the storage model and lets mailboxes (potentially bottomless) grow to sizes that are more conducive to the way employees use e-mail.

Performance issues

Modern Linux filing systems, such as XFS and Ext3, are fast, flexible, reliable and efficient. These systems, for example, support such features as journaling, which is used in playback of operations following a power cut, and semi-offline storage, which allows low-cost storage for rarely accessed files. Linux filing systems also support clustering, letting enterprises build file-system clusters to support any level of file-system reliability.

Leveraging a file-based e-mail store offers significant performance improvements and potential cost savings because the file system does not require multiple read/write commands between the e-mail and the storage subsystem. Performance is improved. Cost savings come from using less expensive commodity-storage systems that let IT provide much larger mailboxes economically. Open e-mail servers may also support virtualized file systems, and a single server might even run multiple filing systems, if that's desired.

If the goal of the IT department is to reduce costs, increase performance and give each user a significantly larger mail store, a file system solves the problem at the source, making e-mail server storage easy to manage and maintain. A file-system approach also addresses:

-- Storage for large data objects. Open e-mail server systems can employ single-instance storage at the file-system level for large data objects attached to messages -- or even for large e-mail bodies. Each large object can be put into a separate file that can be linked to from multiple places. Single-instance storage not only saves on storage space but also provides much higher performance, because the file is written only once.

-- Backup operations. Using a file-based storage system for backup operations is simple, live (no freeze or snapshot step is required), incremental, and detailed down to the message (file) level. This makes backing up the mail store as simple as backing up a file server. Additionally, file-server backup allows incremental backups (backing up just messages that have changed since the previous day) with industry-standard backup tools. And administrators can make mailboxes significantly larger as a result of backup time being eliminated.

-- Restoration. Backup records let enterprises easily restore records that are accidentally lost or deleted, or that are required for compliance or other regulatory purposes. The file system's "one file per message" architecture simplifies restoration because it has no database-synchronization issues. This allows a detailed restoration; IT can restore a single message by restoring a single file, a folder by restoring a folder, a user by restoring that user's folder and subfolders, or the whole store by restoring the folder tree that contains all the users -- without worrying about synchronizing the live database with the backup.

-- Database corruption. File-based storage eliminates the problem of database corruption, because it has no intermediate database that can fragment or become corrupted. Each user has an individual folder within the store; each folder contains subfolders corresponding to the calendar, in-box and other e-mail functions. Each message in a subfolder is represented by a file. This "one file per message" approach means that any corruption that occurs from a disk malfunction or within the store is limited to a single file and will not spread to where it can crash the entire system over time.

-- Disaster recovery. Disaster recovery of an e-mail message store also is faster and simpler with a file-system architecture, because it provides an easy way to build low-cost server clusters (an active and passive pair of servers in front of the file system) that dramatically improves disaster recovery by eliminating database-synchronization issues.

All the advantages of file-based storage systems add up to a powerful new way for enterprises to bring the capabilities of their e-mail system in line with the needs of their e-mail users. Users get all the storage they need, and enterprises gain a far easier, more cost-effective method to handle e-mail storage.

Chang, vice president of engineering at PostPath, can be reached at vchang@postpath.com.

Join the newsletter!

Error: Please check your email address.

More about Osterman Research

Show Comments