Fail-over friends keep Exchange chugging

Solutions that keep Exchange Server 2003 up and running

An e-mail server can stop delivering e-mail for several reasons: a loss of Internet connectivity, a hardware failure, an operating system crash, an e-mail server software crash, or a corruption of the database that stores the messages. The traditional backup-and-restore process can take hours to resurrect a server, and any mail that comes in while the server is down will be lost. As a result, not surprisingly, many organizations demand CDP (continuous data protection) for e-mail.

The options start with Microsoft's own Windows Server 2003 Clustering Services and extend to a range of third-party fail-over and high-availability solutions. Windows clustering allows Exchange Server 2003 to be set up in either an active/passive cluster or a cluster of multiple servers with one standby server. This is highly effective in ensuring uptime, but it is complex to set up, requires extra hardware and licenses, and does not protect against data loss or database corruption.

The solutions reviewed here can cope with almost any Exchange-related mishap, except Internet failures, and they do so more simply, at lower costs, and with additional flexibility or protection compared with the native Exchange cluster. Two solutions, Neverfail for Exchange and SteelEye LifeKeeper, bring true fail-over to an entire Exchange server. Two others, Cemaphore Systems MailShadow and Quest Availability Manager, protect individual mailboxes on one or more Exchange servers. And one, Lucid8 DigiVault, provides backup of data stores that can be restored to a secondary Exchange server. For maximum protection, administrators might choose to implement a fail-over system plus the CDP that DigiVault provides.

Each product takes a different approach to protecting Exchange and offers different advantages. Some of the differentiators are, for example, whether an Exchange server license is required for the backup server, whether more than one server can be protected by a single backup server, whether an agent is required on each Exchange server, and whether replication over WAN links is supported.

The test setup for each product consisted of a domain controller (Active Directory), two Exchange servers (the primary and secondary), and any additional servers as required by the individual product. I set up replication of the primary Exchange server to the secondary and then simulated failures by unplugging the network cable from the primary, stopping the Exchange Information Store service, and dismounting the drive the information store was running on, while monitoring incoming messages and simulating traffic using LoadSim. I observed the Outlook client experience when the primary server failed, as well as the time required to fail over to the secondary server.

Neverfail for Exchange

Neverfail is a true, automatic, active/passive fail-over solution. It uses primary and secondary Exchange servers linked via crossover cable to maintain a heartbeat connection and perform data synchronization. If the primary server experiences a hardware or software failure, the secondary server assumes its IP address and hostname and resumes operation. I tested Neverfail for Exchange 5.0. Neverfail Group offers a variety of application modules other than Exchange, including IBM Lotus Domino, Microsoft File Server, Oracle Database, SharePoint, and SQL Server.

Neverfail provides functionality comparable with that of Windows Clustering, and because it doesn't require Windows Server 2003 Enterprise or DataCenter and Exchange Enterprise Edition, the overall cost is comparable. Neverfail goes beyond Windows Clustering in providing easier setup, great management, and an intelligent analysis and monitoring tool that can find and resolve problems on the Exchange server before they cause failures. Further, as opposed to Windows Clustering, Neverfail doesn't require the hardware of the primary and secondary systems to be identical.

With the Neverfail system, LAN users don't need to restart Outlook. The interval between failure of the primary server and starting the secondary server is short, about two minutes in my testing. Users connecting via MAPI or the Outlook Web Access client may need to restart the client to connect to the backup server.

Join the newsletter!


Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

More about AMPGatewayIBM AustraliaMicrosoftNICOraclePLUSQuest SoftwareSpeedSteelEye TechnologyVIA

Show Comments