Compaq's Cluster-in-a-Box Ensures Uptime

BOSTON (05/15/2000) - If your branch offices are calling out for ensured server uptime, you might want to take a look at Compaq Computer Corp.'s cluster-in-a-box server.

The CL1850 is positioned to be a midlevel, high-availability server for use in remote sites and branch offices where containing downtime is critical but expensive horsepower and disk capacity aren 't warranted. This product is a cluster-in-a-box, the first in this class of servers that we have evaluated. It ships with two redundant servers, two RAID controllers, a built-in keyboard, a KVM switch and a shared drive array. All components reside in a single 10-rack enclosure.

Compaq only supported Windows NT 4.0 on the CL1850 at the time we did our testing, so we limited our tests to that operating system although our standard server tests typically include NetWare 5.1 as well. Also installed on the server was Microsoft Corp.'s Clustering Services (MSCS). This installation was not easy and required intervention by Compaq support staff, but once everything was installed, the server ran without any noticeable problems.

The CL1850 has two redundant servers - which Compaq calls nodes - in the master enclosure. A node can be easily removed by pushing two metal retaining tabs and sliding the unit out from the front after disconnecting the Ethernet connections, KVM connections, drive connections and power source from the rear of the unit. A nice improvement would be having these connections made internally so that removing the node would make the disconnections as well. The unit comes in either a stand-alone tower configuration or a 19-inch rack configuration. We tested the stand-alone version.

Our unit came with 1G byte of RAM, two Pentium III 550MHz processors with 512K byte of L2 cache, three Ethernet network interface cards (NIC), two RAID controllers and 16 9.1G-byte drives. Eight of the drives were loaded in an external Compaq 4214T array enclosure in a tower configuration.

Two of the Ethernet NICs were used for network connections outside the cluster and the third was used for connection between the two nodes. This connection provided each node with information on the state of the other node. When the primary node is no longer operational, the back-up node detects the change and becomes the primary node. The CL1850 is a failover cluster, not a load-balancing cluster, because MSCS on NT does not support application load balancing.

Our hard drive configuration was complicated. The system had to be partitioned to optimize performance for our tests and it requires partitions for cluster operations. One drive in each node is used exclusively for the operating system. An UltraWide SCSI controller built into each node controls each operating system drive. The remaining 14 drives are split across two drive bays and are managed via the shared SCSI controllers. Six drives are in the internal shared cluster drive and eight drives in the external array enclosure. Ten of these drives are striped with RAID-0 into two partitions. One partition is used for the file data set, and the other was used for the SQL data set. These 10 drives are managed by one of the RAID controllers that ship with the server.

The other RAID controller handles the remaining four drives. Two drives are striped with RAID-0 into two partitions used for SQL executables and the cluster quorum. The quorum partition is used for cluster system housekeeping.

The remaining two drives are striped with RAID-0 into one partition for the SQL log files.

In this configuration, the RAID controllers are not redundant, so replacing the controllers will result in downtime. The controllers can be configured to allow failover. Our controllers were configured without failover so both controllers can be used simultaneously to improve performance.

The CL1850 offers a new twist in serviceability: server hot swap. The cluster management software from Microsoft is called Cluster Manager; it resides on either node and lets you take a node down from the cluster. The node can then be powered down and swapped. This process takes about five minutes and has no bearing on server performance or uptime. The fit and finish of the components comprising the product can be polished to make swapping nodes easier. Once the node is out of the enclosure, it is easy to work on with plenty of room to service all the components.

We used Benchmark Factory test software to determine how long it took for the server to switch context from one node to another and for the network clients to adjust. We found the clients need about 160 seconds to recover from the server switching active nodes. This is a little long by our estimation but relatively minor compared with hard downtime that can occur during server failure.

The Cluster Manager software also lets you move an application from one node to another, configure the nodes and monitor the cluster. The application is fairly intuitive, but the MSCS installation is cumbersome. The cluster does not always operate as expected, as we had to make some registry changes to alter the network configuration of the nodes.

But what about performance?

The CL1850 earned an 8.0 in overall performance. This server did best in file performance, scoring a respectable 8.3 due in part to the number of disk drives in the RAID-0 stripe set and the good performance of the RAID controllers. The CL1850 earned an 8.2 for CPU performance, which is about what we expect for dual 550-MHz PIII processors. The server scored a 7.5 in network performance due in part to possible network overhead of the cluster stack.

The CL1850 scored high with a 9.2 in features and flexibility because of its clustering capabilities, 66MHz PCI and drive capacity.

The CL1850 earned a 7.8 in manageability and a 10 for serviceability.

Manageability was lacking due to the bugs and usability problems with MSCS.

Compaq provides a proprietary management platform called Compaq Insight Manager, which includes Cluster Monitor and Intelligent Cluster Administrator.

Serviceability was a breeze, allowing an administrator to completely replace the processors and system memory one OserverO at a time without any downtime.

The Compaq CL1850 is a great server if uptime in your small to midsize server is of utmost importance. Once the cluster is up and running, it is a breeze to maintain and service.

Configuration is another story. This server set is not for the faint of heart.

Installing the cluster with NT 4.0 with MSCS and then installing applications over the cluster can be frustrating. This is an issue with Microsoft clustering, not the Compaq design. As Microsoft cleans up the usability of its clustering software, especially with the addition of application load balancing over the cluster, these hardware cluster solutions have the potential to become prevalent in the mission-critical small to midsize server market.

Bass, a senior technical staff member at CNL, designs and leads the execution of the test suites. He can be reached at john_bass@ncsu.edu.

Server testing is performed at North Carolina State University 's Centennial Networking Labs (CNL) in Raleigh, North Carolina. CNL tests network equipment and network-attached devices for interoperability and performance.

Join the newsletter!

Error: Please check your email address.

More about CompaqCompaqKVMMicrosoftNIC

Show Comments

Market Place