Transforming the data center from hell
- 24 July, 2008 08:23
Some CIOs and data center managers have found themselves having to wring performance out of monstrously deficient facilities. Ultimately, these three wrestled with their infrastructures, made major changes and won the day.
Pomona Valley Medical Center CIO Kent Hoyos says he became embroiled in a too-literal version of "Dante's Inferno" when he tried to consolidate data center operations for the 436-bed acute-care hospital, located in Pomona, Calif., near Los Angeles. Escalating heat problems threatened ongoing operations and future technology deployments as temperatures inside the data center began to spike to as much as 102 degrees.
"It was horrific," Hoyos says. "It was about as dire a situation as I think anyone would ever want to have to deal with. We were trying to get as much out of our data center as possible, but what we wound up with was just an awful environment."
The firm that had originally been hired to design the consolidated data center had promised that there would be adequate cooling, but by the time temperatures began to increase, the firm had gone out of business and left Pomona with an untenable working environment.
The hospital was using two 5-ton air conditioners to cool the data center, but the increasing heat generated by the tightly spaced server installation began to push typical temperatures inside the data center past 90 degrees. Two portable air-conditioning units were placed on the floor, and a hole was cut into the facility's Plexiglas window to allow for the installation of a third air conditioner. But even then the heat persisted, so multiple box fans were hung at various points along the ceilings in an attempt to reduce problems at specific hot spots.
"It was one of the most ridiculous things you've ever seen," Hoyos says. "We simply did not have the infrastructure necessary to move forward."
Alas, it was not a divine comedy for Pomona. The hospital attempted to add a Picture Archiving and Communications (PAC) digital radiology system that required a large SAN and archiving platform, which "were just like putting a furnace in the room."
When one of the air conditioning units failed, the data center saw temperatures exceed 100 degrees, leading to the loss of several hard drives and a lab system. In all, more than US$40,000 worth of equipment was damaged, and Hoyos' IT staff was besieged by help desk calls.
Space limitations made it impossible to add more large air conditioners, and a raised floor only 6 inches high also limited options. Hoyos investigated using chilled-water solutions but was reluctant to introduce water in his data center, particularly since the small floor space would mean the cooling units would have to be mounted above the servers. He instead began working with Emerson Network Power's Liebert subsidiary and decided to install its XD high-density cooling systems.
Twenty Liebert XDV cooling units and two XDP pumping units were installed inside the facility to supplement the existing 5-ton units. The XD systems were mounted directly on top of the hot server racks, providing cooling of up to 500 watts per square feet. The units use a coolant that is pumped as a refrigerated liquid to the modules, where the coolant is then vaporized as a gas that absorbs heat.
In total, the new cooling units bought the facility the equivalent of 44 tons of air conditioning, allowing Hoyos to transform his previously sizzling environment to a more stable and comfortable 66 degrees.
"All of a sudden, the help desk got quiet," he says. "The problems that we were experiencing because of the un-optimized operating condition just ceased."
College faced dying servers from blackouts
As the College of Southern Nevada grew to its current size of about 40,000 students and 500 full-time faculty members, its IT operations expanded randomly in response to specific departmental requirements, leading to the operation of five separate facilities at three different campuses located as much as 30 miles apart.
"They were really little more than large network closets with a UPS and some server racks," says Josh Feudi, interim CIO. "We were trying to add more services for our students, staff and administration and began running into an increasing number of issues. Here in Nevada we had cooling problems, humidity issues, and then we started experiencing local power outages." Units, including servers, "were dying on us," he adds.
On days with high winds, the area began experiencing rolling brownouts that could lead to server crashes, Feudi says. As temperatures rose from spring to fall, the college's IT department was forced to turn off selected services, including limiting the admissions office's ability to access student data. The college was attempting to cool the five rooms primarily by using the building's general central air conditioning.
"Temperatures were getting into the hundreds," Feudi says. "We knew these data rooms had outgrown their usefulness and determined that centralizing in one location would have less of an economic impact than trying to upgrade the individual rooms."
Working with Hewlett-Packard and American Power Conversion, the college created a single consolidated data center that takes advantage of blade servers, virtualization software and specialized heat-containment and cooling modules.
The college had more than 250 physical servers in its five data rooms. Servers that were still under maintenance contracts -- about 150 -- were moved to the new central data center, and the remaining systems were consolidated using VMware virtualization software on three HP server blades placed in a single cabinet. As the older servers still in operation reach end of life, Feudi plans to further consolidate on server blades.
To address heat and cooling issues, the college began using APC's InfraStruXure hot aisle containment platforms. The platforms place two rows of servers back to back, which forces hot air into a middle row that is contained using a ceiling and doors on both ends. In-row cooling equipment inside the platform allows Feudi to directly address the contained hot air instead of attempting to cool all the heat that would be released throughout the data center in a conventional design.
The project was completed with no addition to IT staff, has significantly reduced staff travel time to the formerly far-flung locations, and has increased service availability from a low point of 79 per cent to 99.99 per cent today, he says. The new data center has allowed the college to expand services, enabling students to enroll in online courses in particular.
Healthcare facility: data center as physical hazard
Norton Healthcare has four hospitals, 10 urgent-care centers and more than 50 physician practice facilities under its direction, all serviced from a single data center housed in a 100-year-old building. The data center itself is about 35 years old and had been expanded without an overall floor design throughout its decades of use, says Mike Moore, technical planning director.
The 3,900-square-foot facility originally housed mainframes but moved to a client/server architecture beginning abut 15 years ago. Equipment was added piecemeal over the years, resulting in more than 600 servers that were spread haphazardly across the floor.
"We had rows going horizontally, perpendicular, parallel, and there was certainly no hot-aisle/cold-aisle concept being utilized," Moore says.
The hospital had multiple CRAC units attempting to cool the facility, and the growth of server deployments inside the facility maxed out the available power for distribution inside the data center, making it impossible to add more equipment or expand services and applications.
One of the major limiting factors was the raised floor inside the data center that had been constructed using 24-inch squares of wood that had at some point in past been covered with 18-inch square pieces of carpet. As the hospital attempted to add more airflow to the floor, saws were used to cut out pieces of the wood floor and carpet, and aluminum vents were then placed on the floor to direct the flow up into the data center.
"When you walked through the data center, it was a physical hazard," Moore says. "You could trip over the edges of the carpet that laid one piece on top of another, and once the holes were cut, there weren't many alternatives to how you could move and lay out your equipment. When we eventually pulled up the floor, there was cabling underneath a foot and half deep, and a lot of it was old mainframe cabling that had never been removed, which of course was severely limiting the distribution of cooling."
It was determined that a major renovation of the facility would be required, including removal of all the old floor and installation of a new floor manufactured by Tate Access Floors that would accommodate under-floor airflow, as well as new wiring and data cabling.
The renovation was completed in two phases, with the facility gutted and retrofitted one half at a time. As the new floor was readied, the server rows were laid out in a hot-aisle/cold-aisle design, and new Liebert CRAC cooling units were brought in to provide adequate airflow and cooling. In addition, a redundant power distribution and UPS system was installed.
Where previously people entering the facility walked directly onto the data center floor with no security measures, a network operation center was created in the new facility, as well as a "dead man's zone" with dual doors requiring a log-in for entry onto the data center floor.
Liebert's Nform monitoring software was added to monitor the cooling and power distribution equipment, as well as alarm systems. Norton is relying heavily on virtualization to reduce the number of physical devices in the new facility. There are currently about 475 physical servers and 175 virtualized servers.
"We've beefed up security, doubled our available power, increased workload by 100 per cent without increasing staff, reduced help desk calls by 20 per cent," Moore says. With the new data center now in full operation, he believes that the hospital will have adequate room for continued expansion for another five to 10 years, and the hospital now has the ability to begin planning for the construction of a separate disaster recovery site.
"We've extended the life of the facility we have substantially, while allowing us to continue to expand the services we can offer our more than 13,000 users on the system," Moore says.