Monday | 8 September, 2008
Computerworld
Not the usual back-end trouble
The physical network's forensic files find an unlikely culprit for the company's server failure
Anonymous (InfoWorld) 02/07/2008 10:51:53

Computerworld Buyer's Guide - Vendors Matched to this Article
Additional Resources
Executive Guides
Whitepapers
Zones
Zone logoZones provide focussed content from Computerworld and leading technology partners.

Newsletter Subscription

Sign up for our Computerworld newsletters!
Computerworld's twice-daily news service keeps you in touch with the latest, most important headlines from Australia and around the world.
Keep up with the latest virtualisation technologies, products, news and features.
RSS Feeds

In the late 90s I worked for a midsize ISP in the midwestern United States. Like any ISP at that time, we had dialup service, which meant we had a consumer-oriented tech support number. A friend of one of my co-workers was hired for dialup tech support. We'll call him Jake. He had some computer experience but no specific qualifications to do more than what he was hired to do.

Jake proved himself fairly quickly, and within a few months, he had risen through the ranks and was managing the tech support department. Everybody was pretty impressed with him overall, since it was rare for anyone with a clue or any ambition to work for dialup support.

Jake's prior job was a bit more ... active than this one, and the fact that he sat in a chair all day answering the phone started to manifest itself physically. The poor guy must have gained 30 or 40 pounds in a matter of a few months. Perils of the job, I guess.

I was the systems lead at that time. One day, the NOC contacted me to let me know our main user server was down. This affected shell accounts, personal Web sites, and customer e-mail service. We found the system up and talking to the network, but it was throwing errors left and right about the /home filesystem, which is where everything and anything that mattered on that machine lived.

After some remote tinkering around, we decided to take a look at the machine itself. It was some kind of SPARC pizza-box machine, with an external disk array attached via SCSI cable. We quickly found the problem: the SCSI cable was dangling, barely hanging on to the socket. We got the disks back online and after some filesystem repairs, everything was back in order.

We were interested to find out what had happened, though. Cables don't just miraculously pop completely out of their sockets. After some investigation, we found out the only person that had carded into the datacenter near the time of the outage was Jake. Since he had no business with the user server, we talked to him to find out what had happened. It turned out that he had been checking up on something in the next row, in the rack just behind the user server. The aforementioned SCSI cable was at just about the same height as his rear end ...

After that, whenever a server went down, we would joke that someone must have "pulled a Jake" on us again.

Some years later, Jake had moved into Network Engineering, and he was tasked with doing a UPS bypass test during a facility audit. The UPS had a big dial on the front with several positions (on, bypass, and so forth). He turned the dial from "on" to "bypass" at the proper time, and was suddenly standing in a very dark, quiet room. Nobody had told him that you couldn't turn the dial that way; you had to turn it the other way, through every other position, to execute a bypass.

The next day we all asked him if he'd turned the dial with his rear end ...

Computerworld Buyer's Guide - Vendors Matched to this Article
More about VIA, Socket
Market Place

Computerworld Member Login


 

Prioritizing Services with IT Service Management (ITSM)

Computerworld Live Webinar
Wednesday 20th, August 2008
11:00am EST (Sydney, Australia)

To be repeated on:

Thursday 4th, September 2008
11:00am EST (Sydney Australia)

Sign up and receive a free copy of The Forrester WaveTM Service Desk Management Tools, Q2 2008 at the conclusion of the Webinar.

Attend and discover:

  • How to deliver value to your business through ITSM
  • Best practice ITSM implementation
  • Why emphasis is changing from optimizing IT management processes to better servicing customers and demonstrating real dollar value
  • If service-oriented ITSM is best for your business
Whitepaper

Wireless LANs: Is my enterprise at risk?

Achieve an overall understanding of the risks associated with wireless LANs. Discover their inherent properties, as well as what makes them different from wired networks. Read on to uncover a list of recently published articles on real-life breaches and incidents illustrating the need for proactive measures to mitigate wireless security risks.

Enterprise IT Buyer's Guide
Find Technology Vendors Fast
 
Find vendors by name | Find by category
Sponsored Links