Communications failure dooms IDS alert process

Despite the sophistication of technology in place, when humans fail to contact each other, the process snaps

Our intrusion-detection system consists mostly of PCs that act as network sensors by running Snort open-source software. The IDS worked very well in giving us an early warning of an impending SQL Slammer attack a few months ago. But communication between my group and the operations group broke down, turning what should have been a minor issue into a major problem. Now management is talking about merging remediation responsibilities into my small group -- something we're not prepared to handle.

We have more than 25 IDS sensors across our network worldwide, and we can see about 90 percent of the company's internal network traffic. The remaining 10 percent comes from our engineering labs and remote sales offices, which we plan to monitor as soon as we can get the resources.

Our IDS gives us a unique view into our network. We're the only IT organization in the company that can see all traffic as it enters and leaves the network and examine it at the packet level. With this comprehensive view, it's not surprising that we were the first to observe initial SQL Slammer activity.

The Slammer worm entered our network via an unpatched server in one of our engineering labs. The person monitoring the IDS noticed outbound traffic consistent with SQL Slammer at about 7:30 one morning and traced it back to a lab server. The staffer sent an e-mail that included details on the suspected traffic and followed up with a phone call and a voice-mail message. The operations group gets so many e-mails that if you don't let it know you've sent something important, the message might get missed. That's exactly what happened this time. The e-mail alert wasn't read, and our voice message wasn't retrieved in time to block the attack.

Although the SQL Slammer worm was initially released in January 2003, variations of it continue to float around the Internet. Meanwhile, people at my company are still deploying new servers, especially in lab environments, without the proper patches and service packs installed. That leaves us vulnerable to Slammer and many other exploits.

The consequences have been costly. During this latest incident, we had to configure access-control lists on key routers in order to mitigate the attack, which required the services of 15 to 20 people for many hours. If the machines had been patched, there might not have been an incident at all.

But even with the machines unpatched, better communication and a more timely response to our initial warning would have kept the problem from escalating.

My team and I are trying to address the lab vulnerability. Since we have limited control over how the lab builds servers, we're in the process of deploying a device between the lab and corporate network segments that will offer URL filtering, virus scanning and some firewall protection. We also plan to address the communication and reporting problem by deploying a data-correlation tool to send alerts to a more manageable console, instead of through unreliable e-mails.

After we regained control of the situation, the IT security group received an e-mail from a high-level manager suggesting that we be the central point of contact for all virus-related activity and that we should be responsible for managing and creating all incident reports for viruses. This made sense, he argued, since my team has consistently been the first responder whenever malicious code has appeared on our networks.

While we were grateful for the recognition, the manager's e-mail was also of concern to us, and the cause of much discussion. On one hand, we're in the best position to detect malicious activity within the network, and we can provide the most meaningful information on issues ranging from viruses to hacking activity. But we don't feel that our small group should be responsible for managing every virus once it has been detected.

We should handle some incidents. But with some guidance from the security group, the desktop-support group has traditionally handled viruses very well. Given our staff size and abilities, our group should be used as a resource for detecting problems. But remediation should continue to be handled by other groups. Because of our limited resources, managing virus problems would consume most of our time and hurt our ability to attend to other security-related matters. This is also a politically hot turf issue. We need to respond very carefully, so as not to alienate our peers in the other IT operations groups.

This article is written by an IT manager whose name and employer have been disguised for obvious reasons

Join the newsletter!

Error: Please check your email address.
Show Comments

Market Place