A disk failure in a Sun Microsystems server caused the US Federal Aviation Administration's NOTAM database to crash for nearly 20 hours last week, according to the FAA.
The NOTAM (notice to airmen) system provides notices to airmen, or pilots, regarding airports, equipment, and security issues. The system went down late May 22 and was back up at around 7 p.m. May 23.
Because of the disk failure, information had to be delivered to pilots through local air traffic controllers and alternate systems, including a Web site set up to disseminate the most up-to-date information, said Barry Davis, manager of Aeronautical Information Management for the FAA. However, flight safety was never a problem, the FAA said.
"What happened was the drive in an end-of-life Sun box failed in the middle of updating the information on the hard drive, so it screwed up the database," Davis said.
Davis said that was the beginning of the complications. Davis' team replaced the hardware and the drive on May 22 which got the system running again.
"We already had the equipment to replace [the box], we just hadn't done it yet, and that's why the hardware recovery was quite simple - we just put the boxes in," Davis said.
But even then, the system was running slowly, or in a deteriorated mode, and it got so bad, Davis said, that his team decided to reopen the problem to see what was going on.
As the technicians were working to fix the database, they decided to go to the back-up system. As they did do, they soon realized they had written the error over to the back-up system and had corrupted that system as well, Davis said.
"So because we had already replaced the hardware and the drives, we just had to pull the latest information and extract it out of the [corrupted] database, then re-import it into the [new] database," Davis said. "Then we resynchronized all of the subsystems so everyone had the same database copy, and then we opened the gates up at 4:40 p.m. on Friday so that all of the information would come into the system."
Davis and his team spent the rest of that night monitoring the situation to make sure there were no other errors.
While the automated system was out, pilots and other affected organizations were able to get the latest information from a Web site set up for that purpose. Although everything was updated by 7 p.m. Friday, Davis said the decision was made to keep the Web site up until midnight as a precaution.
Read up on the latest ideas and technologies from companies that sell hardware, software and services. Best Practice in Building an Integrated Information Management Strategy
IT Service Management Needs and Adoption Trends: An Analysis of a Global Survey of IT Executives
Achieving the impossible: Unlimited application scalability
How to improve employee productivity in small and medium businesses
Email Archiving Implementation: Five Costly Mistakes to Avoid
Strategies for Eliminating .PST Files
Everything you need to know about email and web security (but were afraid to ask)
The state of Middleware
Zones provide focussed content from Computerworld and leading technology partners.Discover how SOA can create smarter outcomes for your business.
Attend and learn:
- How SOA is helping leading companies to become more agile
- Where you should be applying SOA processes in your company
- The top SOA implementation mistakes to avoid
Click here for more information.
- +
Computerworld Live Podcast #97: The Future of Enterprise Networking 25/07/2008 09:45:36
This week CW Live chats with Mark Thompson, global sales and marketing manager for HP ProCurve, on the future of the enterprise networking. Mark discusses the trends we can expect to see in the near future and how the right infrastructure can ensure your enterprise network is secure. - +
Computerworld Live Podcast #96: Security at the Edge 11/06/2008 09:22:22
CW Live speaks with Amol Mitra, HP ProCurve Director of Marketing for Asia Pacific and Japan. Today's topic: how enterprises are starting to shift away from simply controlling security via server logins, firewalls and moving to more adaptive security frameworks. - +
Data Management Edition #10: Multi-Petascale Systems 02/05/2008 09:12:33
This week we look at sustainability and the development of multicore technologies to build multi-petascale systems. - +
IT Security Edition #11: How to poison the Storm botnet 01/05/2008 08:51:55
This week CW Live presents a case study on how to poison the notorious Storm botnet . Plus we take a look at Cisco's plans for Ironport. - +
IT Security Edition #10: Cyber-battles fought and won 24/04/2008 11:09:47
Vendors bow to end user pressure to improve product security, and we take a look at the latest concepts shaping the cyber-battlefield of the future.
FrontRange Solutions launches HEAT Plus Mobile to reduce help desk costs and improve service management productivity 2008-12-02 15:15:00+11
AARNet Helps to Advance Indigenous Health 2008-12-02 12:44:00+11
Orbis selects Telstra International as its data centre partner for the UK, Europe and Middle East Region 2008-12-02 11:23:00+11
ComOps Deploys Corporate Performance Reporting Solution For Healthcare Test Manufacturer 2008-12-02 10:09:00+11
Mornington Peninsula Shire implements Objective to manage knowledge and deliver service excellence 2008-12-02 09:56:00+11
CRM your salespeople will love
Winning over the sales department and obtaining buy-in at all levels is crucial to the success of any CRM initiative. Discover how you can let salespeople work how they want to and reduce their administrative burden with the latest CRM technology.












