On the morning of September 11 in Manhattan, US, 800 workers at global investment bank Lehman Bros were in their offices on the 40th floor of the first World Trade Centre tower, while another 600 staff were working at the World Financial Centre on a typical business day.
Then disaster, in the form of a terrorist attack, struck.
Amazingly, there was only one life lost, but the headquarters of the investment bank as well as millions of dollars of equipment were destroyed.
In the fast-moving world of trading, the bank could not afford to be down for long, yet in one fell swoop it had lost power, its primary data centre and equipment, while its displaced workers were scattered throughout the city.
Rafman Azeez, vice president of global e-commerce strategy and architecture at Lehman Bros, spoke to Computerworld about the disaster recovery efforts that went into full swing. Even now, as these stories continue to emerge, it is hard for CIOs to manage a disaster recovery effort of this scale.
"We switched to our backup site, but it was not set up to handle the same amount of load as the primary site," Azeez said. "Displaced employees who didn't move to our New Jersey site were set up in the Sheraton Hotel, or set up as remote workers and worked from home."
Michael Cunningham, EMC's regional program manager of business continuity, said although Lehman Bros, and other organisations in the WTC, had a disaster recovery plan, "a large extent of disaster recovery effort was thought up on the fly".
EMC was among many vendors, such as RSA Security, Citrix, Compaq and IBM, that stepped in to help Lehmans get operational.
"Compaq shipped over a bunch of servers and IBM sent us laptops," Azeez said.
"But the laptops were brand new and not configured. There were 5000 laptops and it takes over an hour to configure each one. The task of reinstalling all the applications needed and configuring the laptops would have taken forever," said Michael Abel, product marketing manager at Citrix.
To overcome this, Lehman switched to a thin-client server environment provided by Citrix's ICA product.
"We just needed ICA client and the user downloaded the applications themselves. This way we could deploy the applications very fast," Azeez said.
"With the thin-client environment, nothing is installed on the laptop. Users just go to their DOS client and download what they need and off it goes," Abel said.
Remote workers could log in from their home PCs and use this facility after the attacks, no matter what their home OS set up and capability was, Azeez said.
"No matter if they were running on Wintel OS from 95, 98, ME, NT, 2000 or XP, or even on a Pentium 166 Mhz with 64MB of memory, they could still do it. You should see how many home users were running on Windows 95 at 64MHz. Our large trading apps would just kill it. But with thin client, it's all done on the server, so we could support 1500 users concurrently."
As more Compaq servers arrived, Azeez said they could just keep adding users in.
Azeez said the new environment also highlighted the redundancy in the terms "primary" and "secondary" data centres.
"In current disaster recovery thinking, you've got a primary and a back-up data centre. But if the workplace is virtual to begin with and you can get access at home or in the office, does it really matter if there is a primary or back-up to get access anywhere," he said.
Since September 11, remote usage has dropped off, but the amount of applications published has increased, which means more workers are logging on at home after hours, Azeez said.
Many companies in the US are running on two data centres. Cunningham pointed out that when US organisations lost a data centre as a result of the attacks, many were left with just one data centre running, "which means you've got a single point of failure again. Now companies in the US are investigating the option of putting in three or four data centres."
Security was also a concern for the displaced workers at the Sheraton. "There were cleaning staff wandering in and out of the rooms while people were logged on, so we had RSA secure ID cards which identify persons logging on and also an automatic shut-off feature on the connection. If there was no activity for 15 minutes, the connection would close down," Azeez said