My Exchange Server Hell

10:00amHa! There is a God: saved from WIP meeting by the finance director's e-mail problem. My gut feeling: a local software issue, easily fixed.

10:03amAs I step outside worried workers ambush me and I realise it must be more serious. In fact by the time I reach the server room I am being mobbed. A few in marketing and sales curse and swear bloody revenge unless those responsible fix it; that's me. There is the usual barrage of questions. Do they think I just go into this magical room and flip a switch? I wish. When the e-mail system goes down the world seems to stop. From this point onwards it’s just a rollercoaster ride of adrenalin and heart-stopping action until the server starts humming again.

10:15amOne of the most heart-stopping moments of any Exchange Server crash is seeing if the databases have dismounted - they had.

I’d come to the conclusion that I was going to have to run a consistency check but was aware that it would take a good hour to complete due to the size of the information stores. Sometimes you have to take one step back to go two steps forward. I alerted everyone that we were working on the problem but did not have an ETA just yet.

11:45am Bingo! Check completed and the databases mounted successfully. I felt 10ft tall and headed through the sales and marketing department skipping and shouting that the system was back-up. There’s always one smarty pants and they are typically from sales department. “What took you so long?” he said. I was tempted to reply “Well had you not hassled me so much then I may have got the job done faster?” I was tempted, very tempted, but chose to walk away.

13:30pmHad lunch and my day was cruising; my ‘to do’ list had more ticks than crosses. Just a second ... freeze frame. Surely I could not get off that easily. Talking to the HR manager we shared a joke or two about system crashes, but it was really smiles all round. This turned out to be the kiss of death. There I was saying how we have had only one hour of downtime in six months when I felt a firm hand on my shoulder, it was the finance director again, and he had bad news. Warning...never brag about server uptime.

13:45pmMy user base knew me well; when they saw me sprinting through the office like an Olympic athlete they knew there must be something seriously wrong- within a couple of hours the mail system was down again. Both information stores had dismounted again and having no other option I applied the same fix and an hour later the system was back up.

15:00pmAlthough we were up and running I didn't specifically know what the problem was so I took a drastic step. We had been looking at installing a new Exchange Server for a while, and our current system was running on dissimilar hardware to our other systems. I am a big fan of standardization and fortunately I had a new Compaq Proliant DL380 in my testing area. I made a quick call to my supplier and ordered six new disks; the lead time was a couple of hours – so I played the waiting game.

17:00pmThe system was still up when the courier arrived but I got straight to work. Within the space of an hour I had the hardware configured, the raid array was nearly formatted and I was ready to install Windows.

18:30pmAs the office emptied I got the feeling that this was going to be a long night. You know that lonely feeling when you see the arrival of the office cleaners. The echo of vacuum cleaners throughout the empty office set the mood for the rest of the evening.

20:00pmIt's a slow process but I was starting to make strides forward in the installation of Exchange, although I still wasn't sure why the server crashed twice in one day.

00:30amThe Exchange back-end was up and running on the new server and I’d decided that I was going to keep both servers in the production environment at once and configure a routing group between them to allow them to pass messages between each other.

1:30amFinally... all configured. The tests had completed successfully and I was ready to move users between servers. Since we have more than 10GB of mailboxes this was going to take some time. Not being the type who takes risks I decided to move the mailboxes in batches of five at a time, starting with the smaller ones. It would be too risky to start with the executive mailboxes first so I focused on the data entry department.

2:00amAnd well into the graveyard shift. My friend the business analyst who always seemed to work very late had now gone. To remain focused I headed to the kitchen and made a cup of tea using the biggest mug I could find.

2:15amThen the answer came to me. Even though the antivirus software was not meant to scan the Exchange directory it appeared that on this occasion it had taken it upon itself to scan one of the main log files in Exchange and quarantine the file. I was fairly sure that this was the problem but still decided to continue with my migration. I also modified the virus settings so that they definitely did not scan the Exchange directory - solution in place.

3:00amI had now moved half of the mailboxes and tested a couple at random on the new mail system and everything was looking very rosy.

3:30amI decided I was here for the long haul so I took a taxi home and had a quick shower. Not sure whether to have lunch, dinner or breakfast as my days had rolled into one I asked the taxi driver to take me through the 24 hour McDonalds on the way back to the office. Funnily enough the Big Mac tasted like the best gourmet burger I had ever eaten.

5:00amAll the mailboxes had now been moved to the new system but the MX record in my DNS still pointed to the old system. I needed to change this since it was the single possible point of failure. I’m not on my own at this time, having contacted my ISP. At this early hour someone answered the phone but was unable to make the change since they only worked on the helpdesk, so they paged an engineer who was on call. I’m not the only one who was working through the night to fix an Exchange crash!

5:15amNow that’s what I call service, a touch less than 15 minutes later my MX record had been pointed at the new IP address and all new mail was being delivered to the new system. Things were finally coming together in a big way.

6:00amThe first couple of workers arrived in the office and muttered morning greetings as they rubbed the sleep from their eyes. Having lost track of time I nearly replied good evening. Everything was feeling a little surreal as the rush of adrenalin had long since gone and the overwhelming feeling of tiredness was starting to engulf me.

6:30amI suddenly realised that the interstate offices would not be able see our new mail system due to the access control lists on the head office firewall. It was not the time to be re-inventing security policies but crossing my fingers I made the changes without making any mistakes.

7:30amAs more people trickled into the office, I got the odd comment about what I had been up to last night? No one realised that I was actually here all night since there was no interruption in service and everyone was happily accessing their mailboxes. Luckily I had taken a shower earlier and changed clothes so everyone thought I had actually slept, little did they know I had been there for 24 hours straight!

8:30amMy boss arrived and as he is not one for pushing people to work through the night gave me a pat on the back and a bacon roll with HP brown sauce from the café downstairs. At this stage he probably didn’t realise it would only keep me going for another hour.

9:00amNow for the niceties, some would call them necessities but the target here was to get the Exchange up and running. We have a strong mobile user base that use Outlook Web Access heavily and some users even have Blackberry so of course I had to reconfigure them as well.

12:00amThree hours later and I had successfully reconfigured OWA on the new server, our BlackBerry Enterprise Server was forwarding e-mail’s back out to Research In Motion again and the CEO was very happy!

1:00pmIt was finally time to call it a day. All systems were up and running; we had a multiple Exchange server environment for fault tolerance purposes and it appears that the original problem had now been resolved. My boss offered to drive me home in his flash new BMW M3. How could I say no? Trying to make conversation on my way back was virtually impossible so all I could do was nod and agree.

1:30pmFinally back at home, I was exhausted but satisfied, a job well done. I felt like an insomniac and it was all I could do to take off my shoes, check my e-mail (I am an IT manager after all and it could have been something important) and fall into bed. Ahh blessed sleep!)

Join the newsletter!

Error: Please check your email address.

More about BlackBerryBMW Group AustraliaCompaqResearch In Motion

Show Comments

Market Place