CIOs looking to find a reliable cloud provider should seek one whose service has already fallen over, according to Dan Foody, vice president at Web services management provider Progress Actional.
“Most people who look for a SaaS or cloud provider will look for one who has never had a major failure, but that’s the exact opposite of what they should do,” he says.
“You want to look for someone who has a major failure: when someone’s house burns down, they next one they will put sprinklers in, where they might not have thought about having it on their first house.”
Only by having major failures can a provider of cloud services understand what it really take to address and resolve downtime issues, Foody says.
The issue of reliable cloud computing can be further complicated by issues of ownership and responsibility when problems arise, he says.
“If a company’s IT organisation builds and designs an application, but they deploy it on something like Amazon’s EC2 cloud infrastructure, who is responsible if something goes wrong: the IT team or Amazon?” Foody says.
“Having an SLA with your service provider is rarely enough, as the moment you customise an application, you as an organisation need to ensure it’s working. You need to make sure where the line of responsibility is drawn.”
Smaller cloud providers can complicate the issue further, Foody says, in that they will often rely on fourth or fifth party infrastructure or billing systems for their third party CRM application.
“The lesson for IT organisations is that you really have to understand whether your provider is taking responsibility for their downstream providers and what their contingency plan is for those providers,” he says.
Foody says CIOs should consider four points to improve their organisation’s cloud reliability:
1. A contract is not enough: CIOs need to understand how their provider will achieve the SLA and quality of service which is being promised them. Understand what their contingency plans for failures is, their processes, internal architecture and systems.
2. Look to people who have been there before: Speak to people who have been through failures before not just on paper. Find out where the internal processes have broken down, and how they are now more ready for the next time a failure occurs.
3. IT is responsible: Have the IT organisation be aware that it is responsible for the application logic layer in most cases if anything goes wrong. It needs to have the tools and the technology in place with their provider so they can resolve that level of issue.
4. Diagnostic tools: Given the physical distance between a cloud provider and the customer, the customer needs to have the diagnostic tools in place in advance to proactively find problems. Many service providers don’t have good stories for how a customer can do that.