Coping With a Cloud OutageBy Samuel Greengard | Posted 03-27-2014
By Samuel Greengard
In recent years, as organizations have embraced cloud computing, CIOs and other executives have witnessed significant gains. In many cases, their enterprises have boosted IT availability, reduced demands on internal infrastructure and notched productivity improvements along with cost savings. Last October, Gartner reported that cloud computing will emerge as the bulk of IT spend by 2016 and half of all cloud services will take a hybrid cloud approach by 2017.
But as more and more organizations drift into the cloud, one fact is perfectly clear: the risk of an outage or outright failure is real, and such an event could have significant repercussions during and after an event. Already, a number of high-profile cloud providers have endured episodic outages and failures, including Amazon Web Services, Google Drive, Dropbox and Microsoft Azure. In some instances, companies using these products and services haven't just endured downtime, they've also lost data.
"There's a fairly high degree of certainty that at some point an outage will occur," says Jay Heiser, a research vice president at Gartner. "Any business using the cloud must ensure that a contingency plan is in place," he says. This encompasses everything from a short-term outage to a cloud provider exiting the business. "It's not enough to make sure that all data is backed up," says Heiser. "It's critical to have a way to get to it and redeploy it. Otherwise, a company might find its own business at risk."
One prominent example of a cloud provider that shut down is Nirvanix, which offered public, private and hybrid clouds and storage services. The company, which was founded in 2007, simply disappeared in 2013. Customers wound up with only 14 days notice that the closure would occur. And that left some customers reeling. "You don't move a petabyte of data in two weeks," Heiser notes. "If a cloud provider fails, you could be left with a difficult, if not impossible, situation."
Of course, it's critical for CIOs to conduct due diligence and hammer out a solid service-level agreement before signing any contract. But where the situation gets particularly tricky, Heiser points out, is understanding exactly how a cloud provider handles data and how it replicates it within its data center or beyond. "Every cloud provider advertises that they offer business continuity capabilities," he says. "The question is whether they operate a mirror site of their own. If there's some kind of failure that brings down both the primary site and copies of the data, recovery may not be possible."
Anil Desai, an independent technology consultant based in Austin, Texas, says organizations can hedge against the risks of a public cloud by using a hybrid approach to help in the event of an outage. "It may be possible, especially if you have taken steps ahead of time, to redirect some of your traffic back to your own resources," Desai says. He also believes it's crucial to view IT in a holistic way and examine everything from bandwidth, cloud connections and internal infrastructure to APIs, security and policies before adopting a plan. "Any weak point can become a factor during an outage," he adds.
Businesses shopping for cloud services should ask a number of key questions, Heiser says. Appropriate questions, Heiser notes, include: Where is the offline backup stored? How long will it take to restore? What is the process for restoring data? "Too often, providers cannot adequately answer these questions," he says. But Heiser also believes it's important to weigh an organization's ability to handle IT tasks internally. "The reality is that many small and medium companies—and even some larger ones—are not able to backup and manage data as effectively, or achieve the same level of availability and business continuity, as a cloud provider."
One thing is certain: these challenges won't disappear anytime soon. Current standards are relatively weak and businesses using cloud services are doing little to push for change. "The Cloud Security Alliance and ISO are not looking into these issues," Heiser says. "Until buyers begin demanding more information, better answers and better capabilities, there's no reason for vendors to become more vigilant."
The takeaway? "Buyers need to be more aggressive about getting the right information up front," explains Heiser, "and having a contingency plan in place when an unforeseen outage or failure occurs."
About the Author
Samuel Greengard is a contributing writer for CIO Insight. To read his previous CIO Insight article, "Sustainability 2:0: The CSO's Role Keeps Growing," click here.