What the Five-Day AWS Outage Means for the Cloud

By CIOinsight  |  Posted 04-27-2011 Print Email
Amazon Web Services experienced disruptions in its EC2 hosting service, taking down popular Web sites including Foursquare, Reddit, Quora and Hootsuite, leaving IT managers to wonder whether to continue using the service.

Five full days after its largest outage hit on the morning of April 21, Amazon Web Services said it finally has restored virtually all services to its customers.

However, there still are a lot of smoldering IT managers who haven't yet cooled off completely from the outage that started at 1:41 a.m. PDT April 21 at the AWS data center in Northern Virginia.

The mishap caused disruptions in its EC2 (Elastic Compute Cloud) hosting service, knocking thousands of Websites--including such popular ones as Foursquare, Reddit, Quora and Hootsuite--off the Internet. A limited number of customers still were reporting data being "stuck" in its EBS (Elastic Block Storage) service on April 25.

Income that AWS-hosted businesses lost during that one- to five-day window of time will never be regained. This was a serious business problem for hundreds, perhaps thousands of IT managers, who are now wondering whether to continue using the service.

"EBS is now operating normally for all APIs and recovered EBS volumes," Amazon reported April 25 on its status dashboard. "The vast majority of affected volumes have now been recovered. We're in the process of contacting a limited number of customers who have EBS volumes that have not yet recovered and will continue to work hard on restoring these remaining volumes." The company said it will post a detailed incident report.

What are industry people saying in the wake of the mishap? What might be the long- and short-term results of an outage that shackled one of the sturdiest, most trusted Web services providers in the world?

Reaction from Far and Wide

Several AWS users commented with frustration on eWEEK stories covering the mishap. The blogosphere, as one might imagine, was rife with commentary.

"In short, if your systems failed in the Amazon cloud this week, it wasn't Amazon's fault," blogged O'Reilly Media's George Reese. "You either deemed an outage of this nature an acceptable risk or you failed to design for Amazon's cloud computing model. The strength of cloud computing is that it puts control over application availability in the hands of the application developer and not in the hands of your IT staff, data center limitations, or a managed services provider.

For more, read the eWEEK article: Final Thoughts on the Five-Day AWS Outage.



 

Submit a Comment

Loading Comments...