Last Friday night, severe storms wiped out power to millions on the east coast of the United States, including Amazon's Elastic Compute Cloud in northern Virginia. Like tornadoes that seemingly target trailer parks, social media sites seemed to be the most stricken by these huge cloud outages, although I suspect these were just the most visible services that people tend to use more on the weekend. It makes me wonder if the loss of Pinterest and Instagram services (among others) cause us to be less or more sociable?
Now I'm not one to throw stones at Amazon or any other IaaS provider. I worked in the service provider industry for years, and I understand that service providers operate under a much more powerful microscope than internal IT organizations do. One of the benefits of cloud computing is its elasticity and ability to move workloads around, so in theory, a cloud provider should be more resilient when dealing with natural disasters, spikes in demand, or other causes of performance problems. But as we've seen here and in other highly publicized outages, public cloud customers can't expect that this sort of resiliency happens magically without some planning and investment. Service providers have to be profitable, and therefore they limit their services to exactly what is ordered. Unlike an internal IT organization, they are in it for their own business and you get only what you pay for.
That doesn't necessarily mean that running IT services from a public cloud is a bad choice. An internal IT organization can fall victim to natural disasters just as easily. What this means is buyers of cloud services have to exercise more caveat emptor, realizing that outsourcing the computing infrastructure of a service doesn't equate to outsourcing all the risks associated with that service. There can be a false sense of security that public cloud providers are supposed to handle these details.
Much has been written about how this latest outage exposed the risks of cloud computing, but I don't think that's a fair way to look at it. The reality is that the risks of outages, security breaches, regulatory violations and performance problems exist whether an IT service is run from a public or private cloud, or traditional infrastructure. The level of risk and how these risks are managed and mitigated are going to be different in each scenario. So, while we can't control the weather, choosing the right cloud solution (or no cloud) should include a measured look at all the risks and costs involved, with the same concern for disaster recovery, data protection, performance and availability regardless of the venue.
Jul 03 2012, 02:59 PM