Essential Guide

Browse Sections


This content is part of the Essential Guide: Breaking down what's in your cloud SLA
Problem solve Get help with specific problems with your technologies, process and projects.

When cloud providers fail: Recovering from an SLA violation

Chris Moyer offers insights into cloud provider service-level agreements and what to do in the case of an SLA violation.

We're evaluating cloud providers, and I've heard some horror stories about bad service-level agreements (SLAs)....

What should we watch out for?

SLAs are very tricky. They don't necessarily mean that your specific servers are covered, or that your service in particular is covered.

Cloud-provider SLAs typically allow for a certain amount of failure before a problem qualifies as an outage.

Instead, what most cloud providers cover in their SLAs is the general availability of their services. That means that any single server can go down and not be considered an SLA violation.

Cloud-provider SLAs also typically allow for a certain amount of failure before a problem qualifies as an outage. For example, with Amazon Elastic Compute Cloud, or EC2, it's considered a true outage only if all your instances within two availability zones (AZs) are down. That means that if a single AZ is down, or if you're running in only a single AZ, you're not covered by the SLA.

It's also important to understand what the SLA covers in terms of what you'll receive and how you'll receive it if the provider breaches the agreement. In Amazon's case, you typically receive only statement credits, which don't help you recoup any potential revenue lost during downtime.

In addition, many providers specifically remove the requirement to automatically grant such credits, instead offering a way for you to request statement credits if you see an SLA violation. Going through that process is usually more trouble than it's worth.

In general, SLAs won't actually help you recover anything in the event of an SLA violation. For that reason, it's important to identify your own recovery steps to prevent provider outages from costing you money.

Some questions to consider in developing those steps: If the servers in one area are down, do you have the ability to launch servers in another area? Can you migrate your data and instances from one location to another? And what can you do if the provider does violate the SLA?

Dig Deeper on Cloud computing SLAs

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.