News Stay informed about the latest enterprise technology news and product updates.

Climbing out of an AWS EC2 cost sinkhole

Building a cloud without considering the future can be costly. Hybrid clouds or moving off a provider are ways to stem financial bleeding.

In a world where Amazon Web Services is a household name in public cloud, sometimes it's tough to admit that building a Web-based business on AWS Elastic Compute Cloud (EC2) may have been a mistake. So when the time comes to find a new Infrastructure as a Service cloud provider, what should you look for? And what's at stake if you make any missteps?

SearchCloudComputing spoke with Dan Lucky, vice president of Cloud Solutions at Micro Strategies Inc., a cloud integrator and Tier 1 IBM partner, about where the enterprise hybrid cloud market really is and his work moving app developer Music Mastermind Inc. off AWS EC2 and onto IBM SoftLayer.

We heard a lot at IBM Pulse 2014 and at other conferences about the push to hybrid cloud, but it seems the average enterprise is still just researching hybrid cloud. Where is the separation?

Dan LuckyDan Lucky

Dan Lucky: Enterprises have been investing a boatload of money into building out their private infrastructure. When they go to the public cloud, in some cases, guess who's not needed? …

The piece companies are missing all has to do with, Why do you need a public [cloud] environment? You may need it for short-term workload requirements. And can you really cost-justify a private environment for short-term workload environments? It's really tough to do that. To go from a private environment with autoscaling to a public [cloud] for short-term capacity requires some diligent software work. In other words, the same software architecture you’re running in the private environment for autoscaling and for application implementation, you’ll probably have to run in the public environment. And not too many traditional software products can do that.

Most companies are using private clouds -- or start in the public cloud -- and then they realize that public cloud costs are eating them alive, so they start looking at a private implementation. Once they look at a private implementation, they don't think forward enough to say, 'How do I make this work in a hybrid implementation?' When you look at a private environment, it's no big deal. You can build up private environments all day long. But that application provisioning software you decide to use may not function with a public cloud environment. . Thinking about a private implementation and then thinking forward four or five months from now -- what do I need to have in place for a hybrid cloud? You better think about it before you build your private cloud.

What are the major roadblocks with running a hybrid cloud?

Lucky: For one customer implementation with a hybrid environment, we had to use the same software stack for provisioning a workload inside the private cloud that the customer was using in Amazon [Web Services]. That software stack has to run in both places. If not, you have two completely separate implementations that act completely different. Remember, it's not just about provisioning a bare-bones operating system in the virtual machine. It's about provisioning the application on top of that bare-bones machine.

It's not about firing up a VM in the cloud; anyone can do that. How do you provision your application up there? What are the dependencies of that application? In many cases, people don't understand how to build an application to make it scale in a cloud. Scaling in a private environment is fairly simple; scaling in a public environment is a different discussion. How do you architect that multiserver environment so it can scale?

With autoscaling, if it takes you an hour to an hour and a half to launch something, then autoscaling just went out the window -- it's not going to happen. So you have to educate application developers on how to break those machines down to help them understand what will scale well and what won't scale in a public environment. If you're going to run static workloads, who cares? If you're going to run static workloads, you better look at the cost of that workload and ask yourself how much it will cost to build out a private cloud for that static workload.

You worked with Music Mastermind to move its business from AWS EC2 to IBM SoftLayer. Why did they start with AWS in the first place -- and what spurred the desire to move?

Lucky: Music Mastermind is about a 40-person company. Its app -- Zya -- is free in the Apple store, and it's approaching 300,000 users in three months.

In [January 2011], over a four-month period, they were adding about $5,000 per month [in AWS EC2 costs]. When their bill hit $20,000, they knew they were going to go to $50,000 and $60,000 and $70,000 [per month] if they continued to grow. The issue was, how do they cap that AWS expense?

One of the problems they encountered was something everyone faces in the public cloud. Developers save everything. In a public cloud, a developer would fire up a machine, use it for the day and then go home at 5 p.m. or 6 p.m. without shutting down his machine. The next day, he uses it a little bit more, changes the code and fires up a new machine. He starts doing the same thing on the new machine. That old machine is just sitting there burning up money. You need to put something in place to ask developers when they haven't used a machine in two hours, so that you can shut it down. But even with those stops in place in AWS [with Music Mastermind], the cost was too high.

And it wasn't just the [AWS EC2] instances that cost them. It was all the other stuff that comes along with it that you don't think about: storage costs, I/O costs -- all those increase the monthly invoice.

When we provisioned the Music Mastermind Zya application in SoftLayer and ran all the tests; we were very conscious of the SoftLayer cloud costs. Music Mastermind gave us a budget. So every day, we'd fire up the test, do our work, and then at 6 p.m., I'd go through and see what was running. If it was running and I couldn't get a response from the developer who was doing the work, it was my job to shut the VM down. That was a manual process, and that's how we kept our development costs contained. It's a different discussion when you're in production. When you're in production, you better think about autoscaling. You have to have all that figured out before you move to production.

Was Music Mastermind already using IBM?

Lucky: They were on IBM hardware; no IBM software. Three years ago, there wasn't any IBM software for application provisioning and autoscaling in AWS. But now, if you think about where IBM is going, they're going to OpenStack and IBM SmartCloud Orchestrator. I can tell you from what I know from a features-and-functions standpoint, IBM got their money's worth for the $2 billion they paid for SoftLayer. They got a cloud you can tune to the application requirements.

What was wrong with [IBM's] other cloud, SmartCloud Enterprise? Did it work? Yes. Did it give people the flexibility they wanted? No. Did it have a single network? No. If I wanted to provision something in Boulder and move it to Raleigh, I had to move it over the Internet, which was a nightmare. In SoftLayer, it's all internal. I can move the SoftLayer provisioning scripts from San Jose to Amsterdam in seconds. I can't do that in the Amazon world. There's a lot of value in the way SoftLayer has architected their environment, for example, with the availability of bare-metal machines.

In some cases, the bare-metal machine option is less expensive than the virtual machine option, especially when moving data to the public environment. What does this mean? If you're running a bare-metal environment, you get 5 TB of outbound bandwidth in the cost of the bare-metal machine. In a virtual hourly environment, you get no outbound bandwidth. If you provision a bare-metal machine as a load balancer, in effect, you may get the machine for free. You may pay for the 5 TB of bandwidth, but you're going to get the machine almost for free. So it's not just building out an environment, it's also showing the customer the financial benefits and the financial comparison of what works and what doesn't work.

What are some tried and true methods for keeping cloud costs under control?

Lucky: If it's a production environment, you better have application provisioning and autoscaling logic to determine and monitor that the particular stack of machines -- your application pool -- is running at 75%. If it drops down to 50% in the next hour, scale back. That's autoscaling logic -- and you have to have some type of software to do that. That's how you contain your costs. There are two different approaches: whether it's a development environment or a production environment.

No one likes to admit failure. It is pretty hard for the people that started in AWS like Music Mastermind to admit they were going down a black hole -- a sinkhole, basically. Their costs kept escalating, and they didn't know how to get out of it. [Music Mastermind's] original private cloud design is 100% different [from how] it is now. If you design it wrong and have no flexibility, you're going to pay an awful price. It's not that easy to say, 'Just take it back. We changed our mind about what we really needed.' That's why you need to ask all these intricate questions about where you are going. If you don't know where you are going, you won't like it when you get there.  

So what advice would you give to other enterprises running AWS EC?

Lucky: Not a lot of people are migrating off AWS. People are hesitant to move off something that works and running in production simply because they've got it working. To get off [that product] will cost them money, and unless it's really, really expensive, they're not going to consider it. They'll consider new projects with a new tool set, but in terms of ripping out something that works to move it to someplace else? Not very likely.

Other customers are considering moving out of Amazon to SoftLayer because of the price difference. Customers that are spending a lot of money -- from $50,000 to $60,000 a month -- they'll consider moving at that time. But for customers that have smaller workloads -- spending under $10,000 to $15,000 per month -- won't move just to save $2,000 to $3,000 per month because the cost of moving is expensive as well. And unless they're going to be running the same workload in that cloud for the next three, four, five years, they're not going to think that much forward.

I would say, think about what your costs are, think about where you're going and think about whether you want to tune your environment to the cloud limitations or tune the cloud to maximize your application requirements. It's that simple. It's all about the numbers. At the end of the day, it's all about how much money you're going to save. Will it work in AWS? Absolutely. Will it work in SoftLayer? Absolutely. But where is it going to cost you the least amount of money? What cloud gives you the best value for your money based on your application requirements?

Dig Deeper on Infrastructure (IaaS) cloud deployment strategies