There are a number of important issues that need to be addressed when testing applications in the cloud, including protecting sensitive data and automating the process of deploying an application on Amazon Web Services (AWS).
Ron Bowers, computer scientist at Army Research Labs (ARL), and Dennis Reedy, Founder of Elastic Grid LLC, discussed these issues during a session at the JavaOne conference in San Francisco recently.
"The issue is that we don't have that many computers, but need to support a community of hundreds of users. We have to test this system and test the availability and policies of the system," Reedy said.
As they make changes to the application code, they need to know if the application will scale to meet peak demands. They don't have enough surplus hardware to do the testing on their existing compute infrastructure, so they wanted to use AWS in order to do performance testing.
Cloud security concerns
The main challenge with Amazon is that not only is the data sensitive but many of the application models are sensitive as well. The Army does not want to make its models about the effects of various weapons on American equipment available to a potential enemy.
It needed a way to dynamically provision the application both in the cloud during testing, and then locally during implementation of the finished application. "We don't want to build an Amazon Machine Image (AMI) or go through the time to provision an entire distributed computing stack every night," Reedy said.
Elastic Grid is an open source technology that uses a domain specific language to express the services and requirements. It allows the developers to push out an instance of the entire distributed infrastructure to either Amazon EC2 or a LAN automatically.
When a developer finalizes the release of a new version, Elastic Grid organizes the creation of clusters. The key to the success of this process is being able to release the new code, and then run it with no operator input work.
Going forward they are hoping to do continuous integration. They plan to connect Hudson into the Amazon S3 repository, so that code changes are automatically tested. On an occasional basis, they want to do a scalability test with 50-100 machines to see how the system behaves under heavy load.
Reedy said one of the harder problems to solve is how to take the test result information from a cluster of fifty machines and look for subtle problems and bottlenecks in a way that a human can understand. He said, "We have a lot of metrics built in about how fast it is working and the wait time so we can be notified if something goes bad. But how can we be notified in a post mortem about a pattern developing?"
One developer asked how they get around security concerns because his administrators would balk at the idea of deploying government applications on a cloud. Bowers said this was a big issue, and they had demonstrated that it could be addressed by proper separation of the sensitive data, code, and algorithms.
Another developer in the audience, who worked for a large unnamed telco, pointed out that telecommunications companies have been addressing these concerns for years by tagging data and code sets. She said that these same concepts could also be applied to cloud projects by other developers wanting to test new apps in the cloud.