For most of us, building applications to run in a cloud environment is a new ball of wax. But to exploit the flexibility of a cloud environment, you need to understand which application architectures are properly structured to operate in the cloud, the kinds of applications and data that run well in cloud environments, data backup needs and system workloads. There are three architectural choices for cloud environments:
- Traditional architecture;
- asynchronous application architecture (i.e., one focused on information processing, not end-user interaction); and
- Synchronous application architecture.
This article outlines these architectures and indicates when each is most appropriate based on system needs and cloud provider capabilities.
Cloud environments differ
As part of your system design, you need to research and design your application appropriately for the particular environment offered by your cloud provider.
Cloud environments are not all created equal: They offer different mechanisms to implement applications. Amazon Elastic Compute Cloud (EC2), for instance, delivers "empty" virtual machines into which any type of software may be installed and run; achieving scalability for individual applications is left up to the application creator.
By contrast, Google and Microsoft provide programming frameworks (Google Apps, a component-based framework, and Microsoft's Azure Services Platform, a .NET-based framework, respectively) that transparently scale, relieving the app creator of that burden; however, these frameworks limit a system designer's architectural options.
Your environment: The considerations
- Data and applications. The first step is to determine which applications and data sources used by the application will run in the cloud. These apps or data probably reside within the corporate data center, requiring that the cloud-based app be able to reach into the data center to access them. In these cases, the best option is for apps or data sources to be made available as services that can be called remotely. Despite service-oriented architecture fervor, though, many applications and data sources have not been front ended with service interfaces. And even with workarounds such as screen scaping, these approaches are inelegant, make a cloud-based application more complex and tend to be fragile.
Data backup. While Infrastructure as a Service (Iaas) offerings allow traditional SQL Server databases to run, you still need to perform data backups to ensure recoverability in case of crashes. Since the data and applications may reside in different places, the established backup mechanisms won't work. Amazon's Simple Storage Service (S3) can store system recovery data via the backup capabilities of the system database itself. For system recovery, the application itself can also be protected via snapshots into S3.
Because of its internal replication of S3 data, Amazon makes a recovery scenario less likely, but it's important not to rely on that capability blindly. Several startups have released products to back up system data. Amazon has also released Amazon Elastic Block Storage (EBS) that can "persist" system data without needing S3. You can install a database on EBS, which is then protected against system failure due to Amazon's redundant internal backup inside its data centers. EBS will not, however, create point-in-time backups, so you must do them manually or use a third-party product
Traditional apps in the cloud
These applications follow an enterprise architecture model and are designed to meet roughly stable demand rather than tolerate huge variations in system load. They don't require an architecture that can scale up or down, so an architecture scoped for a steady state works fine. A traditional architecture calls for one or more Web server systems interacting through a middle-tier software framework, ultimately interacting with a database.
The good news is that Infrastructure as a Service cloud providers; such as Amazon Web Services (AWS) as well as GoGrid and Rackspace accommodate this architecture. The bad news is that so-called Platform as a Service (PaaS) offerings, such as those from Microsoft, Google, and Salesforce.com, do not: These offering have pre-built frameworks within which your application must operate. So unless you design for one of these frameworks, your application won't run.
But one reason cloud computing draws such interest is that applications expected to have stable demand don't: In these cases, hardware and architecture designed with a certain load in mind often prove inadequate in the real world..
There are two kinds opplications that need the scalability of cloud environments: user-facing and, for lack of a better word, batch. Another way to define them is synchronous (i.e., a user interacts with the system and waits for an answer) and asynchronous (i.e., data is input, processed, and eventually concludes being worked up, with no one sitting around waiting for the results).
Synchronous cloud applications
For synchronous apps end-user interaction is the primary factor, such as with Web usage. With these kinds of applications, large numbers of users may hit the system in a short duration and potentially overwhelm the system's capacity or create poor performance. As a result of these usage issues, there are several system design implications.
- Provide enough Web servers to handle total traffic. The key here is to monitor each Web server's load and, when performance passes a given threshold, start another Web server system to share traffic. This can be done in both directions (i.e., more servers can be started as load increases, and unneeded servers stopped as load decreases). Load-balancing software spreads the traffic across all live Web servers so that capacity can be dynamically increased as needed. It's also easy to remove Web servers from a pool and shut them down. While many companies use load-balancing appliances within their data centers, cloud providers offer software-based load-balancing capabilities. If these capabilities are not sophisticated enough, you can add more load-balancing software in a cloud environment to distribute the load.
- Provide enough middleware to manage demand. Just as end-user demand can overwhelm a Web server layer, the middle layer gets overloaded. So this layer must be designed to scale as well and enable traffic to be transparently spread across middle tier servers. It's common for a load-balancing strategy to be used here as well, with dynamic registration and de-registration of middle-tier servers to enable end-user traffic to flow across the appropriate number of servers.
Provide a data tier that scales. The data layer often proves to be the Achilles' heel of scalable systems. In many designs, all these multisystem tiers funnel down to a single database.
There are several ways to avoid overwhelming a database. First, plan your data approach to minimize trips to the database. So get as much data as is needed for the entire session, which can prevent subsequent calls to the database.
Second, set up a caching mechanism between the database itself and the middle tier. The open source product memcached is often used to store frequently retrieved data; it is filled on the first call for a given piece of data, but subsequent calls pull the data from memcached's storage rather than calling all the way down to the database.
Third, consider more sophisticated database uses. You can use replication technology to run multiple copies of a database and keep databases consistent. Finally, you can minimize use of relational database technology and create a file-based data storage mechanism. This method is more complex in terms of data schema planning and requires application-level concurrency control, but it can provide better data performance characteristics.
But how do you do all this monitoring and management of instances? While every cloud provider offers a way to view individual system components and determine whether they're operational, providers' tools don't typically monitor load or provide the ability to dynamically (i.e., without manual intervention) spawn new server instances to manage Web interaction or middle-tier functionality.
Asynchronous cloud applications
The term asynchronous might also be thought of as batch, to conjure a term from the early days of computing. In other words, these applications do not support end-user interaction, but rather work on a set of data: an extract from a database, a set of files, and so on. Batch processing doesn't necessarily experience fluctuations in demand, but it can experience transitory loads (such as once-a-month reporting or a onetime processing request).
So while a system designer need not plan for end-user deluge, he may need to design a system that can sustain occasional large loads. Cloud computing offers an attractive system design feature: scalability of resources. A batch job that might have taken days or weeks when assigned to a single system within a data center can be processed in a much shorter time frame in a cloud environment, where multiple machines can be brought up to share the processing load.
The general rule is to separate functionality into processing components that can be linked by asynchronous communication mechanisms. The most common form of communication mechanism is the message queue; fortunately, all major cloud providers offer a secure, persistent queue mechanism for inter-component communication. Queues enable the output of one component to be made available to a subsequent component. If the next component in the chain is not ready to take on the work due to load, the work item sits in the queue safely, waiting for the next component to read the work item and perform its task.
In this way, systems that have components that complete their work quickly, along with other components that take longer to do their job, can operate without logjams. Further, a queue mechanism is ideal for load balancing: If a particular step in processing takes a long time, multiple versions of that component can be operated, each of which reads off the input queue. Thus the numbers of individual components can be balanced to ensure overall system throughput.
The trick is how to ensure that each component has the properly operating number of instances. And that leads us to the last piece of the architecture puzzle: systems management.
Managing systems in the cloud
The obvious questions are, How do you know when you have to add new resources or can remove resources, and how do you implement their addition or subtraction?
Framework cloud systems (e.g., Microsoft Azure and Google Apps) take care of these tasks for you. They feature a monitoring mechanism that views system load and spawns new instances as required. An infrastructure-based system like Amazon provides Web service calls that can be used to start or stop individual instances. A couple of tools perform the same work via a graphical user interface: Amazon provides an Ajax-enabled AWS management page, and Firefox features the handy plug-in Electric Fox that performs the same function. But these are still manual approaches. You have to look at the system and then intervene to change the resources available to the application.
Now, however, several system management companies specialize in providing cloud management tools that can not only monitor the general health of the systems (i.e., whether the system up and running) but also track the load factor of machines and, based on pre-assigned levels, automatically spawn new instances to handle system load. Naturally, when system loads decrease below assigned minimums, they release system resources. In this way, you can be assured that your application can achieve desired performance levels while avoiding the cost of leaving unneeded resources up and running.
System management is evolving among traditional and new vendors. Spend the time to sort through your options. It's hard to get the full benefit of cloud computing without having a more sophisticated management mechanism than merely examining ElasticFox to determine whether a particular system is up and running.
For IT shops, cloud computing offers great promise. System architectures that are difficult to implement in capacity-constrained data centers are much easier to create in cloud environments. But making your application flexible enough to respond to changing demand or short-duration use requires a different architectural approach than that used for traditional applications. Planning for systems that can easily accommodate adding or subtracting computing resources is key for cloud system design.
About the author:
Bernard Golden has more than 20 years of experience in the technology field and has worked in global consultancies, enterprise software companies, and large IT organizations. Today he is the CEO of HyperStratus, a consulting firm that helps its clients define and implement cloud strategy and systems.
Golden is the author of Virtualization for Dummies. He also serves as the virtualization and cloud computing adviser for CIO magazine. Golden is a popular speaker, appearing at many conferences like CloudWorld, OSCON, and Educause.
Learn the differences between asynchronous and synchronous communication