Business processes comprise a range of applications and involve the coordination of multiple business units. In a cloud computing environment, this process, called orchestration, involves a few very critical factors. To design for orchestration in the private cloud, IT teams must manage server runtimes, direct the process flow among applications and deal with exceptions to typical workflows.
For simple application requirements, custom scripts can be sufficient for setting up basic orchestration. Scripts can implement the business logic behind a workflow. For example, a database loader script should run once data files are written to the staging directory.
As your cloud workflows become more complex, you may be tempted to add more logic to your orchestration script. For example, you may need additional logic to make sure several processes complete successfully before starting another process. Additionally, you might need to correct errors in an application running on another server. This approach can become unmanageable quickly -- as the number of processes increase and the dependencies between processes become more difficult to track.
Managing decoupled cloud workflows
One way to reduce the challenges of managing multiple cloud applications that have to work together is to minimize the need for direct connections between applications. For example, instead of assuming that process A will be called by process B after an event occurs, you could program process B to write data to a message queue indicating the event has occurred and including all relevant data.
In this example, process B doesn’t need any information about process A, nor does it depend on process A to run at the time the message is written. This allows you to run process B -- or multiple instances of it -- until a sufficient number of messages are in the queue. Process A will then start and work through all messages in the queue before it terminates. This is especially useful in cases in which process A works through the messages faster than process B can accumulate them.
A decoupled architecture that uses message queues is an improvement over custom scripting for complex processes. It also works well when you need to scale certain parts of a workflow but not others. If there are more messages in a queue than a single instance can handle in the time allowed, for example, additional instances can be brought online. There is no need to change coding or alter the system architecture.
However, because this technique distributes the logic for processing the workflow throughout the system, it can be problematic. One part of the application embeds some of the logic and writes to a message queue while another part reads from the queue and embeds the logic for processing data structures. There is no single global repository of business logic.
Amazon Simple Workflow Service and alternatives
Another approach to orchestration is to use a workflow system, such as the Amazon Simple Workflow Service (SWS). This service implements workflows as a collection of tasks that actors execute. Actors are abstractions that include both code that will be executed and agents that start workflows, implement decision logic about workflows and execute programs to complete tasks.
Tasks are broken down into two types: activity tasks and decision tasks. Activity tasks run a program to perform a specific activity; decision tasks use information about the state of the workflow to control the next set of tasks. Amazon SWS implements other constructs needed for realistic workflows, such as task lists, workflow-execution-closure services and recoded data about the history of a workflow. It also signals for event-driven processing as well as task polling. SWS includes an API for implementing and controlling orchestrated workflows.
A primary advantage of SWS and other workflow services is that the logic of the workflow is managed as a single unit. Workflows are specified using the management console or API, meaning they are language-independent. SWS workflows are generic, allowing the use of mechanisms for multiple workflows.
There is always a risk of vendor lock-in when you use proprietary cloud services. As an alternative to Amazon SWS, you could use a workflow engine such as Route, an open source application written in Ruby. RightScale use a combination of SWS and Route for its orchestration services. Other open source cloud options include jBPM and Apache ODE (Orchestration Director Engine).
The cloud can be used to efficiently deploy servers as needed to run complex business processes. But you need to optimize how you use instances in the cloud. You can implement your own scripts for relatively simple workflows, but as your business logic becomes more complex, you may find that message queues and workflow engines are compelling enough to switch to a more structured workflow approach.
About the author:
Dan Sullivan, M.Sc., is an author, systems architect and consultant with over 20 years of IT experience with engagements in advanced analytics, systems architecture, database design, enterprise security and business intelligence. He has worked in a broad range of industries, including financial services, manufacturing, pharmaceuticals, software development, government, retail and education, among others. Dan has written extensively about topics ranging from data warehousing, cloud computing and advanced analytics to security management, collaboration, and text mining.
Automate and orchestrate workflows in cloud computing