Poor performance of public cloud-based applications leads to end user frustration. SLAs from PaaS and IaaS providers cover availability, but not overall response time. DevOps teams can monitor cloud computing apps for performance using a few different tools.
Nearly 350 IT pros throughout North America estimated an average annual revenue loss of $985,260 from performance problems with cloud-based applications, reported a Compuware survey. Respondents in the EU estimated a $777,000 loss (in U.S. dollars). Concerns about app performance caused 58% of North American and 57% of EU-based respondents to delay adoption of cloud-based applications.
The report also found that 94% of North American respondents and 84% of EU respondents think cloud application service-level agreements (SLAs) are based on the actual end-user experience, not just service provider availability metrics. This naivety likely won't convince public or private cloud providers to offer potentially costly end-to-end SLAs. Therefore, it is DevOps' job to instrument cloud-based apps with error logging, analytics and diagnostic code.
So what's the best way to monitor performance of your data-intensive applications? Free or low-cost uptime and response time reports from Pingdom.com, Mon.itor.us and other site monitoring providers confirm that applications meet SLAs and end-to-end site performance.
Firms like LoadStorm and Soasta sell customized cloud application load testing. Compuware's CloudSleuth site provides a free monthly Global Performance Ranking of end-to-end response times of major public Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) providers.
Figure 1 shows a Windows Azure sample project that demonstrates paging as well as create, read, update and delete (CRUD) operations on an Azure table. Attempting this for OakLeaf's Azure Table Services Sample Project would result in negative values for code execution.
The Time text box at the bottom of the window indicates the last action's code execution time, which was 28 ms for a new page. By clearing the Batch Updates check box, you'll see a dramatic increase in execution time for individual CRUD operations on the 91 customer records.
Relying on reports isn't enough; organizations also must acquire or develop on-premises diagnostic management tools to download and analyze performance logs, as well as to deliver alarms and graphical reports. IaaS providers like Amazon Web Services (AWS) concentrate on building their hardware with native metrics; PaaS products like Windows Azure and SQL Azure provide deeper, more customized insight into the application and its code.
How AWS CloudWatch keeps an eye on performance
Amazon CloudWatch allows DevOps teams to automatically monitor CPU, data transfer, disk activity, latency and request counts for their Amazon Elastic Compute Cloud (EC2) instances. Basic metrics for EC2 instances, EBS volumes, SQS queues, SNS topics, Elastic Load Balancers and Amazon RDS database instances occur at five-minute intervals for no additional cost. You can add standard metrics and alarms in the AWS Management Console's CloudWatch tab and view graphs of metrics by navigating to the Metric page in the Navigation Pane (Figure 2).
The DevOps team can use AWS's Auto Scaling feature to provide elastic availability by adding or deleting Amazon EC2 instances dynamically based on an app's CloudWatch metrics. AWS added new notification, recurrence and other Auto Scaling features in July 2011.
CloudWatch doesn't provide built-in application monitoring metrics because AWS is an IaaS offering that's OS- and development-platform agnostic. However, developers can program applications to submit API requests in response to app events, such as handled or unhandled errors, function or module execution time, and other app-related metrics.
Windows Azure performance logging and analytics
The Windows Azure team has been adding logging, analytic and diagnostic features to the platform steadily since its PaaS service became available in January 2010. Because Windows Azure Portal's interface doesn't support adding metrics, analytics or alarms, DevOps teams must write code and edit configuration files to enable diagnostics and logging for Windows Azure compute and storage services. Table 1 shows logs that were available for analysis by DevOps with Windows Azure's Diagnostic API in June 2010.
TABLE 1. Windows Azure diagnostic data collection logswith the names of Azure tables and binary large objects (blobs) that store the log data enabled by Windows Azure SDK v1.2.
SDK v.1.4 and Visual Studio Tools for Azure v1.4 added the capability to profile a Windows Azure app with Visual Studio 2010 Premium or Ultimate when it runs in Windows Azure's production fabric. An early August 2011 update to tools and a new Windows Azure Storage Analytics feature enabled logs that trace executed requests for storage accounts as well as metrics that provide a summary of capacity and request statistics for binary large objects (blobs), tables and queues.
An updated sample project of Aug. 22 generated analytics tables for both table and blob storage, as well as internal timing data based on TraceWriter items, but requires a separate app to read, display and manage diagnostics. Cerebrata's Azure Diagnostics Manager reads and displays log and diagnostics data in tabular or graphic format. The firm's free Windows Azure Storage Configuration Utility lets IT operations teams turn on storage analytics without writing code. Cerebrata's Cloud Storage Studio adds table and blob management capabilities (see Figure 3).
The System Center Monitoring Pack for Windows Azure Applications provides capabilities similar to Cerebrata products for System Center Operations Manager (SCOM) 2007 and 2010 beta users. Autoscaling Windows Azure for elastic availability currently requires a third-party solution, such as Paraleap's AzureWatch, hand-crafted .NET functions or the forthcoming Windows Azure Integration Pack for Enterprise Library. Monitoring SQL Azure databases requires you to use SQL Azure's three categories of dynamic management views: database-related views, execution-related views and transaction-related views.
ABOUT THE AUTHOR
Roger Jennings is a data-oriented .NET developer and writer, the principal consultant of OakLeaf Systems and curator of the OakLeaf Systems blog. He's also the author of 30+ books on the Windows Azure Platform, Microsoft operating systems (Windows NT and 2000 Server), databases (SQL Azure, SQL Server and Access), .NET data access, Web services and InfoPath 2003. His books have more than 1.25 million English copies in print and have been translated into 20+ languages.