Cloud application monitoring is a multidisciplined affair. To optimize infrastructure, admins and developers must perform several distinct types of cloud application monitoring, including checks on performance, spending and security. What's more, some of these monitoring disciplines contain sub-disciplines.
To make matters more complicated, different types of cloud apps or services need to be monitored in particular ways. Monitoring a serverless function, for example, requires a different approach from monitoring an app running on a virtual server.
To devise a strategy for cloud application monitoring, let's explore the primary types of monitoring, how they apply to various types of apps and cloud services, and which native and third-party tools are available.
Key types of cloud monitoring
Cloud monitoring can be broken into three important, but overlapping categories:
- Performance monitoring. This type of monitoring aims to ensure that cloud applications are available and that they perform adequately. The goal is to identify and diagnose the various types of problems that could undercut performance, ranging in scope from infrastructure issues, such as a lack of network bandwidth, to configuration problems, such as an ineffective load-balancing setup, to application errors.
- Cost monitoring. In the cloud, where inefficient use of resources can quickly lead to a large bill, it is especially critical to monitor costs. As a result, cloud cost monitoring has emerged as a discipline of its own, with a variety of tools and strategies dedicated to cost optimization.
- Security monitoring. Security monitoring is important in any context, but it can be particularly challenging to perform in the cloud. That is true not only because cloud environments are typically composed of multiple overlapping layers of infrastructure and software, but also because the cloud provides no hard boundaries between public and private networks.
These categories of cloud monitoring overlap in various ways. For example, security monitoring includes the identification of distributed denial-of-service (DDoS) attacks, which also threaten application availability and performance. They also cover several other types of monitoring, such as database monitoring and log monitoring, that are necessary to optimize cloud application performance, cost and security. Thus, the types of cloud monitoring should be thought of as overlapping disciplines of cloud application monitoring -- not as neat and definitive types of monitoring.
Along similar lines, it is worth noting that although primary responsibility for each of the three types of monitoring described above typically falls to different types of IT staff, the best cloud monitoring strategies make all types of monitoring a collective effort. For example, security monitoring may be the primary responsibility of security professionals, but other IT admins and developers will also need to be involved in helping to identify and respond to security issues in order to address them quickly.
Likewise, cost monitoring should be a concern of everyone within the IT organization, because anyone who monitors the cloud in any way can help identify and address wasteful or inefficient processes within the cloud.
Building a cloud-monitoring strategy
To perform each type of monitoring, IT teams should review specific metrics and information. This isn't a comprehensive list, but it provides some real-world examples of what teams should look for.
Importantly, these metrics also highlight the ways in which monitoring strategies vary depending on the specific cloud workloads that a team deploys. The types of information you would look for when running, for example, a cloud-based VM are quite different than those required to monitor a serverless application.
A variety of metrics and information sources contribute to performance monitoring, including what follows:
- Resource availability. Are the cloud services or instances that you deployed up and running? If a VM unexpectedly shuts down or a database no longer responds to requests, these could indicate a looming cloud application performance problem.
- Response time. How long does it take cloud resources to respond to requests? Slow responses could be because the resources themselves lack the compute power or memory to respond quickly, or a lack of network bandwidth could be the root problem.
- Application errors. How many errors do your cloud applications produce? And what is the source of those errors? Your ability to track this information will vary depending on which types of applications you run and how those apps log errors. A serverless function produces relatively little log data, for example, while a conventional web app running in a VM will produce much more log data. Operating system logs are also an important source of error information, if the cloud service you use provides access to them.
- Traffic levels. How many users are accessing your cloud services or applications at a given time, and how do traffic patterns vary over time? If there is a sudden spike in traffic, are you prepared to scale up your cloud resource allocations to meet it?
To track and optimize cloud computing costs, look for unused or unattached resources. VM instances, cloud databases and other resources that are running but are not being actively used are a common source of cost inefficiency in the cloud. Identify and shut down these resources. Teams can also consider migrating workloads to different types of architectures -- such as serverless, which requires companies to pay only when the service is in active use.
Another area to focus on is virtual service instance optimization. Most cloud providers let users select from dozens of VM instance types. When you're able to identify which one is most cost-efficient for a given workload, you will be less likely to overspend. Also, users should strive to take advantage of discounted instance offerings, such as reserved instances, when available.
Security, of course, is a crucial consideration in cloud application monitoring. The types of information that teams collect to support cloud security will vary widely depending on the workloads they deploy and the threats they face. In general, though, most cloud security monitoring strategies will include a focus on these areas:
- Identity and access management. IAM policies must be configured properly to prevent undesired access to cloud resources and services. If you use them, containers will need to be configured in ways that maximize isolation between them and the host. For example, be sure to prevent containers from running as root.
- Vulnerability detection. Is code that is deployed in the cloud -- whether on a VM, container, serverless function or something else -- being properly scanned for known malware signatures?
- Runtime anomaly detection. Applications and services running in the cloud should be monitored for unusual behavior that could signal a breach or attempted breach.
- DDoS. DDoS attacks, which overwhelm cloud applications by flooding them with requests, are a threat to both cloud security and performance. IT teams should use cloud providers' tools -- such as AWS Shield -- to mitigate DDoS attacks, while also monitoring for network traffic patterns that indicate that such an attack is being attempted.
Tools for cloud application monitoring
Between native monitoring tools that are offered by cloud providers, such as Azure Monitor and AWS CloudWatch, and third-party monitoring products, there is no shortage of options for building a cloud monitoring tool set.
Most organizations will use the native tools of their cloud provider, or providers, as the foundation for performing all types of cloud monitoring. In many cases, however, native tools are not enough. It's wise to add a third-party monitoring platform that can take the data collected by tools such as CloudWatch and help teams to analyze and visualize it more effectively. Many of these third-party tools include an application performance monitoring (APM) platform. Some have a security information and event management (SIEM) platform that can analyze data from the cloud for security events.
Effective cloud application monitoring is a complex endeavor that requires you to take it on from multiple angles. The entire IT team will need to contribute, deploying a variety of tools and strategies to collect metrics that provide a holistic overview of cloud application performance, cost-efficiency and security.