Log files document the actions and responses of hardware systems, enterprise services, applications and users....
IT administrators analyze and review log activity to improve performance, troubleshoot problems, spot malicious activities and more.
But log availability has lagged in the public cloud. Customers have traditionally had little -- if any --visibility into the cloud provider's infrastructure. Fortunately, public cloud providers are realizing logs are crucial for troubleshooting, security and compliance, and providers such as Amazon Web Services (AWS), Microsoft Azure and Google all now offer some type of log management service.
Here's a closer look at public cloud log management tools and how to get more from those services.
What are some options for public cloud log management tools?
Most major public cloud vendors offer log management services. Pricing, integration and features vary, so evaluate prospective log management tools before adding them to your monthly cloud bill.
AWS provides CloudWatch Logs, which is designed to monitor and troubleshoot virtual servers, such as Elastic Compute Cloud (EC2) instances, and applications with native or custom log files. Log files are passed to AWS CloudWatch Logs programmatically through an API and are monitored in near-real time. Users can have log data trigger alarms -- to alert them, for example, when they reach a certain number of errors -- or use the metrics to produce graphs or reports.
Google Cloud Platform (GCP) offers the fully managed Stackdriver Logging service, which is designed to store, monitor, search, analyze and produce real-time alerts from log data and event metrics generated from GCP instances. The Google Stackdriver service, currently in beta, is intended to be compatible with AWS EC2 instances, as well as custom log data from any source.
Microsoft takes a more hybrid approach to cloud log management. Rather than offer log management as a menu item in the Azure service portfolio, Microsoft provides the Operations Management Suite as an add-on to System Center, which can ingest, correlate and visualize log data from Windows and Linux workloads across on-premises, Azure and AWS instances.
What are events, streams and groups in cloud log management?
Providers like AWS break logs into events, streams and groups. Logs are generally composed of events, or specific activities that systems, services or applications generate. A log file must run to capture the events. A log event usually includes a timestamp that denotes the event date and time, along with a raw message that describes the event.
Related log entries are often grouped into streams. For example, a stream may show all of the events a particular server instance produces. Administrators can watch specific streams to get a better view of resource behaviors. They can delete old streams to open storage space for more logs.
Consider the retention period for cloud logs
Administrators can typically select retention periods for log groups, ranging from one day to 10 years, or choose to have them never expire. But log data can be substantial -- especially with many log groups. Select a retention period that provides adequate adherence to compliance requirements, while maintaining manageable cloud log storage costs.
Related log streams are organized into groups for better management. Each stream associated with a group usually shares the same configuration, such as administrative access, retention and other characteristics.
Cloud log management tools provide metric filters that look for events and convert them to data points. For example, an admin might create a metric filter that counts invalid login attempts listed within a log group. This can help identify possible attacks or malicious login activity. Once a metric is established against a log, it also is possible to create an alarm when certain conditions are met.
How do log management tools or practices differ in the public cloud vs. on premises?
There are no profound differences between on-premises log management and the log management services from public cloud providers. However, there are some wrinkles to consider.
First, you may have to use more than one tool. Ideally, a single log management tool would service both on-premises and cloud deployments. Providers may be able to ingest on-premises logs and add that content to the logs collected for cloud deployments. Still, underlying log management tools, such as AWS CloudWatch Logs and Google Stackdriver Logging, are services from the cloud provider; this can be a tough pill to swallow for organizations that have already made a serious investment in on-premises log management tools.
Second, providers' cloud log management tools may not be sufficient on their own. Organizations may need additional cloud services, such as storage, messaging and alerting, to create a suitable log management environment. Every added feature, however, may increase the monthly log management cost.
Finally, successful cloud logging may require IT teams to install agents on each cloud instance they log. Not all applications or services generate log files, and even those that do can vary in format and detail. This makes it difficult for the log management tool to process log content. Use a uniform agent to capture events in a consistent manner and pass them to the log tool in the correct format.
Also, have a plan for handling agent updates; log management services typically use agents to collect and pass data from deployed cloud instances to the log service. Log agents may periodically update to reflect changes in the log service API, feature set or dependencies. Know how to check agent versions, update agents in running deployments and adjust VM image files that include agents.
Things to consider when choosing a log file tool
Add extra security with logging tools
Monitor resources with AWS CloudWatch