Melpomene - Fotolia
Cloud instances run the gamut in terms of compute power, memory, storage and support for GPUs, machine learning and other specialized features. Administrators should let application requirements dictate cloud instance types and sizes -- especially since the wrong match can lead to poor performance and high costs.
Some apps might require a larger cloud instance -- meaning a virtual machine with high levels of compute, storage and other resources -- while others can perform well on a smaller instance with fewer resources. But not everyone lands on the right cloud instance at first, and the dynamic nature of cloud apps means that pairings don't always work as planned.
Use cloud optimization tools and techniques to choose the right cloud instance types and sizes, and then reset and adjust them, as needed, over time.
Define application requirements
Cloud-hosted applications require a different mindset than those on premises. IT managers typically overprovision for on-premises apps, because it's a challenge to scale up additional resources later. Cloud apps are easier to scale on demand, but the specific resources they need to scale -- such as compute, RAM and storage -- vary depending on the workload. For example, a database app requires comparatively more RAM and storage IOPS than a web server, which places a greater demand on compute.
This makes it difficult to right-size a cloud instance, as enterprises need to know an application's resource usage pattern, ideally over an entire year, said Torsten Volk, managing research director at Enterprise Management Associates.
Check the cloud vendor's resource limits for CPU, RAM, storage and more. Although cloud providers' tools can help determine the best instance type and setup for an application, these vendors have little incentive to provide customers with sophisticated cloud optimization tools, since overprovisioning can be a source of revenue, Volk said. Admins will need to implement other techniques, and potentially third-party optimization tools, to get the full picture.
Analyze metrics to drive optimization
Utilization metrics, such as CPU, storage, memory and network capacity, can show whether an instance is the right size for an application. For example, a high amount of request latency might signal an instance size is too small and can't handle the current load.
"Be sure to incorporate an understanding of how spikes affect your metrics when determining the right instance size," said Beau Bennett, senior cloud architect at Candid Partners, a cloud consultancy.
There should be enough spare capacity to handle a spike in usage just long enough for horizontal scaling to kick in. If a single metric is far higher than the rest -- for example, if RAM is fully used, but CPU and network capacity is low -- it might be time for a different instance size.
Build a monitoring and adjustment loop
To ensure a cloud application scales efficiently, create a feedback loop between metrics and operations.
When monitoring metrics display high or low utilization, make small adjustments to the instance sizes or types. Continue monitoring these metrics to see how changes affect application performance and efficiency.
A robust CI/CD pipeline, where changes occur with the aid of policy-based controls and automation, helps cloud administrators adjust settings without unexpected consequences or lengthy manual processes. Infrastructure as code, where resources are templatized and written in a programming language, is also helpful for ongoing cloud optimization, Bennett said.
Performance and utilization metrics are core offerings across cloud providers, or organizations can turn to third-party products that provide an overview of the instances in use.
Use cloud cost management tools
Native cost management tools, including Azure Cost Management, AWS Cost Management and Google Rightsizing Recommendations, can help with optimization, but may not be enough to accurately estimate all workloads' needs. For example, Google Rightsizing Recommendations can suggest instance sizing based on a running eight-day average. However, this average utilization information doesn't suffice for spikey apps with peaks that occur once a month or quarter, or based on specific events, such as Black Friday or Tax Day.
Torsten VolkManaging research director, EMA
"Always profile important workloads to include both average and peak usage in your sizing decision," Volk cautioned.
Additionally, these tools might not track all the required resources that are important for an application. For example, Google Rightsizing Recommendations only considers CPU, RAM and storage size. Some applications have other limiting factors, such as network IOPS. Also, the tool doesn't account for certain deployment types, such as Kubernetes clusters, which leaves administrators to make their own cloud optimization decisions.
Load test different instance types
Conduct load tests for applications on different instance types and sizes to determine the expected average and burst performance metrics, Volk recommended. Model load tests as closely as possible to real-life usage patterns, and how these usage patterns intersect. For example, peak loads could occur during API function calls and, simultaneously, through user interaction with an application's graphical front end.
To make things more complex, integrated third-party systems might have requirements in terms of communication speed with a cloud-hosted app. Integration becomes even trickier when applications share common microservices. Cloud optimization across these interconnected, distributed architectures requires vigilance and complex modeling.
Consider dynamic instance types
AWS offers burstable performance instances that enable admins to dynamically add and pay for CPU performance, as needed. While this kind of service boosts cloud performance, it is difficult to know how well an application will scale with only CPU improvement and no upgrade to the speed or size of memory and storage. Whenever you rely on instance bursting, always conduct load tests, Volk said.
In addition, if you're spending significantly on added CPUs, review the application's needs; it may be time to convert to a larger instance size.
Watch out for container issues
Container clusters might have dependencies that cause higher network throughput or storage load than expected. A key culprit is often Kubernetes scheduling policies. "This is an insidious issue to diagnose," Volk said.
Take cloud optimization reviews beyond just CPU, storage and RAM if you plan to launch containers. Determine whether there are restrictions in terms of how many Kubernetes pods users can launch or delete in a certain period of time, and whether there are size restrictions for individual containers running on an instance.