To most IT organizations, "performance" means response time or the quality of user experience. Like all applications,...
a multicloud app -- or an application that moves across different cloud platforms -- can be impacted by three primary factors: overall availability, network delay and packet loss and processing delays in the application and its components.
The design of a multicloud app also plays a role in its performance. Applications are trending toward "componentization," or the separation of application functions and features into independent components. Microservices are the most recent example of this trend, and since separate components can scale horizontally to improve application processing, componentization is seen as accommodating for the cloud. This is both true and false.
Componentization creates a long, threaded workflow that passes in and among cloud providers and hosts. An issue anywhere along that workflow will impact the quality of experience (QoE) for users.
A key part of application performance management is measuring user QoE and having a benchmark to compare it to. The most reliable place to get that information is the device or front-end application component that is closest to the user. If you measure performance somewhere inside an application workflow, you'll detect and address only that specific part of the application. If software can't measure response time at the user's device, measure it manually, either by timing transactions individually or in a group. The "normal" or expected value becomes your benchmark, and any deviations signal the need to identify and address problems.
Optimize multicloud app performance
When application performance suffers in a multicloud environment, most organizations try to determine first if the problem is with a specific cloud provider. Look at cloud management logs to see if the performance problems stem from some cloud event, such as failover or the addition or removal of application instances. If this is the case, the problem is likely with that specific cloud provider, and you should address it there before looking at multicloud causes.
If no single provider appears to be the cause of the issue, trace the app's workflow across the multiple cloud platforms you use. A multicloud app generally falls into two categories: one where cloud providers host an app for a given geography or user group and one where apps for users are spread across multiple providers. Problems with QoE in the first case will be specific to one set of users, and that will identify the cloud provider involved. The second case, however, is more complicated.
Data loss or delay is the cause of most application performance problems, so know how work is passed between your cloud providers. There are three broad options: the providers pass work via the internet, the providers pass work through your own central VPN or the providers are themselves interconnected through a private network. Each of these options requires different testing and remediation.
If the internet is used for workflow connection, it's difficult to monitor, and the cloud providers themselves may be unable to help. To effectively monitor the workflow handoff points between providers, build loss and delay detection into your application components. Fortunately, many applications use TCP/IP, and by monitoring the window size and reading middleware network logs, you can often detect long delays, which are shown as large windows or buffers, as well as packet loss.
Window/buffer and network logs can also help you monitor internal VPN or direct cloud connections, but there are alternatives. Pick an option based on your internal tools and skills or consider using professional services from an integrator, vendor or cloud provider to set up your application for easy monitoring.
If your multicloud app workflow is connected between providers through your own VPN, use a data monitoring probe to look at the actual packet flow. In some cases, you will see direct evidence of a delay or loss. Check with monitoring vendors to ensure you can get the data you need and have someone on staff who understands packet flows to interpret results.
It's also possible to insert a software probe into application workflows at the cloud provider boundary points. There are some standards for monitoring, such as RMON, but vendors also provide proprietary test and monitoring tools that may offer better features. Always perform analysis at the probe level where possible rather than creating a stream of monitoring packets that are sent back to some remote location. The second model creates its own delays and variations that often disguise the real problem.
The best way to identify the source of application performance problems in a multicloud model is to build the capability into the applications at the component level, or at least where workflows cross a public cloud provider boundary. Sequence numbers and timestamps on transactions will provide reliable packet delay and loss data, and both network and cloud providers accept this information as indication of a problem.
To remediate any problems with multicloud app performance, take the evidence to your cloud and network providers and work with them on a resolution. The goal is to associate your performance issue to a condition within or among the providers that can be remedied. In some cases, you'll have to pay for a premium service to increase performance, so don't assume it's a cloud provider "error" that can be fixed -- you're in this together.
Learn all you need to know about multicloud
Avoid multicloud management challenges
How to work with multiple providers in cloud