Get started Bring yourself up to speed with our introductory content.

What are the different types of cloud load balancing?

Learn how load balancing in the cloud differs from a traditional network traffic distribution, and explore the different services available from AWS, Google and Microsoft.

Load balancing is the process of distributing network traffic across two or more instances of a workload. IT teams use load balancing to ensure each instance performs at peak efficiency, without any one instance becoming overburdened or failing due to excess network traffic.

Traditionally, a load balancer exists in a local data center as a dedicated physical network appliance. However, load balancing is more frequently performed by an application installed on a server and offered as a network service. Public cloud providers use the service paradigm and provide software-based load balancers as a distinct feature.

Once a load balancer is implemented, it acts as a network front end and often uses a single IP address to receive all network traffic intended for the target workload. The load balancer can evenly distribute the network traffic to each available workload instance, or it can throttle traffic to send specific percentages of traffic to each instance.

With a load balancer, the target workloads can be in different physical places. Cloud load balancing provides similar benefits that enable users to distribute network traffic across multiple instances within the same region or across multiple regions or availability zones.

Layer 4 vs. Layer 7 cloud load balancing

Load balancing is defined by the layer that the network traffic is handled by based on the traditional seven-layer Open Systems Interconnection network model. Each layer corresponds to specific traffic types. Cloud load balancing is most commonly performed at Layer 4 (transport or connection layer) or Layer 7 (application layer).

cloud load balancing seven-layer model

For example, AWS' Network Load Balancer service operates at Layer 4 to direct data from transport layer protocols, including Transmission Control Protocol (TCP), User Datagram Protocol (UDP) and Transport Layer Security (TLS). Google Cloud Platform (GCP) refers to this as TCP/UDP Load Balancing, while Microsoft calls its Layer 4 service Azure Load Balancer. Since traffic is handled at a lower level of the network stack, Layer 4 load balancing provides the best performance. Cloud load-balancing services can handle millions of network requests per second and ensure low latencies. They are, therefore, great options for erratic or unpredictable network traffic patterns.

At the top of the network stack, Layer 7 handles more complex traffic, such as HTTP and HTTPS requests. Each of the major cloud providers has its own feature or service for this:

Since this traffic is much higher up the network stack, IT teams can implement more advanced options, such as content- or request-based routing decisions. This type of cloud load balancing works well with modern application instances and architectures, including microservices and container-based workloads.

The choice of a cloud load balancer should extend beyond traffic types alone. Cloud providers also differentiate load-balancing services based on scope and framework. For example, GCP suggests global load-balancing services when workloads are distributed across multiple regions, while regional load-balancing services are a good fit when all workloads are in the same region. Similarly, GCP suggests external load balancers when traffic is coming into the workloads from the internet and internal load balancers when traffic is intended for use within GCP.

Be sure to consider the broader suite of features and capabilities available with cloud load-balancing services. In particular, features can include support for a single front-end IP address, support for automatic workload scaling, and integration with other cloud services, such as monitoring and alerting.

Dig Deeper on Cloud architecture design and planning