-
Overview
- Google Cloud external TCP/UDP Network Load Balancing (Network Load Balancing) is a regional, non-proxied load balancer
- Network Load Balancing distributes traffic among virtual machine (VM) instances in the same region in a Virtual Private Cloud (VPC) network
- A network load balancer directs TCP or UDP traffic across regional backends
- Use Network Load Balancing to load balance UDP, TCP, and SSL traffic on ports that are not supported by the TCP proxy load balancers and SSL proxy load balancers
-
Characteristics
- Network Load Balancing is a managed service
- Network Load Balancing is implemented by using Andromeda virtual networking and Google Maglev
- The network load balancers are not proxies
- Responses from the backend VMs go directly to the clients, not back through the load balancer
- The industry term for this is direct server return
- The load balancer preserves the source IP addresses of packets
- The destination IP address for packets is the regional external IP address associated with the load balancer's forwarding rule
- Instances that participate as backend VMs for network load balancers must be running the appropriate Linux guest environment, Windows guest environment, or other processes that provide equivalent functionality
- The guest OS environment (or an equivalent process) is responsible for configuring local routes on each backend VM
- These routes allow the VM to accept packets that have a destination that matches the IP address of the load balancer's forwarding rule
- On the backend instances that accept load-balanced traffic, configure the software to bind to the IP address associated with the load balancer's forwarding rule (or to any IP address, 0.0.0.0/0)
-
Protocols, scheme, and scope
- Each network load balancer supports either TCP or UDP traffic (not both)
- A network load balancer uses a target pool to contain the backend instances among which traffic is load balanced
- A network load balancer balances traffic originating from the internet
- You cannot use it to load balance traffic that originates within Google Cloud between instances
- The scope of a network load balancer is regional, not global
- A network load balancer cannot span multiple regions
- Within a single region, the load balancer services all zones
- Use Network Load Balancing to balance UDP traffic, or to load balance a TCP port that isn't supported by other load balancers
- Load balance UDP traffic, or to load balance a TCP port that isn't supported by other load balancers
- It is acceptable to have SSL traffic decrypted by backends instead of by the load balancer, as the network load balancer cannot perform this task
- When the backends decrypt SSL traffic, there is a greater CPU burden on the VMs
- Self-managing the load balancer's SSL certificates is acceptable
- Google-managed SSL certificates are only available for HTTP(S) Load Balancing and SSL Proxy Load Balancing
- To forward the original packets unproxied
- For an existing setup that uses a pass-through load balancer, and to migrate it without changes
-
Architecture
- The network load balancers balance the load on systems based on incoming IP protocol data, such as address, port, and protocol type
- The network load balancer is a pass-through load balancer, so backends receive the original client request
- The network load balancer doesn't do any Transport Layer Security (TLS) offloading or proxying
- Traffic is directly routed to VMs
- When a forwarding rule is created for the load balancer, an ephemeral virtual IP address (VIP) is received or a VIP that originates from a regional network block needs to be reserved
- The forwarding rule is associated with the backends
- The VIP is anycasted from Google's global points of presence, but the backends for a network load balancer are regional
- The load balancer cannot have backends that span multiple regions
- Google Cloud firewalls can be used to control or filter access to the backend VMs
- The network load balancer examines the source and destination ports, IP address, and protocol to determine how to forward packets
- For TCP traffic, modify the forwarding behavior of the load balancer by configuring session affinity
-
Load distribution algorithm
- By default, to distribute traffic to instances, the session affinity value is set to NONE
- Cloud Load Balancing picks an instance based on a hash of the source IP and port, destination IP and port, and protocol
- Incoming TCP connections are spread across instances, and each new connection may go to a different instance
- All packets for a connection are directed to the same instance until the connection is closed
- Established connections are not taken into account in the load balancing process
- Regardless of the session affinity setting, all packets for a connection are directed to the chosen instance until the connection is closed
- An existing connection has no impact on load balancing decisions for new incoming connections
- This can result in an imbalance among backends if long-lived TCP connections are in use
- A different session affinity setting can be chosen where multiple connections from a client need to go to the same instance
-
Target pools
- A target pool resource defines a group of instances that should receive incoming traffic from forwarding rules
- When a forwarding rule directs traffic to a target pool, Cloud Load Balancing picks an instance from these target pools based on a hash of the source IP and port and the destination IP and port
- Target pools can only be used with forwarding rules that handle TCP and UDP traffic
- For all other protocols, create a target instance
- Create a target pool before it can be used with a forwarding rule.
- Each project can have up to 50 target pools
- For a target pool with a single VM instance, consider using the protocol forwarding feature instead
- Network Load Balancing supports Cloud Load Balancing Autoscaler, which allows users to perform autoscaling on the instance groups in a target pool based on backend utilization
-
Forwarding rules
- Forwarding rules work in conjunction with target pools to support load balancing
- To use load balancing, create a forwarding rule that directs traffic to specific target pools
- It is not possible to load balance traffic without a forwarding rule
- Each forwarding rule matches a particular IP address, protocol, and optionally, port range to a single target pool
- When traffic is sent to an external IP address that is served by a forwarding rule, the forwarding rule directs that traffic to the corresponding target pool
-
Multiple forwarding rules
- Users can configure multiple regional external forwarding rules for the same external TCP/UDP network load balancer
- Optionally, each forwarding rule can have a different regional external IP address, or multiple forwarding rules can have the same regional external IP address
- Configuring multiple regional external forwarding rules can be useful for configuring more than one external IP address for the same target pool
- To configure different port ranges or different protocols by using the same external IP address for the same target pool
- When using multiple forwarding rules, configure the software running on backend VMs so that it binds to all necessary IP addresses
- This is required because the destination IP address for packets delivered through the load balancer is the regional external IP address associated with the respective regional external forwarding rule
-
Health checks
- Health checks ensure that Compute Engine forwards new connections only to instances that are up and ready to receive them
- Compute Engine sends health check requests to each instance at the specified frequency
- After an instance exceeds its allowed number of health check failures, it is no longer considered an eligible instance for receiving new traffic
- Existing connections are not actively terminated, which allows instances to shut down gracefully and close TCP connections
- The health checker continues to query unhealthy instances, and returns an instance to the pool when the specified number of successful checks occur
- If all instances are marked as UNHEALTHY, the load balancer directs new traffic to all existing instances
- Network Load Balancing relies on legacy HTTP health checks to determine instance health
- Even if the service does not use HTTP, a basic web server must be run on each instance that the health check system can query
-
Return path
- Google Cloud uses special routes not defined in the VPC network for health checks
-
Firewall rules
- Health checks for network load balancers are sent from specific IP ranges
- Create ingress allow firewall rules that permit traffic from those ranges
- In addition to the IP ranges for health check probes, backends might also receive health check traffic from their metadata servers, 169.254.169.254.
- Each backend VM can receive packets from its metadata server because that is always allowed traffic
- Network Load Balancing is a pass-through load balancer, which means that firewall rules must allow traffic from the client source IP addresses
- If service is open to the internet, it is easiest to allow traffic from all IP ranges
- To restrict access so that only certain source IP addresses are allowed, set up firewall rules to enforce that restriction, but allow access from the health check IP ranges
-
Session affinity
- Network Load Balancing doesn't use backend services session affinity
- Instead, network load balancers use target pools for session affinity
- Load balancing and fragmented UDP packets
- Unfragmented packets are handled normally in all configurations
- UDP packets may become fragmented before reaching Google Cloud
- Intervening networks may wait for all fragments to arrive before forwarding them, causing delay, or may drop fragments
- Google Cloud does not wait for all fragments; it forwards each fragment as soon as it arrives
- Because subsequent UDP fragments do not contain the destination port, problems can occur if the target pool's session affinity is set to NONE (5-tuple affinity)
- The subsequent fragments may be dropped because the load balancer cannot calculate the 5-tuple hash
- If there is more than one UDP forwarding rule for the same load-balanced IP address, subsequent fragments may arrive at the wrong forwarding rule.
- For fragmented UDP packets, set session affinity to CLIENT_IP_PROTO or CLIENT_IP.
- Do not use NONE (5-tuple hashing).
- Because CLIENT_IP_PROTO and CLIENT_IP do not use the destination port for hashing
- They can calculate the same hash for subsequent fragments as for the first fragment.
- Use only one UDP forwarding rule per load-balanced IP address.
- This ensures that all fragments arrive at the same forwarding rule.
- With these settings, UDP fragments from the same packet are forwarded to the same instance for reassembly.