Internal TCP UDP Load Balancing

Overview
1. A regional load balancer that enables users to run and scale services behind an internal load balancing IP address that is accessible only to the internal virtual machine (VM) instances
2. Distributes traffic among VM instances in the same region in a Virtual Private Cloud (VPC) network by using an internal IP address
3. An Internal TCP/UDP Load Balancing service has a frontend (the forwarding rule) and a backend (the backend service and instance groups)
Protocols, scheme, and scope
1. Each internal TCP/UDP load balancer supports Either TCP or UDP traffic (not both)
2. Supports backend service with load balancing scheme: internal
3. Backend VMs in one VPC network
4. Backend VMs in all zones within one region
5. Clients in any region if global access is enabled
6. An internal TCP/UDP load balancer doesn't support backend VMs in multiple regions
7. Does not support balancing traffic that originates from the internet, unless you're using it with an external load balancer
Client access
1. Enable global access to allow client VM instances from any region to access the internal TCP/UDP load balancer
2. The client VM must be in the same network or in a VPC network connected by using VPC Network Peering
3. With global access disabled, clients must be in the same region as the load balancer
4. They also must be in the same VPC network as the load balancer or in a VPC network that is connected to the load balancer's VPC network by using VPC Network Peering
5. On-premises clients can access the load balancer through Cloud VPN tunnels or interconnect attachments (VLANs)
6. These tunnels or attachments must be in the same region as the load balancer
7. With global access enabled, clients can be in any region
8. They still must be in the same VPC network as the load balancer or in a VPC network that's connected to the load balancer's VPC network by using VPC Network Peering
9. On-premises clients can access the load balancer through Cloud VPN tunnels or interconnect attachments (VLANs)
10. These tunnels or attachments can be in any region
11. Access an internal TCP/UDP load balancer in VPC network from a connected network by using the VPC Network Peering and Cloud VPN and Cloud Interconnect
Three-tier web service
1. Use Internal TCP/UDP Load Balancing in conjunction with other load balancers
2. Where external HTTP(S) load balancers are incorporated, the external load balancer is the web tier and relies on services behind the internal load balancer
3. With global access enabled, web-tier VMs can be in another region
4. A globally-available internet-facing web tier can be configured to load balance traffic with HTTP(S) Load Balancing
5. An internal backend load-balanced database tier that is accessed by the global web tier
6. A client VM that is part of the web tier that accesses the internal load-balanced database tier
Characteristics
1. It's a managed service
2. It's not a proxy
3. It's implemented in virtual networking
4. Unlike a device-based or VM instance-based load balancer, an internal TCP/UDP load balancer doesn't terminate connections from clients.
5. Instead of traffic being sent to a load balancer and then to backends, clients send traffic to backends directly
6. There's no intermediate device or single point of failure
7. Client requests to the load balancer's IP address go directly to the backend VMs
8. Responses from the backend VMs go directly to the clients, not back through the load balancer
9. TCP responses use direct server return
10. The Google Cloud Linux guest environment, Windows guest environment, or an equivalent process configures each backend VM with the IP address of the load balancer
11. Google Cloud virtual networking manages traffic delivery, scaling as appropriate
Architecture
1. An internal TCP/UDP load balancer with multiple backend instance groups distributes connections among backend VMs in all of those instance groups
2. Any type of instance group (unmanaged instance groups, managed zonal instance groups, or managed regional instance groups) but not network endpoint groups (NEGs), can be used as backends for the load balancer
3. High availability describes how to design an internal load balancer that is not dependent on a single zone
4. Instances that participate as backend VMs for internal TCP/UDP load balancers must be running the appropriate Linux or Windows guest environment or other processes that provide equivalent functionality
5. Guest environment must be able to contact the metadata server (metadata.google.internal, 169.254.169.254) to read instance metadata so that it can generate local routes to accept traffic sent to the load balancer's internal IP address
6. Internal TCP/UDP Load Balancing can be used with either a custom mode or auto mode VPC network
7. Internal TCP/UDP load balancers can be created with an existing legacy network
High availability
1. The internal TCP/UDP load balancer is highly available by design
2. There are no special steps to make the load balancer highly available because the mechanism doesn't rely on a single device or VM instance
3. To ensure that backend VM instances are deployed to multiple zones, use regional managed instance groups if software are deployed by using instance templates
4. Regional managed instance groups automatically distribute traffic among multiple zones, providing the best option to avoid potential issues in any given zone
5. If zonal managed instance groups or unmanaged instance groups are configured, use multiple instance groups in different zones (in the same region) for the same backend service
6. Using multiple zones protects against potential issues in any given zone
Internal IP address
1. Internal TCP/UDP Load Balancing uses an internal IPv4 address from the primary IP range of the subnet that is selected when the internal forwarding rule is created
2. The IP address can't be from a secondary IP range of the subnet
3. Specify the IP address for an internal TCP/UDP load balancer when the the forwarding rule is created
4. An ephemeral IP address can be provisioned or an IP address reserved
Forwarding rules
1. A forwarding rule specifies the protocol and ports on which the load balancer accepts traffic
2. Because internal TCP/UDP load balancers are not proxies, they pass traffic to backends on the same protocol and port
3. An internal TCP/UDP load balancer requires at least one internal forwarding rule
4. Define multiple forwarding rules for the same load balancer
5. Forwarding rule must reference a specific subnet in the same VPC network and region as the load balancer's backend components
6. The subnet specified for the forwarding rule doesn't need to be the same as any of the subnets used by backend VMs; however, the subnet must be in the same region as the forwarding rule
7. When an internal forwarding rule is created, Google Cloud chooses an available regional internal IP address from the primary IP address range of the subnet selected
8. Alternatively, specify an internal IP address in the subnet's primary IP range
Forwarding rules and global access
1. An internal TCP/UDP load balancer's forwarding rules are regional, even when global access is enabled
2. After global access is enabled, the regional internal forwarding rule's allowGlobalAccess flag is set to true
Forwarding rules and port specifications
1. When an internal forwarding rule is created, specify at least one and up to five ports, by number or specify ALL to forward traffic on all ports
2. Google Kubernetes Engine (GKE) doesn't support creating a Service of type LoadBalancer with an internal forwarding rule that uses all ports
3. Users can manually create an internal TCP/UDP load balancer with an internal forwarding rule that uses all ports for GKE nodes
4. An internal forwarding rule that supports either all TCP ports or all UDP ports allows backend VMs to run multiple applications, each on its own port
5. Traffic sent to a given port is delivered to the corresponding application, and all applications use the same IP address
6. To forward traffic on more than five specific ports, combine firewall rules with forwarding rules
7. When the forwarding rule is created, specify all ports, and then create ingress allow firewall rules that only permit traffic to the desired ports
8. Apply the firewall rules to the backend VMs
9. A forwarding rule cannot be modified after it is created
10. If the specified ports or the internal IP address for an internal forwarding rule needs to be changed, it must be deleted and recreated
Multiple forwarding rules
1. Configure multiple internal forwarding rules for the same internal load balancer
2. Each forwarding rule must have a unique IP address and can only reference a single backend service
3. Multiple internal forwarding rules can reference the same backend service
4. Configuring multiple internal forwarding rules can be useful if more than one IP address is needed for the same internal TCP/UDP load balancer or to associate certain ports with different IP addresses
5. When using multiple internal forwarding rules, configure the software running on backend VMs so that it binds to all necessary IP addresses
6. The destination IP address for a packet delivered through the load balancer is the internal IP address associated with the respective internal forwarding rule
Backend service
1. Each internal TCP/UDP load balancer has one regional internal backend service that defines backend parameters and behavior
2. The name of the backend service is the name of the internal TCP/UDP load balancer shown in the Google Cloud Console
3. Each backend service defines a protocol
4. A backend service accepts either TCP or UDP traffic, but not both, on the ports specified by one or more internal forwarding rules
5. The backend service allows traffic to be delivered to backend VMs on the same ports to which traffic was sent
6. The backend service protocol must match the forwarding rule's protocol
7. A backend service allows traffic to be distributed according to a configurable session affinity
8. A backend service must have an associated health check.
9. Each backend service operates in a single region and distributes traffic for backend VMs in a single VPC network
10. Backends are instance groups in the same region as the backend service (and forwarding rule)
11. The backends can be unmanaged instance groups, zonal managed instance groups, or regional managed instance groups
12. All backend VMs must have a network interface in the VPC network associated with the backend service
13. Either explicitly specify a backend service's network or use an implied network.
14. Every internal forwarding rule's subnet must be in the backend service's VPC network
Backend services and network interfaces
1. Each backend service operates in a single VPC network and Google Cloud region
2. Different backend VMs in the same unmanaged instance group might use different interface identifiers if each VM has an interface in the specified VPC network
Health check
1. The load balancer's backend service must be associated with a health check
2. Special routes outside of the VPC network facilitate communication between health check systems and the backends
3. Use an existing health check or define a new one
4. The internal TCP/UDP load balancers use health check status to determine how to route new connections
5. The protocol of the health check does not have to match the protocol of the load balancer
6. HTTP, HTTPS, or HTTP/2. If backend VMs serve traffic by using HTTP, HTTPS, or HTTP/2, it's best to use a health check that matches that protocol because HTTP-based health checks offer options appropriate to that protocol
7. Serving HTTP-type traffic through an internal TCP/UDP load balancer means that the load balancer's protocol is TCP
8. SSL or TCP. If the backend VMs do not serve HTTP-type traffic, use either an SSL or TCP health check
9. Regardless of the type of health check, Google Cloud sends health check probes to the IP address of the internal TCP/UDP load balancer, simulating how load-balanced traffic is delivered
10. Software running on backend VMs must respond to both load-balanced traffic and health check probes sent to the IP address of the load balancer
Health checks and UDP traffic
1. Google Cloud does not offer a health check that uses the UDP protocol
2. When Internal TCP/UDP Load Balancing is used with UDP traffic, run a TCP-based service on backend VMs to provide health check information
3. In this configuration, client requests are load balanced by using the UDP protocol, and a TCP service is used to provide information to Google Cloud health check probers
TCP and UDP request and return packets
1. When a client system sends a TCP or UDP packet to an internal TCP/UDP load balancer, the client's primary internal IP address or the IP address from one of the client's alias IP ranges
2. The IP address of the load balancer's forwarding rule
3. When the load balancer sends a response packet, that packet's source and destination depend on the protocol
4. TCP is connection-oriented, and internal TCP/UDP load balancers use direct server return
5. This means that response packets are sent from the IP address of the load balancer's forwarding rule
6. In contrast, UDP is connectionless
7. By default, return packets are sent from the primary internal IP address of the backend instance's network interface
8. However, users can change this behavior
9. Configuring a UDP server to bind to the forwarding rule's IP address causes response packets to be sent from the forwarding rule's IP address
Traffic distribution
1. Internal TCP/UDP load balancer distributes new connections depending on whether failover is configured
2. If failover is not configured, an internal TCP/UDP load balancer distributes new connections among all of its healthy backend VMs if at least one backend VM is healthy
3. When all backend VMs are unhealthy, the load balancer distributes new connections among all backends as a last resort
4. If failover is configured, an internal TCP/UDP load balancer distributes new connections among VMs in its active pool, according to the configured failover policy
5. When all backend VMs are unhealthy, users can choose to drop traffic
6. By default, the method for distributing new connections uses a hash calculated from five pieces of information: the client's IP address, the source port, the load balancer's internal forwarding rule IP address, the destination port, and the protocol
7. Users can modify the traffic distribution method for TCP traffic by specifying a session affinity option
8. The health check state controls the distribution of new connections
9. An established TCP session persists on an unhealthy backend VM if the unhealthy backend VM is still handling the connection
Session affinity options
1. Session affinity controls the distribution of new connections from clients to the load balancer's backend VMs
2. Set session affinity when backend VMs need to keep track of state information for their clients when sending TCP traffic
3. This is a common requirement for web applications
4. Session affinity works on a best-effort basis for TCP traffic
5. Because the UDP protocol doesn't support sessions, session affinity doesn't affect UDP traffic
6. The internal TCP/UDP load balancers support the NONE session affinity option, which is effectively the same as Client IP, protocol, and port
7. Client IP directs a particular client's requests to the same backend VM based on a hash created from the client's IP address and the destination IP address
8. Client IP and protocol directs a particular client's requests to the same backend VM based on a hash created from three pieces of information: the client's IP address, the destination IP address, and the load balancer's protocol (TCP or UDP).
9. Client IP, protocol, and port directs a particular client's requests to the same backend VM based on a hash.
10. The hash is created from surce IP address of the client sending the request, source port of the client sending the request, destination IP address, destination port, and protocol (TCP or UDP)
11. The destination IP address is the IP address of the load balancer's forwarding rule, unless packets are delivered to the load balancer because of a custom static route