-
Failover for Internal TCP/UDP Load Balancing
- Configure an internal TCP/UDP load balancer to distribute connections among virtual machine (VM) instances in primary backends, and then switch, if needed, to using failover backends
- Failover provides one method of increasing availability, while also giving greater control over how to manage workload when primary backend VMs aren't healthy
- Configuring failover modifies the internal TCP/UDP load balancer's standard traffic distribution algorithm
- By default, when a backend is added to an internal TCP/UDP load balancer's backend service, that backend is a primary backend
- A backend can be designated to be a failover backend when it is added to the load balancer's backend service, or by editing the backend service later
- Failover backends only receive connections from the load balancer after a configurable ratio of primary VMs don't pass health checks
-
Supported instance groups
- Managed and unmanaged instance groups are supported as backends
- Using managed instance groups with autoscaling and failover might cause the active pool to repeatedly failover and failback between the primary and failover backends
- Google Cloud does not prevent users from configuring failover with managed instance groups
-
Backend instance groups and VMs
- The unmanaged instance groups in Internal TCP/UDP Load Balancing are either primary backends or failover backends
- You can designate a backend to be a failover backend when it is added to the backend service or by editing the backend after it is added
- Otherwise, unmanaged instance groups are primary by default
- Multiple primary backends and multiple failover backends can be configured in a single internal TCP/UDP load balancer by adding them to the load balancer's backend service
- A primary VM is a member of an instance group defined to be a primary backend
- The VMs in a primary backend participate in the load balancer's active pool, unless the load balancer switches to using its failover backends
- A backup VM is a member of an instance group defined to be a failover backend
- The VMs in a failover backend participate in the load balancer's active pool when primary VMs become unhealthy
- The number of unhealthy VMs that triggers failover is a configurable percentage
-
Active pool
- The active pool is the collection of backend VMs to which an internal TCP/UDP load balancer sends new connections
- Membership of backend VMs in the active pool is computed automatically based on which backends are healthy and conditions that can be specified
- The active pool never combines primary VMs and backup VMs
- During failover, the active pool contains only backup VMs
- During normal operation (failback), the active pool contains only primary VMs
-
Failover and failback
- Failover and failback are the automatic processes that switch backend VMs into or out of the load balancer's active pool
- When Google Cloud removes primary VMs from the active pool and adds healthy failover VMs to the active pool, the process is called failover
- When Google Cloud reverses this, the process is called failback
-
Failover policy
- A failover policy is a collection of parameters that Google Cloud uses for failover and failback
- Each internal TCP/UDP load balancer has one failover policy that has multiple settings
-
Failover ratio
- Dropping traffic when all backend VMs are unhealthy
- Connection draining on failover and failback
- A configurable failover ratio determines when Google Cloud performs a failover or failback, changing membership in the active pool
- A failover ratio of 1.0 requires that all primary VMs be healthy
- When at least one primary VM becomes unhealthy, Google Cloud performs a failover, moving the backup VMs into the active pool
- A failover ratio of 0.1 requires that at least 10% of the primary VMs be healthy; otherwise, Google Cloud performs a failover
- A failover ratio of 0.0 means that Google Cloud performs a failover only when all the primary VMs are unhealthy
- Failover doesn't happen if at least one primary VM is healthy
-
Internal TCP/UDP load balancer
- An internal TCP/UDP load balancer distributes connections among VMs in the active pool according to the traffic distribution algorithm
- Dropping traffic when all backend VMs are unhealthy
- By default, when all primary and backup VMs are unhealthy, Google Cloud distributes new connections among all primary VMs
- It does so as a last resort
- Internal TCP/UDP load balancers can be configured to drop new connections when all primary and backup VMs are unhealthy
-
Connection draining on failover and failback
- Connection draining allows existing TCP sessions to remain active for up to a configurable time period even after backend VMs become unhealthy
-
If the protocol for load balancer is TCP
- By default, connection draining is enabled
- Existing TCP sessions can persist on a backend VM for up to 300 seconds (5 minutes), even if the backend VM becomes unhealthy or isn't in the load balancer's active pool
- Disabling connection draining during failover and failback ensures that all TCP sessions, including established ones, are quickly terminated
- Users can disable connection draining during failover and failback events
- Connections to backend VMs might be closed with a TCP reset packet
-
Disabling connection draining on failover and failback is useful for scenarios such as
- Patching backend VMs. Prior to patching, configure your primary VMs to fail health checks so that the load balancer performs a failover
- Disabling connection draining ensures that all connections are moved to the backup VMs quickly and in a planned fashion
- This allows users to install updates and restart the primary VMs without existing connections persisting
- After patching, Google Cloud can perform a failback when a sufficient number of primary VMs (as defined by the failover ratio) pass their health check
- If there is a need to ensure that only one primary VM is the destination for all connections, disable connection draining so that switching from a primary to a backup VM does not allow existing connections to persist on both
- This reduces the possibility of data inconsistencies by keeping just one backend VM active at any given time