Internal HTTP(S) Load Balancing

Overview
1. Google Cloud Internal HTTP(S) Load Balancing is a proxy-based, regional Layer 7 load balancer that enables users to run and scale services behind an internal load balancing IP address
2. Internal HTTP(S) Load Balancing distributes HTTP and HTTPS traffic to backends hosted on Compute Engine and Google Kubernetes Engine (GKE)
3. The load balancer is accessible only in the chosen region of the Virtual Private Cloud (VPC) network on an internal IP address
4. Internal HTTP(S) Load Balancing is a managed service based on the open source Envoy proxy
5. It enables rich traffic control capabilities based on HTTP(S) parameters
6. After the load balancer has been configured, it automatically allocates Envoy proxies to meet traffic needs
7. At a high level, an internal HTTP(S) load balancer consists of an internal IP address to which clients send traffic.
8. Only clients that are located in the same region as the load balancer can access this IP address
9. Internal client requests stay internal to the network and region.
10. There can be one or more backend services to which the load balancer forwards traffic.
11. Backends can be Compute Engine VMs, groups of Compute Engine VMs (through instance groups), or GKE nodes (through network endpoint groups [NEGs])
12. These backends must be located in the same region as the load balancer
13. A URL map, which defines traffic control rules (based on Layer 7 parameters such as HTTP headers) that map to specific backend services
14. The load balancer evaluates incoming requests against the URL map to route traffic to backend services or perform additional actions (such as redirects)
15. Health checks, which periodically check the status of backends and reduce the risk that client traffic is sent to a non-responsive backend
Path-based routing
1. One common use case is load balancing traffic among services
2. An internal client can request video and image content by using the same base URL, mygcpservice.internal, with the paths /video and /images
3. The internal HTTP(S) load balancer's URL map specifies that requests to path /video should be sent to the video backend service, while requests to path /images should be sent to the images backend service
4. When an internal client sends a request to the load balancer's internal IP address, the load balancer evaluates the request according to this logic and sends the request to the correct backend service
Modernizing legacy services
1. Internal HTTP(S) Load Balancing can be an effective tool for modernizing legacy applications
2. Deploy an internal HTTP(S) load balancer in front of a legacy application
3. Use the load balancer's traffic control capabilities to direct a subset of traffic to new microservices that replace the functionality that the legacy application provides
4. Configure the load balancer's URL map to route all traffic to the legacy application by default
5. This maintains the existing behavior as replacement services are developed, update the URL map to route portions of traffic to these replacement services
Three-tier web services
1. Use Internal HTTP(S) Load Balancing to support traditional three-tier web services
2. At each tier, the load balancer type depends on traffic type
3. Web tier: Traffic enters from the internet and is load balanced by using an external HTTP(S) load balancer
4. Application tier: The application tier is scaled by using a regional internal HTTP(S) load balancer
5. Database tier: The database tier is scaled by using an internal TCP/UDP load balancer
Architecture and resources
1. An internal managed forwarding rule specifies an internal IP address, port, and regional target HTTP(S) proxy.
2. Clients use the IP address and port to connect to the load balancer's Envoy proxies
3. A regional target HTTP(S) proxy receives a request from the client
4. The HTTP(S) proxy evaluates the request by using the URL map to make traffic routing decisions
5. The proxy can also authenticate communications by using SSL certificates
6. If using Internal HTTP(S) Load Balancing, the HTTP(S) proxy uses regional SSL certificates to prove its identity to clients
7. A target HTTP(S) proxy supports up to a number of SSL certificates
8. The HTTP(S) proxy uses a regional URL map to make a routing determination based on HTTP attributes (such as the request path, cookies, or headers)
9. Based on the routing decision, the proxy forwards client requests to specific regional backend services
10. The URL map can specify additional actions to take such as rewriting headers, sending redirects to clients, and configuring timeout policies (among others)
11. A regional backend service distributes requests to healthy backends (either instance groups containing Compute Engine VMs or NEGs containing GKE containers)
12. One or more backends must be connected to the backend service.
13. Backends can be instance groups or NEGs in a managed instance groups (zonal or regional), unmanaged instance groups (zonal) or network endpoint groups (zonal)
14. Instance groups and NEGs cannot be used on the same backend service
15. A regional health check periodically monitors the readiness of backends
16. This reduces the risk that requests might be sent to backends that can't service the request
17. A proxy-only subnet whose IP addresses are the source of traffic from the load balancer proxies to your backends is required
18. Create one proxy-only subnet in each region of a VPC network which uses internal HTTP(S) load balancers
19. Google manages this subnet, and all internal HTTP(S) load balancers in the region share it
20. This subnet cannot be used to host backends
21. A firewall is required for backends to accept connections from the proxy-only subnet.
Traffic types, scheme, and scope
1. Backend services support the HTTP, HTTPS, or HTTP/2 protocols
2. Clients and backends do not need to use the same request protocol
3. For example, clients can send requests to the load balancer by using HTTP/2, and the load balancer can forward these requests to backends by using HTTP/1.1
4. Because the scope of an internal HTTP(S) load balancer is regional, not global, clients and backend VMs or endpoints must all be in the same region
Limitations
1. Internal HTTP(S) Load Balancing operates at a regional level
2. There's no guarantee that a request from a client in one zone of the region is sent to a backend that's in the same zone as the client
3. Session affinity doesn't reduce communication between zones
4. An internal HTTP(S) load balancer supports HTTP/2 only over TLS
5. Google Cloud doesn't warn if the proxy-only subnet runs out of IP addresses
6. Within each VPC network, each internal managed forwarding rule must have its own IP address
7. The internal forwarding rule that internal HTTP(S) load balancer uses must have exactly one port
8. The WebSocket protocol is not supported
Incompatible features
1. Identity-Aware Proxy
2. Cloud CDN
3. Google Cloud Armor
4. Cloud Storage buckets
5. Google-managed SSL certificates
6. SSL policies
7. VPC Network Peering
Shared VPC host project
1. Client VMs can be located in either the host project or any connected service project
2. The client VMs must use the same Shared VPC network and the same region as the load balancer
3. All the load balancer's components and backends must be in the host project
4. This is different from other Google Cloud load balancers because none of the internal HTTP(S) load balancer components can be in a service project when the load balancer uses a Shared VPC network
5. The host project within the Shared VPC network owns and creates the proxy-only subnet (purpose=INTERNAL_HTTPS_LOAD_BALANCER)