Kubernetes Engine Networking

Overview
1. Kubernetes' advanced software-defined networking (SDN) enables packet routing and forwarding for Pods, Services, and nodes in a cluster.
2. Kubernetes and Google Cloud dynamically configure IP filtering rules, routing tables, and firewall rules on each node
3. Do not manually make changes to nodes because they are overridden by GKE, and clusters may not function correctly
4. The only reason to access a node directly is to debug problems with configuration
5. The Kubernetes networking model relies heavily on IP addresses
6. Services, Pods, containers, and nodes communicate using IP addresses and ports
7. Kubernetes provides different types of load balancing to direct traffic to the correct Pods
8. ClusterIP: The IP address assigned to a Service. This address is stable for the lifetime of the Service
9. Pod IP: The IP address assigned to a given Pod. This is ephemeral
10. Node IP: The IP address assigned to a given node
11. Each node has an IP address assigned from the cluster's Virtual Private Cloud (VPC) network
12. This node IP provides connectivity from control components like kube-proxy and the kubelet to the Kubernetes API server
13. This IP is the node's connection to the rest of the cluster
14. Each node has a pool of IP addresses that GKE assigns Pods running on that node (a /24 CIDR block by default)
15. Users can optionally specify the range of IPs when creating the cluster
16. The Flexible Pod CIDR range feature allows users to reduce the size of the range for Pod IPs for nodes in a given node pool
17. Users can run a maximum of 110 Pods on a node with a /24 range, not 256 as you might expect
18. This provides a buffer so that Pods don't become unschedulable due to a transient lack of IP addresses in the Pod IP range for a given node
19. For ranges smaller than /24, roughly half as many Pods can be scheduled as IP addresses in the range
20. Each Pod has a single IP address assigned from the Pod CIDR range of its node
21. This IP address is shared by all containers running within the Pod, and connects them to other Pods running in the cluster
22. Each Service has an IP address, called the ClusterIP, assigned from the cluster's VPC network
23. Users can optionally customize the VPC network when creating the cluster
24. In Kubernetes, a Pod is the most basic deployable unit within a Kubernetes cluster
25. A Pod runs one or more containers
26. Zero or more Pods run on a node
27. Each node in the cluster is part of a node pool
28. In GKE, these nodes are virtual machines, each running as an instance in Compute Engine
29. Pods can also attach to external storage volumes and other custom resources
30. When Kubernetes schedules a Pod to run on a node, it creates a network namespace for the Pod in the node's Linux kernel
31. This network namespace connects the node's physical network interface, such as eth0, with the Pod using a virtual network interface, so that packets can flow to and from the Pod
32. The associated virtual network interface in the node's root network namespace connects to a Linux bridge that allows communication among Pods on the same node
33. A Pod can also send packets outside of the node using the same virtual interface
34. Kubernetes assigns an IP address (the Pod IP) to the virtual network interface in the Pod's network namespace from a range of addresses reserved for Pods on the node
35. This address range is a subset of the IP address range assigned to the cluster for Pods, which can be configured when a cluster is created
36. A container running in a Pod uses the Pod's network namespace
37. From the container's point of view, the Pod appears to be a physical machine with one network interface
38. All containers in the Pod see this same network interface
39. Each container's localhost is connected, through the Pod, to the node's physical network interface, such as eth0
40. By default, each Pod has unfiltered access to all the other Pods running on all nodes of the cluster, but users can limit access among Pods
41. A Pod's IP address is an implementation detail, and should not be relied upon
42. Kubernetes provides stable IP addresses using Services
43. Users can assign arbitrary key-value pairs called labels to any Kubernetes resource
44. Kubernetes uses labels to group multiple related Pods into a logical unit called a Service
45. A Service has a stable IP address and ports, and provides load balancing among the set of Pods whose labels match all the labels defined in the label selector when the Service is created
46. Kubernetes assigns a stable, reliable IP address to each newly-created Service (the ClusterIP) from the cluster's pool of available Service IP addresses
47. Kubernetes also assigns a hostname to the ClusterIP, by adding a DNS entry
48. The ClusterIP and hostname are unique within the cluster and do not change throughout the lifecycle of the Service
49. Kubernetes only releases the ClusterIP and hostname if the Service is deleted from the cluster's configuration
50. Users can reach a healthy Pod running an application using either the ClusterIP or the hostname of the Service
51. Kubernetes spreads traffic as evenly as possible across the full set of Pods, running on many nodes, so a
52. A cluster can withstand an outage affecting one or more (but not all) nodes
53. Kubernetes manages connectivity among Pods and Services using the kube-proxy component, which runs on each node
54. kube-proxy is not an in-line proxy, but an egress-based load-balancing controller
55. It watches the Kubernetes API server and continually maps the ClusterIP to healthy Pods by adding and removing destination NAT (DNAT) rules to the node's iptables subsystem
56. When a container running in a Pod sends traffic to a Service's ClusterIP, the node selects a Pod at random and routes the traffic to that Pod
57. When a Service is configured, users can optionally remap its listening port by defining values for port and targetPort
58. The port is where clients reach the application
59. The targetPort is the port where the application is actually listening for traffic within the Pod
60. kube-proxy manages this port remapping by adding and removing iptables rules on the node
61. By default, Pods do not expose an external IP address, because kube-proxy manages all traffic on each node
62. Pods and their containers can communicate freely, but connections outside the cluster cannot access the Service
63. Clients outside the cluster cannot access the frontend Service via its ClusterIP
64. GKE provides three different types of Load Balancers to control access and to spread incoming traffic across clusters as evenly as possible
65. Users can configure one Service to use multiple types of Load Balancers simultaneously
66. External Load Balancers manage traffic coming from outside the cluster and outside your Google Cloud Virtual Private Cloud (VPC) network
67. They use forwarding rules associated with the Google Cloud network to route traffic to a Kubernetes node
68. Internal Load Balancers manage traffic coming from within the same VPC network
69. Like external load balancers, they use forwarding rules associated with the Google Cloud network to route traffic to a Kubernetes node
70. HTTP(S) Load Balancers are specialized external load balancers used for HTTP(S) traffic
71. They use an Ingress resource rather than a forwarding rule to route traffic to a Kubernetes node
72. When traffic reaches a Kubernetes node, it is handled the same way, regardless of the type of load balancer
73. The load balancer is not aware of which nodes in the cluster are running Pods for its Service
74. The Service balances traffic across all nodes in the cluster, even those not running a relevant Pod
75. On a regional cluster, the load is spread across all nodes in all zones for the cluster's region
76. When traffic is routed to a node, the node routes the traffic to a Pod, which may be running on the same node or a different node
77. The node forwards the traffic to a randomly chosen Pod by using the iptables rules that kube-proxy manages on the node
78. When a load balancer sends traffic to a node, the traffic might get forwarded to a Pod on a different node
79. This requires extra network hops
80. To avoid the extra hops, specify that traffic must go to a Pod that is on the same node that initially receives the traffic
81. To specify that traffic must go to a Pod on the same node, set externalTrafficPolicy to Local in the Service manifest
82. When externalTrafficPolicy is set to Local, the load balancer sends traffic only to nodes that have a healthy Pod that belongs to the Service
83. The load balancer uses a health check to determine which nodes have the appropriate Pods
84. If a Service needs to be reachable from outside the cluster and outside the VPC network, the Service can be configured as a LoadBalancer
85. GKE then provisions a Network Load Balancer in front of the Service
86. The Network Load Balancer is aware of all nodes in the cluster and configures VPC network's firewall rules to allow connections to the Service from outside the VPC network, using the Service's external IP address
87. A static external IP address can be assigned to the Service
88. When using the external load balancer, arriving traffic is initially routed to a node using a forwarding rule associated with the Google Cloud network
89. After the traffic reaches the node, the node uses its iptables NAT table to choose a Pod
90. kube-proxy manages the iptables rules on the node
91. For traffic that needs to reach a cluster from within the same VPC network, configure the Service to provision an Internal Load Balancer
92. The Internal Load Balancer chooses an IP address from the cluster's VPC subnet instead of an external IP address
93. Applications or services within the VPC network can use this IP address to communicate with Services inside the cluster
94. When the traffic reaches a given node, that node uses its iptables NAT table to choose a Pod, even if the Pod is on a different node
95. kube-proxy manages the iptables rules on the node
96. When traffic reaches a given node, that node uses its iptables NAT table to choose a Pod, even if the Pod is on a different node
97. kube-proxy manages the iptables rules on the node
98. Many applications, such as RESTful web service APIs, communicate using HTTP(S)
99. Users can allow clients external to the VPC network to access applications using a Kubernetes Ingress resource
100. An Ingress resource map hostnames and URL paths to Services within the cluster
101. When using a HTTP(S) load balancer, configure the Service to use a NodePort, as well as a ClusterIP
102. When traffic accesses the Service on a node's IP at the NodePort, GKE routes traffic to a healthy Pod for the Service
103. A NodePort can be specified to allow GKE to assign a random unused port
104. When an Ingress resource is created, GKE provisions an HTTP(S) Load Balancer in the Google Cloud project
105. The load balancer sends a request to a node's IP address at the NodePort
106. After the request reaches the node, the node uses its iptables NAT table to choose a Pod
107. kube-proxy manages the iptables rules on the node
108. When an Ingress object is created, the GKE Ingress controller configures a Google Cloud HTTP(S) load balancer according to the rules in the Ingress manifest and the associated Service manifests
109. The client sends a request to the HTTP(S) load balancer
110. The load balancer is an actual proxy
111. The load balancer chooses a node and forwards the request to that node's NodeIP:NodePort combination
112. The node uses its iptables NAT table to choose a Pod
113. kube-proxy manages the iptables rules on the node
114. By default, all Pods running within the same cluster can communicate freely
115. Access among Pods can be limited using a network policy
116. Network policy definitions allow users to restrict the ingress and egress of Pods based on an arbitrary combination of labels, IP ranges, and port numbers
117. By default, there is no network policy, so all traffic among Pods in the cluster is allowed
118. As soon as the first network policy is created in a namespace, all other traffic is denied
119. After creating a network policy, users must explicitly enable it for the cluster
120. If a Service uses an External Load Balancer, traffic from any external IP address can access the Service by default
121. If a service uses the HTTP(S) Load Balancer, Google Cloud Armor security policy can be used to limit which external IP addresses can access the Service and which responses to return when access is denied because of the security policy
122. Users can configure Logging to log information about these interactions
123. If a Google Cloud Armor security policy is not fine-grained enough, users can enable the Identity-Aware Proxy on endpoints to implement user-based authentication and authorization for applications
DNS
1. Service discovery is implemented with autogenerated service names that map to the service's IP address
2. Service names follow a standard specification my-svc.my-namespace.svc.my-zone
3. Pods can access external services, like example.com, through their names
4. GKE provides managed DNS for resolving service names and for resolving external names
5. Managed DNS is implemented by kube-dns, a cluster add-on that is deployed by default in all GKE clusters
6. kube-dns runs as a Deployment that schedules redundant kube-dns Pods to nodes in the cluster
7. The kube-dns Pods are in the kube-system namespace
8. The kube-dns deployment is accessed through a corresponding Service that groups the kube-dns Pods and gives them a single IP address
9. By default, all Pods in a cluster use this service to resolve DNS queries
10. kube-dns scales to serve the DNS demands of the cluster
11. This scaling is controlled by the kube-dns-autoscaler which is deployed by default in all GKE clusters
12. kube-dns-autoscaler adjusts the number of replicas in the kube-dns deployment based on the number of nodes and cores in the cluster
13. The kubelet agent running on each Pod configures the Pod's etc/resolv.conf to use the kube-dns service's ClusterIP
14. kube-dns is the authoritative name server for the cluster domain (cluster.local) and it recursively resolves external names
15. Short names that are not fully qualified, like myservice, are completed first with local search paths, e.g. myservice.default.svc.cluster.local, myservice.svc.cluster.local, myservice.cluster.local, myservice.c.my-project-id.internal, and myservice.google.internal.
Load Balancing
1. Ingress object defines rules for routing external HTTP(S) traffic to applications
2. Ingress object is associated with one or more Service objects, each associated with Pods
3. GKE ingress controller creates and configures a Google Cloud HTTP(S) load balancer based on Ingress object configuration
4. Configuration of the HTTP(S) load balancer and its components, including target proxies, URL maps, and backend services must not be manually updated
5. Ingress defines how traffic reaches Services and is routed to an application
6. Ingress can provide a single IP address for multiple Services in a cluster
7. Ingress can configure Google Cloud features such as Google-managed SSL certificates, Google Cloud Armor, Cloud CDN, and Identity-Aware Proxy
8. Ingress can specify the use of multiple TLS certificates for request termination
9. Google-managed SSL certificates are provisioned, deployed, renewed, and managed for domains
10. Managed certificates do not support wildcard domains or multiple subject alternative names (SANs)
11. Users can provision SSL certificate and create a certificate resource in Google Cloud project
12. Users can list custom certificate resource in an annotation on an Ingress to create an HTTP(S) load balancer that uses the certificate
13. Users can provision their own SSL certificate and create a Secret to hold it
14. Secret can be referenced in an Ingress specification to create an HTTP(S) load balancer that uses the certificate
15. GKE ingress controller to use readinessProbes as health checks, the Pods for an Ingress must exist at the time of Ingress creation
16. If replicas are scaled to 0, the default health check applies
17. Changes to a Pod's readinessProbe do not affect the Ingress after it is created
18. HTTPS load balancer terminates TLS in locations that are distributed globally, to minimize latency between clients and the load balancer
19. If geographic control over where TLS is terminated is required, use a custom ingress controller and GCP Network Load Balancing instead, and terminate TLS on backends that are located in appropriate regions
20. HTTP(S) load balancer provides one stable IP address to route requests to a variety of backend services
21. Load balancer can route requests to different backend services depending on the URL path
22. Load balancer can route requests according to the hostname
23. Users create and configure an HTTP(S) load balancer by creating a Kubernetes Ingress object
24. Ingress object must be associated with one or more Service objects, each of which is associated with a set of Pods
25. When an Ingress is created, GKE ingress controller creates and configures an HTTP(S) load balancer according to the information in the Ingress and the associated Services
26. The ingress load balancer is given a stable IP address that can be associated with a domain name
27. The only supported wildcard character for the path field of an Ingress is the * character.
28. The * character must follow a forward slash (/) and must be the last character in the pattern
29. A more specific pattern takes precedence over a less specific pattern, so where there is /foo/* and /foo/bar/*, then /foo/bar/bat is taken to match /foo/bar/*
30. Service type of NodePort is the required type for an Ingress that is used to configure an HTTP(S) load balancer
31. Service manifest, the selector field indicates that any Pod that has the specified label is a member of the Service
32. A default backend can be configured by providing a backend field in the Ingress manifest
33. Any requests that don't match the paths in the rules field are sent to the Service and port specified in the backend field
34. If a default backend is not specified, GKE provides a default backend that returns 404
35. A Service exposed through an Ingress must respond to health checks from the load balancer
36. Any container that is the final destination of load-balanced traffic must serve a response with an HTTP 200 status to GET requests on the / path to indicate that it is healthy
37. Configure an HTTP readiness probe
38. Serve a response with an HTTP 200 status to GET requests on the path specified by the readiness probe
39. The Service exposed through an Ingress must point to the same container port on which the readiness probe is enabled
40. If the Deployment is configured or scaled to 0 Pods, the HTTP readiness probe's path is set to /, regardless of the value of readinessProbe.path
41. A Kubernetes Service and a Google Cloud backend service are distinct
42. GKE ingress controller creates a Google Cloud backend service for each (serviceName, servicePort) pair in an Ingress manifest
43. Kubernetes Service object can be related to several Google Cloud backend services
44. BackendConfig can be used to configure HTTP(S) load balancer to use features like Google Cloud Armor, Cloud CDN, and IAP
45. BackendConfig is a custom resource that holds configuration information for Google Cloud features
46. An Ingress manifest refers to a Service, and the Service manifest refers to a BackendConfig by using a beta.cloud.google.com/backend-config annotation
47. With HTTP(S) Load Balancing, the WebSocket protocol just works and requires no configuration
48. You might want to use a timeout value larger than the default 30 seconds
49. To set the timeout value for a backend service configured through Ingress, create a BackendConfig object, and use the beta.cloud.google.com/backend-config annotation in your Service manifest
50. An Ingress object is associated with a stable external IP address that clients can use to access Services and in turn, running containers
51. The stable IP address lasts for the lifetime of the Ingress object
52. If Ingress is deleted and recreated from the same manifest file, there is no guarantee to get the same external IP address
53. To get a permanent IP address, reserve a global static external IP address
54. Include the annotation kubernetes.io/ingress.global-static-ip-name in Ingress manifest for the name of reserved static IP address
55. HTTP(S) load balancer acts as a proxy between clients and application
56. To accept HTTPS requests from clients, the load balancer must have a certificate to prove its identity to clients
57. The load balancer must also have a private key to complete the HTTPS handshake
58. HTTPS traffic between the client and the load balancer is encrypted using TLS
59. Load balancer terminates TLS encryption, and forwards the request without encryption to the application
60. Use self managed or Google-managed SSL certificates
61. Create a Kubernetes Secret to provide an HTTP(S) load balancer with a certificate and key
62. To use the Secret, add its name in the tls field of Ingress manifest
63. Changes to Secrets are picked up periodically, so it can take up to 10 minutes for changes to be applied to the load balancer
64. To force all traffic between the client and the HTTP(S) load balancer to use HTTPS, disable HTTP by including the kubernetes.io/ingress.allow-http annotation in Ingress manifest, and set the value of the annotation to "false"
65. As an alternative to using Kubernetes Secrets to provide certificates to the load balancer for HTTP(S) termination, certificates previously uploaded to GCP project can be used
66. Clients can use HTTP or HTTPS to communicate with the load balancer proxy
67. The connection from the load balancer proxy to application uses HTTP by default
68. Where the application running in a GKE Pod is capable of receiving HTTPS requests, configure the load balancer to use HTTPS to forward requests to the application
69. To configure the protocol used between the load balancer and application, use the cloud.google.com/app-protocols annotation in Service manifest to specify ports
70. To enable a HTTP(S) load balancer to serve content from two hostnames using separate certificate, specify multiple certificates in an Ingress manifest
71. The load balancer chooses a certificate if the Common Name (CN) in the certificate matches the hostname used in the request.
BackendConfig
1. BackendConfig is a custom resource definition that is used by the Kubernetes Engine Ingress controller
2. Users can provide configuration for a Cloud load balancer by associating Service ports with BackendConfig objects
3. A BackendConfig can be used to configure Cloud CDN (Cloud CDN), Google Cloud Armor, Identity-Aware Proxy (IAP), Timeout, Connection draining timeout, Session affinity and User-defined request headers features of HTTP(S) Load Balancing
4. When a Kubernetes Ingress object is created, the GKE Ingress controller creates and configures an HTTP(S) load balancer
5. Ingress has rules, each of which references a port in a Kubernetes Service object
6. If a port of a Service is referenced by an Ingress, the port is associated with an HTTP(S) Load Balancing backend service
7. If a Service port is referenced by an Ingress, and if the Service port is associated with a BackendConfig, then the HTTP(S) load balancing backend service takes part of its configuration from the BackendConfig