Kubernetes Engine Service Management

Service
1. The idea of a Service is to group a set of Pod endpoints into a single resource
2. By default, a Service has a stable cluster IP address that clients inside the cluster can use to contact Pods in the Service
3. A client sends a request to the stable IP address, and the request is routed to one of the Pods in the Service
4. A Service identifies its member Pods with a selector
5. For a Pod to be a member of the Service, the Pod must have all of the labels specified in the selector
6. A label is an arbitrary key/value pair that is attached to an object
7. In a Kubernetes cluster, each Pod has an internal IP address
8. Pods in a Deployment come and go, and their IP addresses change
9. It doesn't make sense to use Pod IP addresses directly
10. A Service provides a stable IP address that lasts for the life of the Service, even as the IP addresses of the member Pods change
11. A Service provides load balancing
12. Clients call a single, stable IP address, and their requests are balanced across the Pods that are members of the Service
13. ClusterIP service type (default) enables internal clients to send requests to a stable internal IP address
14. NodePort service type enables clients to send requests to the IP address of a node on one or more nodePort values that are specified by the Service
15. LoadBalancer service type enable clients to send requests to the IP address of a network load balancer
16. ExternalName service type enables internal clients to use the DNS name of a Service as an alias for an external DNS name
17. A headless service provides a Pod grouping, without a stable IP address
18. The NodePort type is an extension of the ClusterIP type
19. A Service of type NodePort has a cluster IP address
20. The LoadBalancer type is an extension of the NodePort type
21. A Service of type LoadBalancer has a cluster IP address and one or more nodePort values
22. When a Service of type ClusterIP is created, Kubernetes creates a stable IP address that is accessible from nodes in the cluster
23. Clients in the cluster call the Service using the cluster IP address, and the TCP port specified in the port field of the Service manifest
24. The request is forwarded to one of the member Pods on the TCP port specified in the targetPort field
25. When a Service of type NodePort is created, Kubernetes allocates a nodePort value
26. The Service is accessible using the IP address of any node along with the nodePort value
27. External clients call the Service using the external IP address of a node along with the TCP port specified by nodePort
28. The request is forwarded to one of the member Pods on the TCP port specified by the targetPort field
29. The NodePort Service type is an extension of the ClusterIP Service type
30. Internal clients can call a Service using clusterIP and port, or a node's internal IP address and nodePort
31. HTTP(S) load balancer is a proxy server, and is fundamentally different from the network load balancer
32. A nodePort value in the 30000--32767 range can be specified
33. It is best to omit the nodePort field and let Kubernetes allocate a nodePort that avoids collisions between Services
34. When a Service of type LoadBalancer is created, a Google Cloud controller wakes up and configures a network load balancer.
35. The load balancer has a stable IP address that is accessible from outside the project
36. A network load balancer is not a proxy server
37. A network load balancer forwards packets with no change to the source and destination IP addresses
38. After a Service is created, kubectl get service -o yaml can be used to view its specification and see the stable external IP address
39. External clients call the Service by using the load balancer's IP address and the TCP port specified by port
40. The request is forwarded to one of the member Pods on the TCP port specified by targetPort
41. The LoadBalancer Service type is an extension of the NodePort type, which is an extension of the ClusterIP type
42. A LoadBalancer Service that provisions a network load balancer is given an external IP address and a load balancer
43. Some load balancing resources incur charges
44. A Service of type ExternalName provides an internal alias for an external DNS name
45. Internal clients make requests using the internal DNS name, and the requests are redirected to the external name
46. When a Service is created, Kubernetes creates a DNS name that internal clients can use to call the Service
47. An example DNS name is my-xn-service.default.svc.cluster.local
48. When an internal client makes a request to myservice.default.svc.cluster.local, the request gets redirected to an external domain, e.g. example.com
49. The ExternalName Service type is fundamentally different from the other Service types
50. A Service of type ExternalName is not associated with a set of Pods, and it does not have a stable IP address
51. A Service of type ExternalName is a mapping from an internal DNS name to an external DNS name
52. A Service is an abstraction in the sense that it is not a process that listens on some network interface
53. Part of the Service abstraction is implemented in the iptables rules of the cluster nodes
54. Depending on the type of the Service, other parts of the abstraction are implemented by Network Load Balancing or HTTP(S) load balancing
55. The value of the port field in a Service manifest is arbitrary, but the value of targetPort is not arbitrary
56. Each member Pod must have a container listening on targetPort
57. The ports field of a Service is an array of ServicePort objects
58. Where there are more than one ServicePort, each ServicePort must have a unique name
59. When a Service is created, Kubernetes creates an Endpoints object that has the same name as the Service
60. Kubernetes uses the Endpoints object to keep track of which Pods are members of the Service
StatefulSet
1. StatefulSets represent a set of Pods with unique, persistent identities and stable hostnames that is maintained regardless of where they are scheduled
2. The state information and other resilient data for any given StatefulSet Pod is maintained in persistent disk storage associated with the StatefulSet
3. StatefulSets use an ordinal index for the identity and ordering of their Pods
4. By default, StatefulSet Pods are deployed in sequential order and are terminated in reverse ordinal order
5. A StatefulSet named web has its Pods named web-0, web-1, and web-2.
6. When the web Pod specification is changed, its Pods are gracefully stopped and recreated in an ordered way; for example, web-2 is terminated first, then web-1, and so on
7. podManagementPolicy: Parallel field can be used to have a StatefulSet launch or terminate all of its Pods in parallel, rather than waiting for Pods to become Running and Ready or to be terminated prior to launching or terminating another Pod
8. StatefulSets use a Pod template, which contains a specification for its Pods
9. Pod specification determines how each Pod should look: what applications should run inside its containers, which volumes it should mount, its labels and selectors, and more
10. StatefulSets are designed to deploy stateful applications and clustered applications that save data to persistent storage, such as Google Compute Engine persistent disks
11. StatefulSets are suitable for deploying Kafka, MySQL, Redis, ZooKeeper, and other applications needing unique, persistent identities and stable hostnames
12. StatefulSet can be created using kubectl apply
13. StatefulSet ensures that the desired number of Pods are running and available at all times
14. StatefulSet automatically replaces Pods that fail or are evicted from their nodes, and automatically associates new Pods with the storage resources, resource requests and limits, and other configurations defined in the StatefulSet's Pod specification
15. To help prevent data loss, PersistentVolumes and PersistentVolumeClaims are not deleted when a StatefulSet is deleted
16. They must be manually deleted using kubectl delete pv and kubectl delete pvc
17. StatefulSets can be updated by making changes to its Pod specification, which includes its container images and volumes
18. StatefulSet object resource requests and limits, labels, and annotations can be updated using kubectl, the Kubernetes API, or the GKE Workloads menu in Google Cloud Console
19. To decide how to handle updates, StatefulSets use a update strategy defined in spec: updateStrategy
20. OnDelete strategy does not automatically delete and recreate Pods when the object's configuration is changed.
21. Must manually delete the old Pods to cause the controller to create updated Pods
22. RollingUpdate automatically deletes and recreates Pods when the object's configuration is changed
23. New Pods must be in Running and Ready states before their predecessors are deleted
24. Changing the Pod specification automatically triggers a rollout
25. This is the default update strategy for StatefulSets
26. StatefulSets update Pods in reverse ordinal order
27. Update rollouts can be monitored by running the command kubectl rollout status statefulset [STATEFULSET_NAME]
28. Rolling updates can be partitioned
29. Partitioning is useful for staging an update, rolling out a canary, or performing a phased roll out
30. When an update is partitioned, all Pods with an ordinal greater than or equal to the partition value are updated when the StatefulSet’s Pod specification is updated
31. Pods with an ordinal less than the partition value are not updated and, even if they are deleted
32. They are recreated using the previous version of the specification
33. If the partition value is greater than the number of replicas, the updates are not propagated to the Pods
DaemonSet
1. DaemonSets manage groups of replicated Pods
2. DaemonSets attempt to adhere to a one-Pod-per-node model, either across the entire cluster or a subset of nodes
3. As nodes are added to a node pool, DaemonSets automatically add Pods to the new nodes as needed
4. DaemonSets use a Pod template, which contains a specification for its Pods
5. The Pod specification determines how each Pod should look: what applications should run inside its containers, which volumes it should mount, its labels and selectors, and more
6. DaemonSets are useful for deploying ongoing background tasks needed to run on all or certain nodes, and which do not require user intervention
7. Examples of such tasks include storage daemons like ceph, log collection daemons like fluentd, and node monitoring daemons like collectd
8. A DaemonSets can be configured for each type of daemon to run on all nodes, or multiple DaemonSets for a single type of daemon can use different configurations for different hardware types and resource needs
Pod
1. Pods are the smallest, most basic deployable objects in Kubernetes
2. A Pod represents a single instance of a running process in a cluster
3. Pods contain one or more containers, such as Docker containers
4. When a Pod runs multiple containers, the containers are managed as a single entity and share the Pod's resources
5. Generally, running multiple containers in a single Pod is an advanced use case
6. Pods contain shared networking and storage resources for their containers
7. Pods are automatically assigned unique IP addresses
8. Pod containers share the same network namespace, including IP address and network port
9. Containers in a Pod communicate with each other inside the Pod on localhost
10. Pods can specify a set of shared storage volumes that can be shared among the containers
11. A Pod is a self-contained, isolated "logical host" that contains the systemic needs of the application it serves
12. A Pod is meant to run a single instance of an application on a cluster
13. It is not recommended to create individual Pods directly
14. Generally, a set of identical Pods, called replicas are created to run an application
15. A replicated set of Pods are created and managed by a controller, such as a Deployment
16. Controllers manage the lifecycle of their constituent Pods and can also perform horizontal scaling, changing the number of Pods as necessary
17. Although users might occasionally interact with Pods directly to debug, troubleshoot, or inspect them, it is highly recommended to use a controller to manage Pods
18. Pods run on nodes in a cluster
19. Once created, a Pod remains on its node until its process is complete, the Pod is deleted, the Pod is evicted from the node due to lack of resources, or the node fails
20. If a node fails, Pods on the node are automatically scheduled for deletion
21. Pods are ephemeral
22. Pods are not designed to run forever, and when a Pod is terminated it cannot be brought back
23. Pods do not disappear until they are deleted by a user or by a controller
24. Pods do not "heal" or repair themselves
25. If a Pod is scheduled on a node which later fails, the Pod is deleted
26. If a Pod is evicted from a node for any reason, the Pod does not replace itself
27. Each Pod has a PodStatus API object, which is represented by a Pod's status field
28. Pods publish their phase to the status: phase field
29. The phase of a Pod is a high-level summary of the Pod in its current state
30. Pending: Pod has been created and accepted by the cluster, but one or more of its containers are not yet running
31. Pending phase includes time spent being scheduled on a node and downloading images
32. Running: Pod has been bound to a node, and all of the containers have been created
33. In Running state, at least one container is running, is in the process of starting, or is restarting
34. Succeeded: All containers in the Pod have terminated successfully
35. Terminated Pods do not restart
36. Failed: All containers in the Pod have terminated, and at least one container has terminated in failure
37. A container "fails" if it exits with a non-zero status
38. Unknown: The state of the Pod cannot be determined
39. PodStatus contains an array called PodConditions, which is represented in the Pod manifest as conditions
40. The field has a type and status field
41. Conditions indicates more specifically the conditions within the Pod that are causing its current status
42. The type field can contain PodScheduled, Ready, Initialized, and Unschedulable
43. The status field corresponds with the type field, and can contain True, False, or Unknown
44. Run kubectl get pod [POD_NAME] -o yaml to view the Pod's entire manifest, including the phase and conditions fields
45. Because Pods are ephemeral, it is not necessary to create Pods directly
46. Because Pods cannot repair or replace themselves, it is not recommended to create Pods directly
47. Use a controller, such as a Deployment, which creates and manages Pods
48. Controllers are useful for rolling out updates, such as changing the version of an application running in a container, because the controller manages the whole update process for you
49. When a Pod starts running, it requests an amount of CPU and memory
50. Requesting memory and CPU helps Kubernetes schedule the Pod onto an appropriate node to run the workload
51. A Pod will not be scheduled onto a node that doesn't have the resources to honor the Pod's request
52. A request is the minimum amount of CPU or memory that Kubernetes guarantees to a Pod
53. Pod requests differ from and work in conjunction with Pod limits
54. Configure the CPU and memory requests for a Pod, based on the application's needs
55. Requests can be specified for individual containers running in the Pod
56. The default request for CPU is 100m
57. The default request for CPU is too small for many applications
58. There is no default request for memory
59. A Pod with no default memory request could be scheduled onto a node without enough memory to run the Pod's workloads
60. Setting too small a value for CPU or memory requests could cause too many Pods or a suboptimal combination of Pods to be scheduled onto a given node and reduce performance
61. Setting too large a value for CPU or memory requests could cause the Pod to be unschedulable and increase the cost of the cluster's resources
62. In addition to, or instead of, setting a Pod's resources, users can specify resources for individual containers running in the Pod
63. Where resources for the containers are only specified, the Pod's requests are the sum of the requests specified for the containers
64. Where resources for the pods and containers are specified, the sum of requests for all containers must not exceed the Pod requests
65. It is strongly recommended the requests for Pods are configured based on the requirements of the actual workloads
66. By default, a Pod has no upper bound on the maximum amount of CPU or memory it can use on a node
67. Limits can be set to control the amount of CPU or memory a Pod can use on a node
68. A limit is the maximum amount of CPU or memory that Kubernetes guarantees to a Pod
69. In addition to, or instead of, setting a Pod's limits, limits for individual containers running in the Pod can be specified
70. Where only limits for the containers are specified, the Pod's limits are the sum of the limits specified for the containers
71. Each container can only access resources up to its limit
72. Where limits are specified on containers only, a limit must be specified for each container
73. The sum of limits for all containers must not exceed the Pod limit
74. A limit must always be greater than or equal to a request for the same type of resource
75. If an attempt is made to set the limit below the request, the Pod's container cannot run and an error is logged
76. Limits are not taken into consideration when scheduling Pods, but can prevent resource contention among Pods on the same node, and can prevent a Pod from causing system instability on the node by starving the underlying operating system of resources
77. Pod limits differ from and work in conjunction with Pod requests
78. It is strongly recommended that users configure limits for Pods, based on the requirements of the actual workloads
79. Controller objects, such as Deployments and StatefulSets, contain a Pod template field
80. Pod templates contain a Pod specification which determines how each Pod should run, including which containers should be run within the Pods and which volumes the Pods should mount
81. Controller objects use Pod templates to create Pods and to manage their "desired state" within a cluster
82. When a Pod template is changed, all future Pods reflect the new template, but all existing Pods do not
83. By default, Pods run on nodes in the default node pool for the cluster
84. The node pool a Pod selects explicitly or implicitly can be configured
85. A Pod can be explicitly forced to be deployed to a specific node pool by setting a nodeSelector in the Pod manifest
86. Setting a nodeSelector forces a Pod to run only on Nodes in that node pool
87. Resource requests can be specified for the containers
88. Pod will only run on nodes that satisfy the resource requests
89. If the Pod definition includes a container that requires four CPUs, the Service will not select Pods running on Nodes with two CPUs
90. The simplest and most common Pod pattern is a single container per pod, where the single container represents an entire application
91. Pods with multiple containers are primarily used to support colocated, co-managed applications that share resources
92. Colocated containers might form a single cohesive unit of service—one container serving files from a shared volume while another container refreshes or updates those files
93. Pod wrap containers and storage resources together as a single manageable entity
94. Each Pod is meant to run a single instance of a given application
95. Replicated Pods are created and managed as a group by a controller, such as a Deployment
96. Pods terminate gracefully when their processes are complete
97. Kubernetes imposes a default graceful termination period of 30 seconds
98. When deleting a Pod you can override the grace period to the number of seconds to wait for the Pod to terminate before forcibly terminating it
Deployment
1. Deployments represent a set of multiple, identical Pods with no unique identities
2. A Deployment runs multiple replicas of an application and automatically replaces any instances that fail or become unresponsive
3. Deployments help ensure that one or more instances of an application are available to serve user requests
4. Deployments are managed by the Kubernetes Deployment controller
5. Deployments use a Pod template, which contains a specification for its Pods
6. The Pod specification determines what applications should run inside its containers, which volumes the Pods should mount, its labels, and more
7. When a Deployment's Pod template is changed, new Pods are automatically created one at a time
8. Deployments are well-suited for stateless applications that use ReadOnlyMany or ReadWriteMany volumes mounted on multiple replicas, but are not well-suited for workloads that use ReadWriteOnce volumes
9. For stateful applications using ReadWriteOnce volumes, use StatefulSets
10. StatefulSets are designed to deploy stateful applications and clustered applications that save data to persistent storage, such as Compute Engine persistent disks
11. StatefulSets are suitable for deploying Kafka, MySQL, Redis, ZooKeeper, and other applications needing unique, persistent identities and stable hostnames
12. You can create a Deployment using the kubectl run, kubectl apply, or kubectl create commands
13. Once created, the Deployment ensures that the desired number of Pods are running and available at all times
14. The Deployment automatically replaces Pods that fail or are evicted from their nodes
15. A Deployment can be updated by making changes to the Deployment's Pod template specification
16. Making changes to the specification field automatically triggers an update rollout
17. You can use kubectl, the Kubernetes API, or the GKE Workloads menu in Google Cloud Console
18. By default, when a Deployment triggers an update, the Deployment stops the Pods, gradually scales down the number of Pods to zero, then drains and terminates the Pods
19. Then, the Deployment uses the updated Pod template to bring up new Pods
20. Old Pods are not removed until a sufficient number of new Pods are Running, and new Pods are not created until a sufficient number of old Pods have been removed
21. To see in which order Pods are brought up and are removed, you can run kubectl describe deployments
22. Deployments can ensure that at least one less than the desired number of replicas are running, with at most one Pod being unavailable
23. Deployments can ensure that at most one more than the desired number of replicas are running, with at most one more Pod than desired running
24. An updated can be roll back using the kubectl rollout undo command
25. The kubectl rollout pause can be used to temporarily halt a Deployment
26. Deployments can be in one of three states during its lifecycle: progressing, completed, or failed
27. A progressing state indicates that the Deployment is in process of performing its tasks, like bringing up or scaling its Pods
28. A completed state indicates that the Deployment has successfully completed its tasks, all of its Pods are running with the latest specification and are available, and no old Pods are still running
29. A failed state indicates that the Deployment has encountered one or more issues that prevent it from completing its tasks
30. Some causes include insufficient quotas or permissions, image pull errors, limit ranges, or runtime errors
31. To investigate what causes a Deployment to fail, you can run kubectl get deployment [DEPLOYMENT+NAME] -o yaml and examine the messages in the status: conditions field
32. Deployment's progress can be monitored or its progress checked using the kubectl rollout status command
Batch
1. Batch on GKE (Batch) is a cloud-native solution for scheduling and managing batch workloads
2. Batch can leverage the on-demand and flexible nature of cloud
3. Batch is based on Kubernetes and containers so jobs are portable
4. A BatchJob describe the work to be done and the compute resources, data, etc. needed to do it
5. BatchQueues are the central organising construct in Batch
6. A Queue specifies BatchJobConstraint, the jobs and policies acceptable in a given Queue
7. A single BatchJobConstraint resource may be assigned to multiple Queues
8. A Queue can specify a BatchBudget that defines how many resources a Queue may use or how much it can spend at a time or over a number of days
9. Several Queues may consume the same budget
10. A Queue may also specify a BatchPriority
11. Admins may define multiple different priority levels in the system
12. A Queue can have one priority level, but the same BatchPriority resource can be attached to multiple Queues