Kubernetes Engine Node Pool

Pool
1. A node pool is a group of nodes within a cluster that all have the same configuration
2. Each node in the pool has a Kubernetes node label, cloud.google.com/gke-nodepool, which has the node pool's name as its value
3. A node pool can contain only a single node or many nodes
4. When a cluster is created, the number and type of nodes specified becomes the default node pool
5. Additional custom node pools of different sizes and types can be added to a cluster
6. All nodes in any given node pool are identical to one another
7. Custom node pools are useful when there is a need to schedule Pods that require more resources than others
8. Node taints can be used to control where Pods are scheduled
9. Any configuration changes to a pool affect all nodes in the node pool
10. By default, all new node pools run the latest stable version of Kubernetes
11. Existing node pools can be manually upgraded or automatically upgraded
12. Multiple Kubernetes node versions can be configured for each node pool in a cluster
13. Each node pool can be updated independently
14. Each node pool can target a specific deployments
15. The node pool a Service is deployed into can be controlled at deployment time
16. The target node pool is not dependent on the configuration of the Service, but the configuration of the Pods
17. A Pod can be explicitly deployed to a specific node pool by setting a nodeSelector in the Pod manifest
18. Pods running on nodes satisfy the resource requests for its containers
19. All of the node pools are replicated to multi-zonal cluster zones automatically
20. Any new node pool in a multi-zonal cluster is automatically created in all zones
21. Any node pool deletions in a multi-zonal cluster deletes the node pools from all zones
22. The multiplicative effect of multi-zonal clusters may result in more of a project’s quota being consumed
Images
1. GKE offers node image options for the cluster Container-Optimized OS from Google, Ubuntu, Windows (beta), Container-Optimized OS with containerd (cos_containerd) and Ubuntu with containerd (ubuntu_containerd)
2. The Container-Optimized OS node image is based on a recent version of the Linux kernel and is optimized to enhance node security
3. It is backed by a team at Google that can quickly patch it for security and iterate on features
4. Container-Optimized OS image provides better support, security, and stability than other images
5. Ubuntu node image has been validated against GKE's node image requirements
6. Use the Ubuntu node image if nodes require support for XFS, CephFS, or Debian packages
7. Containerd is an important building block and the core runtime component of Docker
8. Docker cannot view or access containers or images managed by Kubernetes
9. Applications should not interact with Docker directly
10. For general troubleshooting or debugging, use crictl instead
11. cos_containerd is a variant of the Container-Optimized OS image with containerd as the container runtime directly integrated with Kubernetes
12. ubuntu_containerd is a variant of the Ubuntu image that uses containerd as the container runtime
13. The cos and cos_containerd node images use a minimal root file system with built-in support for the Docker (containerd) container runtime, which also serves as the software package manager for installing software on the host
14. The Container-Optimized OS image does not provide package management software such as apt-get
15. Users can't install arbitrary software onto the nodes using conventional mechanisms
16. Instead, create a container image that contains the software needed
17. For debugging purposes only, Container-Optimized OS includes the CoreOS Toolbox for installing and running common debugging tools such as ping, psmisc, or pstree
18. The Ubuntu image has the Aptitude package manager pre-installed
19. Use the apt-get command to install packages on these images
20. To view logs on a node with the Container-Optimized OS or Ubuntu node image, you must use the journalctl command
21. Container-Optimized OS node image file system layout is optimized to enhance node security
22. The boot disk space is split into a root partition, which is mounted as read-only, a stateful partitions, which are writable and stateful and a stateless partitions, which are writable but the contents do not persist across reboots
TPU
1. Tensor Processing Units (TPUs) are Google’s custom-developed application-specific integrated circuits (ASICs) used to accelerate TensorFlow machine learning workloads
2. GKE sets up and manages the Cloud TPU VM and the CIDR block
3. GKE scales VMs and Cloud TPU nodes automatically based on workloads and traffic
4. Users only pay for Cloud TPU and the VM when they run workloads on them
5. A one-line change in Pod spec is required to request a different hardware accelerator (CPU, GPU, or TPU)
6. GKE provides APIs (Job and Deployment) that can easily scale to hundreds of Pods and Cloud TPU nodes
7. GKE's Job API, along with the TensorFlow checkpoint mechanism, provide the run-to-completion semantic
8. Training jobs will automatically rerun with the latest state read from the checkpoint if failures occur on the VM instances or Cloud TPU nodes
containerd
1. cos_containerd and ubuntu_containerd images use containerd as the container runtime in a GKE cluster
2. cos_containerd or ubuntu_containerd can be selected as the image type when a new GKE cluster or new Node Pool is created in an existing cluster, or when an existing GKE cluster is upgraded
3. Docker is still available on each containerd Node, but Kubernetes uses containerd as the container runtime
4. Docker commands cannot be used to view or interact with containers as Docker does not manage Kubernetes containers on the nodes