-
Pool
- A node pool is a group of nodes within a cluster that all have the same configuration
- Each node in the pool has a Kubernetes node label, cloud.google.com/gke-nodepool, which has the node pool's name as its value
- A node pool can contain only a single node or many nodes
- When a cluster is created, the number and type of nodes specified becomes the default node pool
- Additional custom node pools of different sizes and types can be added to a cluster
- All nodes in any given node pool are identical to one another
- Custom node pools are useful when there is a need to schedule Pods that require more resources than others
- Node taints can be used to control where Pods are scheduled
- Any configuration changes to a pool affect all nodes in the node pool
- By default, all new node pools run the latest stable version of Kubernetes
- Existing node pools can be manually upgraded or automatically upgraded
- Multiple Kubernetes node versions can be configured for each node pool in a cluster
- Each node pool can be updated independently
- Each node pool can target a specific deployments
- The node pool a Service is deployed into can be controlled at deployment time
- The target node pool is not dependent on the configuration of the Service, but the configuration of the Pods
- A Pod can be explicitly deployed to a specific node pool by setting a nodeSelector in the Pod manifest
- Pods running on nodes satisfy the resource requests for its containers
- All of the node pools are replicated to multi-zonal cluster zones automatically
- Any new node pool in a multi-zonal cluster is automatically created in all zones
- Any node pool deletions in a multi-zonal cluster deletes the node pools from all zones
- The multiplicative effect of multi-zonal clusters may result in more of a project’s quota being consumed
-
Images
- GKE offers node image options for the cluster Container-Optimized OS from Google, Ubuntu, Windows (beta), Container-Optimized OS with containerd (cos_containerd) and Ubuntu with containerd (ubuntu_containerd)
- The Container-Optimized OS node image is based on a recent version of the Linux kernel and is optimized to enhance node security
- It is backed by a team at Google that can quickly patch it for security and iterate on features
- Container-Optimized OS image provides better support, security, and stability than other images
- Ubuntu node image has been validated against GKE's node image requirements
- Use the Ubuntu node image if nodes require support for XFS, CephFS, or Debian packages
- Containerd is an important building block and the core runtime component of Docker
- Docker cannot view or access containers or images managed by Kubernetes
- Applications should not interact with Docker directly
- For general troubleshooting or debugging, use crictl instead
- cos_containerd is a variant of the Container-Optimized OS image with containerd as the container runtime directly integrated with Kubernetes
- ubuntu_containerd is a variant of the Ubuntu image that uses containerd as the container runtime
- The cos and cos_containerd node images use a minimal root file system with built-in support for the Docker (containerd) container runtime, which also serves as the software package manager for installing software on the host
- The Container-Optimized OS image does not provide package management software such as apt-get
- Users can't install arbitrary software onto the nodes using conventional mechanisms
- Instead, create a container image that contains the software needed
- For debugging purposes only, Container-Optimized OS includes the CoreOS Toolbox for installing and running common debugging tools such as ping, psmisc, or pstree
- The Ubuntu image has the Aptitude package manager pre-installed
- Use the apt-get command to install packages on these images
- To view logs on a node with the Container-Optimized OS or Ubuntu node image, you must use the journalctl command
- Container-Optimized OS node image file system layout is optimized to enhance node security
- The boot disk space is split into a root partition, which is mounted as read-only, a stateful partitions, which are writable and stateful and a stateless partitions, which are writable but the contents do not persist across reboots
-
TPU
- Tensor Processing Units (TPUs) are Google’s custom-developed application-specific integrated circuits (ASICs) used to accelerate TensorFlow machine learning workloads
- GKE sets up and manages the Cloud TPU VM and the CIDR block
- GKE scales VMs and Cloud TPU nodes automatically based on workloads and traffic
- Users only pay for Cloud TPU and the VM when they run workloads on them
- A one-line change in Pod spec is required to request a different hardware accelerator (CPU, GPU, or TPU)
- GKE provides APIs (Job and Deployment) that can easily scale to hundreds of Pods and Cloud TPU nodes
- GKE's Job API, along with the TensorFlow checkpoint mechanism, provide the run-to-completion semantic
- Training jobs will automatically rerun with the latest state read from the checkpoint if failures occur on the VM instances or Cloud TPU nodes
-
containerd
- cos_containerd and ubuntu_containerd images use containerd as the container runtime in a GKE cluster
- cos_containerd or ubuntu_containerd can be selected as the image type when a new GKE cluster or new Node Pool is created in an existing cluster, or when an existing GKE cluster is upgraded
- Docker is still available on each containerd Node, but Kubernetes uses containerd as the container runtime
- Docker commands cannot be used to view or interact with containers as Docker does not manage Kubernetes containers on the nodes