-
Overview
- Protecting workloads in Google Kubernetes Engine involves many layers of the stack, including the contents of container image, the container runtime, the cluster network, and access to the cluster API server
- Take a layered approach to protecting clusters and workloads
- Apply the principle of least privilege to the level of access provided to users and the application
- In each layer there may be different tradeoffs that must be made that allow the right level of flexibility and security for the organization to securely deploy and maintain their workloads
- User accounts are accounts that are known to Kubernetes, but are not managed by Kubernetes
- Service accounts are accounts that are created and managed by Kubernetes, but can only be used by Kubernetes-created entities, such as pods
- In a Google Kubernetes Engine cluster, Kubernetes user accounts are managed by Google Cloud, and may be Google Accounts or Google Cloud service accounts
- Once authenticated, authorize these identities to create, read, update or delete Kubernetes resources
- Kubernetes service accounts and Google Cloud service accounts are different entities
- Kubernetes service accounts are part of the cluster in which they are defined and are typically used within that cluster
- Google Cloud service accounts can be granted permissions both within clusters and to Google Cloud project clusters themselves, as well as to any Google Cloud resource using Cloud Identity and Access Management (Cloud IAM)
- Google Cloud service accounts are more powerful than Kubernetes service accounts
- In order to follow the security principle of least privilege, consider using Google Cloud service accounts only when their capabilities are required
- To configure more granular access to Kubernetes resources at the cluster level or within Kubernetes namespaces, use Role-Based Access Control (RBAC)
- RBAC allows users to create detailed policies that define which operations and resources to allow users and service accounts to access
- With RBAC, users can control access for Google Accounts, Google Cloud service accounts, and Kubernetes service accounts
- Use Kubernetes RBAC and Cloud IAM as the sources of truth
- In Google Kubernetes Engine, the Kubernetes master components are managed and maintained by Google
- The master components host the software that runs the Kubernetes control plane, including the API server, scheduler, controller manager and the etcd database where Kubernetes configuration is persisted
- By default, the master components use a public IP address
- Protect the Kubernetes API server by using master authorized networks, and private clusters, which allow users to assign a private IP address to the master and disable access on the public IP address
- Handle cluster authentication in Google Kubernetes Engine by using Cloud IAM as the identity provider
- For enhanced authentication security, disabled Basic Authentication by setting an empty username and password for the MasterAuth configuration
- Disable the client certificate which ensures that there is one less key to think about when locking down access to the cluster
- Another way to help secure a Kubernetes master is to perform credential rotation on a regular basis
- When credential rotation is initiated, the SSL certificates and cluster certificate authority are rotated
- This process is automated by Google Kubernetes Engine and also ensures that the master IP address rotates
- Google Kubernetes Engine deploys workloads on Compute Engine instances running in the Google Cloud project
- These instances are attached to Google Kubernetes Engine cluster as nodes
- By default, Google Kubernetes Engine nodes use Google's Container-Optimized OS as the operating system on which to run Kubernetes and its components
- Container-Optimized OS implements locked-down firewall
- Container-Optimized OS implements read-only filesystem where possible
- Container-Optimized OS implements limited user accounts and disabled root login
- A best practice is to patch OS on a regular basis
- From time to time, security issues in the container runtime, Kubernetes itself, or the node operating system might require an upgrade to the nodes more urgently
- When the node is upgraded, the node's software is upgraded to their latest versions
- Users can manually upgrade the nodes in the cluster, but Google Kubernetes Engine also allows users to enable automatic upgrades
- For clusters that run unknown or untrusted workloads, a good practice is to protect the operating system on the node from the untrusted workload running in a Pod
- Multi-tenant clusters such as software-as-a-service (SaaS) providers often execute unknown code submitted by their users
- Enable GKE Sandbox on clusters to isolate untrusted workloads in sandboxes on the node
- GKE Sandbox is built using gVisor, an open source project
- Google Kubernetes Engine nodes run as Compute Engine instances, and as such they have access to instance metadata by default
- Instance metadata is used to provide nodes with credentials and configurations used in bootstrapping and connecting to the Kubernetes master nodes
- A Pod running on a node does not necessarily need this information, which contains sensitive data, like the node's service account key
- Users can lock down sensitive instance metadata paths by disabling legacy APIs and by using metadata concealment
- Metadata concealment ensures that Pods running in a cluster are not able to access sensitive data by filtering requests to fields such as the kube-env
- Most workloads running in Google Kubernetes Engine need to communicate with other services that could be running either inside or outside of the cluster
- Use several different methods to control what traffic is allowed to flow through clusters and their Pods
- By default, all Pods in a cluster can be reached over the network via their Pod IP address
- By default, egress traffic allows outbound connections to any address accessible in the VPC into which the cluster was deployed
- Cluster administrators and users can lock down the ingress and egress connections created to and from the Pods in a namespace by using network policies
- By default, when there are no network policies defined, all ingress and egress traffic is allowed to flow into and out of all Pods
- Network policies allow users to use tags to define the traffic flowing through Pods
- Once a network policy is applied in a namespace, all traffic is dropped to and from Pods that don't match the configured labels
- As part of the creation of clusters and/or namespaces, users can apply the default deny traffic to both ingress and egress of every Pod to ensure that all new workloads added to the cluster must explicitly authorize the traffic they require
- To load balance Kubernetes Pods with a network load balancer, create a Service of type LoadBalancer that matches Pod's labels
- With the Service created, there will be an external-facing IP that maps to ports on the Kubernetes Pods
- Filtering authorized traffic is achieved at the node level by kube-proxy, which filters based on IP address
- To configure filtering, use the loadBalancerSourceRanges configuration of the Service object
- With this configuration parameter, provide a list of CIDR ranges that to whitelist for access to the Service
- If loadBalancerSourceRanges is not configured, all addresses are allowed to access the Service via its external IP
- For cases in which access to the Service is not required, consider using an internal load balancer
- The internal load balancer also respects the loadBalancerSourceRanges when it is necessary to filter out traffic from inside of the VPC
- Kubernetes allows users to quickly provision, scale, and update container-based workloads
- Limiting the privileges of containerized processes is important for the overall security of a cluster
- Google Kubernetes Engine allows users to set security-related options via the Security Context on both Pods and containers
- Security Context settings allow users to change security settings of processes such as user and group to run as, available Linux capabilities and ability to escalate privileges
- In order to change Security Context settings at the cluster level rather than at the Pod or container, users need to implement a PodSecurityPolicy
- Cluster administrators can use PodSecurityPolicies to ensure that all Pods in a cluster adhere to a minimum baseline policy
- The Google Kubernetes Engine node operating systems, both Container-Optimized OS and Ubuntu, apply the default Docker AppArmor security policies to all containers started by Kubernetes
- The simplest and most secure way to authorize Pods to access Google Cloud resources is with Workload Identity
- Workload identity allows a Kubernetes service account to run as a Google Cloud service account
- Pods that run as the Kubernetes service account have the permissions of the Google Cloud service account
- Workload Identity can be used with GKE Sandbox
- Pods can also authenticate to Google Cloud using the Kubernetes clusters' service account credentials from metadata
- However, these credentials can be reached by any Pod running in the cluster
- Create and configure a custom service account that has the minimum Cloud IAM roles that are required by all the Pods running in the cluster
- This approach is not compatible with GKE Sandbox because GKE Sandbox blocks access to the metadata server
- A third way to grant credentials for Google Cloud resources to applications is to manually use the service account's key
- This approach is strongly discouraged because of the difficulty of securely managing account keys
- Application-specific GCP service accounts should be used to provide credentials so that applications have the minimal necessary permissions
- Each service account is assigned only the Cloud IAM roles that are needed for its paired application to operate successfully
- Keeping the service account application-specific makes it easier to revoke its access in the case of a compromise without affecting other applications
- Once a service account has been assigned the correct Cloud IAM roles, a JSON service account key can be created and then mounted into a Pod using a Kubernetes Secret
- Binary Authorization is a service on Google Cloud that provides software supply-chain security for applications that run in the Cloud
- Binary Authorization works with images that deploy to GKE from Container Registry or another container image registry
- With Binary Authorization, users can ensure that internal processes that safeguard the quality and integrity of software have successfully completed before an application is deployed to your production environment
- Audit logging provides a way for administrators to retain, query, process, and alert on events that occur in Google Kubernetes Engine environments
- Administrators can use the logged information to do forensic analysis, real-time alerting, or for cataloging how a fleet of Google Kubernetes Engine clusters are being used and by whom
- By default, Google Kubernetes Engine logs Admin Activity logs
- Users can optionally also log Data Access events, depending on the types of operations they are interested in inspecting
-
Control Plane
- The control plane includes the Kubernetes API server, etcd, and a number of controllers
- Google is responsible for securing the control plane, though users might be able to configure certain options based on their requirements
- Users are responsible for securing their nodes, containers, and Pods.
- GKE control plane components run on Container-Optimized OS, which is a security-hardened operating system designed by Google
- In a GKE cluster, the control plane components run on Compute Engine instances owned by Google, in a Google-managed project
- Each instance runs these components for only one customer
- Authentication to the Kubernetes API server and etcd is done the same way it's done for other Google Cloud services
- Application-layer Transport Security (ALTS) protects these communications
- SSH sessions by Google Site Reliability Engineers are audit logged through Google's internal audit infrastructure, which is available for forensics and security response
- In Google Cloud, customer content is encrypted at the filesystem layer by default
- Disks that host etcd storage for GKE clusters are encrypted at the filesystem layer
- In a regional cluster, communication between etcd servers to establish a quorum is encrypted by mutual TLS
- Each cluster has its own root certificate authority (CA)
- An internal Google service manages root keys for this CA
- Each cluster also has its own CA for etcd
- Root keys for the etcd CA are distributed to the metadata of the VMs that run the Kubernetes API server
- Communication between nodes and the Kubernetes API server is protected by TLS
- GKE adheres to Google standards for testing, qualifying, and gradually rolling out changes to the control plane
- GKE control plane components are managed by a team of Google site reliability engineers, and are kept up to date with the latest security patches
- This includes patches to the host operating system, Kubernetes components, and containers running on the control plane VMs
- GKE applies new kernel, OS, and Kubernetes-level fixes promptly to control plane VMs
- When these contain fixes for known vulnerabilities, additional information is available in the GKE Security Bulletins
- GKE scans all Kubernetes system and GKE-specific containers for vulnerabilities using Container Registry Vulnerability Scanning, and keeps the containers patched, benefitting the whole Kubernetes ecosystem
- Google engineers participate in finding, fixing, and disclosing Kubernetes security bugs
- Google pays external security researchers, through the Google-wide vulnerability reward program, to look for security bugs
- In some instances, Google has been able to patch all running clusters before the vulnerability became public
- Audit Logging is enabled by default
- This provides a detailed record of calls made to the Kubernetes API server
- Users can view the log entries on the Logs page in the GCP console
- Users can also use BigQuery to view and analyze logs
- By default, the Kubernetes API server uses a public IP address
- Protect the Kubernetes API server by using master authorized networks and private clusters, which
- This allow users to assign a private IP address to the Kubernetes API server and disable access on the public IP address
- Users can handle cluster authentication in GKE by using Cloud Identity and Access Management (Cloud IAM) as the identity provider
- Basic Authentication should be disabled by setting an empty username and password for the MasterAuth configuration
- Disable the client certificate, which ensures there is one less key to think about when locking down access to clusters
- Enhance the security of the control plane by doing credential rotation on a regular basis
- When credential rotation is initiated, the TLS certificates and cluster certificate authority are rotated automatically
- GKE also rotates the IP address of Kubernetes API server
-
Trust
- The master communicates with a node for managing containers
- When the master sends a request to the node, for example, kubectl logs, that request is sent over an SSH tunnel, and furthermore protected with unauthenticated TLS, providing integrity and encryption
- When a node sends a request to the master, for example, kubelet to API server, that request is authenticated and encrypted using mutual TLS
- A node may communicate with another node as part of a specific workload
- When the node sends a request to another node, that request is authenticated, and will be encrypted if that connection crosses a physical boundary controlled by Google
- Note that no Kubernetes components require node-to-node communication
- A Pod may communicate with another Pod as part of a specific workload
- When the Pod sends a request to another Pod, that request is neither authenticated nor encrypted
- Note that no Kubernetes components require Pod-to-Pod communication
- Pod-to-Pod traffic can be restricted with a Network Policy, and can be encrypted using a service mesh like Istio or otherwise implementing application-layer encryption
- An instance of etcd may communicate with another instance of etcd to keep state updated
- When an instance of etcd sends a request to another instance, that request is authenticated and encrypted using mutual TLS
- The traffic never leaves a GKE-owned network protected by firewalls
- Master-to-etcd communication is entirely over localhost, and is not authenticated or encrypted
- The cluster root Certificate Authority (CA) is used to validate the API server and kubelets' client certificates that is, masters and nodes have the same root of trust
- Any kubelet within the cluster node pool can request a certificate from this CA using the certificates.k8s.io API, by submitting a certificate signing request
- A separate per-cluster etcd CA is used to validate etcd's certificates
- The API server and kubelets rely on Kubernetes' cluster root CA for trust
- In GKE, the master API certificate is signed by the cluster root CA
- Each cluster runs its own CA, so that if one cluster's CA were to be compromised, no other cluster CA would be affected
- An internal Google service manages root keys for this CA, which are non-exportable
- This service accepts certificate signing requests, including those from the kubelets in each GKE cluster
- Even if the API server in a cluster were compromised, the CA would not be compromised, so no other clusters would be affected
- Each node in the cluster is injected with a shared Secret at creation, which it can use to submit certificate signing requests to the cluster root CA and obtain kubelet client certificates
- These certificates are then used by the kubelet to authenticate its requests to the API server
- Note that this shared Secret is reachable by Pods, unless metadata concealment is enabled.
- The API server and kubelet certs are valid for five years, but they can be manually rotated sooner by performing a credential rotation
- etcd relies on a separate per-cluster etcd CA for trust in GKE
- Root keys for the etcd CA are distributed to the metadata of each VM on which the master runs
- Any code executing on master VMs, or with access to compute metadata for these VMs, can sign certificates as this CA
- Even if etcd in a cluster were compromised, the CA is not shared between clusters, so no other clusters would be affected
- The etcd certs are valid for five years
- To rotate all cluster's API server and kubelet certificates, perform a credential rotation
- There is no way to trigger a rotation of the etcd certificates; this is managed in GKE
- Performing a credential rotation causes GKE to upgrade all node pools to the closest supported node version, and causes brief downtime for the cluster API
-
Shielded Nodes
- Shielded GKE Nodes are built on top of Compute Engine Shielded VMs
- Shielded GKE Nodes provide Node OS provenance check, a cryptographically verifiable check to make sure the node OS is running on a virtual machine in a Google data center
- Shielded GKE Nodes provide an enhanced rootkit and bootkit protection against gaining persistence in the node, using secure and measured boot, virtual trusted platform module (vTPM), UEFI firmware and integrity monitoring
- Shielded GKE Nodes can be used with GPUs
- There is no additional cost to run Shielded GKE Nodes
- Shielded GKE Nodes are available in all zones and regions
- Shielded GKE Nodes can be used with Container-Optimized OS (COS), COS with containerd, and Ubuntu node images
- After Shielded GKE Nodes is enabled for a cluster, any nodes created in a node pool without Shielded GKE Nodes enabled or created outside of any node pool aren't able to join the cluster
-
Sandbox
- GKE Sandbox provides an extra layer of security to prevent untrusted code from affecting the host kernel on cluster nodes
- A container runtime such as docker or containerd provides some degree of isolation between the container's processes and the kernel running on the node
- The container runtime can run as a privileged user on the node and has access to most system calls into the host kernel
- Multi-tenant clusters and clusters whose containers run untrusted workloads are more exposed to security vulnerabilities than other clusters
- Examples include SaaS providers, web-hosting providers, or other organizations that allow their users to upload and run code
- A flaw in the container runtime or in the host kernel could allow a process running within a container to "escape" the container and affect the node's kernel, potentially bringing down the node
- The potential also exists for a malicious tenant to gain access to and exfiltrate another tenant's data in memory or on disk, by exploiting such a defect
- An untrusted workload could potentially access other Google Cloud services or cluster metadata
- gVisor is a userspace re-implementation of the Linux kernel API that does not need elevated privileges
- In conjunction with a container runtime such as containerd , the userspace kernel re-implements the majority of system calls and services them on behalf of the host kernel
- Direct access to the host kernel is limited
- From the container's point of view, gVisor is nearly transparent, and does not require any changes to the containerized application
- When GKE Sandbox is enabled on a node pool, a sandbox is created for each Pod running on a node in that node pool
- In addition, nodes running sandboxed Pods are prevented from accessing other Google Cloud services or cluster metadata
- Pods that do not run in a sandbox are called regular Pods
- Each sandbox uses its own userspace kernel
- Decisions can be made about how to group containers into Pods, based on the level of isolation required and the characteristics of applications
- GKE Sandbox is an especially good fit for untrusted or third-party applications using runtimes such as Rust, Java, Python, PHP, Node.js, or Golan
- GKE Sandbox is a good fit for web server front-ends, caches, or proxies
- GKE Sandbox is a good fit for applications processing external media or data using CPUs
- GKE Sandbox is a good fit for machine-learning workloads using CPUs
- GKE Sandbox is a good fit for CPU-intensive or memory-intensive applications
- It is highly recommended that users specify resource limits on all containers running in a sandbox
- This protects against the risk of a defective or malicious application starving the node of resources and negatively impacting other applications or system processes running on the node
- GKE Sandbox works well with many applications, but not all
- GKE Sandbox protects cluster from untrusted or third-party workloads
- There is generally no advantage to running trusted first-party workloads in a sandbox
- GKE Sandbox cannot be enabled on the default node pool
- When using GKE Sandbox, the cluster must have at least two node pools
- There must be at least one node pool where GKE Sandbox is disabled
- This node pool must contain at least one node, even if all workloads are sandboxed
- Nodes running sandboxed Pods are prevented from accessing cluster metadata at the level of the operating system on the node
- Regular Pods can run on a node with GKE Sandbox enabled
- By default regular Pods cannot access Google Cloud services or cluster metadata
- Use Workload Identity to grant Pods access to Google Cloud services
- gVisor nodes have Hyper-Threading disabled by default to mitigate Microarchitectural Data Sampling (MDS) vulnerabilities announced by Intel
- By default, the container is prevented from opening raw sockets, to reduce the potential for malicious attacks
- Certain network-related tools such as ping and tcpdump, create raw sockets as part of their core functionality
- To enable raw sockets, explicitly add the NET_RAW capability to the container's security context
- Untrusted code running inside the sandbox may be allowed to reach external services such as database servers, APIs, other containers, and CSI drivers
- These services are running outside the sandbox boundary and need to be individually protected
- An attacker can try to exploit vulnerabilities in these services to break out of the sandbox
- Consider the risk and impact of these services being reachable by the code running inside the sandbox, and apply the necessary measures to secure them
- This includes file system implementations for container volumes such as ext4 and CSI drivers
- CSI drivers run outside the sandbox isolation and may have privileged access to the host and services
- An exploit in these drivers can affect the host kernel and compromise the entire node
- Recommend that users run the CSI driver inside a container with the least amount of permissions required, to reduce the exposure in case of an exploit
- Compute Engine Persistent Disk CSI driver is supported to be used with GKE Sandbox
- Imposing an additional layer of indirection for accessing the node's kernel comes with performance trade-offs
- GKE Sandbox provides the most tangible benefit on large multi-tenant clusters where isolation is important
- Keep the following guidelines in mind when testing workloads with GKE Sandbox
- GKE Sandbox might not be a good fit where direct access to the host kernel on the node is needed
-
Risk Management
- To detect potential incidents, Google recommends setting up a process that collects and monitors workload's logs
- Google sets up alerts based on abnormal events detected from logs
- Alerts notify the security team when something unusual is detected
- The security team can then review the potential incident
- Alerts can be customized based on specific metrics or actions
- Alerting on high CPU usage on GKE nodes may indicate they are compromised for cryptomining
- Alerts should be generated where the user aggregates logs and metrics
- Use GKE's Audit Logging in combination with logs-based alerting in Cloud Logging
- After a user has been alerted to an incident, they should take action
- Fix the vulnerability if possible
- If the root cause of the vulnerability is not known or a fix ready is not available, apply mitigations
- The mitigations might depend on the severity of the incident and certainty that the issue has been identified
- A snapshot of the host VM's disk lets users perform forensics on the VM state at the time of the anomaly after the workload has been redeployed or deleted
- Connecting to the host VM or workload container can provide information about the attacker's actions
- Redeploying a container kills currently running processes in the affected container and restarts them
- Deleting the workload kills currently running processes in the affected container without a restart
- Before taking any of the actions, consider if there will be a negative reaction from the attacker if they are discovered
- The attacker may decide to delete data or destroy workloads
- If the risk is too high, consider more drastic mitigations such as deleting a workload before performing further investigation
- Creating a snapshot of the VM's disk allows forensic investigation after the workload has been redeployed or deleted
- Snapshots can be created while disks are attached to running instances
- Snapshots only capture state written to disk.
- Snapshots do not capture contents of the VM's memory
- In severe incidents, workloads on the same node or the same cluster may also be compromised
- This is known as a container escape
- Monitor all of workloads for abnormal behavior and take appropriate actions
- Consider what access an attacker may have before taking action
- If a user suspects a container has been compromised and are concerned about informing the attacker, connect to the container and inspect it
- Inspecting is useful for quick investigation before taking more disruptive actions
- Inspecting is also the least disruptive approach to the workload, but it doesn't stop the incident
- To avoid logging into a machine with a privileged credential, analyze workloads by setting up live forensics (such as GRR Rapid Response), on-node agents, or network filtering
- For more information on suggested forensics tools, see Security controls and forensic analysis for GKE apps
- By cordoning, draining, and limiting network access to the VM hosting a compromised container, partially isolate the compromised container from the rest of the cluster
- Limiting access to the VM reduces risk but does not prevent an attacker from moving laterally in an environment if they take advantage of a critical vulnerability
- Cordoning and draining a node moves workloads colocated with the compromised container to other VMs in the cluster
- Cordoning and draining reduces an attacker's ability to impact other workloads on the same node
- Cordoning and draining does not necessarily prevent them from inspecting a workload's persistent state
- Google recommended blocking both internal and external traffic from accessing the host VM
- Allow inbound connections from a specific VM on your network or VPC to connect to the quarantined VM
- The first step is to abandon the VM from the Managed Instance Group that owns it
- Abandoning the VM prevents the node from being marked unhealthy and auto-repaired (re-created) before the investigation is complete
- Creating a firewall between the affected container and other workloads in the same network helps prevent an attacker from moving into other parts of the environment while further analysis is conducted
- Firewalling a VM prevents new outbound connections to other VMs in your cluster using an egress rule.
- Firewalling a VM prevents inbound connections to the compromised VM using an ingress rule
- Adding firewall rules doesn't close existing connections
- Removing the VM's external IP address breaks existing connections from the external internet, although not from inside the network
- An attacker who compromises a privileged container or breaks out of an unprivileged container can access the VM's metadata
- Google recommend using Shielded GKE Nodes to remove the privileged bootstrap keys from the metadata server
- By redeploying a container, start a fresh copy of the container and delete the compromised container
- Redeploy a container by deleting the Pod that hosts it
- If the Pod is managed by a higher-level Kubernetes construct (for example, a Deployment or DaemonSet), deleting the Pod schedules a new Pod
- This Pod runs new containers
-
Audit policy
- In a Kubernetes Engine cluster, the Kubernetes API server writes audit log entries to a backend that is managed by Kubernetes Engine
- As Kubernetes Engine receives log entries from the Kubernetes API server, it writes them to the project's Admin Activity log and Data Access log
- The Kubernetes audit policy defines rules for which events are recorded as log entries, and what data the log entries should include
- The Kubernetes Engine audit policy determines which entries are written to the Admin Activity log and which are written to the Data Access log
- The Kubernetes API server follows a policy that is specified in the --audit-policy-file flag of the kube-apiserver command
- When Kubernetes Engine starts the Kubernetes API server, it supplies an audit policy file by setting the value of the --audit-policy-file flag
- The configure-helper.sh script in the open-source Kubernetes repository generates the audit policy file
- When a person or component makes a request to the Kubernetes API server, the request goes through one or more stages
- Each stage of a request generates an event, which is processed according to a policy
- The policy specifies whether the event should be recorded as a log entry and if so, what data should be included in the log entry
- The Kubernetes audit policy file contains a list of rules
- In the policy file, the first rule that matches an event sets the audit level for the event
- A rule can specify audit levels
-
Metadata
- GKE uses instance metadata to configure node VMs, but some of this metadata is potentially sensitive and should be protected from workloads running on the cluster
- Because each node's service account credentials will continue to be exposed to workloads, ensure that a service account is configured with the minimal permissions it needs
- Attach service account to nodes, so that an attacker cannot circumvent GKE's metadata protections by using the Compute Engine API to access the node instances directly
- Do not use a service account that has compute.instances.get permission, the Compute Instance Admin role, or other similar permissions, as they allow potential attackers to obtain instance metadata using the Compute Engine API
- The best practice is to restrict the permissions of a node VM by using service account permissions, not access scopes
- Legacy Compute Engine metadata endpoints are disabled by default on new clusters
- GKE's metadata concealment protects some potentially sensitive system metadata from user workloads running on clusters
- Enable metadata concealment to prevent user Pods from accessing certain VM metadata for your cluster's nodes, such as Kubelet credentials and VM instance information
- Metadata concealment protects access to kube-env (which contains Kubelet credentials) and the VM's instance identity token
- Metadata concealment firewalls traffic from user Pods (Pods not running on HostNetwork) to the cluster metadata server, only allowing safe queries
- The firewall prevents user Pods from using Kubelet credentials for privilege escalation attacks, or from using VM identity for instance escalation attacks
- Metadata concealment only protects access to kube-env and the node's instance identity token
- Metadata concealment does not restrict access to the node's service account
- Metadata concealment does not restrict access to other related instance metadata
- Metadata concealment does not restrict access to other legacy metadata APIs