Kubernetes Engine Multitenancy

Overview
1. Multi-tenancy in Google Kubernetes Engine (GKE) refers to one or more clusters that are shared between tenants
2. In Kubernetes, a tenant can be defined as a team responsible for developing and operating one or more workloads, or a set of related workloads, whether operated by one or more teams
3. A tenant can be a single workload, such as a Deployment
4. Cluster multi-tenancy is often implemented to reduce costs or to consistently apply administration policies across tenants
5. Incorrectly configuring a GKE cluster or its associated GKE resources can result in unachieved cost savings, incorrect policy application, or destructive interactions between different tenants' workloads
6. Each tenant is a single team developing a single workload
7. The platform team owns the clusters and defines the amount of resources each tenant team can use; each tenant can request more
8. Each tenant team should be able to deploy their application through the Kubernetes API without having to communicate with the platform team
9. Each tenant should not be able to affect other tenants in the shared cluster, except via explicit design decisions like API calls, shared data sources, etc.
Multitenancy
1. Cluster multi-tenancy is an alternative to managing many single-tenant clusters
2. A multi-tenant cluster is shared by multiple users and/or workloads which are referred to as "tenants"
3. Includes clusters shared by different users at a single organization, and clusters that are shared by per-customer instances of a SaaS application
4. Operators of multi-tenant clusters must isolate tenants from each other to minimize the damage a malicious tenant can do to other tenants
5. Cluster resources must be fairly allocated among tenants
6. When planning a multi-tenant architecture, consider the layers of resource isolation in Kubernetes: cluster, namespace, node, pod, and container
7. Consider the security implications of sharing different types of resources among tenants
8. There might be a need to prevent certain workloads from being colocated
9. It might not be recommended to allow untrusted code from outside of an organization to run on the same node as sensitive workloads
10. Although Kubernetes cannot guarantee perfect secure isolation between tenants, it offer features that may be sufficient for specific use cases
11. Kubernetes allows users to separate each tenant and their Kubernetes resources into their own namespaces
12. Policies can be used to enforce tenant isolation
13. Policies are usually scoped by namespace and can be used to restrict API access, constrain resource usage, and restrict containers priviledges
14. The tenants of a multi-tenant cluster share extensions, controllers, add-ons, and custom resource definitions
15. Cluster operations, security, and auditing are centralized in the cluster control plane
16. Operating a multi-tenant cluster reduces management overhead and resource fragmentation
17. With a multi-tenant cluster, there is no need to wait for cluster creation to create new tenants
18. In an enterprise environment, the tenants of a cluster are distinct teams within the organization
19. Typically, each tenant has a corresponding namespace
20. Alternative models of multi-tenancy with a tenant per cluster, or a tenant per Google Cloud project, are harder to manage
21. Kubernetes network policy can be used to require network traffic between namespaces to be explicitly whitelisted
22. Cluster administrator role is for administrators of the entire cluster, who manage all tenants
23. Cluster administrators can create, read, update, and delete any policy object
24. Cluster administrators can create namespaces and assign them to namespace administrators.
25. Namespace administrator role is for administrators of specific, single tenants
26. A namespace administrator can manage the users in their namespace
27. Developer role can create, read, update, and delete namespaced non-policy objects like Pods, Jobs, and Ingresses
28. Developers only have privileges in the namespaces they have access to
29. The tenants of a SaaS provider's cluster are the per-customer instances of the application, and the SaaS's control plane
30. To take advantage of namespace-scoped policies, application instances should be organized into their own namespaces
Resource organization
1. For enterprise organizations deploying multi-tenant clusters, configuration is needed to manage the additional complexity
2. Project configuration is needed to isolate administrative concerns as well as mapping organization structure to cloud identities and accounts
3. Controls are needed to manage additional Google Cloud resources, such as databases, logging and monitoring, storage, and networking
4. Folders and projects can be used to enforce separation of concerns
5. Folders allow teams to set policies that cascade across multiple projects
6. Projects can be used to segregate production vs. staging environments and teams from each other
7. Control access to Google Cloud resources through Cloud Identity and Access Management (Cloud IAM) policies
8. Start by identifying the groups needed for the organization and their scope of operations, then assign the appropriate Cloud IAM role to the group
9. Use Google Groups to efficiently assign and manage Cloud IAM for users
10. If resources cannot be supported by a single cluster, create more clusters
11. To ease deployments across multiple environments that are hosted in different clusters, standardize namespace naming convention
12. Avoid tying the environment name to the namespace name and instead use the same name across environments.
13. Using the same name avoids having to change the config files across environments
14. Create a tenant-specific Google service account for each distinct workload in a tenant namespace
15. This ensures that tenants can manage service accounts for the workloads that they own/deploy in their respective namespaces
16. The Kubernetes service account for each namespace can be mapped to one Google service account by using Workload Identity
17. To ensure all tenants that share a cluster have fair access to the cluster resources, enforce resources quotas
18. Create a resource quota for each namespace based on the number of Pods deployed by each tenant, and the amount of memory and CPU required by Pods
Networking
1. To maintain centralized control over network resources, such as subnets, routes, and firewalls, use Shared VPC networks
2. Resources in a Shared VPC can communicate with each other securely and efficiently across project boundaries using internal IPs
3. Each Shared VPC network is defined and owned by a centralized host project, and can be used by one or more service projects
4. Using Shared VPC and Cloud IAM, users can separate network administration from project administration
5. Separating network administration from project administration helps implement the principle of least privilege
6. When setting up a Shared VPC, configure the subnets and their secondary IP ranges in the VPC
7. To determine the subnet size, the expected number of tenants should be considered, the number of Pods and Services expected to run, and the maximum and average Pod size
8. Calculating the total cluster capacity needed requires an understanding of the desired instance size, and total node count
9. With the total number of nodes, the total IP space consumed can be calculated to determine the desired subnet size
10. The Node, Pod, and Services IP ranges must all be unique.
11. A subnet's primary and secondary IP address ranges cannot overlap
12. The maximum number of Pods and Services for a given GKE cluster is limited by the size of the cluster's secondary ranges
13. The maximum number of nodes in the cluster is limited by the size of the cluster's subnet's primary IP address range and the cluster's Pod address range
14. For flexibility and control over IP address management, configure the maximum number of Pods that can run on a node
15. By reducing the number of Pods per node, the CIDR range allocated per node is reduced, requiring fewer IP addresses
16. To calculate subnets for clusters, use the GKE IPAM calculator open source tool
17. IP Address Management (IPAM) enables efficient use of IP space/subnets and avoids having overlaps in ranges
18. Tenants that require further isolation for resources that run outside the shared clusters may use their own VPC, which is peered to the Shared VPC
19. This provides security at the cost of increased complexity and numerous other limitations
Security
1. Create one cluster per project to reduce the risk of project-level configurations adversely affecting many clusters ("blast radius"), and to provide separation for quota and billing
2. Make the production cluster private to disable access to the nodes and manage access to the control plane
3. Use private clusters for development and staging environments
4. Ensure the control plane for the cluster is regional to provide high availability for multi-tenancy; any disruptions to the control plane will impact tenants
5. Create an HTTP(s) load balancer to allow a single ingress per cluster, where each tenant's Services are registered with the cluster's Ingress resource
6. Create a Kubernetes Ingress resource to define how traffic reaches Services and how the traffic is routed to a tenant's application
7. By registering Services with the Ingress resource, the Services' naming convention becomes consistent, accessible via a single ingress
8. To control network communication between Pods in each of cluster's namespaces, create network policies based on tenants' requirements
9. As an initial recommendation, block traffic between namespaces that host different tenants' applications
10. Cluster administrator can apply a deny-all network policy to deny all ingress traffic to avoid Pods from one namespace accidentally sending traffic to Services or databases in other namespaces
11. Clusters that run untrusted workloads are more exposed to security vulnerabilities than other clusters
12. Use GKE Sandbox to harden the isolation boundaries between workloads for multi-tenant environments
13. For security management, Google recommends starting with GKE Sandbox and then using Pod security policies to fill in any gaps
14. GKE Sandbox is based on gVisor, an open source container sandboxing project, and provides additional isolation for multi-tenant workloads by
15. GKE Sandbox adds an extra layer between your containers and host OS
16. Container runtimes often run as a privileged user on the node and have access to most system calls into the host kernel
17. In a multi-tenant cluster, one malicious tenant can gain access to the host kernel and to other tenant's data
18. GKE Sandbox mitigates these threats by reducing the need for containers to interact with the host by shrinking the attack surface of the host and restricting the movement of malicious actors
19. GKE Sandbox provides a user-space kernel, written in Go, that handles system calls and limits interaction with the host kernel
20. Each Pod has its own isolated user-space kernel
21. The user-space kernel also runs inside namespaces and seccomp filtering system calls
22. To prevent Pods from running in a cluster, create a Policy Controller constraint which specifies conditions that Pods must meet in a cluster
23. Authorize the use of policies for a Pod by binding the Pod's serviceAccount to a role that has access to use the policies
24. Google recommends defining the most restrictive policy bound to system:authenticated and more permissive policies bound as needed for exceptions
25. To ensure that no child process of a container can gain more privileges than its parent, set the allowPrivilegeEscalation parameter to false
26. To disallow escalation privileges outside of the container, disable access to the components of the Host namespaces (hostNetwork, hostIPC, and hostPID)
27. This also blocks snooping on network activity of other Pods on the same node
28. To securely grant workloads access to Google Cloud services, enable Workload Identity in the cluster
29. Workload Identity helps administrators manage Kubernetes service accounts that Kubernetes workloads use to access Google Cloud services
30. When a user creates a cluster with Workload Identity enabled, an Identity Namespace is established for the project that the cluster is housed in
31. To protect the control plane, restrict access to authorized networks
32. In GKE, when master authorized networks is enabled, users can whitelist up to CIDR ranges and allow IP addresses only in those ranges to access the control plane
33. GKE uses Transport Layer Security (TLS) and authentication to provide secure access to the cluster master endpoint from the public internet
34. By using authorized networks, users can further restrict access to specified sets of IP addresses
35. To host a tenant's non-cluster resources, create a service project for each tenant
36. These service projects contain logical resources specific to the tenant applications (for example, logs, monitoring, storage buckets, service accounts, etc.)
37. All tenant service projects are connected to the Shared VPC in the tenant host project
38. Define finer-grained access to cluster resources for tenants by using Kubernetes RBAC
39. On top of the read-only access initially granted with Cloud IAM to tenant groups, define namespace-wide Kubernetes RBAC roles and bindings for each tenant group
40. In addition to creating RBAC roles and bindings that assign Google Workspace or Cloud Identity groups various permissions inside their namespace, Tenant admins often require the ability to manage users in each of those groups
41. To efficiently manage tenant permissions in a cluster, bind RBAC permissions to Google Groups
42. The membership of those groups are maintained by Google Workspace administrators, so cluster administrators do not need detailed information about users
43. To provide a logical isolation between tenants that are on the same cluster, implement namespaces
44. As part of the Kubernetes RBAC process, the cluster admin creates namespaces for each tenant group
45. The Tenant admin manages users (tenant developers) within their respective tenant namespace
46. Tenant developers are then able to use cluster and tenant specific resources to deploy their applications
Availability
1. There are cost implications with running regional clusters
2. Ensure the nodes in the cluster span at least three zones to achieve zonal reliability
3. There are cost implications for egress between zones in the same region
4. To accommodate the demands of tenants, automatically scale nodes in the cluster by enabling autoscaling
5. Autoscaling helps systems appear responsive and healthy when heavy workloads are deployed by various tenants in their namespaces, or to respond to zonal outages
6. When enabling autoscaling, specify the minimum and maximum number of nodes in a cluster based on the expected workload sizes
7. By specifying the maximum number of nodes, users can ensure there is enough space for all Pods in the cluster, regardless of the namespace they run in
8. Cluster autoscaling rescales node pools based on the min/max boundary, helping to reduce operational costs when the system load decreases
9. Cluster autoscaling avoid Pods going into a pending state when there aren't enough available cluster resources
10. To determine the maximum number of nodes, identify the maximum total amount of CPU and memory that each tenant requires
11. Using the maximum number of nodes, users can choose instance sizes and counts, taking into consideration the IP subnet space made available to the cluster
12. Use Pod autoscaling to automatically scale Pods based on resource demands
13. Unlike Vertical Pod Autoscaler, Horizontal Pod Autoscaler does not modify the workload's configured requests
14. Horizontal Pod Autoscaler scales only the number of replicas
15. Vertical Pod Autoscaling (VPA) is used to scale CPU/memory to existing Pods and Horizontal Pod Autoscaler (HPA) scales the number of Pod replicas based on CPU/memory utilization or custom metrics
16. Do not use VPA with HPA on the same Pods unless scaling is based on different metrics
17. The sizing of the cluster is dependent on the type of workloads
18. If workloads have greater density, the cost efficiency is higher but there is also a greater chance for resource contention
19. The minimum size of a cluster is defined by the number of zones it spans: one node for a zonal cluster and three nodes for a regional cluster
20. To reduce downtimes during cluster/node upgrades and maintenance, schedule maintenance windows to occur during off-peak hours
21. During upgrades, there can be temporary disruptions when workloads are moved to recreate nodes
22. To ensure minimal impact of such disruptions, schedule upgrades for off-peak hours and design application deployments to handle partial disruptions seamlessly, if possible
Metering
1. To obtain cost breakdowns on individual namespaces and labels in a cluster, enable GKE usage metering
2. GKE usage metering tracks information about resource requests and resource usage of a cluster's workloads, which can be further broken down by namespaces and labels
3. With GKE usage metering, users can approximate the cost breakdown for departments/teams that are sharing a cluster.
4. GKE usage metering enables users to understand the usage patterns of individual applications (or even components of a single application)
5. GKE usage metering help cluster admins triage spikes in usage, and provide better capacity planning and budgeting
6. When GKE usage metering is enabled on the multi-tenant cluster, resource usage records are written to a BigQuery table
7. Tenant-specific metrics can be exported to BigQuery datasets in the corresponding tenant project, which auditors can then analyze to determine cost breakdowns
8. Auditors can visualize GKE usage metering data by creating dashboards with plug-and-play Google Data Studio templates
9. Tenants can be provided with logs data specific to their project workloads by using Stackdriver Kubernetes Engine Monitoring
10. Cloud Monitoring manages both the Monitoring and Logging services together and provides a dashboard customized for GKE clusters
11. To create tenant- specific logs, the cluster admin creates a sink to export log entries to BigQuery datasets, filtered by tenant namespace
12. The exported data in BigQuery can then be accessed by the tenants
13. To provide tenant-specific monitoring, the cluster admin can use a dedicated namespace that contains a Prometheus to Stackdriver adapter (prometheus-to-sd) with a per namespace config
14. This configuration ensures tenants can only monitor their own metrics in their projects
15. However, the downside to this design is the extra cost of managing Prometheus deployment(s)
16. Alternatively, teams can accept shared tenancy within the Monitoring environment and allow tenants to have visibility into all metrics in the projects
17. A single Grafana instance can be deployed per tenant, which communicates with the shared Monitoring environment.
18. Configure the Grafana instance to only view the metrics from a particular namespace
19. The downside to this option is the cost and overhead of managing these additional deployments of Grafana
Implementation
1. Organizational setup
  1. Define resource hierarchy
  2. Create folders based on organizational hierarchy and environmental needs
  3. Create host and service projects for clusters and tenants
2. Identity and access management
  1. Identify and create a set of Google Groups for organization
  2. Assign users and Cloud IAM policies to the groups
  3. Refine tenant access with namespace-scoped roles and role bindings
  4. Grant tenant admin access to manage tenant users
3. Networking
  1. Create per-environment Shared VPC networks for the tenant and cluster networks
4. High availability and reliability
  1. Create one cluster per project to reduce the "blast radius"
  2. Create the cluster as a private cluster
  3. Ensure the control plane for the cluster is regional
  4. Span nodes for the cluster over at least three zones
  5. Enable cluster autoscaling and Pod autoscaling
  6. Specify maintenance windows to occur during off-peak hours
  7. Create an HTTP(s) load balancer to allow a single ingress per multi-tenant cluster
5. Security
  1. Create namespaces to provide isolation between tenants that are on the same cluster
  2. Create network policies to restrict communication between Pods
  3. Mitigate threats by running workloads on GKE Sandbox
  4. Create Pod Security Policies to constrain how Pods operate on clusters
  5. Enable Workload Identity to manage Kubernetes service accounts and access
  6. Enable master authorized networks to restrict access to the control plane
6. Logging and monitoring
  1. Enforce resource quotas for each namespace
  2. Track usage metrics with GKE usage metering
  3. Set up tenant-specific logging with Kubernetes Engine Monitoring
  4. Set up tenant-specific monitoring