Harden your cluster's security


With the speed of development in Kubernetes, there are often new security features for you to use. This page guides you through implementing our current guidance for hardening your Google Kubernetes Engine (GKE) cluster.

This guide prioritizes high-value security mitigations that require customer action at cluster creation time. Less critical features, secure-by-default settings, and those that can be enabled post-creation time are mentioned later in the document. For a general overview of security topics, read the Security Overview.

If you are creating new clusters in GKE, many of these protections are enabled by default. If you are upgrading existing clusters, make sure to regularly review this hardening guide and enable new features.

Clusters created in the Autopilot mode implement many GKE hardening features by default.

Many of these recommendations, as well as other common misconfigurations, can be automatically checked using Security Health Analytics.

Where the recommendations below relate to a CIS GKE Benchmark Recommendation, this is specified.

Upgrade your GKE infrastructure in a timely fashion

CIS GKE Benchmark Recommendation: 6.5.3. Ensure Node Auto-Upgrade is enabled for GKE nodes

Keeping the version of Kubernetes up to date is one of the simplest things you can do to improve your security. Kubernetes frequently introduces new security features and provides security patches.

See the GKE security bulletins for information on security patches.

In Google Kubernetes Engine, the control planes are patched and upgraded for you automatically. Node auto-upgrade also automatically upgrades nodes in your cluster.

Node auto-upgrade is enabled by default for clusters created using the Google Cloud console since June 2019, and for clusters created using the API starting November 11, 2019.

If you choose to disable node auto-upgrade, we recommend upgrading monthly on your own schedule. Older clusters should opt-in to node auto-upgrade and closely follow the GKE security bulletins for critical patches.

To learn more, see Auto-upgrading nodes.

Restrict network access to the control plane and nodes

CIS GKE Benchmark Recommendations: 6.6.2. Prefer VPC-native clusters, 6.6.3. Ensure Authorized Networks is Enabled, 6.6.4. Ensure clusters are created with Private Endpoint Enabled and Public Access Disabled, and 6.6.5. Ensure clusters are created with Private Nodes

By default the GKE cluster control plane and nodes have internet routable addresses that can be accessed from any IP address.

Best practice:

Limit exposure of your cluster control plane and nodes to the internet.

Restrict access to the control plane

To restrict access to the GKE cluster control plane, see Configure the control plane access. The following are the options you have for network-level protection:

  • DNS-based endpoint enabled (recommended): You can control who can access the DNS-based endpoint with VPC Service Controls. VPC Service Controls lets you define one security parameter for all Google APIs in your project with context-aware attributes such as network origin. These settings can be controlled centrally for a project across all Google APIs, reducing the number of places where you'd have to configure access rules.

  • External and internal IP-based endpoints access disabled: This prevents all access to the control plane through IP-based endpoints.

  • External IP-based endpoint access disabled: This prevents all internet access to both control planes. This is a good choice if you have configured your on-premises network to connect to Google Cloud using Cloud Interconnect and Cloud VPN. Those technologies effectively connect your company network to your cloud VPC.

  • External IP-based endpoint access enabled, authorized networks enabled: This option provides restricted access to the control plane from source IP addresses that you define. This is a good choice if you don't have existing VPN infrastructure or have remote users or branch offices that connect over the public internet instead of the corporate VPN and Cloud Interconnect or Cloud VPN.

  • External endpoint access enabled, authorized networks disabled: This allows anyone on the internet to make network connections to the control plane.

If using IP-based endpoints, we recommend clusters use authorized networks.

This ensures the control plane is reachable by:

  • The allowed CIDRs in authorized networks.
  • Nodes within your cluster's VPC.
  • Google-reserved IP addresses for cluster management purposes.

Restrict access to nodes

Best practice:

Enable private nodes on your clusters to prevent external clients from accessing the nodes.

To disable direct internet access to nodes, specify the gcloud CLI option --enable-private-nodes at cluster creation.

This tells GKE to provision nodes with internal IP addresses, which means the nodes aren't directly reachable over the public internet.

Use least-privilege firewall rules

Minimize the risk of unintended access by using the principle of least privilege for firewall rules

GKE creates default VPC firewall rules to enable system functionality and to enforce good security practices. For a full list of automatically created firewall rules, see Automatically created firewall rules.

GKE creates these default firewall rules with a priority of 1000. If you create permissive firewall rules with a higher priority, for example an allow-all firewall rule for debugging, your cluster is at risk of unintended access.

Group authentication

CIS GKE Benchmark Recommendation: 6.8.3. Consider managing Kubernetes RBAC users with Google Groups for RBAC

You should use groups to manage your users. Using groups allows identities to be controlled using your Identity management system and Identity administrators. Adjusting the group membership negates the need to update your RBAC configuration whenever anyone is added or removed from the group.

To manage user permissions using Google Groups, you must enable Google Groups for RBAC on your cluster. This allows you to manage users with the same permissions easily, while allowing your identity administrators to manage users centrally and consistently.

See Google Groups for RBAC for instructions on enabling Google Groups for RBAC.

Container node choices

The following sections describe secure node configuration choices.

Enable Shielded GKE Nodes

CIS GKE Benchmark Recommendation: 6.5.5. Ensure Shielded GKE Nodes are enabled

Shielded GKE Nodes provide strong, verifiable node identity and integrity to increase the security of GKE nodes and should be enabled on all GKE clusters.

You can enable Shielded GKE Nodes at cluster creation or update. Shielded GKE Nodes should be enabled with secure boot. Secure boot should not be used if you need third-party unsigned kernel modules. For instructions on how to enable Shielded GKE Nodes, and how to enable secure boot with Shielded GKE Nodes, see Using Shielded GKE Nodes.

Choose a hardened node image with the containerd runtime

The Container-Optimized OS with containerd (cos_containerd) image is a variant of the Container-Optimized OS image with containerd as the main container runtime directly integrated with Kubernetes.

containerd is the core runtime component of Docker and has been designed to deliver core container functionality for the Kubernetes Container Runtime Interface (CRI). It is significantly less complex than the full Docker daemon, and therefore has a smaller attack surface.

To use the cos_containerd image in your cluster, see Containerd images.

The cos_containerd image is the preferred image for GKE because it has been custom built, optimized, and hardened specifically for running containers.

Enable Workload Identity Federation for GKE

CIS GKE Benchmark Recommendation: 6.2.2. Prefer using dedicated Google Cloud Service Accounts and Workload Identity

Workload Identity Federation for GKE is the recommended way to authenticate to Google Cloud APIs.

Workload Identity Federation for GKE replaces the need to use Metadata Concealment and as such, the two approaches are incompatible. The sensitive metadata protected by Metadata Concealment is also protected by Workload Identity Federation for GKE.

Harden workload isolation with GKE Sandbox

CIS GKE Benchmark Recommendation: 6.10.4. Consider GKE Sandbox for hardening workload isolation, especially for untrusted workloads

GKE Sandbox provides an extra layer of security to prevent malicious code from affecting the host kernel on your cluster nodes.

You can run containers in a sandboxed environment to mitigate against most container escape attacks, also called local privilege escalation attacks. For past container escape vulnerabilities, refer to the security bulletins. This type of attack lets an attacker gain access to the host VM of the container, and therefore gain access to other containers on the same VM. A sandbox such as GKE Sandbox can help limit the impact of these attacks.

You should consider sandboxing a workload in situations such as:

  • The workload runs untrusted code
  • You want to limit the impact if an attacker compromises a container in the workload.

Learn how to use GKE Sandbox in Harden workload isolation with GKE Sandbox.

Enable security bulletin notifications

When security bulletins are available that are relevant to your cluster, GKE publishes notifications about those events as messages to Pub/Sub topics that you configure. You can receive these notifications on a Pub/Sub subscription, integrate with third-party services, and filter for the notification types you want to receive.

For more information about receiving security bulletins using GKE cluster notifications, see Cluster notifications.

Disable the insecure kubelet read-only port

Disable the kubelet read-only port and switch any workloads that use port 10255 to use the more secure port 10250 instead.

The kubelet process running on nodes serves a read-only API using the insecure port 10255. Kubernetes doesn't perform any authentication or authorization checks on this port. The kubelet serves the same endpoints on the more secure, authenticated port 10250.

For instructions, see Disable the kubelet read-only port in GKE clusters.

Permissions

Use least privilege IAM service accounts

CIS GKE Benchmark Recommendation: 6.2.1. Prefer not running GKE clusters using the Compute Engine default service account

GKE uses IAM service accounts that are attached to your nodes to run system tasks like logging and monitoring. At a minimum, these node service accounts must have the Kubernetes Engine Default Node Service Account (roles/container.defaultNodeServiceAccount) role on your project. By default, GKE uses the Compute Engine default service account, which is automatically created in your project, as the node service account.

If you use the Compute Engine default service account for other functions in your project or organization, the service account might have more permissions than GKE needs, which could expose you to security risks.

The service account that's attached to your nodes should be used only by system workloads that perform tasks like logging and monitoring. For your own workloads, provision identities using Workload Identity Federation for GKE.

To create a custom service account and grant it the required role for GKE, complete the following steps:

console

  1. In the Google Cloud console, enable the Cloud Resource Manager API:

    Enable the API

  2. Go to the Service accounts page:

    Go to Service accounts

  3. Click Create service account.
  4. Enter a name for the service account. The Service account ID field automatically generates a unique ID for the service account based on the name.
  5. Click Create and continue.
  6. In the Select a role menu, select the Kubernetes Engine Default Node Service Account role.
  7. Click Done.

gcloud

  1. Enable the Cloud Resource Manager API:
    gcloud services enable cloudresourcemanager.googleapis.com
  2. Create the service account:
    gcloud iam service-accounts create SERVICE_ACCOUNT_ID \
        --display-name=DISPLAY_NAME

    Replace the following:

    • SERVICE_ACCOUNT_ID: a unique ID for the service account.
    • DISPLAY_NAME: a display name for the service account.
  3. Grant the Kubernetes Engine Default Node Service Account (roles/container.defaultNodeServiceAccount) role to the service account:
    gcloud projects add-iam-policy-binding PROJECT_ID \
        --member="serviceAccount:SERVICE_ACCOUNT_ID@PROJECT_ID.iam.gserviceaccount.com" \
        --role=roles/container.defaultNodeServiceAccount

    Replace the following:

    • PROJECT_ID: your Google Cloud project ID.
    • SERVICE_ACCOUNT_ID: the service account ID that you created.

Config Connector

Note: This step requires Config Connector. Follow the installation instructions to install Config Connector on your cluster.

  1. To create the service account, download the following resource as service-account.yaml:
    apiVersion: iam.cnrm.cloud.google.com/v1beta1
    kind: IAMServiceAccount
    metadata:
      name: [SA_NAME]
    spec:
      displayName: [DISPLAY_NAME]

    Replace the following:

    • [SA_NAME]: the name of the new service account.
    • [DISPLAY_NAME]: a display name for the service account.
  2. Create the service account:
    kubectl apply -f service-account.yaml
  3. Apply the roles/logging.logWriter role to the service account:
    1. Download the following resource as policy-logging.yaml.
      apiVersion: iam.cnrm.cloud.google.com/v1beta1
      kind: IAMPolicyMember
      metadata:
        name: policy-logging
      spec:
        member: serviceAccount:[SA_NAME]@[PROJECT_ID].iam.gserviceaccount.com
        role: roles/logging.logWriter
        resourceRef:
          kind: Project
          name: [PROJECT_ID]

      Replace the following:

      • [SA_NAME]: the name of the service account.
      • [PROJECT_ID]: your Google Cloud project ID.
    2. Apply the role to the service account:
      kubectl apply -f policy-logging.yaml
  4. Apply the roles/monitoring.metricWriter role to the service account:
    1. Download the following resource as policy-metrics-writer.yaml. Replace [SA_NAME] and [PROJECT_ID] with your own information.
      apiVersion: iam.cnrm.cloud.google.com/v1beta1
      kind: IAMPolicyMember
      metadata:
        name: policy-metrics-writer
      spec:
        member: serviceAccount:[SA_NAME]@[PROJECT_ID].iam.gserviceaccount.com
        role: roles/monitoring.metricWriter
        resourceRef:
          kind: Project
          name: [PROJECT_ID]

      Replace the following:

      • [SA_NAME]: the name of the service account.
      • [PROJECT_ID]: your Google Cloud project ID.
    2. Apply the role to the service account:
      kubectl apply -f policy-metrics-writer.yaml
  5. Apply the roles/monitoring.viewer role to the service account:
    1. Download the following resource as policy-monitoring.yaml.
      apiVersion: iam.cnrm.cloud.google.com/v1beta1
      kind: IAMPolicyMember
      metadata:
        name: policy-monitoring
      spec:
        member: serviceAccount:[SA_NAME]@[PROJECT_ID].iam.gserviceaccount.com
        role: roles/monitoring.viewer
        resourceRef:
          kind: Project
          name: [PROJECT_ID]

      Replace the following:

      • [SA_NAME]: the name of the service account.
      • [PROJECT_ID]: your Google Cloud project ID.
    2. Apply the role to the service account:
      kubectl apply -f policy-monitoring.yaml
  6. Apply the roles/autoscaling.metricsWriter role to the service account:
    1. Download the following resource as policy-autoscaling-metrics-writer.yaml.
      apiVersion: iam.cnrm.cloud.google.com/v1beta1
      kind: IAMPolicyMember
      metadata:
        name: policy-autoscaling-metrics-writer
      spec:
        member: serviceAccount:[SA_NAME]@[PROJECT_ID].iam.gserviceaccount.com
        role: roles/autoscaling.metricsWriter
        resourceRef:
          kind: Project
          name: [PROJECT_ID]

      Replace the following:

      • [SA_NAME]: the name of the service account.
      • [PROJECT_ID]: your Google Cloud project ID.
    2. Apply the role to the service account:
      kubectl apply -f policy-autoscaling-metrics-writer.yaml

You can also use this service account for resources in other projects. For instructions, see Enabling service account impersonation across projects.

Grant access to private image repositories

To use private images in Artifact Registry, grant the Artifact Registry Reader role (roles/artifactregistry.reader) to the service account.

gcloud

gcloud artifacts repositories add-iam-policy-binding REPOSITORY_NAME \
    --member=serviceAccount:SA_NAME@PROJECT_ID.iam.gserviceaccount.com \
    --role=roles/artifactregistry.reader

Replace REPOSITORY_NAMEwith the name of your Artifact Registry repository.

Config Connector

Note: This step requires Config Connector. Follow the installation instructions to install Config Connector on your cluster.

  1. Save the following manifest as policy-artifact-registry-reader.yaml:

    apiVersion: iam.cnrm.cloud.google.com/v1beta1
    kind: IAMPolicyMember
    metadata:
      name: policy-artifact-registry-reader
    spec:
      member: serviceAccount:"SA_NAME"@"PROJECT_ID".iam.gserviceaccount.com
      role: roles/artifactregistry.reader
      resourceRef:
        apiVersion: artifactregistry.cnrm.cloud.google.com/v1beta1
        kind: ArtifactRegistryRepository
        name: "REPOSITORY_NAME"

    Replace the following:

    • SA_NAME: the name of your IAM service account.
    • PROJECT_ID: your Google Cloud project ID.
    • REPOSITORY_NAME: the name of your Artifact Registry repository.
  2. Grant the Artifact Registry Reader role to the service account:

    kubectl apply -f policy-artifact-registry-reader.yaml
    

If you use private images in Container Registry, you also need to grant access to those:

gcloud

gcloud storage buckets add-iam-policy-binding gs://BUCKET_NAME \
  --member=serviceAccount:SA_NAME@PROJECT_ID.iam.gserviceaccount.com \
  --role=roles/storage.objectViewer

The bucket that stores your images has the name BUCKET_NAME of the form:

  • artifacts.PROJECT_ID.appspot.com for images pushed to a registry in the host gcr.io, or
  • STORAGE_REGION.artifacts.PROJECT_ID.appspot.com

Replace the following:

  • PROJECT_ID: your Google Cloud console project ID.
  • STORAGE_REGION: the location of the storage bucket:
    • us for registries in the host us.gcr.io
    • eu for registries in the host eu.gcr.io
    • asia for registries in the host asia.gcr.io

Refer to the gcloud storage buckets add-iam-policy-binding documentation for more information about the command.

Config Connector

Note: This step requires Config Connector. Follow the installation instructions to install Config Connector on your cluster.

Apply the storage.objectViewer role to your service account. Download the following resource as policy-object-viewer.yaml. Replace [SA_NAME] and [PROJECT_ID] with your own information.

apiVersion: iam.cnrm.cloud.google.com/v1beta1
kind: IAMPolicyMember
metadata:
  name: policy-object-viewer
spec:
  member: serviceAccount:[SA_NAME]@[PROJECT_ID].iam.gserviceaccount.com
  role: roles/storage.objectViewer
  resourceRef:
    kind: Project
    name: [PROJECT_ID]
kubectl apply -f policy-object-viewer.yaml

If you want another human user to be able to create new clusters or node pools with this service account, you must grant them the Service Account User role on this service account:

gcloud

gcloud iam service-accounts add-iam-policy-binding \
    SA_NAME@PROJECT_ID.iam.gserviceaccount.com \
    --member=user:USER \
    --role=roles/iam.serviceAccountUser

Config Connector

Note: This step requires Config Connector. Follow the installation instructions to install Config Connector on your cluster.

Apply the iam.serviceAccountUser role to your service account. Download the following resource as policy-service-account-user.yaml. Replace [SA_NAME] and [PROJECT_ID] with your own information.

apiVersion: iam.cnrm.cloud.google.com/v1beta1
kind: IAMPolicyMember
metadata:
  name: policy-service-account-user
spec:
  member: serviceAccount:[SA_NAME]@[PROJECT_ID].iam.gserviceaccount.com
  role: roles/iam.serviceAccountUser
  resourceRef:
    kind: Project
    name: [PROJECT_ID]
kubectl apply -f policy-service-account-user.yaml

For existing Standard clusters, you can now create a new node pool with this new service account. For Autopilot clusters, you must create a new cluster with the service account. For instructions, see Create an Autopilot cluster.

  • Create a node pool that uses the new service account:

    gcloud container node-pools create NODE_POOL_NAME \
    --service-account=SA_NAME@PROJECT_ID.iam.gserviceaccount.com \
    --cluster=CLUSTER_NAME

If you need your GKE cluster to have access to other Google Cloud services, you should use Workload Identity Federation for GKE.

Restrict access to cluster API discovery

By default, Kubernetes bootstraps clusters with a permissive set of discovery ClusterRoleBindings which give broad access to information about a cluster's APIs, including those of CustomResourceDefinitions.

Users should be aware that the system:authenticated Group included in the subjects of the system:discovery and system:basic-user ClusterRoleBindings can include any authenticated user (including any user with a Google account), and does not represent a meaningful level of security for clusters on GKE. For more information, see Avoid default roles and groups.

Those wishing to harden their cluster's discovery APIs should consider one or more of the following:

  • Only enable the DNS-based endpoint for access to the control plane.
  • Configure authorized networks to restrict access to set IP ranges.
  • Restrict access to the control plane and enable private nodes.

If none of these options are suitable for your GKE use case, you should treat all API discovery information (namely the schema of CustomResources, APIService definitions, and discovery information hosted by extension API servers) as publicly disclosed.

Use namespaces and RBAC to restrict access to cluster resources

CIS GKE Benchmark Recommendation: 5.6.1. Create administrative boundaries between resources using namespaces

Give teams least-privilege access to Kubernetes by creating separate namespaces or clusters for each team and environment. Assign cost centers and appropriate labels to each namespace for accountability and chargeback. Only give developers the level of access to their namespace that they need to deploy and manage their application, especially in production. Map out the tasks that your users need to undertake against the cluster and define the permissions that they require to do each task.

For more information about creating namespaces, see the Kubernetes documentation. For best practices when planning your RBAC configuration, see Best practices for GKE RBAC.

IAM and Role-based access control (RBAC) work together, and an entity must have sufficient permissions at either level to work with resources in your cluster.

Assign the appropriate IAM roles for GKE to groups and users to provide permissions at the project level and use RBAC to grant permissions on a cluster and namespace level. To learn more, see Access control.

You can use IAM and RBAC permissions together with namespaces to restrict user interactions with cluster resources on Google Cloud console. For more information, see Enable access and view cluster resources by namespace.

Restrict traffic among Pods with a network policy

CIS GKE Benchmark Recommendation: 6.6.7. Ensure Network Policy is Enabled and set as appropriate

By default, all Pods in a cluster can communicate with each other. You should control Pod to Pod communication as needed for your workloads.

Restricting network access to services makes it much more difficult for attackers to move laterally within your cluster, and also offers services some protection against accidental or deliberate denial of service. Two recommended ways to control traffic are:

  1. Use Istio. See Installing Istio on Google Kubernetes Engine if you're interested in load balancing, service authorization, throttling, quota, metrics and more.
  2. Use Kubernetes network policies. See Creating a cluster network policy. Choose this if you're looking for the basic access control functionality exposed by Kubernetes. To implement common approaches for restricting traffic using network policies, follow the implementation guide from the GKE Enterprise Security Blueprints. Also, the Kubernetes documentation has an excellent walkthrough for a simple nginx deployment. Consider using network policy logging to verify that your network policies are working as expected.

Istio and network policy may be used together if there is a need to do so.

Secret management

CIS GKE Benchmark Recommendation: 6.3.1. Consider encrypting Kubernetes Secrets using keys managed in Cloud KMS

You should provide an additional layer of protection for sensitive data, such as secrets, stored in etcd. To do this you need to configure a secrets manager that is integrated with GKE clusters. Some solutions will work both in GKE and in Google Distributed Cloud, and so may be more desirable if you are running workloads across multiple environments. If you choose to use an external secrets manager such as HashiCorp Vault, you'll want to have that set up before you create your cluster.

You have several options for secret management.

  • You can use Kubernetes secrets natively in GKE. Optionally, you can encrypt these at the application-layer with a key you manage, using Application-layer secrets encryption.
  • You can use a secrets manager such as HashiCorp Vault. When run in a hardened HA mode, this will provide a consistent, production-ready way to manage secrets. You can authenticate to HashiCorp Vault using either a Kubernetes service account or a Google Cloud service account. To learn more about using GKE with Vault, see Running and connecting to HashiCorp Vault on Kubernetes.

GKE VMs are encrypted at the storage layer by default, which includes etcd.

Use admission controllers to enforce policy

Admission controllers are plugins that govern and enforce how the cluster is used. They must be enabled to use some of the more advanced security features of Kubernetes and are an important part of the defence in depth approach to hardening your cluster

By default, Pods in Kubernetes can operate with capabilities beyond what they require. You should constrain the Pod's capabilities to only those required for that workload.

Kubernetes supports numerous controls for restricting your Pods to execute with only explicitly granted capabilities. For example, Policy Controller is available for clusters in fleets. Kubernetes also has the built-in PodSecurity admission controller that lets you enforce the Pod Security Standards in individual clusters.

Policy Controller is a feature of GKE Enterprise that lets you enforce and validate security on GKE clusters at scale by using declarative policies. To learn how to use Policy Controller to enforce declarative controls on your GKE cluster, see Install Policy Controller.

The PodSecurity admission controller lets you enforce pre-defined policies in specific namespaces or in the entire cluster. These policies correspond to the different Pod Security Standards.

Restrict the ability for workloads to self-modify

Certain Kubernetes workloads, especially system workloads, have permission to self-modify. For example, some workloads vertically autoscale themselves. While convenient, this can allow an attacker who has already compromised a node to escalate further in the cluster. For example, an attacker could have a workload on the node change itself to run as a more privileged service account that exists in the same namespace.

Ideally, workloads should not be granted the permission to modify themselves in the first place. When self-modification is necessary, you can limit permissions by applying Gatekeeper or Policy Controller constraints, such as NoUpdateServiceAccount from the open source Gatekeeper library, which provides several useful security policies.

When you deploy policies, it is usually necessary to allow the controllers that manage the cluster lifecycle to bypass the policies. This is necessary so that the controllers can make changes to the cluster, such as applying cluster upgrades. For example, if you deploy the NoUpdateServiceAccount policy on GKE, you must set the following parameters in the Constraint:

parameters:
  allowedGroups:
  - system:masters
  allowedUsers:
  - system:addon-manager

Restrict the use of the deprecated gcePersistentDisk volume type

The deprecated gcePersistentDisk volume type lets you mount a Compute Engine persistent disk to Pods. We recommend that you restrict usage of the gcePersistentDisk volume type in your workloads. GKE doesn't perform any IAM authorization checks on the Pod when mounting this volume type, although Google Cloud performs authorization checks when attaching the disk to the underlying VM. An attacker who already has the ability to create Pods in a namespace can therefore access the contents of Compute Engine persistent disks in your Google Cloud project.

To access and use Compute Engine persistent disks, use PersistentVolumes and PersistentVolumeClaims instead. Apply security policies in your cluster that prevent usage of the gcePersistentDisk volume type.

To prevent usage of the gcePersistentDisk volume type, apply the Baseline or Restricted policy with the PodSecurity admission controller, or you can define a custom constraint in Policy Controller or in the Gatekeeper admission controller.

To define a custom constraint to restrict this volume type, do the following:

  1. Install a policy-based admission controller such as Policy Controller or Gatekeeper OPA.

    Policy Controller

    Install Policy Controller in your cluster.

    Policy Controller is a paid feature for GKE users. Policy Controller is based on open source Gatekeeper, but you also get access to the full constraint template library, policy bundles, and integration with Google Cloud console dashboards to help observe and maintain your clusters. Policy bundles are opinionated best practices that you can apply to your clusters, including bundles based on recommendations like the CIS Kubernetes Benchmark.

    Gatekeeper

    Install Gatekeeper in your cluster.

    For Autopilot clusters, open the Gatekeeper gatekeeper.yaml manifest in a text editor. Modify the rules field in the MutatingWebhookConfiguration specification to replace wildcard (*) characters with specific API groups and resource names, such as in the following example:

    apiVersion: admissionregistration.k8s.io/v1
    kind: MutatingWebhookConfiguration
    ...
    webhooks:
    - admissionReviewVersions:
      - v1
      - v1beta1
      ...
      rules:
      - apiGroups:
        - core
        - batch
        - apps
        apiVersions:
        - '*'
        operations:
        - CREATE
        - UPDATE
        resources:
        - Pod
        - Deployment
        - Job
        - Volume
        - Container
        - StatefulSet
        - StorageClass
        - Secret
        - ConfigMap
      sideEffects: None
      timeoutSeconds: 1
    

    Apply the updated gatekeeper.yaml manifest to your Autopilot cluster to install Gatekeeper. This is required because, as a built-in security measure, Autopilot disallows wildcard characters in mutating admission webhooks.

  2. Deploy the built-in Pod Security Policy Volume Types ConstraintTemplate:

    kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper-library/master/library/pod-security-policy/volumes/template.yaml
    
  3. Save the following Constraint with a list of allowed volume types as constraint.yaml:

    apiVersion: constraints.gatekeeper.sh/v1beta1
    kind: k8sPSPVolumeTypes
    metadata:
      name: nogcepersistentdisk
    spec:
      match:
        kinds:
          - apiGroups: [""]
            kinds: ["Pods"]
      parameters:
        volumes: ["configMap", "csi", "projected", "secret", "downwardAPI", "persistentVolumeClaim", "emptyDir", "nfs", "hostPath"]
    

    This constraint restricts volumes to the list in the spec.parameters.volumes field.

  4. Deploy the constraint:

    kubectl apply -f constraint.yaml
    

Monitor your cluster configuration

You should audit your cluster configurations for deviations from your defined settings.

Many of the recommendations covered in this hardening guide, as well as other common misconfigurations, can be automatically checked using Security Health Analytics.

Secure defaults

The following sections describe options that are securely configured by default in new clusters. You should verify that preexisting clusters are configured securely.

Protect node metadata

CIS GKE Benchmark Recommendations: 6.4.1. Ensure legacy Compute Engine instance metadata APIs are Disabled and 6.4.2. Ensure the GKE Metadata Server is Enabled

The v0.1 and v1beta1 Compute Engine metadata server endpoints were deprecated and shutdown on September 30, 2020. These endpoints did not enforce metadata query headers. For the shutdown schedule, refer to v0.1 and v1beta1 metadata server endpoints deprecation.

Some practical attacks against Kubernetes rely on access to the VM's metadata server to extract credentials. These attacks are blocked if you are using Workload Identity Federation for GKE or Metadata Concealment.

Leave legacy client authentication methods disabled

CIS GKE Benchmark Recommendations: 6.8.1. Ensure Basic Authentication using static passwords is Disabled and 6.8.2. Ensure authentication using Client Certificates is Disabled

There are several methods of authenticating to the Kubernetes API server. In GKE, the supported methods are service account bearer tokens, OAuth tokens, and x509 client certificates. GKE manages authentication with gcloud for you using the OAuth token method, setting up the Kubernetes configuration, getting an access token, and keeping it up to date.

Prior to GKE's integration with OAuth, a one-time generated x509 certificate or static password were the only available authentication methods, but are now not recommended and should be disabled. These methods present a wider surface of attack for cluster compromise and have been disabled by default since GKE version 1.12. If you are using legacy authentication methods, we recommend that you turn them off. Authentication with a static password is deprecated and has been removed since GKE version 1.19.

Existing clusters should move to OAuth. If a long-lived credential is needed by a system external to the cluster we recommend you create a Google service account or a Kubernetes service account with the necessary privileges and export the key.

To update an existing cluster and remove the static password, see Disabling authentication with a static password.

Currently, there is no way to remove the pre-issued client certificate from an existing cluster, but it has no permissions if RBAC is enabled and ABAC is disabled.

Leave Cloud Logging enabled

CIS GKE Benchmark Recommendation: 6.7.1. Ensure Stackdriver Kubernetes Logging and Monitoring is Enabled

To reduce operational overhead and to maintain a consolidated view of your logs, implement a logging strategy that is consistent wherever your clusters are deployed. GKE Enterprise clusters are integrated with Cloud Logging by default and that should remain configured.

All GKE clusters have Kubernetes audit logging enabled by default, which keeps a chronological record of calls that have been made to the Kubernetes API server. Kubernetes audit log entries are useful for investigating suspicious API requests, for collecting statistics, or for creating monitoring alerts for unwanted API calls.

GKE clusters integrate Kubernetes Audit Logging with Cloud Audit Logs and Cloud Logging. Logs can be routed from Cloud Logging to your own logging systems.

Leave the Kubernetes web UI (Dashboard) disabled

CIS GKE Benchmark Recommendation: 6.10.1. Ensure Kubernetes web UI is Disabled

You should not enable the Kubernetes web UI (Dashboard) when running on GKE.

The Kubernetes web UI (Dashboard) is backed by a highly privileged Kubernetes Service Account. The Google Cloud console provides much of the same functionality, so you don't need these permissions.

To disable the Kubernetes web UI:

gcloud container clusters update CLUSTER_NAME \
    --update-addons=KubernetesDashboard=DISABLED

Leave ABAC disabled

CIS GKE Benchmark Recommendation: 6.8.4. Ensure Legacy Authorization (ABAC) is Disabled

You should disable Attribute-Based Access Control (ABAC), and instead use Role-Based Access Control (RBAC) in GKE.

By default, ABAC is disabled for clusters created using GKE version 1.8 and later. In Kubernetes, RBAC is used to grant permissions to resources at the cluster and namespace level. RBAC allows you to define roles with rules containing a set of permissions. RBAC has significant security advantages over ABAC.

If you're still relying on ABAC, first review the Prerequisites for using RBAC. If you upgraded your cluster from an older version and are using ABAC, you should update your access controls configuration:

gcloud container clusters update CLUSTER_NAME \
    --no-enable-legacy-authorization

To create a new cluster with the above recommendation:

gcloud container clusters create CLUSTER_NAME \
    --no-enable-legacy-authorization

Leave the DenyServiceExternalIPs admission controller enabled

Do not disable the DenyServiceExternalIPs admission controller.

The DenyServiceExternalIPs admission controller blocks Services from using ExternalIPs and mitigates a known security vulnerability.

The DenyServiceExternalIPs admission controller is enabled by default on new clusters created on GKE versions 1.21 and later. For clusters upgrading to GKE versions 1.21 and later, you can enable the admission controller using the following command:

gcloud beta container clusters update CLUSTER_NAME \
    --no-enable-service-externalips

What's next