This document describes how to harden the security of your clusters created with Google Distributed Cloud (software only) on bare metal.
Secure your containers using SELinux
You can secure your containers by enabling SELinux, which is supported for Red Hat Enterprise Linux (RHEL). If your host machines are running RHEL and you want to enable SELinux for your cluster, you must enable SELinux in all of your host machines. See secure your containers using SELinux for details.
Use seccomp
to restrict containers
Secure computing mode (seccomp
) is available in version 1.11 and higher of
Google Distributed Cloud. Running containers with a seccomp
profile improves the
security of your cluster because it restricts the system calls that containers
are allowed to make to the kernel. This reduces the chance of kernel
vulnerabilities being exploited.
The default seccomp
profile contains a list of system calls that a container
is allowed to make. Any system calls not on the list are disallowed. seccomp
is enabled by default in version 1.11 and higher clusters. This means that all
system containers and customer workloads are run with the container runtime's
default seccomp
profile. Even containers and workloads that don't specify a
seccomp
profile in their configuration files are subject to seccomp
restrictions.
How to disable seccomp
cluster-wide or on particular workloads
You can disable seccomp
during cluster creation or cluster upgrade only.
bmctl update
can't be used to disable this feature. If you want to disable
seccomp
within a cluster, add the following clusterSecurity
section to the
cluster's configuration file:
apiVersion: baremetal.cluster.gke.io/v1
kind: Cluster
metadata:
name: example
namespace: cluster-example
spec:
...
clusterSecurity:
enableSeccomp: false
...
In the unlikely event that some of your workloads need to execute system
calls that seccomp
blocks by default, you don't have to disable seccomp
on
the whole cluster. Instead, you can single out particular workloads to run in
unconfined mode
. Running a workload in unconfined mode
frees that workload
from the restrictions that the seccomp
profile imposes on the rest of the
cluster.
To run a container in unconfined mode
, add the following securityContext
section to the Pod manifest:
apiVersion: v1
kind: Pod
....
spec:
securityContext:
seccompProfile:
type: Unconfined
....
Don't run containers as root
user
By default, processes in containers execute as root
. This poses a potential
security problem, because if a process breaks out of the container, that process
runs as root
on the host machine. It's therefore advisable to run all your
workloads as a non-root user.
The following sections describe two ways of running containers as a non-root user.
Method #1: add USER
instruction in Dockerfile
This method uses a Dockerfile
to ensure that containers don't run as a root
user. In a Dockerfile
, you can specify which user the process inside a container
should be run as. The following snippet from a Dockerfile
shows how to do this:
....
#Add a user with userid 8877 and name nonroot
RUN useradd −u 8877 nonroot
#Run Container as nonroot
USER nonroot
....
In this example, the Linux command useradd -u
creates a user called nonroot
inside the container. This user has a user ID (UID) of 8877
.
The next line in the Dockerfile
runs the command USER nonroot
. This command
specifies that from this point on in the image, commands are run as the user
nonroot
.
Grant permissions to UID 8877
so that the container processes can execute
properly for nonroot
.
Method #2: add securityContext fields in Kubernetes manifest file
This method uses a Kubernetes manifest file to ensure that containers don't run
as a root
user. Security settings are specified for a Pod, and those security
settings are in turn applied to all containers within the Pod.
The following example shows an excerpt of a manifest file for a given Pod:
apiVersion: v1
kind: Pod
metadata:
name: name-of-pod
spec:
securityContext:
runAsUser: 8877
runAsGroup: 8877
....
The runAsUser
field specifies that for any containers in the Pod, all
processes run with user ID 8877
. The runAsGroup
field specifies that these
processes have a primary group ID (GID) of 8877
. Remember to grant the
necessary and sufficient permissions to UID 8877
so that the container
processes can execute properly.
This ensures that processes within a container are run as UID 8877
, which has
fewer privileges than root.
System containers in Google Distributed Cloud software-only help install and
manage clusters. The UIDs and GIDs used by these containers can be controlled by
the field
startUIDRangeRootlessContainers
in the cluster specification. The startUIDRangeRootlessContainers
is an
optional field which, if not specified, has a value of 2000
. Allowed values
for startUIDRangeRootlessContainers
are 1000
-57000
. The
startUIDRangeRootlessContainers
value can be changed during upgrades only. The
system containers use the UIDs and GIDs in the range
startUIDRangeRootlessContainers
to startUIDRangeRootlessContainers
+ 2999.
The following example shows an excerpt of a manifest file for a Cluster resource:
apiVersion: baremetal.cluster.gke.io/v1
kind: Cluster
metadata:
name: name-of-cluster
spec:
clusterSecurity:
startUIDRangeRootlessContainers: 5000
...
Choose the value for startUIDRangeRootlessContainers
so that the UID and GID
spaces used by the system containers don't overlap with those assigned to user
workloads.
How to disable rootless mode
Starting with Google Distributed Cloud release 1.10, Kubernetes control plane
containers and system containers run as non-root users by default.
Google Distributed Cloud assigns these users UIDs and GIDs in the range
2000
-4999
. However, this assignment can cause problems if those UIDs and
GIDs have already been allocated to processes running inside your environment.
Starting with release 1.11, you can disable rootless mode when you upgrade your cluster. When rootless mode is disabled, Kubernetes control plane containers and system containers run as the root user.
To disable rootless mode, perform the following steps:
Add the following
clusterSecurity
section to the cluster's configuration file:apiVersion: baremetal.cluster.gke.io/v1 kind: Cluster metadata: name: example namespace: cluster-example spec: ... clusterSecurity: enableRootlessContainers: false ...
Upgrade your cluster. For details, see Upgrade clusters.
Restrict the ability for workloads to self-modify
Certain Kubernetes workloads, especially system workloads, have permission to self-modify. For example, some workloads vertically autoscale themselves. While convenient, this can allow an attacker who has already compromised a node to escalate further in the cluster. For example, an attacker could have a workload on the node change itself to run as a more privileged service account that exists in the same namespace.
Ideally, workloads shouldn't be granted the permission to modify themselves in the first place. When self-modification is necessary, you can limit permissions by applying Gatekeeper or Policy Controller constraints, such as NoUpdateServiceAccount from the open source Gatekeeper library, which provides several useful security policies.
When you deploy policies, it's usually necessary to allow the controllers that
manage the cluster lifecycle to bypass the policies. This is necessary so that
the controllers can make changes to the cluster, such as applying cluster
upgrades. For example, if you deploy the NoUpdateServiceAccount
policy on your
clusters, you must set the following parameters in the Constraint
:
parameters:
allowedGroups:
- system:masters
allowedUsers: []
Disable kubelet read-only port
Starting with release 1.15.0, Google Distributed Cloud disables by default port
10255
, the kubelet read-only port. Any customer workloads that are configured
to read data from this insecure kubelet port 10255
should migrate to use the
secure kubelet port 10250.
Only clusters created with version 1.15.0 or higher have this port disabled by
default. The kubelet read-only port 10255
remains accessible for clusters
created with a version lower than 1.15.0, even after a cluster upgrade to
version 1.15.0 or higher.
This change was made because the kubelet leaks low sensitivity information over
port 10255
, which is unauthenticated. The information includes the full
configuration information for all Pods running on a Node, which can be valuable
to an attacker. It also exposes metrics and status information, which can
provide business-sensitive insights.
Disabling the kubelet read-only port is recommended by the CIS Kubernetes Benchmark.
Maintenance
Monitoring security bulletins and upgrading your clusters are important security measures to take once your clusters are up and running.
Monitor security bulletins
The GKE security team publishes security bulletins for high and critical severity vulnerabilities.
These bulletins follow a common Google Cloud vulnerability numbering scheme and are linked to from the main Google Cloud bulletins page and the release notes.
When customer action is required to address these high and critical vulnerabilities, Google contacts customers by email. In addition, Google might also contact customers with support contracts through support channels.
For more information about how Google manages security vulnerabilities and patches for GKE and GKE Enterprise, see Security patching.
Upgrade clusters
Kubernetes regularly introduces new security features and provides security patches. Google Distributed Cloud releases incorporate Kubernetes security enhancements that address security vulnerabilities that may affect your clusters.
You are responsible for keeping your clusters up to date. For each release, review the release notes. To minimize security risks to your clusters, plan to update to new patch releases every month and minor versions every four months.
One of the many advantages of upgrading a cluster is that it automatically
refreshes the cluster kubeconfig file. The kubeconfig file authenticates a
user to a cluster. The kubeconfig file is added to your cluster directory when
you create a cluster with bmctl
. The default name and path is
bmctl-workspace/CLUSTER_NAME/CLUSTER_NAME-kubeconfig
.
When you upgrade a cluster, that cluster's kubeconfig file is automatically
renewed. Otherwise, the kubeconfig file expires one year after it was created.
For information about how to upgrade your clusters, see upgrade your clusters.
Use VPC Service Controls with Cloud Interconnect or Cloud VPN
Cloud Interconnect provides low latency, high availability connections that let you transfer data reliably between your on-premises bare metal machines and Google Cloud Virtual Private Cloud (VPC) networks. To learn more about Cloud Interconnect, see Dedicated Interconnect provisioning overview.
Cloud VPN securely connects your peer network to your Virtual Private Cloud (VPC) network through an IPsec VPN connection. To learn more about Cloud VPN, see Cloud VPN overview.
VPC Service Controls works with either Cloud Interconnect or Cloud VPN to provide additional security for your clusters. VPC Service Controls helps to mitigate the risk of data exfiltration. Using VPC Service Controls, you can add projects to service perimeters that protect resources and services from requests that originate outside the perimeter. To learn more about service perimeters, see Service perimeter details and configuration.
To fully protect your clusters created with Google Distributed Cloud, you need to use Restricted VIP and add the following APIs to the service perimeter:
- Artifact Registry API (
artifactregistry.googleapis.com
) - Resource Manager API (
cloudresourcemanager.googleapis.com
) - Compute Engine API (
compute.googleapis.com
) - Connect gateway API (
connectgateway.googleapis.com
) - Google Container Registry API (
containerregistry.googleapis.com
) - GKE Connect API (
gkeconnect.googleapis.com
) - GKE Hub API (
gkehub.googleapis.com
) - GKE On-Prem API (
gkeonprem.googleapis.com
) - Identity and Access Management (IAM) API (
iam.googleapis.com
) - Cloud Logging API (
logging.googleapis.com
) - Cloud Monitoring API (
monitoring.googleapis.com
) - Config Monitoring for Ops API (
opsconfigmonitoring.googleapis.com
) - Service Control API (
servicecontrol.googleapis.com
) - Cloud Storage API (
storage.googleapis.com
)
When you use bmctl
to create or upgrade a cluster, use the --skip-api-check
flag to bypass calling Service Usage API (serviceusage.googleapis.com
).
Service Usage API isn't supported by VPC Service Controls.