Skip to content

Tool and policy library for reviewing Google Kubernetes Engine clusters against best practices

License

Notifications You must be signed in to change notification settings

google/gke-policy-automation

GKE Policy Automation logo

GKE Policy Automation

This repository contains the tool and the policy library for validating GKE clusters against configuration best practices and scalability limits.

Build Policy tests Version Go Report Card GoDoc GitHub

GKE Policy Automation Demo

Note: this is not an officially supported Google product.


Table of Contents

Installation

Container image

The container images with GKE Policy Automation tool are hosted on ghcr.io. Check the packages page for a list of all tags and versions.

docker pull ghcr.io/google/gke-policy-automation:latest
docker run --rm ghcr.io/google/gke-policy-automation check \
-project my-project -location europe-west2 -name my-cluster

Krew

The GKE Policy Automation is available as a Krew plugin.

kubectl krew install gke-policy
kubectl gke-policy check --discovery -p my-project

Binary

Binaries for Linux, Windows and Mac are available as tarballs in the release page.

Source code

Go v1.22 or newer is required. Check the development guide for more details.

git clone https://github.com/google/gke-policy-automation.git
cd gke-policy-automation
make build
./gke-policy check \
--project my-project --location europe-west2 --name my-cluster

Usage

Full user guide: GKE Policy Automation User Guide.

Checking best practices

The configuration best practices check validates GKE clusters against the set of GKE configuration policies.

./gke-policy check \
--project my-project --location europe-west2 --name my-cluster

Checking scalability limits

The scalability limits check validates GKE clusters against the GKE quotas and limits. The tool will report violations when the current values will cross the certain thresholds.

./gke-policy check scalability \
--project my-project --location europe-west2 --name my-cluster

NOTE: you need to run kube-state-metrics to export cluster metrics to use cluster scalability limits check. Refer to the kube-state-metrics installation & configuration guide for more details.

The tool assumes that metrics are available in Cloud Monitoring, i.e. in a result of Google Cloud Managed Service for Prometheus based metrics collection. If self managed Prometheus collection is used, be sure to:

  • Configure Prometheus scraping for kube-state-metrics using PodMonitor / ServiceMonitor and corresponding annotations, i.e. prometheus.io/scrape

  • Configure custom Prometheus API server address in a tool

    • Prepare config.yaml:

      inputs:
        metricsAPI:
          enabled: true
          address: http://my-prometheus-svc:8080 # Prometheus server API endpoint
          username: user   # username for basic authentication (optional)
          password: secret # password for basic authentication (optional)
    • Run ./gke-policy check scalability -c config.yaml

Common check options

The common options apply to all types of check commands.

Selecting multiple clusters

Check multiple GKE clusters using the config file.

./gke-policy check -c config.yaml

The config.yaml file:

clusters:
  - name: prod-central
    project: my-project-one
    location: europe-central2
  - id: projects/my-project-two/locations/europe-west2/clusters/prod-west

Using cluster discovery

Check multiple clusters by discovering them in a selected GCP projects, folders or in the entire organization using Cloud Asset Inventory and configuration file.

./gke-policy check -c config.yaml

The config.yaml file:

clusterDiscovery:
  enabled: true
  organization: "123456789012"

It is possible to use cluster discovery on a given project using command line flags only:

./gke-policy check --discovery -p my-project-id

Defining inputs

Data for cluster validation can be retrieved from multiple data sources, eg. GKE API, Cloud Monitoring API or local JSON file exported from GKE API. For best practices checks GKE API is enabled by default, and for scalability checks, metrics API is enabled as well. Check Inputs user guide for more details.

Example:

  • Metrics API input from Cloud Monitoring configured in dedicated project and other values set with defaults for scalability check
inputs:
  gkeAPI:
    enabled: true
  gkeLocal:
    enabled: false
    file:
  metricsAPI:
    enabled: true
    project: sample-project
    metrics:

Defining outputs

The cluster validation results can be published to multiple outputs, including JSON file, Pub/Sub topic, Cloud Storage bucket or Security Command Center. Check Outputs user guide for more details.

Examples:

  • JSON file output with command line flags

    ./gke-policy check \
    --project my-project --location europe-west2 --name my-cluster \
    --out-file output.json
  • All outputs enabled in a configuration file

    clusters:
      - name: my-cluster
        project: my-project
        location: europe-west2
    outputs:
      - file: output.json
      - pubsub:
          topic: Test
          project: my-pubsub-project
      - cloudStorage:
          bucket: bucket-name
          path: path/to/write
      - securityCommandCenter:
          organization: "153963171798"

Custom Policy repository

Specify custom repository with the GKE cluster best practices and check the cluster against them.

  • Custom policies source with command line flags

    ./gke-policy check \
    --project my-project --location europe-west2 --name my-cluster \
    --git-policy-repo "https://github.com/google/gke-policy-automation" \
    --git-policy-branch "main" \
    --git-policy-dir "gke-policies-v2"
  • Custom policies source with configuration file

    ./gke-policy check -c config.yaml

    The config.yaml file:

    clusters:
      - name: my-cluster
        project: my-project
        location: europe-west2
    policies:
      - repository: https://domain.com/your/custom/repository
        branch: main
        directory: gke-policies-v2

Authentication

The tool is fetching GKE cluster details using GCP APIs. The application default credentials are used by default.

  • When running the tool in GCP environment, the tool will use the attached service account by default
  • When running locally, use gcloud auth application-default login command to get application default credentials
  • To use credentials from service account key file pass --creds parameter with a path to the file.

The minimum required IAM role is roles/container.clusterViewer on a cluster projects. Additional roles may be needed, depending on configured outputs - check authentication section in the user guide.

Serverless execution

The GKE Policy Automation tool can be executed in a serverless way to perform automatic evaluations of a clusters running in your organization. Please check our reference Terraform Solution that leverages GCP serverless solutions including Cloud Scheduler and Cloud Run.

Contributing

Please check out Contributing and Code of Conduct docs before contributing.

Development

Please check GKE Policy Automation development for guides on building and developing the application.

Policy authoring

Please check GKE Policy authoring guide for guides on authoring REGO rules for GKE Policy Automation.

License

Apache License 2.0