Review and establish configurable quotas for users in the new Kubernetes cluster
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	• Bstorm
	Oct 5 2019, 12:20 AM

Description

In the course of finishing up PodSecurityPolicies etc., we should ensure the new cluster is configured more like a public use environment and includes quotas to prevent issues with particular tools consuming all resources. These need to be adjustable as well as mostly stable. The new cluster has eviction limits that will prevent problems of nodes running out of RAM for the most part, but resource management can be baked in instead of optional or controlled in webservice in a vague way.

Details

	Subject	Repo	Branch	Lines +/-
	quotas: Add default quotas and limitranges to new tools	labs/tools/maintain-kubeusers	master	+400 -1 K

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
		Restricted Task
Resolved	• Bstorm	T246122 Upgrade the Toolforge Kubernetes cluster to v1.16
		Restricted Task
Resolved	• bd808	T232536 Toolforge Kubernetes internal API down, causing `webservice` and other tooling to fail
Resolved	• Bstorm	T236565 "tools" Cloud VPS project jessie deprecation
Resolved	aborrero	T101651 Set up toolsbeta more fully to help make testing easier
Resolved	• Bstorm	T166949 Homedir/UID info breaks after a while in Tools Kubernetes (can't read replica.my.cnf)
Resolved	• Bstorm	T246059 Add admin account creation to maintain-kubeusers
Resolved	• Bstorm	T154504 Make webservice backend default to kubernetes
Declined	None	T245230 Investigate cpu/ram requests and limits for DaemonSets pods
Resolved	• Bstorm	T214513 Deploy and migrate tools to a Kubernetes v1.15 or newer cluster
Resolved	aborrero	T215531 Deploy upgraded Kubernetes to toolsbeta
Resolved	• Bstorm	T234702 Review and establish configurable quotas for users in the new Kubernetes cluster

Event Timeline

• Bstorm created this task.Oct 5 2019, 12:20 AM

Docs on the topic:
https://kubernetes.io/docs/tasks/administer-cluster/manage-resources/memory-constraint-namespace/
https://kubernetes.io/docs/tasks/administer-cluster/manage-resources/quota-memory-cpu-namespace/
https://kubernetes.io/docs/concepts/policy/resource-quotas/

My plan of attack is to kick the APIs until I find the behavior we want.

Overall, it seems that these would be defined on the namespace level and thus should be set by maintain-kubeusers on creation of the namespace, which would allow adjustments to be made after that point when users request greater resources.

This implies that aside from experimentation and such, once we have a list of what we want, I should close this task and merge it into T228499

webservice tries to set some limits today that we can use as we try to decide what reasonable defaults for a tool's namespace are. They vary a bit by language runtime, but are fairly consistent:
php5.6, php7.2, tcl, python, python2, ruby2, golang, nodejs

limits:
  memory: 2Gi
  cpu: 2
requests:
  memory: 256Mi
  cpu: 0.125

jdk8

limits:
  memory: 4Gi
  cpu: 2
requests:
  memory: 256Mi
  cpu: 0.125

On the grid engine side, the default h_vmem limit is 4G for a webservice job. 17 tools currently have override files in /data/project/.system/config that grant a larger limit: 7 x 6G, 8 x 7G, 2 x 8G.

For a 'typical' tool account using Kubernetes we expect one pod running a webservice and occasional use of a second interactive pod (webservice [...] shell) for doing things like running a language specific package manager (composer, pip, npm, etc). The interactive pods are currently started without any explicit resource limits. I guess this means they would get the namespace default memory and cpu limits?

It looks like we could preserve current assumed limits with something like:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: tool-{name}-quota
spec:
  hard:
    requests.cpu: "0.25"         # 2 x 0.125
    requests.memory: 512Mi       # 2 x 256Mi
    limits.cpu: "2"
    limits.memory: 4Gi           # Could be lower, but java webservice users would need a bump
    pods: "2"                    # webservice + interactive
    replicationcontrollers: "1"  # Assumes only 1 deployment in a 'typical' tool
    resourcequotas: "1"          # Would keep us from accidentally making multiple ResourceQuota object per namespace
    services: "1"                # ¿Only a LoadBalancer?
    services.nodeports: "0"      # No tool should use a nodeport
    services.loadbalancers: "1"  # Assumes only 1 webservice in a 'typical' tool
    secrets: "16"                # Arbitrarily chosen
    configmaps: "2"              # Arbitrarily chosen, I know were are planning on 1 per ns right now for state tracking
    persistentvolumeclaims: "0"  # ¿Are we going to be using PVCs yet?

If and when we are ready to have folks start running scheduled jobs these defaults would need to be reexamined. There are likely some folks already running custom deployments that will not fit in these limits. Not a huge problem as long as we document the default limits well, have a process for requesting higher limits, and have some rubric for evaluating those requests.

• bd808 moved this task from Inbox to Doing on the cloud-services-team (Kanban) board.Oct 7 2019, 11:13 PM

Thanks! I had no idea the JVM one was different.

I'm trying to think about this as the default starting point for a namespace in much the same way we handle quotas for an openstack that can be adjusted later. Using webservice as the only gateway into Kubernetes, it isn't necessary to even consider and there is zero flexibility in webservice at this time, but I want to start on the right foot and make sure we are thinking ahead.

I also want to introduce a limitrange to each namespace that should basically make it so that the default limits are configurable for admins if they are removed from webservice--so there would be a story for if a user wished to have their per-pod/container limit changed.

Setting pods at 2 seems very contrary to any possibility of growing the usage of kubernetes. It kinda codes the needs of webservice into the cluster. We could easily do limits.memory even higher than that if we set a reasonable limitrange for individual containers. That's the limit for the whole namespace, after all (and we are not far from placing a simple wrapper somewhere for cronjobs).

I need to test if I can set quotas on things like ingresses (should be able to in 1.15). I'll come back with a counter/slightly altered proposal shortly.

No need to include PVCs. Users cannot create them with the current RBAC (or mount them with current PSP, maybe should change that)--only cluster admins. We have no use for them yet either, but if we can get away from a huge, shared nfs...

In T234702#5558114, @Bstorm wrote:

I'm trying to think about this as the default starting point for a namespace in much the same way we handle quotas for an openstack that can be adjusted later.

Agreed, I was thinking about it in the same manner. My main reason for starting with really low limits is a belief that it is always easier to raise defaults than to lower them. Maybe I aimed too low with the single web container core use case.

I do think that the default tool account's quota should be relatively constrained. This is more social/community reasons than technical reasons though. A large quota gives the tool's maintainers more space to spread out in, meaning that they are not incentivize to build focused, single purpose tools. 'Suites' of tools all by the same author were common on Toolserver (XTools, etc). Toolforge's feature of multi-maintainer tool accounts is helped much more by smaller, stand-alone tools. Smaller tools are easier for others to understand for the purpose of adoption or forking. I am not opposed to large tools or small tools with larger than 'normal' resource needs, but it would be nice to put at least a small hurdle of asking for more quota on folks so they have a moment to think. We could even think about a 3-tier setup with a low default, some self-serve interface to jump up to a medium size quota, and then something like the process for current Cloud VPS users to step up beyond medium into large territory.

Makes sense. Here's an example of using limit range and a quota with lots of comments and opinions. This works on minikube, but the user experience is kind of annoying in certain places.

apiVersion: v1
kind: ResourceQuota
metadata:
  name: tool-blurp           # When the resource is already named by the API, it's just extra keystrokes to add "quota"
spec:
  hard:
    requests.cpu: "2"         # Across all pods a namespace can deploy, they can only grab from a pool of "2"
    requests.memory: 6Gi      # This would allow a java webservice and a webservice shell to run
    limits.cpu: "2"           # You can only aquire as much as you request.
    limits.memory: 8Gi        # Allows a burst of memory, to be MUCH smaller for each container
    pods: "4"                 # Webservice with no replicas or state machines, a shell and 2 crons
    services: "1"             # Initial usecase = webservice
    # not limiting loadbalancers because they don't work anyway, and they are services (thus 1 only) -- and that 1 is type ClusterIP with webservice
    services.nodeports: "0"   # Seems to break most of our model so far if we open that can of worms
    # I'm not sure limiting resourcequotas would "work" -- it would limit inside the NS only where users cannot touch them
    replicationcontrollers: "1"   # Possibly (probably) redundant due to pod and services limit redundant limits are just more complication and work and see below
    secrets: "10"             # These are totally unused by users in the webservice regime, but they could fill etcd if unchecked
    configmaps: "10"          # Ditto (but the limit could be a 100, and we wouldn't have to worry much)
    persistentvolumeclaims: "3"  # Users cannot create them! However, if we leave the option open for us to...
---

apiVersion: v1  
kind: LimitRange  
metadata:  
  name: tool-blurp
spec:  
  limits:  
  - default:
      cpu: "500m"       # If we stop setting this in webservice, this is what will be set to (the default limits and requests)
      memory: 512Mi  
    defaultRequest:
      cpu: "250m"
      memory: 256Mi
    max:
      memory: 4Gi      # We could allow webservice to set limits up to these values.
      cpu: 1
    min:
      memory: 256Mi
    type: Container

One place it is annoying is that the replicationcontroller restriction acts like something is simply broken. You deploy a deployment and it just doesn't work. It doesn't stop you. It does the same for pods: https://github.com/kubernetes/kubernetes/issues/55037
It may not be worth some of them.

In the current structure, none of this is actually limited, obviously. However, setting the limits assumes that we will make more use of kubernetes than just webservice sometime "soon".

I should probably mention that only standard objects in the core API can be quota limited...except for truly custom ones that are fully qualified and outside of k8s.io. It's a bit strange at the moment. This is why I didn't add one for ingresses. I couldn't find a way that made it work.

Change 542501 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[labs/tools/maintain-kubeusers@master] quotas: Add default quotas and limitranges to new tools

https://gerrit.wikimedia.org/r/542501

gerritbot added a project: Patch-For-Review.Oct 11 2019, 6:02 PM

Change 542501 merged by Bstorm:
[labs/tools/maintain-kubeusers@master] quotas: Add default quotas and limitranges to new tools

https://gerrit.wikimedia.org/r/542501

This should work when the new user service does.

aborrero mentioned this in T175593: Increases the memory available to the corenlp tool container.Nov 27 2019, 4:10 PM

aborrero mentioned this in T183436: Add memory limit configuration for Kubernetes pods.

• bd808 merged a task: T173312: Resource based (CPU/IO/Disk) quota imposed on a per-namespace basis in k8s cluster.Dec 17 2019, 5:34 AM

• bd808 added a subscriber: Jprorama.

Can the note "Currently tool memory limits can only be adjusted for Grid Engine Web services (T183436)." be removed from the wiki page? Or is this still the case?

Don-vip mentioned this in T230284: Request increased quota for spacemedia Toolforge tool.Apr 11 2020, 12:52 PM

@MichaelSchoenitzer That can be removed, yes.

taavi mentioned this in T327748: Remove 1 CPU limit for toolforge-jobs.Jan 24 2023, 1:48 PM

Maintenance_bot removed a project: Patch-For-Review.Jan 24 2023, 2:31 PM

Review and establish configurable quotas for users in the new Kubernetes clusterClosed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

Review and establish configurable quotas for users in the new Kubernetes cluster
Closed, ResolvedPublic
Actions

Related Objects
Search...