Setup Kubernetes Masters in a HA setup
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	yuvipanda
	Aug 12 2016, 8:15 PM

Description

Since we have instances randomly freezing, and it could happen to the kubernetes master too, let's make sure it's got a HA setup going.

Need to follow http://kubernetes.io/docs/admin/high-availability/#replicated-api-servers

Details

	Subject	Repo	Branch	Lines +/-
	tools: Allow multiple k8s master to access etcd	operations/puppet	production	+2 -2
	k8s: Make controller-manager & scheduler be HA	operations/puppet	production	+4 -2

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
		Restricted Task
Resolved	• Bstorm	T246122 Upgrade the Toolforge Kubernetes cluster to v1.16
		Restricted Task
Resolved	• bd808	T232536 Toolforge Kubernetes internal API down, causing `webservice` and other tooling to fail
Resolved	• Bstorm	T236565 "tools" Cloud VPS project jessie deprecation
Resolved	aborrero	T101651 Set up toolsbeta more fully to help make testing easier
Resolved	• Bstorm	T166949 Homedir/UID info breaks after a while in Tools Kubernetes (can't read replica.my.cnf)
Resolved	• Bstorm	T246059 Add admin account creation to maintain-kubeusers
Resolved	• Bstorm	T154504 Make webservice backend default to kubernetes
Declined	None	T245230 Investigate cpu/ram requests and limits for DaemonSets pods
Resolved	• Bstorm	T214513 Deploy and migrate tools to a Kubernetes v1.15 or newer cluster
Resolved	aborrero	T142862 Setup Kubernetes Masters in a HA setup
Resolved	aborrero	T215663 Stand up upgraded Toolforge etcd clusters
Resolved	aborrero	T215530 Sort out the best method of spinning up multiple toolforge kubernetes masters
Resolved	aborrero	T215679 Sort out and test deploying the worker nodes in a sane fashion
Resolved	• Bstorm	T215529 Puppetize/stand up a load balancer for K8s API servers
Resolved	aborrero	T215975 Package/copy kubeadm, kubelet, docker-ce and kubectl to Toolforge Aptly or Reprepro

Event Timeline

yuvipanda created this task.Aug 12 2016, 8:15 PM

Restricted Application added a project: Cloud-Services. · View Herald TranscriptAug 12 2016, 8:15 PM

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Change 304503 had a related patch set uploaded (by Yuvipanda):
k8s: Make controller-manager & scheduler be HA

https://gerrit.wikimedia.org/r/304503

Change 304504 had a related patch set uploaded (by Yuvipanda):
tools: Allow multiple k8s master to access etcd

https://gerrit.wikimedia.org/r/304504

yuvipanda mentioned this in rOPUP5c2bcd527896: k8s: Make controller-manager & scheduler be HA.Aug 12 2016, 8:37 PM

yuvipanda mentioned this in rOPUP862efcebf8c0: tools: Allow multiple k8s master to access etcd.

This ran into a bump - we have kube-maintainusers, which is used to populate token auth of all the masters. This should run in only one place, however, and push updates to all the places.

To do this, I am going to do the following:

Move maintain-kubeusers to a centralized location (puppetmaster maybe?)
Setup some way for it to push config to all the masters and restart them only when it's sure it has propogated everywhere.

yuvipanda mentioned this in rOPUP088f1f7ca7d3: k8s: Make controller-manager & scheduler be HA.Aug 12 2016, 10:54 PM

yuvipanda mentioned this in rOPUPead26379361d: tools: Allow multiple k8s master to access etcd.

Change 304503 merged by Yuvipanda:
k8s: Make controller-manager & scheduler be HA

https://gerrit.wikimedia.org/r/304503

Change 304504 merged by Yuvipanda:
tools: Allow multiple k8s master to access etcd

https://gerrit.wikimedia.org/r/304504

yuvipanda created subtask T144153: Move kubernetes authentication to using X.509 client certs.Aug 29 2016, 7:30 AM

scfc removed a project: Patch-For-Review.Nov 26 2016, 11:41 PM

scfc triaged this task as Medium priority.Feb 16 2017, 8:14 PM

scfc moved this task from Backlog to Ready to be worked on on the Toolforge board.

yuvipanda removed yuvipanda as the assignee of this task.Mar 22 2017, 10:31 PM

• Phabricator_maintenance removed a subscriber: yuvipanda.Jun 7 2017, 6:40 PM

• bd808 edited projects, added Kubernetes; removed Cloud-Services.Jul 28 2017, 11:01 PM

• Bstorm added a parent task: T214513: Deploy and migrate tools to a Kubernetes v1.15 or newer cluster.Feb 7 2019, 12:29 AM

• Bstorm removed a subtask: T144153: Move kubernetes authentication to using X.509 client certs.Feb 7 2019, 9:06 PM

• Bstorm added subtasks: T215663: Stand up upgraded Toolforge etcd clusters, T215530: Sort out the best method of spinning up multiple toolforge kubernetes masters.Feb 12 2019, 10:27 PM

aborrero closed subtask T215530: Sort out the best method of spinning up multiple toolforge kubernetes masters as Resolved.Jul 4 2019, 3:41 PM

aborrero closed subtask T215663: Stand up upgraded Toolforge etcd clusters as Resolved.Jul 4 2019, 3:45 PM

We know how to do this now.

In T215531: Deploy upgraded Kubernetes to toolsbeta we are developing a new k8s cluster which is deployed by using kubeadm. This new mechanism takes care of building the multi master setup for us.

The next version of the toolforge k8s service should contain a fix for this.

Closing task now. Feel free to reopen if required.

• ayounsi added a project: Wikimedia-Incident.Sep 30 2019, 9:55 PM

Krinkle edited projects, added Sustainability (Incident Followup); removed Wikimedia-Incident.Aug 19 2022, 3:05 PM

Setup Kubernetes Masters in a HA setupClosed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

Setup Kubernetes Masters in a HA setup
Closed, ResolvedPublic
Actions

Related Objects
Search...