Page MenuHomePhabricator

Switchover s5 primary database master db1070 -> db1100 - 15th Oct 05:00 - 05:30 UTC
Closed, ResolvedPublic

Description

The current s5 master, db1070 is old and needs to be decommissioned.
Further, it has a disk on predictive failure and it is in row D, which needs to get some masters away from it, to avoid a high concentration of masters on that row.

We are going to switchover db1070 to db1100 on Tuesday Tuesday 15th October from 05:00 to 05:30
Read only is required, and the following wikis will not allow writes during this maintenance window:

cebwiki
dewiki
enwikivoyage
mgwiktionary
shwiki
srwiki

Event Timeline

Marostegui triaged this task as Medium priority.Oct 1 2019, 4:42 AM
Marostegui moved this task from Triage to Pending comment on the DBA board.

Change 540762 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Promote db1100 to s5 master

https://gerrit.wikimedia.org/r/540762

Change 540763 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/dns@master] wmnet: Update s5-master alias

https://gerrit.wikimedia.org/r/540763

Mentioned in SAL (#wikimedia-operations) [2019-10-14T09:48:11Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Pool db1130 into s5 api, db1100 will be removed later in preparation for tomorrow's failover T234300', diff saved to https://phabricator.wikimedia.org/P9325 and previous config saved to /var/cache/conftool/dbconfig/20191014-094809-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2019-10-14T10:07:59Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Set db1100 with weight 0 in preparation for tomorrow's failover T234300', diff saved to https://phabricator.wikimedia.org/P9326 and previous config saved to /var/cache/conftool/dbconfig/20191014-100758-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2019-10-15T04:15:46Z] <marostegui> Start pre-switchover steps T234300

Change 540762 merged by Marostegui:
[operations/puppet@production] mariadb: Promote db1100 to s5 master

https://gerrit.wikimedia.org/r/540762

Mentioned in SAL (#wikimedia-operations) [2019-10-15T05:00:08Z] <marostegui> Starting s5 failover from db1070 to db1100 - T234300

Mentioned in SAL (#wikimedia-operations) [2019-10-15T05:00:16Z] <marostegui@cumin2001> dbctl commit (dc=all): 'Set s5 as read-only for maintenance T234300', diff saved to https://phabricator.wikimedia.org/P9336 and previous config saved to /var/cache/conftool/dbconfig/20191015-050016-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2019-10-15T05:00:43Z] <marostegui@cumin2001> dbctl commit (dc=all): 'Promote db1100 to s5 master and remove read-only from s5 T234300', diff saved to https://phabricator.wikimedia.org/P9337 and previous config saved to /var/cache/conftool/dbconfig/20191015-050042-marostegui.json

Change 540763 merged by Marostegui:
[operations/dns@master] wmnet: Update s5-master alias

https://gerrit.wikimedia.org/r/540763

This was done
read only start: 05:00:17
read only stop: 05:00:43

Total read only time: 26 seconds