Common information
- dashboard: TODO
- runbook: https://wikitech.wikimedia.org/wiki/Management_Interfaces#Reset_the_management_card
- alertname: ManagementSSHDown
- job: probes/mgmt
- module: ssh_banner
- prometheus: ops
- rack: E1
- severity: task
- site: eqiad
- source: prometheus
- team: dcops
Firing alerts
- dashboard: TODO
- description: The management interface at an-presto1006.mgmt:22 has been unresponsive for multiple hours.
- runbook: https://wikitech.wikimedia.org/wiki/Management_Interfaces#Reset_the_management_card
- summary: Unresponsive management for an-presto1006.mgmt:22
- alertname: ManagementSSHDown
- instance: an-presto1006.mgmt:22
- job: probes/mgmt
- module: ssh_banner
- prometheus: ops
- rack: E1
- severity: task
- site: eqiad
- source: prometheus
- team: dcops
- Source
- dashboard: TODO
- description: The management interface at backup1010.mgmt:22 has been unresponsive for multiple hours.
- runbook: https://wikitech.wikimedia.org/wiki/Management_Interfaces#Reset_the_management_card
- summary: Unresponsive management for backup1010.mgmt:22
- alertname: ManagementSSHDown
- instance: backup1010.mgmt:22
- job: probes/mgmt
- module: ssh_banner
- prometheus: ops
- rack: E1
- severity: task
- site: eqiad
- source: prometheus
- team: dcops
- Source
- dashboard: TODO
- description: The management interface at dse-k8s-worker1005.mgmt:22 has been unresponsive for multiple hours.
- runbook: https://wikitech.wikimedia.org/wiki/Management_Interfaces#Reset_the_management_card
- summary: Unresponsive management for dse-k8s-worker1005.mgmt:22
- alertname: ManagementSSHDown
- instance: dse-k8s-worker1005.mgmt:22
- job: probes/mgmt
- module: ssh_banner
- prometheus: ops
- rack: E1
- severity: task
- site: eqiad
- source: prometheus
- team: dcops
- Source
- dashboard: TODO
- description: The management interface at dumpsdata1006.mgmt:22 has been unresponsive for multiple hours.
- runbook: https://wikitech.wikimedia.org/wiki/Management_Interfaces#Reset_the_management_card
- summary: Unresponsive management for dumpsdata1006.mgmt:22
- alertname: ManagementSSHDown
- instance: dumpsdata1006.mgmt:22
- job: probes/mgmt
- module: ssh_banner
- prometheus: ops
- rack: E1
- severity: task
- site: eqiad
- source: prometheus
- team: dcops
- Source
- dashboard: TODO
- description: The management interface at elastic1090.mgmt:22 has been unresponsive for multiple hours.
- runbook: https://wikitech.wikimedia.org/wiki/Management_Interfaces#Reset_the_management_card
- summary: Unresponsive management for elastic1090.mgmt:22
- alertname: ManagementSSHDown
- instance: elastic1090.mgmt:22
- job: probes/mgmt
- module: ssh_banner
- prometheus: ops
- rack: E1
- severity: task
- site: eqiad
- source: prometheus
- team: dcops
- Source