Cisco UCS Troubleshooting

Download as pdf or txt
Download as pdf or txt
You are on page 1of 134

BRKINI-2011

Lessons Learned:
Troubleshooting UCS
from a TAC Engineer’s
Perspective

Aaron Sampliner, Systems Engineer


Jason Lill, CX Technical Leader
Cisco Webex Teams

Questions?
Use Cisco Webex Teams (formerly Cisco Spark)
to chat with the speaker after the session

How
1 Find this session in the Cisco Events Mobile App
2 Click “Join the Discussion”
3 Install Webex Teams or go directly to the team space
4 Enter messages/questions in the team space

cs.co/ciscolivebot#BRKINI-2011

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 3
Agenda
• The Network is Down!!?
• The Story with Storage
• UCS Management Mastery

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 4
The Network is Down!!?
Isolating Networking Issues
UCS Networking Basics
Ethernet Switching Modes

• UCS had two types of switching modes

• The mode effects how Layer 2 forwarding concepts


are applied
• End Host Mode
• Appears like a hypervisor host to upstream network
• Default and recommended best practice

• Switch Mode
• FI works like a normal layer 2 switch with spanning-
tree

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
Physical and Logical Ports in UCS

Uplink Ports
Fabric
Interconnect Server Ports

Network Interface
(NIF) Ports
IOM / Fex Host Interface (HIF)
Ports
Adapter Port
Mezz Adapter vNIC
vNIC / vEthernet /
Virtual Interface (VIF)
Presentation ID © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
Network Troubleshooting for UCS
TAC Methodology…
• Simplify the issue

• UCS only deals with Layer 2


• Are we learning the MAC address on the FI?
• Is the issue fabric specific?
• If multiple servers affected, pick one to work with

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
Common Problem:
My VM is unreachable
on the network
Lab Topology
vPC
Nexus 5548 Nexus 5548

UCS 6248UP UCS 6248UP

Port-Channel

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 11
Tracing the Path in UCS
Start from the most southbound endpoint
Gather information from the hypervisor about the VM…
• Hypervisor Host (blade)

• VM MAC Address
• vSwitch Port Group
• IP Address

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 13
Understand Virtual Network Path
Verifying virtual switch is configured as expected…

VLAN 211 being Both vmnics connected to vSwitch


tagged on vSwitch

Both vmnics are active

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
Understand Virtual Network Path
Using “esxtop” command with the ‘n’ option…
[root@localhost:~] esxtop

4:50:33pm up 2 days 23:29, 696 worlds, 3 VMs, 6 vCPUs; CPU load average: 0.00, 0.00, 0.00

PORT-ID USED-BY TEAM-PNIC DNAME PKTTX/s MbTX/s PSZTX PKTRX/s MbRX/s PSZRX %DRPTX %DRPRX
33554433 Management n/a vSwitch0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
33554434 vmnic0 - vSwitch0 0.00 0.00 0.00 12.25 0.01 82.00 0.00 0.00
33554435 Shadow of vmnic0 n/a vSwitch0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
33554439 vmk0 vmnic1 vSwitch0 10.11 0.01 181.00 8.16 0.00 73.00 0.00 0.00
33554440 vmnic1 - vSwitch0 10.11 0.01 181.00 20.02 0.01 79.00 0.00 0.00
33554441 Shadow of vmnic1 n/a vSwitch0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
33554442 512108:rhel7-1 vmnic1 vSwitch0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
33554443 512185:jlil-central vmnic1 vSwitch0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
33554444 510584:Win-01 vmnic0 vSwitch0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
50331649 Management n/a vSwitch1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
50331650 vmnic2 - vSwitch1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
50331651 Shadow of vmnic2 n/a vSwitch1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
50331652 vmnic4 - vSwitch1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
50331653 Shadow of vmnic4 n/a vSwitch1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
50331654 vmk1 vmnic2 vSwitch1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
50331655 vmk2 vmnic4 vSwitch1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 15
Understand Virtual Network Path
What we know so far…
• VM specifics
• MAC Address
• IP Address
• Virtual machine is actively using vmnic0 to send traffic northbound in the
UCS
• Next, we need to understand which fabric this traffic should be traversing

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
Determining Fabric Path in UCS
vmnic to vNIC mapping…
We use the vmnic MAC Address and match it with the vNIC in UCS
[root@localhost:~] esxcfg-nics -l
Name PCI Driver Link Speed Duplex MAC Address MTU Description
vmnic0 0000:06:00.0 enic Up 10000Mbps Full 00:25:b5:a1:a1:a0 1500 Cisco Systems Inc Cisco VIC Ethernet NIC
vmnic1 0000:07:00.0 enic Up 10000Mbps Full 00:25:b5:b1:b1:b1 1500 Cisco Systems Inc Cisco VIC Ethernet NIC
vmnic2 0000:08:00.0 enic Up 10000Mbps Full 00:25:b5:a1:a1:b0 1500 Cisco Systems Inc Cisco VIC Ethernet NIC
vmnic3 0000:85:00.0 enic Up 10000Mbps Full 00:25:b5:b1:b1:b0 1500 Cisco Systems Inc Cisco VIC Ethernet NIC
vmnic4 0000:86:00.0 enic Up 10000Mbps Full 00:25:b5:b1:b1:c0 1500 Cisco Systems Inc Cisco VIC Ethernet NIC
vmnic5 0000:87:00.0 enic Up 10000Mbps Full 00:25:b5:a1:a1:a1 1500 Cisco Systems Inc Cisco VIC Ethernet NIC

Common mistake – assuming vNIC# and vmnic# are the same without
verifying

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 17
Determining Fabric Path in UCS
Match the MAC of the vmnic to the vNIC on the service profile…

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
Are we learning the MAC address on the FI?
Send traffic from the VM and see what is working…
• Based on what we have found we expect the following:
• VM traffic should be traversing Fabric Interconnect A
• VLAN ID should be 211 based on vSwitch config
• VM MAC Address – 00:50:56:8d:29:15
CiscoLive-2019-A# connect nxos a
CiscoLive-2019-A(nxos)# show mac address-table vlan 211
Legend:
* - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
age - seconds since last seen,+ - primary entry using vPC Peer-Link
VLAN MAC Address Type age Secure NTFY Ports/SWID.SSID.LID
---------+-----------------+--------+---------+------+----+------------------
* 211 0050.568d.2915 dynamic 10 F F Veth4173

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
vEthernet to Server mapping
Verifying that traffic is flowing as expected…
CiscoLive-2019-A(nxos)# show interface vethernet 4173
Vethernet4173 is up
Bound Interface is port-channel1364
• All vNICs are programmed as
Port description is server 1/1, VNIC vNIC2
Hardware is Virtual, address is 547f.eef6.faa0
vethernet interfaces in NX-OS
Port mode is trunk
Speed is auto-speed • Port description shows the vNIC
Duplex mode is auto
300 seconds input rate 0 bits/sec, 0 packets/sec name and server that it belongs to
300 seconds output rate 0 bits/sec, 0 packets/sec
Rx
111 unicast packets 281 multicast packets 2259 broadcast packets
2651 input packets 264597 bytes
0 input packet drops
Tx
82 unicast packets 79044 multicast packets 1841271 broadcast packets
1920397 output packets 180687663 bytes
0 flood packets
0 output packet drops

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
Uplink Pinning
Which uplink is being used?
Understanding how pinning works…
Basic rules to define which interface to pin to:
1. Which uplink interfaces are active?
2. Which uplink interfaces carry ALL of the vNIC’s configured
VLANs?
3. Which uplink has the least amount of vifs pinned to it
currently?
Severity: Major
Code: F0283
Last Transition Time: 2014-02-18T23:08:51.270
ID: 1157440
Status: None
Description: ether VIF 1369 on server 6 / 4 of switch B down, reason: ENM source pinning failed
Affected Object: sys/chassis-6/blade-4/fabric-B/path-1/vc-1369
Name: Dcx Vc Down
Cause: Link Down
Type: Network
Acknowledged: No
Occurrences: 7
Creation Time: 2014-02-11T12:57:11.768
Original Severity: Major
Previous Severity: Cleared
Highest Severity: Major

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 22
Which uplink is being used?
Uplink pinning commands… CiscoLive-2019-A(nxos)# show pinning border-interfaces active

--------------------+---------+----------------------------------------
Border Interface Status SIFs
--------------------+---------+----------------------------------------
Po1 Active sup-eth2 Veth4137 Veth4145 Veth4173
Veth4175 Veth4178 Veth4183 Veth4195
Veth4197 Veth4200 Veth4208 Veth4210
Veth4212 Veth4214 Veth4216
Eth1/18 Active
Two common ways to view
CiscoLive-2019-A(nxos)# show pinning server-interfaces

---------------+-----------------+------------------------+----------------
-
SIF Interface Sticky Pinned Border Interface Pinned Duration
---------------+-----------------+------------------------+----------------
-
Eth1/1 No - -
Eth1/2 No - -
Eth1/3 No - -
Eth1/4 No - -
Eth1/11 No - -
Eth1/12 No - -
Veth4137 No Po1 1d 58:3:23
Veth4145 No Po1 1d 57:47:47
Veth4173 No Po1 1d 57:54:31

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 23
UCS forwarding seems to be working as
expected
What’s upstream?
CiscoLive-2019-A(nxos)# show port-channel summary
Flags: D - Down P - Up in port-channel (members)
I - Individual H - Hot-standby (LACP only)
s - Suspended r - Module-removed
S - Switched R - Routed
U - Up (port-channel)
M - Not in use. Min-links not met
--------------------------------------------------------------------------------
Group Port- Type Protocol Member Ports
Channel
--------------------------------------------------------------------------------
1 Po1(SU) Eth LACP Eth1/31(P) Eth1/32(P)

• Use show cdp neighbors to determine upstream device and ports


• If not available, trace cables

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 24
Reviewing the upstream switches
Are we learning the MAC address?

MAC address not learned for our f241-03-08-5596-a# show mac address-table vlan 211

VM on either upstream switch


Legend:
* - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
age - seconds since last seen,+ - primary entry using vPC Peer-Link
VLAN MAC Address Type age Secure NTFY Ports/SWID.SSID.LID
---------+-----------------+--------+---------+------+----+------------------
* 211 002a.6a35.4a41 static 0 F F sup-eth2
* 211 002a.6a39.2a41 static 0 F F Po3
* 211 547f.ee2f.3381 dynamic 60 F F Po33

f241-03-08-5596-b# show mac address-table vlan 211


Legend:
* - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
age - seconds since last seen,+ - primary entry using vPC Peer-Link
VLAN MAC Address Type age Secure NTFY Ports/SWID.SSID.LID
---------+-----------------+--------+---------+------+----+------------------
* 211 002a.6a35.4a41 static 0 F F Po3
* 211 002a.6a39.2a41 static 0 F F sup-eth2
* 211 547f.ee2f.3381 dynamic 300 F F Po33

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 25
Reviewing the upstream switches
Configuration correct?
f241-03-08-5596-a# show run interface ethernet 1/8

Upstream switchport is Ethernet 1/8 interface Ethernet1/8

on Nexus 5K switchport mode trunk


switchport trunk allowed vlan 1-210,212-4094
channel-group 11 mode active

f241-03-08-5596-a# show run interface port-channel 11


interface port-channel11
description jlill-ucs-pod
Looking at the port-channel switchport mode trunk
switchport trunk allowed vlan 1-210,212-4094
configuration, we can see that spanning-tree port type edge trunk
speed 10000
VLAN 211 is not allowed vpc 11

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
What if we don’t learn
the MAC address on the
FI?
Did we have issues traversing IOM and VIC?
For 1st-3rd Gen FI’s…
• Three components left to investigate:
• OS/Driver issues – Did the OS actually send the frame northbound?
• VIC Adapter
• IOM (NIF and HIF ports)

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 28
Cisco VIC Adapter
Connecting and identifying logical interfaces…
CiscoLive-2019-A# connect adapter 1/1/1
adapter 1/1/1 # connect

adapter 1/1/1 (top):2# attach-mcp


adapter 1/1/1 (mcp):36# vnic -m
vnic id : internal id of vnic, use for other vnic cmds
vnic name/mac : ucsm provisioned name (-n) or mac address (-m)
vnic type : enet=ethernet, enet_pt=dynamic ethernet, fc=fcoe
vnic host : host
vnic state : state of vnic
lif : internal logical if id, use for other lif/vif cmds
lif state : state of lif
vif uif : bound uplink 0 or 1, =:primary, -:secondary, >:current
vif ucsm : ucsm id for this vif
vif idx : switch id for this vif
vif vlan : default vlan for traffic
vif state : state of vif
-------------------------------------- --------- --------------------------
v n i c l i f v i f
id mac type host state lif state uif ucsm idx vlan state
---- -------------- ------- ---- ----- --- ----- --- ----- ----- ---- -----
14 0025:b5a1:a1a0 enet 0 UP 4 UP =>1 4173 30 1 UP
15 0025:b5b1:b1b1 enet 0 UP 5 UP =>0 4174 29 1 UP
16 0025:b5a1:a1b0 enet 0 UP 6 UP =>1 4175 31 1 UP
17 aa25:b5a1:a1a0 fc 0 UP 7 UP =>1 4179 21 1000 UP

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 29
Cisco VIC Adapter
Viewing counters for drops and errors…
adapter 1/1/1 (mcp):28# lifstats -a 4
DELTA TOTAL DESCRIPTION
0 0 Tx unicast frames without error
0 0 Tx multicast frames without error
0 0 Tx broadcast frames without error
0 0 Tx unicast bytes without error
0 0 Tx multicast bytes without error
0 0 Tx broadcast bytes without error

• Tx would mean we
0 0 Tx frames dropped
0 0 Tx frames with error

sent frames with


0 0 Tx TSO frames
0 0 Rx unicast frames without error

errors to the IOM


216 479103 Rx multicast frames without error
5321 10558692 Rx broadcast frames without error
0 0 Rx unicast bytes without error
19077 44142182 Rx multicast bytes without error
386336 778358713 Rx broadcast bytes without error
0 0 Rx frames dropped
0 0 Rx rq drop pkts (no bufs or rq disabled)
0 0 Rx rq drop bytes (no bufs or rq disabled)
0
0
0 Rx frames with error
0 Rx good frames with RSS
• Rx would mean OS
0
24
0 Rx frames with Ethernet FCS error
42055 Rx frames len == 64
sent bad frame to
5403
27
10715806 Rx frames 64 < len <= 127
81854 Rx frames 128 <= len <= 255
the adapter
83 198061 Rx frames 256 <= len <= 511
0 19 Rx frames 512 <= len <= 1023
0 0 Rx frames 1024 <= len <= 1518
0 0 Rx frames len > 1518

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 30
We don’t see any issues
on the adapter…
IOM Troubleshooting
CiscoLive-2019-A# connect iom 1
fex-1# show platform software woodside sts
Board Status Overview: Uplink #: 1 2 3 4 5 6 7 8
legend: Link status: | | | |
' '= no-connect +-+--+--+--+--+--+--+--+-+
X = Failed SFP: [$][$][$][$][ ][ ][ ][ ]
- = Disabled +-+--+--+--+--+--+--+--+-+
: = Dn | N N N N N N N N |
| = Up | I I I I I I I I |
[$] = SFP present | 0 1 2 3 4 5 6 7 |
[ ] = SFP not present | |
[X] = SFP validation failed | NI (0-7) |
------------------------------ +------------+-----------+
|
+-------------------------+-------------+-------------+---------------------------+
| | | |
+------------+-----------+ +-----------+------------+ +------------+-----------+ +-------------+----------+
| HI (0-7) | | HI (8-15) | | HI (16-23) | | HI (24-31) |
| | | | | | | |
| H H H H H H H H | | H H H H H H H H | | H H H H H H H H | | H H H H H H H H |
| I I I I I I I I | | I I I I I I I I | | I I I I I I I I | | I I I I I I I I |
| 0 1 2 3 4 5 6 7 | | 8 9 1 1 1 1 1 1 | | 1 1 1 1 2 2 2 2 | | 2 2 2 2 2 2 3 3 |
| | | 0 1 2 3 4 5 | | 6 7 8 9 0 1 2 3 | | 4 5 6 7 8 9 0 1 |
+-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+
[ ][ ][ ][ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ][ ][ ][ ]
+-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+
- - | | - | - | | | | | - | | |
1 1 1 1 1 1 1 9 8 7 6 5 4 3 2 1
6 5 4 3 2 1 0
\__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/
blade8 blade7 blade6 blade5 blade4 blade3 blade2 blade1

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 32
IOM Troubleshooting
fex-1# show platform software {tiburon/woodside} rmon 0 hi31
+----------------------+----------------------+-----------------+----------------------+----------------------+-----------------+
| TX | Current | Diff | RX | Current | Diff |
+----------------------+----------------------+-----------------+----------------------+----------------------+-----------------+
| TX_PKT_LT64 | 0| 0| RX_PKT_LT64 | 0| 0|
| TX_PKT_64 | 0| 0| RX_PKT_64 | 386| 15|
| TX_PKT_65 | 379| 15| RX_PKT_65 | 13| 0|
| TX_PKT_128 | 8| 0| RX_PKT_128 | 754| 75|
| TX_PKT_256 | 717| 51| RX_PKT_256 | 0| 0|
| TX_PKT_512 | 12| 0| RX_PKT_512 | 22| 4|
| TX_PKT_1024 | 0| 0| RX_PKT_1024 | 0| 0|
| TX_PKT_1519 | 24| 0| RX_PKT_1519 | 0| 0|
| TX_PKT_2048 | 0| 0| RX_PKT_2048 | 0| 0|
| TX_PKT_4096 | 0| 0| RX_PKT_4096 | 0| 0|
| TX_PKT_8192 | 0| 0| RX_PKT_8192 | 0| 0|
| TX_PKT_GT9216 | 0| 0| RX_PKT_GT9216 | 0| 0|
| TX_PKTTOTAL | 1140| 66| RX_PKTTOTAL | 1175| 94|
| TX_OCTETS | 341435| 20207| RX_OCTETS | 163687| 15984|
| TX_PKTOK | 1140| 66| RX_PKTOK | 1175| 94|
| TX_UCAST | 384| 15| RX_UCAST | 588| 55|
| TX_MCAST | 756| 51| RX_MCAST | 543| 38|
| TX_BCAST | 0| 0| RX_BCAST | 44| 1|
| TX_VLAN | 0| 0| RX_VLAN | 0| 0|
| TX_PAUSE | 0| 0| RX_PAUSE | 0| 0|
| TX_USER_PAUSE | 0| 0| RX_USER_PAUSE | 0| 0|
| TX_FRM_ERROR | 0| 0| | | |
| | | | RX_OVERSIZE | 0| 0|
| | | | RX_TOOLONG | 0| 0|
| | | | RX_DISCARD | 0| 0|
| | | | RX_UNDERSIZE | 0| 0|
| | | | RX_FRAGMENT | 0| 0|
| | | | RX_CRC_NOT_STOMPED | 0| 0|
| | | | RX_CRC_STOMPED | 0| 0|
| TX_OCTETSOK | 341435| 20207| RX_OCTETSOK | 163687| 15984|
+----------------------+----------------------+-----------------+----------------------+----------------------+-----------------+

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 33
IOM Troubleshooting
Do we have errors on the NIF or HIF ports…
fex-1# show platform software woodside loss
+-------+-------------------------------------+------------+-+-----------------------------------+---------------------------------------+
| | | | | | |
| | | | | | frm_to |
| | |Port Extra | | +---------------------------------------|
| | RMON | Drop |S| SS Loss Counters | COS | XOFF |
| +------------+-----------+------------+------------|S|-----------+-----------+-----------+---------------------------------------|
| Port | Tx Pause | Rx Pause | Errors | Counters |x| RX SS | Tx SS | SS Total |0 |1 |2 |3 |4 |5 |6 |7 |0 |1 |
+-------+------------+-----------+------------+------------+-+-----------+-----------+-----------+---+---+---+---+---+---+---+---+---+---+
| 0- NI3| 0| 59896| 0| 7|0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |1| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |2| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |3| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |4| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |5| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |6| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |7| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
+-------+-------------------------------------+------------+-+-----------+-----------+-----------+---+---+---+---+---+---+---+---+---+---+
| 0-HI27| 770| 0| 0| 0|0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |1| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |2| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |3| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |4| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |5| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |6| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |7| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
+-------+-------------------------------------+------------+-+-----------+-----------+-----------+---+---+---+---+---+---+---+---+---+---+

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 34
Troubleshooting
Forwarding Issues
4th Gen FI - ELAM
ELAM
(Embedded Logic Analyzer Module)
• New to UCS – added on 4th Gen Fabric Interconnect

• Engineering tool used to understand how a packet is being forwarded


• ELAM is “embedded” within the forwarding pipeline
• Captures packet in real time without performance impact
• Questions answered:
• Did the packet reach the Fabric Interconnect?
• On what port and VLAN was the packet received?
• What did the packet look like?
• What was the forwarding decision made (outbound interface or drop)?

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
ELAM
Example 1 – UCS Server sends ping to SVI on upstream switch
SVI
VLAN 211
14.17.211.250

Nexus 5548

UCS 6454

Blade IP
14.17.211.31

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 37
UCS ELAM – Walkthrough on CLI
Connect to the switching software (NX-OS) for the Fabric Interconnect and then attach
to the hardware module:
F241-03-09-UCSFabric-6454-1-A# connect nxos a
F241-03-09-UCSFabric-6454-1-A(nx-os)# attach module 1

Enter the following debug shell for the ASIC:


module-1# debug platform internal tah elam asic 0

module-1(TAH-elam)# trigger init asic 0 slice 0 lu-a2d 1 in-select 6


param values: start asic 0, start slice 0, lu-a2d 1, in-select 6, out-select 0

Note – in the 6400 Fabric Interconnect, the ASIC and SLICE will always be 0.

Set capture filters:


module-1(TAH-elam-insel6)# set outer ?
arp ARP Fields
fcoe FCoE Fields
ipv4 IPv4 Fields
ipv6 IPv6 Fields
l2 All Layer 2 Fields
l4 L4 Fields

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 38
ELAM – Configuring Filter Criteria
For this example, we only capture traffic destined for IPv4 14.17.211.250
module-1(TAH-elam-insel6)# set outer ipv4 dst_ip 14.17.211.250

To start the capture:


module-1(TAH-elam-insel6)# start
GBL_C++: [MSG] rocky_elam_wrapper_init:36:asic type 8 inst 0 slice 0 a_to_d 1 insel 6 outsel 0
GBL_C++: [MSG] rocky_elam_wrapper_enable:95:asic type 8 inst 0 slice 0 a_to_d 1
GBL_C++: [MSG] – writing
data=0x000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000708E
9FD0000000000000000000000100000000000000 0000000000000000000000000000000000000000000001
GBL_C++: [MSG] - writing
mask=0x000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000007FFFF
FFF8000000000000000000000380000000000000 0000000000000000000000000000000000000000000001

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 39
ELAM Capture Report – No Drops
module-1(TAH-elam-insel6)# report
HOMEWOOD ELAM REPORT SUMMARY
slot - 1, asic - 0, slice – 0
============================
Incoming Interface: Eth1/11 Incoming Interface = ingrees port on the FI
Src Idx : 0x1000, Src BD : 211 Src_BD = BD stands for Bridge Domain (usually 1-to-1 mapping to a VLAN)

Outgoing Interface Info: dmod 1, dpid 3 Outgoing Interface = egress port on the FI
Dst Idx : 0x602, Dst BD : 211 DPID = Destination Port ID
Packet Type: IPv4
Source and Destination MAC
Dst MAC address: 00:2A:6A:35:4A:41
CoS value and VLAN tag info.
Src MAC address: 00:0C:29:D7:3C:89
.1q Tag0 VLAN: 211, cos = 0x0
Dst IPv4 address: 14.17.211.250
IP address information
Src IPv4 address: 14.17.211.31
Ver = 4, DSCP = 0, Don't Fragment = 0
Proto = 1, TTL = 128, More Fragments = 0
Hdr len = 20, Pkt len = 60, Checksum = 0x1ac8

L4 Protocol : 1
ICMP type : 8
ICMP code : 0

Drop Info:
----------
LUA:
LUB: Drop Information section
LUC: Will display reason for drops if applicable
LUD:
Final Drops:

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 40
ELAM – DPID
F241-03-09-UCSFabric-6454-1-A(nx-os)# show interface hardware-mappings

Legends:
SMod - Source Mod. 0 is N/A
Unit - Unit on which port resides. N/A for port channels
HPort - Hardware Port Number or Hardware Trunk Id:
HName - Hardware port name. None means N/A
FPort - Fabric facing port number. 255 means N/A
NPort - Front panel port number
VPort - Virtual Port Number. -1 means N/A
Slice - Slice Number. N/A for BCM systems
SPort - Port Number wrt Slice. N/A for BCM systems
SrcId - Source Id Number. N/A for BCM systems
------------------------------------------------------------------------
Name Ifindex Smod Unit HPort FPort NPort VPort Slice SPort SrcId
------------------------------------------------------------------------
Interface long name is Ethernet1/1
Eth1/1 1a000000 1 0 16 255 0 -1 0 16 32
Interface long name is Ethernet1/2
Eth1/2 1a000200 1 0 17 255 4 -1 0 17 34
Interface long name is Ethernet1/17
Eth1/17 1a002000 1 0 0 255 64 -1 0 0 0
Interface long name is Ethernet1/20
Eth1/20 1a002600 1 0 3 255 76 -1 0 3 6 SPort 3 = dpid 3

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
ELAM – Drop Scenario

Presentation ID
ELAM Walkthrough
Example 2 – Understanding why packets are not forwarded
SVI
VLAN 211
14.17.211.250

Nexus 5548

UCS 6454

Blade IP
14.17.211.31

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 43
ELAM – Configuring Filter Criteria
Same parameters as previous ELAM capture
F241-03-09-UCSFabric-6454-1-A# connect nxos a
F241-03-09-UCSFabric-6454-1-A(nx-os)# attach module 1

module-1# debug platform internal tah elam asic 0


module-1(TAH-elam)# trigger init asic 0 slice 0 lu-a2d 1 in-select 6

module-1(TAH-elam-insel6)# set outer ipv4 dst_ip 14.17.211.250


module-1(TAH-elam-insel6)# start

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 44
ELAM Capture Report – Drop Info
module-1(TAH-elam-insel6)# report

HOMEWOOD ELAM REPORT SUMMARY


slot - 1, asic - 0, slice – 0
============================

Incoming Interface: Eth1/11


Src Idx : 0x1000, Src BD : 211
Outgoing Interface Info: dmod 1, dpid 3
Dst Idx : 0x602, Dst BD : 211

Packet Type: IPv4

Dst MAC address: 00:2A:6A:35:4A:41


Src MAC address: 00:0C:29:D7:3C:89
.1q Tag0 VLAN: 211, cos = 0x0

Dst IPv4 address: 14.17.211.250


Src IPv4 address: 14.17.211.31
Ver = 4, DSCP = 0, Don't Fragment = 0
Proto = 1, TTL = 128, More Fragments = 0
Hdr len = 20, Pkt len = 60, Checksum = 0x1a93

L4 Protocol : 1
ICMP type : 8
ICMP code : 0

Drop Info:
---------- NOTE: When it comes to confirming whether the packet is actually
being dropped, the "Final Drops" field is the ONLY one to consider.
LUA:
LUB:
LUC:
Drop Information section – this time shows SRC_VLAN_MBR as the reason.
LUD:
This indicates an issue with VLAN membership on the source packet
SRC_VLAN_MBR
Final Drops:
SRC_VLAN_MBR
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 45
UCS ELAM – Remove VLAN

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 46
Networking Troubleshooting Summary
Quick recap…
• Keep it simple – UCS Networking is all Layer 2

• Are we learning the MAC on FI?


• 4th Gen FI – Use ELAM to understand forwarding decisions

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 47
The Story with Storage
FC and FCoE with UCS
FC Switching Modes
Different from Ethernet modes!
• End Host Mode
• Default Mode for FIs (NPV)
• Requires NPIV enabled device upstream
• Switch Mode
• Most common use - Direct Attached
Storage
UCS 5108

!
SLOT SLOT
1 2

How do we talk SLOT


3

SLOT
5
SLOT
4

SLOT
6

to each other? SLOT


7
SLOT
8

OK FAIL OK FAIL OK FAIL OK FAIL

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 49
Fibre Channel Port Types
• ‘N’ port: Node ports used to connect devices to switched fabric or point to point configurations.

N N

• ‘F’ port: Fabric ports residing on switches connecting ‘N’ port devices

N F

• ‘E’ port: Expansion ports are essentially trunk ports used to connect two Fibre Channel switches

E E

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 50
What Is NPIV?
• N-Port ID Virtualization (NPIV) provides a means to assign multiple FCIDs to a single N_Port
• Limitation exists in FC where only a single FCID can be handed out per F-port. Therefore an F-
Port can only accept a single FLOGI

• Allows multiple applications to share the same Fiber Channel adapter port
• Main use case is Virtualization

Application Server FC NPIV Core Switch

Email Email I/O F_Port


N_Port_ID 1

Web I/O F_Port


Web
N_Port_ID 2

File Services File Services I/O


N_Port_ID 3
N_Port
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 51
What Is NPV?
• N-Port Virtualizer (NPV) utilizes NPIV functionality to allow a “switch” to act like a server
performing multiple logins through a single physical link
• Physical servers connected to the NPV switch login to the upstream NPIV core switch
• No local switching is done on an FC switch in NPV mode
• FC edge switch in NPV mode does not take up a domain ID
• Helps to alleviate domain ID exhaustion in large fabrics

Application
Server NPV Switch FC NPIV Core Switch

F-Port

Eth1/1 Server1 NP-Port F-Port


N_Port_ID 1

Eth1/2 Server2 F_Port


N_Port_ID 2

Eth1/3 Server3
N_Port_ID 3

N-Port
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 52
SAN “End Host” NPV Mode
SAN A SAN B • Fabric Interconnects in NPV (N Port Virtualization)
Mode
FLOGI
FDISC
NPIV NPIV • Fabric Interconnect operates in N_Port Proxy (NP)
F_Port F_Port mode
VSAN VSAN • SAN switch sees Fabric Interconnect as an FC End
1
N_Proxy (NP)
1
N_Proxy (NP)
Host with many N_Ports and many FC IDs assigned
6100- 6100- • Server facing ports function as F-proxy ports
A vFC vFC B vFC vFC
1 2 1 2 • Server vHBA pinned to an FC uplink in the same
F_Prox F_Prox VSAN. Round Robin selection.
y y
• Provides multiple FC end nodes to one F_Port off an
N_Port N_Port FC Switch
vHB vHB vHB vHB
A0 A1 A0 A1

Server 1 Server 2
VSAN 1 VSAN 1

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 53
SAN FC Switch Mode
FC FCoE • UCS Fabric Interconnect behaves like a FC fabric
switch

• Direct Attach FC & FCoE Storage to UCS

N_Port • Storage ports can be FC or FCoE


VSAN 2
VSAN 1
• Light subset of FC Switching features
F_Port
• Set VSAN on Storage ports
6100-A FC Switch 6100-B FC Switch
vFC vFC vFC vFC
1 2 1 2
F_Port

N_Port
vHB vHB vHB vHB
A0 A1 A0 A1

Server 1 Server 2
VSAN 1 VSAN 2

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 54
FC Boot Configuration
FC Boot - Topology
Traditional deployment of UCS in End Host Mode

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
1
2
3
BCN
4
STS

ENV

LS
L1 L2
CISCO UCS-FI-6332

UCS B200 M4 UCS B200 M4

UCS 5108

!
SLOT SLOT
1 2
! Console Reset ! Console Reset

SLOT SLOT
3 4

SLOT SLOT
5 6

SLOT SLOT
7 8

OK FAIL OK FAIL OK FAIL OK FAIL

FC Storage Array MDS or N5K UCS

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 56
FC Boot - UCS Configuration
Boot from SAN requirements

vHBA

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 57
FC Boot - UCS Configuration
Boot from SAN requirements

vHBA
Boot Policy

Important
Settings
• vHBA Name
• WWPN of
Target
• Boot LUN ID

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 58
FC Boot - UCS Verification
CiscoLive-2019-A# show service-profile circuit server 1/7
Service Profile: asamplin/liveTest
Server: 1/7
Fabric ID: A
Path ID: 1
VIF vNIC Link State Oper State Prot State Prot Role Admin Pin Oper Pin Transport
---------- --------------- ----------- ---------- ------------- ----------- ---------- ---------- ---------
12430 Up Active No Protection Unprotected 0/0/0 0/0/0 Ether
4228 eth0 Offline Unknown No Protection Unprotected 0/0/0 0/0/0 Ether
4238 vhba1 Up Active No Protection Unprotected 0/0/0 2/0/16 Fc

CiscoLive-2019-A(nxos)# show npv flogi-table


Ensure vHBA is --------------------------------------------------------------------------------
FLOGI’d into FI SERVER EXTERNAL
INTERFACE VSAN FCID PORT NAME NODE NAME INTERFACE
--------------------------------------------------------------------------------
vfc4238 1000 0x6e0051 20:00:00:25:d5:00:00:2f 20:00:00:25:d5:00:00:0f fc2/16

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 59
FC Boot - Adapter Programming
Option ROM Programmed Correctly!

WWPN from storage


array seen before
server boots

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 60
FC Boot - We can see the LUN!
Verify that OS installer can see the boot LUN

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 61
Troubleshooting FC
Boot
FC Boot Troubleshooting – Lunlist Command
Commands entered are bolded
CiscoLive-2019-A# connect adapter 1/7/1
adapter 1/7/1 # connect
No entry for terminal type "dumb";
using dumb terminal settings.
adapter 1/7/1 (top):1# attach-fls
No entry for terminal type "dumb";
using dumb terminal settings.
adapter 1/7/1 (fls):1# lunlist
vnic : 15 lifid: 5
- FLOGI State : flogi est (fc_id 0x6e0051)
- PLOGI Sessions
- WWNN 50:0a:09:83:87:49:80:24 WWPN 50:0a:09:83:87:49:80:24 fc_id 0x450000
- LUN's configured (SCSI Type, Version, Vendor, Serial No.)
LUN ID : 0x0000000000000000 (0x0, 0x4, NETAPP , Hn/ZR40PU7K9)
- REPORT LUNs Query Response
LUN ID : 0x0000000000000000
- Nameserver Query Response
- WWPN : 50:0a:09:83:87:49:80:24
- WWPN : 50:0a:09:81:87:49:80:24

Can only be performed while


server in BIOS/Boot Menu

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 63
Lunlist – Step 1: Is the vHBA FLOGI present?
Let’s break it down!

CiscoLive-2019-A# connect adapter 1/7/1


adapter 1/7/1 # connect
No entry for terminal type "dumb";
using dumb terminal settings.
adapter 1/7/1 (top):1# attach-fls FLOGI Must be Established
No entry for terminal type "dumb";
using dumb terminal settings.
adapter 1/7/1 (fls):1# lunlist
vnic : 15 lifid: 5
- FLOGI State : flogi est (fc_id 0x6e0051)

CiscoLive-2019-A(nxos)# show npv flogi-table


--------------------------------------------------------------------------------
SERVER EXTERNAL
INTERFACE VSAN FCID PORT NAME NODE NAME INTERFACE
--------------------------------------------------------------------------------
vfc4238 1000 0x6e0051 20:00:00:25:d5:00:00:2f 20:00:00:25:d5:00:00:0f fc2/16

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 64
Lunlist – Step 2: Is zoning correctly configured?

UCS Boot Target


adapter 1/7/1 (fls):1# lunlist
vnic : 15 lifid: 5
- FLOGI State : flogi est (fc_id 0x6e0051)
- PLOGI Sessions
- WWNN 50:0a:09:83:87:49:80:24 WWPN 50:0a:09:83:87:49:80:24 fc_id 0x450000
- LUN's configured (SCSI Type, Version, Vendor, Serial No.)
LUN ID : 0x0000000000000000 (0x0, 0x4, NETAPP , Hn/ZR40PU7K9)
- REPORT LUNs Query Response
LUN ID : 0x0000000000000000
- Nameserver Query Response
- WWPN : 50:0a:09:83:87:49:80:24
- WWPN : 50:0a:09:81:87:49:80:24
WWPNs returned from
upstream switch
If boot target does not match
returned zone, server will not
boot

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 65
Lunlist – Verify Zoning on Upstream Switch
Similar to ACLs for Ethernet, default behavior is Deny
The zone configured on the
upstream switch must have
the following WWPNs: f241-03-08-5596-a# show zoneset name
zoneset name netapp1-1000 vsan 1000
netapp1-1000 vsan 1000

zone name asamplin-live vsan 1000


• vHBA of Server (Initiator) * fcid 0x6e0051 [pwwn 20:00:00:25:d5:00:00:2f]
* fcid 0x450000 [pwwn 50:0a:09:83:87:49:80:24] [netapp1-1-0a]
* fcid 0x450040 [pwwn 50:0a:09:81:87:49:80:24] [netapp1-1-0c]
• vHBA of Storage (Target)
Lunlist Output:
vnic : 15 lifid: 5
- FLOGI State : flogi est (fc_id 0x6e0051)
- PLOGI Sessions
- WWNN 50:0a:09:83:87:49:80:24 WWPN 50:0a:09:83:87:49:80:24

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 66
Lunlist – Step 3: What is my boot LUN ID?
adapter 1/7/1 (fls):1# lunlist
vnic : 15 lifid: 5 Setting from UCS Boot Policy
- FLOGI State : flogi est (fc_id 0x6e0051)
- PLOGI Sessions
- WWNN 50:0a:09:83:87:49:80:24 WWPN 50:0a:09:83:87:49:80:24 fc_id 0x450000
- LUN's configured (SCSI Type, Version, Vendor, Serial No.)
LUN ID : 0x0000000000000000 (0x0, 0x4, NETAPP , Hn/ZR40PU7K9)
- REPORT LUNs Query Response
LUN ID : 0x0000000000000000
- Nameserver Query Response
Returned from Storage Array
- WWPN : 50:0a:09:83:87:49:80:24
- WWPN : 50:0a:09:81:87:49:80:24

All of these
settings are
defined in boot
policy!

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 67
FC Boot Troubleshooting – Lunlist Command
Nonworking output
CiscoLive-2019-A# connect adapter 1/7/1
adapter 1/7/1 # connect
No entry for terminal type "dumb";
using dumb terminal settings.
adapter 1/7/1 (top):1# attach-fls
No entry for terminal type "dumb";
using dumb terminal settings.
adapter 1/7/1 (fls):1# lunlist
vnic : 15 lifid: 5
- FLOGI State : flogi est (fc_id 0x6e0051)
- PLOGI Sessions
- WWNN 50:0a:09:83:87:49:80:24 WWPN 50:0a:09:83:87:49:80:24 fc_id 0x000000
- LUN's configured (SCSI Type, Version, Vendor, Serial No.)
LUN ID : 0x0000000000000000 access failure
- REPORT LUNs Query Response
- Nameserver Query Response

Zoning is not correct on upstream switch

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 68
FC User Story
I added links but no extra bandwidth!
TAC Case Example
• Hosts Reporting high storage latency
• Customer added 2 additional FC
uplinks (tripling bandwidth)
• Customer reported to TAC that nothing
changed! FC Switch
Fabric Interconnect

Newly added
uplinks in black

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 70
FC Uplink Behavior
Individual FC Uplinks
Servers pin to individual uplinks
No load balancing FC Switch

Fabric Interconnect
fcid1 fcid4
FDISC
fcid2 fcid5
FDISC
fcid3 fcid6

Yellow Link indicates original uplink

• Customer would need to reboot their servers to


send FDISC and utilize new uplinks
• Any new servers would be able to utilize the
additional bandwidth
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 71
FC Port Channels
The power of the bundle

Bundled Port Channel


FC Switch

Fabric Interconnect

• Frames are sent round robin per


link

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 72
Note on FC Port Channels
Multiple VSANs Optional

• FC Port Channels ensure that load


balancing happens across FC links
• SAN Uplinks normally carry 1 VSAN at a
time
• Port Channel allows dynamic modification
of members
• Multiple VSANs needs F-port-channel-
trunking enabled
• Requires Cisco FC device upstream

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 73
FC Port Channel Verification
Which VSANs are active on my port channel?
CiscoLive-2019-B(nxos)# show int san 44
san-port-channel 44 is trunking
Hardware is Fibre Channel
Port WWN is 24:2c:54:7f:ee:c5:6c:c0
Admin port mode is NP, trunk mode is on
snmp link state traps are enabled
Port mode is TNP
Port vsan is 1001
Speed is 16 Gbps
Trunk vsans (admin allowed and active) (1,10,200-203,888,1000-1001)
201 and 1001 are up Trunk vsans (up) (201,1001)
because they are the Trunk vsans (isolated) (10,200,202,888,1000)
only vHBAs with Trunk vsans (initializing) (1,203)
1 minute input rate 13560 bits/sec, 1695 bytes/sec, 4 frames/sec
active FLOGI into FI 1 minute output rate 7480 bits/sec, 935 bytes/sec, 4 frames/sec
1940486 frames input, 2478434904 bytes
83 discards, 0 errors
0 CRC, 0 unknown class
0 too long, 0 too short
736055 frames output, 89418044 bytes
0 discards, 0 errors
4 input OLS, 4 LRR, 3 NOS, 0 loop inits
10 output OLS, 2 LRR, 0 NOS, 0 loop inits
last clearing of "show interface" counters never
Member[1] : fc2/15
Member[2] : fc2/16
Interface last changed at Sun May 14 20:03:28 2017
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 74
Troubleshooting FC in
UCS
How to check for FC Possible Issues within UCS
Since UCS is normally in End Host mode, congestion is hard to find
• Storage traffic within UCS is FCoE

• Track down the affected blades within UCS via IOM

• Isolate which blades have the issue

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 76
Is there a hardware issue?
Check the IOM ports from NX-OS…

• Ethernet ports x/y/z correlate CiscoLive-2019-A(nxos)# show interface counters errors

to HIF’s on the IOM --------------------------------------------------------------------------------


Port Align-Err FCS-Err Xmit-Err Rcv-Err UnderSize OutDiscards
--------------------------------------------------------------------------------
• x = chassis Eth1/1/1 0 1170 0 1170 0 0
Eth1/1/2 0 0 0 0 0 0
• y = module on IOM Eth1/1/3
Eth1/1/4
0
0
0
0
0
0
0
0
0
0
0
0

• z = port Eth1/1/5
Eth1/1/6
0
0
0
0
0
0
0
0
0
0
0
0
Eth1/1/7 0 0 0 0 0 0
Eth1/1/8 0 0 0 0 0 0
Eth1/1/9 0 0 0 0 0 0
Eth1/1/10 0 0 0 0 0 0
Eth1/1/11 0 0 0 0 0 0
Eth1/1/12 0 0 0 0 0 0
Eth1/1/13 0 0 0 0 0 0
Eth1/1/14 0 0 0 0 0 0
Eth1/1/15 0 0 0 0 0 0
Eth1/1/16 0 0 0 0 0 0
Eth1/1/17 0 0 0 0 0 0

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 77
IOM Troubleshooting
Example where errors on IOM are indicating issues downstream…

CiscoLive-2019-A(nxos)# show interface counters errors

--------------------------------------------------------------------------------
Port Align-Err FCS-Err Xmit-Err Rcv-Err UnderSize OutDiscards
--------------------------------------------------------------------------------
Eth1/6 0 103 0 103 0 0 Uplink interfaces rcvd bad
Eth1/21 0 103 0 103 0 0
frames
Po1027 0 206 0 206 0 0 Uplink port-channel counters
Po1351 0 207 0 207 0 0 Adapter-IOM port-channel
Eth3/1/1 0 0 0 0 0 0
Eth3/1/2
Eth3/1/3
0
0
0
0
0
0
0
0
0
0
0
0 HIF ports on IOM
Eth3/1/4 0 207 0 207 0 0

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 78
Which server is seeing an issue?
Determine interface association

CiscoLive-2019(nxos)# show system internal fex info satport ethernet 3/1/4


Interface-Name ifindex State Fabric-if Pri-fabric Expl-Pinned
Eth3/1/4 0x1f000000 Up Po1027 Po1027 NoConf
Port Phy Up. Port dn req: Not pending
SDB entry: ifindex(1f000000) fabric if(16000400)
Dev: 0 Nif0 Hif26 (Nif:0x16000400 Hif:0x1f000000)

The Host Interface (HIF) is now known. You can use this
info to determine affected blade

Note: NIF port will always be nif0 if port-channel


configured on IOM

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 79
IOM Troubleshooting
Reviewing HIF ports to isolate blades impacted
fex-1# show platform software woodside sts
Board Status Overview: Uplink #: 1 2 3 4 5 6 7 8
legend: Link status: | | | |
' '= no-connect +-+--+--+--+--+--+--+--+-+
X = Failed SFP: [$][$][$][$][ ][ ][ ][ ]
- = Disabled +-+--+--+--+--+--+--+--+-+
: = Dn | N N N N N N N N |
| = Up | I I I I I I I I |
[$] = SFP present | 0 1 2 3 4 5 6 7 |
[ ] = SFP not present | |
[X] = SFP validation failed | NI (0-7) |
------------------------------ +------------+-----------+
|
+-------------------------+-------------+-------------+---------------------------+
| | | |
+------------+-----------+ +-----------+------------+ +------------+-----------+ +-------------+----------+
| HI (0-7) | | HI (8-15) | | HI (16-23) | | HI (24-31) |
| | | | | | | |
| H H H H H H H H | | H H H H H H H H | | H H H H H H H H | | H H H H H H H H |
| I I I I I I I I | | I I I I I I I I | | I I I I I I I I | | I I I I I I I I |
| 0 1 2 3 4 5 6 7 | | 8 9 1 1 1 1 1 1 | | 1 1 1 1 2 2 2 2 | | 2 2 2 2 2 2 3 3 |
| | | 0 1 2 3 4 5 | | 6 7 8 9 0 1 2 3 | | 4 5 6 7 8 9 0 1 |
+-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+
[ ][ ][ ][ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ][ ][ ][ ]
+-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+
- - | | - | - | | | | | | | | |
1 1 1 1 1 1 1 9 8 7 6 5 4 3 2 1
6 5 4 3 2 1 0
\__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/
blade8 blade7 blade6 blade5 blade4 blade3 blade2 blade1

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 80
Understanding PAUSE
Frames
What are PAUSE Frames?
• Storage traffic needs to be lossless, so PAUSE frames are used so frames
are not dropped
• PAUSE frames are used in FCoE and allow an interface to send a request
for a short pause in frame transmission to avoid drops
• Can be a sign of an issue, but not always…
• Under normal operations, we would expect PAUSE frames to increment
• Requires detailed review – Remember the numbers are relative!

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 82
Do we have congestion in our UCS?
CiscoLive-2019(nx-os)# show interface priority-flow-control

============================================================
Port Mode Oper(VL bmap) RxPPP TxPPP
============================================================

Ethernet1/5 Auto Off 0 0


Ethernet1/6 Auto Off 0 0
CiscoLive-2019(nx-os)# show interface fex-fabric
Ethernet1/7 Auto Off 0 0 1 Eth1/9 Active 2 N20-C6508 FOX1334G690
Ethernet1/8 Auto Off 0 0 1 Eth1/10 Active 3 N20-C6508 FOX1334G690
Ethernet1/9 On On (8) 0 195820
1 Eth1/11 Active 4 N20-C6508 FOX1334G690
Ethernet1/10 On On (8) 0 195820 1 Eth1/12 Active 1 N20-C6508 FOX1334G690
Ethernet1/11 On On (8) 0 196207
Ethernet1/12 On On (8) 0 196446
Ethernet1/13 Auto Off 0 0
Ethernet1/14 Auto Off 0 0
Ethernet1/15 Auto Off 0 0
Ethernet1/16 Auto Off 0 0

Need to run multiple times and evaluate rate of change


Typically +10,000 in 1-2 second intervals is a sign of congestion

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 83
IOM Troubleshooting
Reviewing HIF ports to isolate blades impacted
fex-1# show platform software woodside sts
Board Status Overview: Uplink #: 1 2 3 4 5 6 7 8
legend: Link status: | | | |
' '= no-connect +-+--+--+--+--+--+--+--+-+
X = Failed SFP: [$][$][$][$][ ][ ][ ][ ]
- = Disabled +-+--+--+--+--+--+--+--+-+
: = Dn | N N N N N N N N |
| = Up | I I I I I I I I |
[$] = SFP present | 0 1 2 3 4 5 6 7 |
[ ] = SFP not present | |
[X] = SFP validation failed | NI (0-7) |
------------------------------ +------------+-----------+
|
+-------------------------+-------------+-------------+---------------------------+
| | | |
+------------+-----------+ +-----------+------------+ +------------+-----------+ +-------------+----------+
| HI (0-7) | | HI (8-15) | | HI (16-23) | | HI (24-31) |
| | | | | | | |
| H H H H H H H H | | H H H H H H H H | | H H H H H H H H | | H H H H H H H H |
| I I I I I I I I | | I I I I I I I I | | I I I I I I I I | | I I I I I I I I |
| 0 1 2 3 4 5 6 7 | | 8 9 1 1 1 1 1 1 | | 1 1 1 1 2 2 2 2 | | 2 2 2 2 2 2 3 3 |
| | | 0 1 2 3 4 5 | | 6 7 8 9 0 1 2 3 | | 4 5 6 7 8 9 0 1 |
+-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+
[ ][ ][ ][ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ][ ][ ][ ]
+-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+
- - | | - | - | | | | | - | | |
1 1 1 1 1 1 1 9 8 7 6 5 4 3 2 1
6 5 4 3 2 1 0
\__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/
blade8 blade7 blade6 blade5 blade4 blade3 blade2 blade1

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 84
IOM Troubleshooting
fex-1# show platform software woodside rmon 0 hi31
+----------------------+----------------------+-----------------+----------------------+----------------------+-----------------+
| TX | Current | Diff | RX | Current | Diff |
+----------------------+----------------------+-----------------+----------------------+----------------------+-----------------+
| TX_PKT_LT64 | 0| 0| RX_PKT_LT64 | 0| 0|
| TX_PKT_64 | 0| 0| RX_PKT_64 | 19985| 0|
| TX_PKT_65 | 30235045| 0| RX_PKT_65 | 22468| 0|
| TX_PKT_128 | 713668| 0| RX_PKT_128 | 46488| 0|
| TX_PKT_256 | 26427672| 2| RX_PKT_256 | 19112| 0|
| TX_PKT_512 | 6425| 0| RX_PKT_512 | 5996| 0|
| TX_PKT_1024 | 12184| 0| RX_PKT_1024 | 21769| 0|
| TX_PKT_1519 | 2690146| 0| RX_PKT_1519 | 106682| 0|
| TX_PKT_2048 | 33075| 0| RX_PKT_2048 | 0| 0|
| TX_PKT_4096 | 0| 0| RX_PKT_4096 | 0| 0|
| TX_PKT_8192 | 0| 0| RX_PKT_8192 | 0| 0|
| TX_PKT_GT9216 | 0| 0| RX_PKT_GT9216 | 0| 0|
| TX_PKTTOTAL | 60118215| 2| RX_PKTTOTAL | 242500| 0|
| TX_OCTETS | 16177265983| 712| RX_OCTETS | 213575710| 0|
| TX_PKTOK | 60118215| 2| RX_PKTOK | 242500| 0|
| TX_UCAST | 2833669| 0| RX_UCAST | 198047| 0|
| TX_MCAST | 6234374| 0| RX_MCAST | 44299| 0|
| TX_BCAST | 51050172| 2| RX_BCAST | 154| 0|
| TX_VLAN | 0| 0| RX_VLAN | 0| 0|
| TX_PAUSE | 0| 0| RX_PAUSE | 0| 0|
| TX_USER_PAUSE | 0| 0| RX_USER_PAUSE | 37754621| 9843|
| TX_FRM_ERROR | 0| 0| | | |
| | | | RX_DISCARD | 0| 0|
| | | | RX_UNDERSIZE | 0| 0|
| | | | RX_FRAGMENT | 0| 0|
| | | | RX_CRC_NOT_STOMPED | 0| 0|
| | | | RX_CRC_STOMPED | 0| 0|
| | | | RX_INRANGEERR | 0| 0|
| | | | RX_JABBER | 0| 0|
| TX_OCTETSOK | 16177265983| 712| RX_OCTETSOK | 213575710| 0|
+----------------------+----------------------+-----------------+----------------------+----------------------+-----------------+

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 85
IOM Troubleshooting
Checking the network port between FI and IOM
fex-1# show platform software woodside rmon 0 ni0 | in PAUSE
+----------------------+----------------------+-----------------+----------------------+----------------------+-----------------+
| TX | Current | Diff | RX | Current | Diff |
+----------------------+----------------------+-----------------+----------------------+----------------------+-----------------+
| PORT CNTRS NI0 |
| TX_PAUSE | 0| 0| RX_PAUSE | 0| 0|
| TX_USER_PAUSE | 1956| 78| RX_USER_PAUSE | 87512| 3564|

Significantly more user pause on RX

UCS 2208XP

Pause frames Remember


2

RX from FI
3
TX to FI
are normal 4 that TX and RX
behavior. (coming from upstream) 5
(sending to upstream) are from the
The number is 6

IOM’s
relative! perspective
7

2208XP IOM
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 86
Understanding FC Aborts

Presentation ID
Common Scenario - FC Aborts Seen in OS Logs
ESXi host – vmkernel.log
2017-09-05T08:14:04.267Z cpu26:2727449)<7>fnic : 0 :: Abort Cmd called FCID 0x450060, LUN 0x61 TAG f3 flags 273
2017-09-05T08:14:04.267Z cpu4:2727482)<7>fnic : 0 :: Abort Cmd called FCID 0x450060, LUN 0x63 TAG f5 flags 273
2017-09-05T08:14:04.270Z cpu11:2605734)<7>fnic : 0 :: abts cmpl recd. id 243 status FCPIO_ABORTED
2017-09-05T08:14:04.270Z cpu26:2727449)<7>fnic : 0 :: Returning from abort cmd type 2 SUCCESS
2017-09-05T08:14:04.270Z cpu28:33545)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.600601602e503900fd54acb15dfae511"
state in doubt; requested fast path state update...
2017-09-05T08:14:04.270Z cpu28:33545)ScsiDeviceIO: 2651: Cmd(0x43a601e05d80) 0xfe, CmdSN 0xaa13f from world 32822 to dev
"naa.600601602e503900fd54acb15dfae511" failed H:0x8 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2017-09-05T08:14:04.273Z cpu11:2605734)<7>fnic : 0 :: abts cmpl recd. id 245 status FCPIO_ABORTED

Aborts are normal behavior, however a significant


number of aborts can indicate an issue!
Numbers are relative!

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 88
Common Scenario – Check UCS Adapter Logs
What does the mezzanine adapter show?
• OS log showing CiscoLive-2019-A# connect adapter 1/1/1
adapter 1/1/1 # connect
issues hitting adapter 1/1/1 (top):1# show-log
storage
160309-20:25:43.456386 ecom.ecom_main ecom(8:2): abort called for exch 68f1,
status 3 rx_id 8517 s_stat 0x1 xmit_recvd 0x400 burst_offset 0x400 burst_len 0x0
• Investigate adapter sgl_err 0x0 last_param 0x0 last_seq_cnt 0x0 tot_bytes_exp 0x400 h_seq_cnt 0x0
logs on UCS exch_type 0x1 s_id 0x450020 d_id 0x450060 host_tag 0x58

160309-20:25:45.526540 ecom.ecom_main ecom(8:2): abort called for exch 69db,


status 3 rx_id 87f0 s_stat 0x1 xmit_recvd 0x2000 burst_offset 0x2000 burst_len 0x0
sgl_err 0x0 last_param 0x0 last_seq_cnt 0x3 tot_bytes_exp 0x2000 h_seq_cnt 0x0
exch_type 0x1 s_id 0x450060 d_id 0x6e0051 host_tag 0xf

Aborts are normal behavior, however a significant


number of aborts can indicate an issue!
Numbers are relative!

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 89
Review Upstream FC Switch
Correlate Source and Destination FCIDs to devices in the fabric
• Source FCID and Abort 1: s_id 0x450020 d_id 0x450060
Abort 2: s_id 0x450060 d_id 0x6e0051
Dest FCID should
be present f241-03-08-5596-a# show fcns database vsan 1000

• Conversation was VSAN 1000:


between array and --------------------------------------------------------------------------
FCID TYPE PWWN (VENDOR) FC4-TYPE:FEATURE
itself, and array to --------------------------------------------------------------------------
UCS 0x450000 N 50:0a:09:83:87:49:80:24 (NetApp)
[netapp1-1-0a]
scsi-fcp:target

0x450020 N 50:0a:09:83:97:49:80:24 (NetApp) scsi-fcp:target


[netapp1-2-0a]
0x450040 N 50:0a:09:81:87:49:80:24 (NetApp) scsi-fcp:target
[netapp1-1-0c]
0x450060 N 50:0a:09:81:97:49:80:24 (NetApp) scsi-fcp:target
[netapp1-2-0c]

0x6e0051 N 20:00:00:25:d5:00:00:2f scsi-fcp:init fc-gs

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 90
4th Gen FI
New Commands for FC
Troubleshooting
FC Credit Mechanism Basics
• Frames are only transmitted when it is known that the receiver
has buffer space
• For each frame sent, an R_Rdy (B2B Credit) should be returned
• R_Rdys can only be returned once the frame that has previously
occupied that buffer location has been handled
SW2
SW1 Frame sent, B2B Credit –1 on SW1

R_RDY sent after frame processed

R_RDY received (B2B +1) and new frame transmit

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 92
Buffer to Buffer Credit Counters
CiscoLive-2019-A# connect nxos a
CiscoLive-2019-A(nx-os)# attach module 1

module-1# show hardware internal fc-mac 1 port 1 statistics


ADDRESS STAT COUNT
__________ ________ __________________
0x0000003c FCP_CNTR_MAC_RX_LOSS_OF_SYNC 0
0x0000003d FCP_CNTR_MAC_RX_BAD_WORDS_FROM_DECODER 0
<snip>
0xffffffff FCP_CNTR_TIMEOUT_DISCARDS 0
0xffffffff FCP_CNTR_CREDIT_LOSS 0
0xffffffff FCP_CNTR_FORCE_TIMEOUT_ON 0
0xffffffff FCP_CNTR_FORCE_TIMEOUT_OFF 0
0xffffffff FCP_CNTR_TX_WT_AVG_B2B_ZERO 0
0xffffffff FCP_CNTR_RX_WT_AVG_B2B_ZERO 0

These counters display the number of Tx and Rx transitions to zero


credit within the 100ms monitor window

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 93
Credit Loss Events and Slow Drain Detection
module-1# show process creditmon credit-loss-events
Credit Loss Events: YES

----------------------------------------------------
| Interface | Total | Timestamp |
| | Events | |
----------------------------------------------------
| fc1/1 | 2 | 1. Thu Aug 9 17:16:03 2018 |
| | | 2. Thu Aug 9 17:14:09 2018 |
----------------------------------------------------

module-1# show platform software fcpc info interface fc 1/1 | begin CREDIT
CREDIT MONITOR INFO
if index: 0x1000000
monitor event: on
number of err functions invoked: 0
number of GSM events generated: 0 Slow drain event detected count
number of err entries: 1 Think of it as “number of times credit-loss-recovery was detected”
fcp port mode: trunking port

e port previous rrdy count: 21879 Number of RRDY counts sent

e port credit loss count: 0 Number of credit loss in current monitor window
When in End-Host Mode, uplink NP port counters are displayed as E Ports

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 94
Storage Troubleshooting Summary
Quick recap…
• Lunlist output only available before boot

• Virtually no drawback from a SAN port channel


• Aborts and pause frame numbers are relative, need to look at the
aggregate
• 4th Gen FI has improved FC troubleshooting tools

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 95
UCS Management
Mastery
When are they recommended?
UCS Health Checks • Before Upgrades
• Prior to planned maintenance of any
kind

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 97
UCS Manager Health Check
CiscoLive-2019-A(local-mgmt)# show cluster extended-state
Cluster Id: 0x2c092182748311e2-0x8ed9547feec569c4
• Connect local (a/b) Start time: Wed Apr 19 14:41:20 2017
via SSH Last election time: Wed Apr 19 14:42:58 2017

• Show cluster A: UP, PRIMARY


B: UP, SUBORDINATE
extended-state
A: memb state UP, lead state PRIMARY, mgmt services state: UP
• Check for L1/L2 up, B: memb state UP, lead state SUBORDINATE, mgmt services state: UP
is HA ready heartbeat state PRIMARY_OK

INTERNAL NETWORK INTERFACES:


eth1, UP
eth2, UP

HA READY
Detailed state of the device selected for HA storage:
Chassis 1, serial: FOX1330GDH1, state: active

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 98
UCS Manager Health Check
Healthy FI Working Output
CiscoLive-2019-A(local-mgmt)# show pmon state

SERVICE NAME STATE RETRY(MAX) EXITCODE SIGNAL CORE


• Connect local (a/b) ------------ ----- ---------- -------- ------ ----
via SSH svc_sam_controller
svc_sam_dme
running
running
0(4)
0(4)
0
0
0
0
no
no
• Show pmon state svc_sam_dcosAG
svc_sam_bladeAG
running
running
0(4)
0(4)
0
0
0
0
no
no
local to each FI svc_sam_portAG running 0(4) 0 0 no
svc_sam_statsAG running 0(4) 0 0 no
• If there are issues, svc_sam_hostagentAG running 0(4) 0 0 no
please contact svc_sam_nicAG running 0(4) 0 0 no
svc_sam_licenseAG running 0(4) 0 0 no
TAC before svc_sam_extvmmAG running 0(4) 0 0 no
upgrading httpd.sh running 0(4) 0 0 no
httpd_cimc.sh running 0(4) 0 0 no
svc_sam_sessionmgrAG running 0(4) 0 0 no
svc_sam_pamProxy running 0(4) 0 0 no
dhcpd running 0(4) 0 0 no
sam_core_mon running 0(4) 0 0 no
svc_sam_rsdAG running 0(4) 0 0 no
svc_sam_svcmonAG running 0(4) 0 0 no

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 99
UCS Management Database Health
Scheduled Checks...
CiscoLive-2019-B# scope system
CiscoLive-2019-B /system # show mgmt-db-check-policy detail

Management Database Integrity Check Policy:


Health Check Interval (hours): 24
Last Integrity Check Time: 2018-12-18T23:15:56.615
Internal Backup Interval (days): 14
Last Internal Backup Time: 2018-12-18T20:39:40.54

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 100
UCS Management Database Health
Manually Checking…
CiscoLive-2019-B# scope system
CiscoLive-2019-B /system # start-db-check
CiscoLive-2019-B /system* # commit-buffer

<wait…>

CiscoLive-2019-B /system # show mgmt-db detail

Management Database Status:


Fabric Id Corrupted Count Last Occurrence Time
--------- ----------------------- --------------------
A 0 1970-01-01T00:00:00.000
B 2 2018-08-13T12:15:56.01

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 101
Hardware Diagnostics
Diagnostic ISO
Tests available:
• Memtest 86 – Memory and CPU
cache tests

• Quick tests – CIMC, CPU, Storage,


Memory

• Comprehensive tests – CIMC, CPU,


Memory, Video

• Test Suite – User chooses which


tests to run

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 103
Blade Memory Diagnostics
• Added in UCSM version 3.1(3a) as an embedded tool
• Only test for issues with memory
• Several options configurable in diag-policy:

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 104
Blade Memory Diagnostics
Available under Diagnostics tab of Server view

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 105
Blade Memory Diagnostics
Results

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 106
The Power of Intersight
• Connecting UCS to the Cloud

• Visibility into multiple domains

• Supports UCS-B/C series and HX


deployments

Presentation ID © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 108
Intersight - Claim UCS Domain
• First step is to claim your domain in Intersight!
• Log in at https://www.intersight.com

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 109
Intersight - Claim UCS Domain

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 110
Intersight - Claim UCS Domain

Enter your Device ID and Claim Code

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 111
Intersight – New Domain Added!

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 112
Intersight - HCL

Server is not compliant with the HCL!

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 113
One minute while we update
firmware…

Presentation ID
Intersight - HCL

We are compliant!

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 115
Cisco Intersight
Connected TAC
Cisco Intersight: Enhanced Support
Connected TAC

Overview: Supports:
Automated transmission of technical support files • UCS Manager
to the Cisco Technical Assistance Center (TAC) • Standalone C-Series
for accelerated troubleshooting. • HyperFlex (ESXi & Hyper-V)

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 117
Intersight + TAC Real World Example #1

P2 Case Diagnostic Data Diagnostic Resolution


Opened Automatically Results Implemented
Collected Automatically
Generated

+20 Minutes +30 Minutes to Resolution


2018-12-04 16:25 Defect Signature Detected

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 118
Intersight + TAC Real World Example #2

P3 Case Diagnostic Data Diagnostic RMA Processed


Opened Automatically Results & Part Replaced
Collected Automatically
Gathered

2018-06-12 08:11 +11 Minutes +1:59 Hours (RMA DIMM)

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 119
Intersight – Future Features
• Intersight Appliance (On-Prem version)
• TAC SR Creation
• More integration with TAC Digitized IC (automatic issue detection)
• More server platforms integrated
• New C series models!
• Hyperflex Connect
• Deploy HX through Intersight!
• New features and documentation at https://www.intersight.com/help/

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 120
Questions?
Cisco Webex Teams

Questions?
Use Cisco Webex Teams (formerly Cisco Spark)
to chat with the speaker after the session

How
1 Find this session in the Cisco Events Mobile App
2 Click “Join the Discussion”
3 Install Webex Teams or go directly to the team space
4 Enter messages/questions in the team space

cs.co/ciscolivebot#BRKINI-2011

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 122
Complete your online
session survey
• Please complete your Online Session
Survey after each session
• Complete 4 Session Surveys & the Overall
Conference Survey (available from
Thursday) to receive your Cisco Live T-
shirt
• All surveys can be completed via the Cisco
Events Mobile App or the Communication
Stations

Don’t forget: Cisco Live sessions will be available for viewing


on demand after the event at ciscolive.cisco.com

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 123
Continue Your Education

Demos in Meet the Related


Walk-in
the Cisco engineer sessions
self-paced
Showcase labs 1:1
meetings

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 123
Thank you
Disjoint Layer 2
What is Disjoint L2?
Two different L2 Domains…
• Click to edit Master text styles Pro Back
• Second level d up
• Third level
• Fourth level

Backup
Prod vNIC
vNIC

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 128
Configuration done half-way...

What about these


VLANs?

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 129
Uplink port configuration in this scenario…
CiscoLive-2019-B(nxos)# show running-config interface ethernet 1/17

interface Ethernet1/17
description U: Uplink
pinning border
pinning server nf-exporter
switchport mode trunk
switchport trunk allowed vlan 1,104,111,204,211,304,311,900
udld disable
no shutdown

CiscoLive-2019-B(nxos)# show running-config interface port-channel 2

interface port-channel2
description U: Uplink
switchport mode trunk
switchport trunk allowed vlan 1,104,111,204,211,304,311
pinning border
speed 10000

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 130
Understanding the Designated Receiver
• Absence of STP means we rely on other mechanisms to avoid loops

• An uplink will be selected as the broadcast and multicast receiver


• This is referred to as the Designated Receiver
• Done on a per VLAN basis
Pro Back
d up
Po2
Eth 1/17

Prod Backup
vNIC vNIC

BRKINI-2011 131
Who is the Designated Receiver?
CiscoLive-2019-B(nxos)# show platform software enm internal info vlandb all

vlan_id 1
-------------
Designated receiver: Po2
Membership:
Eth1/17 Po2
Pro Back
vlan_id 104
-------------
d up VLAN 900 Only
Designated receiver: Eth1/17
Po2
Membership:
Eth 1/17
Eth1/17 Po2

vlan_id 111
-------------
Designated receiver: Po2
Membership:
Eth1/17 Po2

vlan_id 900
-------------
Designated receiver: Eth1/17
Membership: Prod Backup
vNIC vNIC
Eth1/17

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 132
Disjoint Layer 2 Configured in full…

You must explicitly define


which interfaces the
VLAN should traverse for
ALL VLANs in DJL2

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 133
Correct configuration from CLI
CiscoLive-2019-B(nxos)# show running-config interface ethernet 1/17

interface Ethernet1/17
description U: Uplink
pinning border
pinning server nf-exporter
switchport mode trunk
switchport trunk allowed vlan 1,900
udld disable
no shutdown

CiscoLive-2019-B(nxos)# show running-config interface port-channel 2

interface port-channel2
description U: Uplink
switchport mode trunk
switchport trunk allowed vlan 1,104,111,204,211,304,311
pinning border
speed 10000

BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 134

You might also like