Cisco UCS Troubleshooting
Cisco UCS Troubleshooting
Cisco UCS Troubleshooting
Lessons Learned:
Troubleshooting UCS
from a TAC Engineer’s
Perspective
Questions?
Use Cisco Webex Teams (formerly Cisco Spark)
to chat with the speaker after the session
How
1 Find this session in the Cisco Events Mobile App
2 Click “Join the Discussion”
3 Install Webex Teams or go directly to the team space
4 Enter messages/questions in the team space
cs.co/ciscolivebot#BRKINI-2011
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 3
Agenda
• The Network is Down!!?
• The Story with Storage
• UCS Management Mastery
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 4
The Network is Down!!?
Isolating Networking Issues
UCS Networking Basics
Ethernet Switching Modes
• Switch Mode
• FI works like a normal layer 2 switch with spanning-
tree
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
Physical and Logical Ports in UCS
Uplink Ports
Fabric
Interconnect Server Ports
Network Interface
(NIF) Ports
IOM / Fex Host Interface (HIF)
Ports
Adapter Port
Mezz Adapter vNIC
vNIC / vEthernet /
Virtual Interface (VIF)
Presentation ID © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
Network Troubleshooting for UCS
TAC Methodology…
• Simplify the issue
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
Common Problem:
My VM is unreachable
on the network
Lab Topology
vPC
Nexus 5548 Nexus 5548
Port-Channel
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 11
Tracing the Path in UCS
Start from the most southbound endpoint
Gather information from the hypervisor about the VM…
• Hypervisor Host (blade)
• VM MAC Address
• vSwitch Port Group
• IP Address
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 13
Understand Virtual Network Path
Verifying virtual switch is configured as expected…
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
Understand Virtual Network Path
Using “esxtop” command with the ‘n’ option…
[root@localhost:~] esxtop
4:50:33pm up 2 days 23:29, 696 worlds, 3 VMs, 6 vCPUs; CPU load average: 0.00, 0.00, 0.00
PORT-ID USED-BY TEAM-PNIC DNAME PKTTX/s MbTX/s PSZTX PKTRX/s MbRX/s PSZRX %DRPTX %DRPRX
33554433 Management n/a vSwitch0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
33554434 vmnic0 - vSwitch0 0.00 0.00 0.00 12.25 0.01 82.00 0.00 0.00
33554435 Shadow of vmnic0 n/a vSwitch0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
33554439 vmk0 vmnic1 vSwitch0 10.11 0.01 181.00 8.16 0.00 73.00 0.00 0.00
33554440 vmnic1 - vSwitch0 10.11 0.01 181.00 20.02 0.01 79.00 0.00 0.00
33554441 Shadow of vmnic1 n/a vSwitch0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
33554442 512108:rhel7-1 vmnic1 vSwitch0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
33554443 512185:jlil-central vmnic1 vSwitch0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
33554444 510584:Win-01 vmnic0 vSwitch0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
50331649 Management n/a vSwitch1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
50331650 vmnic2 - vSwitch1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
50331651 Shadow of vmnic2 n/a vSwitch1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
50331652 vmnic4 - vSwitch1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
50331653 Shadow of vmnic4 n/a vSwitch1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
50331654 vmk1 vmnic2 vSwitch1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
50331655 vmk2 vmnic4 vSwitch1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 15
Understand Virtual Network Path
What we know so far…
• VM specifics
• MAC Address
• IP Address
• Virtual machine is actively using vmnic0 to send traffic northbound in the
UCS
• Next, we need to understand which fabric this traffic should be traversing
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
Determining Fabric Path in UCS
vmnic to vNIC mapping…
We use the vmnic MAC Address and match it with the vNIC in UCS
[root@localhost:~] esxcfg-nics -l
Name PCI Driver Link Speed Duplex MAC Address MTU Description
vmnic0 0000:06:00.0 enic Up 10000Mbps Full 00:25:b5:a1:a1:a0 1500 Cisco Systems Inc Cisco VIC Ethernet NIC
vmnic1 0000:07:00.0 enic Up 10000Mbps Full 00:25:b5:b1:b1:b1 1500 Cisco Systems Inc Cisco VIC Ethernet NIC
vmnic2 0000:08:00.0 enic Up 10000Mbps Full 00:25:b5:a1:a1:b0 1500 Cisco Systems Inc Cisco VIC Ethernet NIC
vmnic3 0000:85:00.0 enic Up 10000Mbps Full 00:25:b5:b1:b1:b0 1500 Cisco Systems Inc Cisco VIC Ethernet NIC
vmnic4 0000:86:00.0 enic Up 10000Mbps Full 00:25:b5:b1:b1:c0 1500 Cisco Systems Inc Cisco VIC Ethernet NIC
vmnic5 0000:87:00.0 enic Up 10000Mbps Full 00:25:b5:a1:a1:a1 1500 Cisco Systems Inc Cisco VIC Ethernet NIC
Common mistake – assuming vNIC# and vmnic# are the same without
verifying
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 17
Determining Fabric Path in UCS
Match the MAC of the vmnic to the vNIC on the service profile…
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
Are we learning the MAC address on the FI?
Send traffic from the VM and see what is working…
• Based on what we have found we expect the following:
• VM traffic should be traversing Fabric Interconnect A
• VLAN ID should be 211 based on vSwitch config
• VM MAC Address – 00:50:56:8d:29:15
CiscoLive-2019-A# connect nxos a
CiscoLive-2019-A(nxos)# show mac address-table vlan 211
Legend:
* - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
age - seconds since last seen,+ - primary entry using vPC Peer-Link
VLAN MAC Address Type age Secure NTFY Ports/SWID.SSID.LID
---------+-----------------+--------+---------+------+----+------------------
* 211 0050.568d.2915 dynamic 10 F F Veth4173
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
vEthernet to Server mapping
Verifying that traffic is flowing as expected…
CiscoLive-2019-A(nxos)# show interface vethernet 4173
Vethernet4173 is up
Bound Interface is port-channel1364
• All vNICs are programmed as
Port description is server 1/1, VNIC vNIC2
Hardware is Virtual, address is 547f.eef6.faa0
vethernet interfaces in NX-OS
Port mode is trunk
Speed is auto-speed • Port description shows the vNIC
Duplex mode is auto
300 seconds input rate 0 bits/sec, 0 packets/sec name and server that it belongs to
300 seconds output rate 0 bits/sec, 0 packets/sec
Rx
111 unicast packets 281 multicast packets 2259 broadcast packets
2651 input packets 264597 bytes
0 input packet drops
Tx
82 unicast packets 79044 multicast packets 1841271 broadcast packets
1920397 output packets 180687663 bytes
0 flood packets
0 output packet drops
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
Uplink Pinning
Which uplink is being used?
Understanding how pinning works…
Basic rules to define which interface to pin to:
1. Which uplink interfaces are active?
2. Which uplink interfaces carry ALL of the vNIC’s configured
VLANs?
3. Which uplink has the least amount of vifs pinned to it
currently?
Severity: Major
Code: F0283
Last Transition Time: 2014-02-18T23:08:51.270
ID: 1157440
Status: None
Description: ether VIF 1369 on server 6 / 4 of switch B down, reason: ENM source pinning failed
Affected Object: sys/chassis-6/blade-4/fabric-B/path-1/vc-1369
Name: Dcx Vc Down
Cause: Link Down
Type: Network
Acknowledged: No
Occurrences: 7
Creation Time: 2014-02-11T12:57:11.768
Original Severity: Major
Previous Severity: Cleared
Highest Severity: Major
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 22
Which uplink is being used?
Uplink pinning commands… CiscoLive-2019-A(nxos)# show pinning border-interfaces active
--------------------+---------+----------------------------------------
Border Interface Status SIFs
--------------------+---------+----------------------------------------
Po1 Active sup-eth2 Veth4137 Veth4145 Veth4173
Veth4175 Veth4178 Veth4183 Veth4195
Veth4197 Veth4200 Veth4208 Veth4210
Veth4212 Veth4214 Veth4216
Eth1/18 Active
Two common ways to view
CiscoLive-2019-A(nxos)# show pinning server-interfaces
---------------+-----------------+------------------------+----------------
-
SIF Interface Sticky Pinned Border Interface Pinned Duration
---------------+-----------------+------------------------+----------------
-
Eth1/1 No - -
Eth1/2 No - -
Eth1/3 No - -
Eth1/4 No - -
Eth1/11 No - -
Eth1/12 No - -
Veth4137 No Po1 1d 58:3:23
Veth4145 No Po1 1d 57:47:47
Veth4173 No Po1 1d 57:54:31
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 23
UCS forwarding seems to be working as
expected
What’s upstream?
CiscoLive-2019-A(nxos)# show port-channel summary
Flags: D - Down P - Up in port-channel (members)
I - Individual H - Hot-standby (LACP only)
s - Suspended r - Module-removed
S - Switched R - Routed
U - Up (port-channel)
M - Not in use. Min-links not met
--------------------------------------------------------------------------------
Group Port- Type Protocol Member Ports
Channel
--------------------------------------------------------------------------------
1 Po1(SU) Eth LACP Eth1/31(P) Eth1/32(P)
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 24
Reviewing the upstream switches
Are we learning the MAC address?
MAC address not learned for our f241-03-08-5596-a# show mac address-table vlan 211
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 25
Reviewing the upstream switches
Configuration correct?
f241-03-08-5596-a# show run interface ethernet 1/8
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
What if we don’t learn
the MAC address on the
FI?
Did we have issues traversing IOM and VIC?
For 1st-3rd Gen FI’s…
• Three components left to investigate:
• OS/Driver issues – Did the OS actually send the frame northbound?
• VIC Adapter
• IOM (NIF and HIF ports)
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 28
Cisco VIC Adapter
Connecting and identifying logical interfaces…
CiscoLive-2019-A# connect adapter 1/1/1
adapter 1/1/1 # connect
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 29
Cisco VIC Adapter
Viewing counters for drops and errors…
adapter 1/1/1 (mcp):28# lifstats -a 4
DELTA TOTAL DESCRIPTION
0 0 Tx unicast frames without error
0 0 Tx multicast frames without error
0 0 Tx broadcast frames without error
0 0 Tx unicast bytes without error
0 0 Tx multicast bytes without error
0 0 Tx broadcast bytes without error
• Tx would mean we
0 0 Tx frames dropped
0 0 Tx frames with error
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 30
We don’t see any issues
on the adapter…
IOM Troubleshooting
CiscoLive-2019-A# connect iom 1
fex-1# show platform software woodside sts
Board Status Overview: Uplink #: 1 2 3 4 5 6 7 8
legend: Link status: | | | |
' '= no-connect +-+--+--+--+--+--+--+--+-+
X = Failed SFP: [$][$][$][$][ ][ ][ ][ ]
- = Disabled +-+--+--+--+--+--+--+--+-+
: = Dn | N N N N N N N N |
| = Up | I I I I I I I I |
[$] = SFP present | 0 1 2 3 4 5 6 7 |
[ ] = SFP not present | |
[X] = SFP validation failed | NI (0-7) |
------------------------------ +------------+-----------+
|
+-------------------------+-------------+-------------+---------------------------+
| | | |
+------------+-----------+ +-----------+------------+ +------------+-----------+ +-------------+----------+
| HI (0-7) | | HI (8-15) | | HI (16-23) | | HI (24-31) |
| | | | | | | |
| H H H H H H H H | | H H H H H H H H | | H H H H H H H H | | H H H H H H H H |
| I I I I I I I I | | I I I I I I I I | | I I I I I I I I | | I I I I I I I I |
| 0 1 2 3 4 5 6 7 | | 8 9 1 1 1 1 1 1 | | 1 1 1 1 2 2 2 2 | | 2 2 2 2 2 2 3 3 |
| | | 0 1 2 3 4 5 | | 6 7 8 9 0 1 2 3 | | 4 5 6 7 8 9 0 1 |
+-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+
[ ][ ][ ][ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ][ ][ ][ ]
+-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+
- - | | - | - | | | | | - | | |
1 1 1 1 1 1 1 9 8 7 6 5 4 3 2 1
6 5 4 3 2 1 0
\__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/
blade8 blade7 blade6 blade5 blade4 blade3 blade2 blade1
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 32
IOM Troubleshooting
fex-1# show platform software {tiburon/woodside} rmon 0 hi31
+----------------------+----------------------+-----------------+----------------------+----------------------+-----------------+
| TX | Current | Diff | RX | Current | Diff |
+----------------------+----------------------+-----------------+----------------------+----------------------+-----------------+
| TX_PKT_LT64 | 0| 0| RX_PKT_LT64 | 0| 0|
| TX_PKT_64 | 0| 0| RX_PKT_64 | 386| 15|
| TX_PKT_65 | 379| 15| RX_PKT_65 | 13| 0|
| TX_PKT_128 | 8| 0| RX_PKT_128 | 754| 75|
| TX_PKT_256 | 717| 51| RX_PKT_256 | 0| 0|
| TX_PKT_512 | 12| 0| RX_PKT_512 | 22| 4|
| TX_PKT_1024 | 0| 0| RX_PKT_1024 | 0| 0|
| TX_PKT_1519 | 24| 0| RX_PKT_1519 | 0| 0|
| TX_PKT_2048 | 0| 0| RX_PKT_2048 | 0| 0|
| TX_PKT_4096 | 0| 0| RX_PKT_4096 | 0| 0|
| TX_PKT_8192 | 0| 0| RX_PKT_8192 | 0| 0|
| TX_PKT_GT9216 | 0| 0| RX_PKT_GT9216 | 0| 0|
| TX_PKTTOTAL | 1140| 66| RX_PKTTOTAL | 1175| 94|
| TX_OCTETS | 341435| 20207| RX_OCTETS | 163687| 15984|
| TX_PKTOK | 1140| 66| RX_PKTOK | 1175| 94|
| TX_UCAST | 384| 15| RX_UCAST | 588| 55|
| TX_MCAST | 756| 51| RX_MCAST | 543| 38|
| TX_BCAST | 0| 0| RX_BCAST | 44| 1|
| TX_VLAN | 0| 0| RX_VLAN | 0| 0|
| TX_PAUSE | 0| 0| RX_PAUSE | 0| 0|
| TX_USER_PAUSE | 0| 0| RX_USER_PAUSE | 0| 0|
| TX_FRM_ERROR | 0| 0| | | |
| | | | RX_OVERSIZE | 0| 0|
| | | | RX_TOOLONG | 0| 0|
| | | | RX_DISCARD | 0| 0|
| | | | RX_UNDERSIZE | 0| 0|
| | | | RX_FRAGMENT | 0| 0|
| | | | RX_CRC_NOT_STOMPED | 0| 0|
| | | | RX_CRC_STOMPED | 0| 0|
| TX_OCTETSOK | 341435| 20207| RX_OCTETSOK | 163687| 15984|
+----------------------+----------------------+-----------------+----------------------+----------------------+-----------------+
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 33
IOM Troubleshooting
Do we have errors on the NIF or HIF ports…
fex-1# show platform software woodside loss
+-------+-------------------------------------+------------+-+-----------------------------------+---------------------------------------+
| | | | | | |
| | | | | | frm_to |
| | |Port Extra | | +---------------------------------------|
| | RMON | Drop |S| SS Loss Counters | COS | XOFF |
| +------------+-----------+------------+------------|S|-----------+-----------+-----------+---------------------------------------|
| Port | Tx Pause | Rx Pause | Errors | Counters |x| RX SS | Tx SS | SS Total |0 |1 |2 |3 |4 |5 |6 |7 |0 |1 |
+-------+------------+-----------+------------+------------+-+-----------+-----------+-----------+---+---+---+---+---+---+---+---+---+---+
| 0- NI3| 0| 59896| 0| 7|0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |1| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |2| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |3| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |4| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |5| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |6| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |7| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
+-------+-------------------------------------+------------+-+-----------+-----------+-----------+---+---+---+---+---+---+---+---+---+---+
| 0-HI27| 770| 0| 0| 0|0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |1| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |2| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |3| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |4| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |5| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |6| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
| | | | | |7| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
+-------+-------------------------------------+------------+-+-----------+-----------+-----------+---+---+---+---+---+---+---+---+---+---+
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 34
Troubleshooting
Forwarding Issues
4th Gen FI - ELAM
ELAM
(Embedded Logic Analyzer Module)
• New to UCS – added on 4th Gen Fabric Interconnect
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
ELAM
Example 1 – UCS Server sends ping to SVI on upstream switch
SVI
VLAN 211
14.17.211.250
Nexus 5548
UCS 6454
Blade IP
14.17.211.31
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 37
UCS ELAM – Walkthrough on CLI
Connect to the switching software (NX-OS) for the Fabric Interconnect and then attach
to the hardware module:
F241-03-09-UCSFabric-6454-1-A# connect nxos a
F241-03-09-UCSFabric-6454-1-A(nx-os)# attach module 1
Note – in the 6400 Fabric Interconnect, the ASIC and SLICE will always be 0.
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 38
ELAM – Configuring Filter Criteria
For this example, we only capture traffic destined for IPv4 14.17.211.250
module-1(TAH-elam-insel6)# set outer ipv4 dst_ip 14.17.211.250
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 39
ELAM Capture Report – No Drops
module-1(TAH-elam-insel6)# report
HOMEWOOD ELAM REPORT SUMMARY
slot - 1, asic - 0, slice – 0
============================
Incoming Interface: Eth1/11 Incoming Interface = ingrees port on the FI
Src Idx : 0x1000, Src BD : 211 Src_BD = BD stands for Bridge Domain (usually 1-to-1 mapping to a VLAN)
Outgoing Interface Info: dmod 1, dpid 3 Outgoing Interface = egress port on the FI
Dst Idx : 0x602, Dst BD : 211 DPID = Destination Port ID
Packet Type: IPv4
Source and Destination MAC
Dst MAC address: 00:2A:6A:35:4A:41
CoS value and VLAN tag info.
Src MAC address: 00:0C:29:D7:3C:89
.1q Tag0 VLAN: 211, cos = 0x0
Dst IPv4 address: 14.17.211.250
IP address information
Src IPv4 address: 14.17.211.31
Ver = 4, DSCP = 0, Don't Fragment = 0
Proto = 1, TTL = 128, More Fragments = 0
Hdr len = 20, Pkt len = 60, Checksum = 0x1ac8
L4 Protocol : 1
ICMP type : 8
ICMP code : 0
Drop Info:
----------
LUA:
LUB: Drop Information section
LUC: Will display reason for drops if applicable
LUD:
Final Drops:
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 40
ELAM – DPID
F241-03-09-UCSFabric-6454-1-A(nx-os)# show interface hardware-mappings
Legends:
SMod - Source Mod. 0 is N/A
Unit - Unit on which port resides. N/A for port channels
HPort - Hardware Port Number or Hardware Trunk Id:
HName - Hardware port name. None means N/A
FPort - Fabric facing port number. 255 means N/A
NPort - Front panel port number
VPort - Virtual Port Number. -1 means N/A
Slice - Slice Number. N/A for BCM systems
SPort - Port Number wrt Slice. N/A for BCM systems
SrcId - Source Id Number. N/A for BCM systems
------------------------------------------------------------------------
Name Ifindex Smod Unit HPort FPort NPort VPort Slice SPort SrcId
------------------------------------------------------------------------
Interface long name is Ethernet1/1
Eth1/1 1a000000 1 0 16 255 0 -1 0 16 32
Interface long name is Ethernet1/2
Eth1/2 1a000200 1 0 17 255 4 -1 0 17 34
Interface long name is Ethernet1/17
Eth1/17 1a002000 1 0 0 255 64 -1 0 0 0
Interface long name is Ethernet1/20
Eth1/20 1a002600 1 0 3 255 76 -1 0 3 6 SPort 3 = dpid 3
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
ELAM – Drop Scenario
Presentation ID
ELAM Walkthrough
Example 2 – Understanding why packets are not forwarded
SVI
VLAN 211
14.17.211.250
Nexus 5548
UCS 6454
Blade IP
14.17.211.31
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 43
ELAM – Configuring Filter Criteria
Same parameters as previous ELAM capture
F241-03-09-UCSFabric-6454-1-A# connect nxos a
F241-03-09-UCSFabric-6454-1-A(nx-os)# attach module 1
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 44
ELAM Capture Report – Drop Info
module-1(TAH-elam-insel6)# report
L4 Protocol : 1
ICMP type : 8
ICMP code : 0
Drop Info:
---------- NOTE: When it comes to confirming whether the packet is actually
being dropped, the "Final Drops" field is the ONLY one to consider.
LUA:
LUB:
LUC:
Drop Information section – this time shows SRC_VLAN_MBR as the reason.
LUD:
This indicates an issue with VLAN membership on the source packet
SRC_VLAN_MBR
Final Drops:
SRC_VLAN_MBR
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 45
UCS ELAM – Remove VLAN
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 46
Networking Troubleshooting Summary
Quick recap…
• Keep it simple – UCS Networking is all Layer 2
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 47
The Story with Storage
FC and FCoE with UCS
FC Switching Modes
Different from Ethernet modes!
• End Host Mode
• Default Mode for FIs (NPV)
• Requires NPIV enabled device upstream
• Switch Mode
• Most common use - Direct Attached
Storage
UCS 5108
!
SLOT SLOT
1 2
SLOT
5
SLOT
4
SLOT
6
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 49
Fibre Channel Port Types
• ‘N’ port: Node ports used to connect devices to switched fabric or point to point configurations.
N N
• ‘F’ port: Fabric ports residing on switches connecting ‘N’ port devices
N F
• ‘E’ port: Expansion ports are essentially trunk ports used to connect two Fibre Channel switches
E E
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 50
What Is NPIV?
• N-Port ID Virtualization (NPIV) provides a means to assign multiple FCIDs to a single N_Port
• Limitation exists in FC where only a single FCID can be handed out per F-port. Therefore an F-
Port can only accept a single FLOGI
• Allows multiple applications to share the same Fiber Channel adapter port
• Main use case is Virtualization
Application
Server NPV Switch FC NPIV Core Switch
F-Port
Eth1/3 Server3
N_Port_ID 3
N-Port
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 52
SAN “End Host” NPV Mode
SAN A SAN B • Fabric Interconnects in NPV (N Port Virtualization)
Mode
FLOGI
FDISC
NPIV NPIV • Fabric Interconnect operates in N_Port Proxy (NP)
F_Port F_Port mode
VSAN VSAN • SAN switch sees Fabric Interconnect as an FC End
1
N_Proxy (NP)
1
N_Proxy (NP)
Host with many N_Ports and many FC IDs assigned
6100- 6100- • Server facing ports function as F-proxy ports
A vFC vFC B vFC vFC
1 2 1 2 • Server vHBA pinned to an FC uplink in the same
F_Prox F_Prox VSAN. Round Robin selection.
y y
• Provides multiple FC end nodes to one F_Port off an
N_Port N_Port FC Switch
vHB vHB vHB vHB
A0 A1 A0 A1
Server 1 Server 2
VSAN 1 VSAN 1
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 53
SAN FC Switch Mode
FC FCoE • UCS Fabric Interconnect behaves like a FC fabric
switch
N_Port
vHB vHB vHB vHB
A0 A1 A0 A1
Server 1 Server 2
VSAN 1 VSAN 2
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 54
FC Boot Configuration
FC Boot - Topology
Traditional deployment of UCS in End Host Mode
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
1
2
3
BCN
4
STS
ENV
LS
L1 L2
CISCO UCS-FI-6332
UCS 5108
!
SLOT SLOT
1 2
! Console Reset ! Console Reset
SLOT SLOT
3 4
SLOT SLOT
5 6
SLOT SLOT
7 8
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 56
FC Boot - UCS Configuration
Boot from SAN requirements
vHBA
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 57
FC Boot - UCS Configuration
Boot from SAN requirements
vHBA
Boot Policy
Important
Settings
• vHBA Name
• WWPN of
Target
• Boot LUN ID
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 58
FC Boot - UCS Verification
CiscoLive-2019-A# show service-profile circuit server 1/7
Service Profile: asamplin/liveTest
Server: 1/7
Fabric ID: A
Path ID: 1
VIF vNIC Link State Oper State Prot State Prot Role Admin Pin Oper Pin Transport
---------- --------------- ----------- ---------- ------------- ----------- ---------- ---------- ---------
12430 Up Active No Protection Unprotected 0/0/0 0/0/0 Ether
4228 eth0 Offline Unknown No Protection Unprotected 0/0/0 0/0/0 Ether
4238 vhba1 Up Active No Protection Unprotected 0/0/0 2/0/16 Fc
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 59
FC Boot - Adapter Programming
Option ROM Programmed Correctly!
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 60
FC Boot - We can see the LUN!
Verify that OS installer can see the boot LUN
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 61
Troubleshooting FC
Boot
FC Boot Troubleshooting – Lunlist Command
Commands entered are bolded
CiscoLive-2019-A# connect adapter 1/7/1
adapter 1/7/1 # connect
No entry for terminal type "dumb";
using dumb terminal settings.
adapter 1/7/1 (top):1# attach-fls
No entry for terminal type "dumb";
using dumb terminal settings.
adapter 1/7/1 (fls):1# lunlist
vnic : 15 lifid: 5
- FLOGI State : flogi est (fc_id 0x6e0051)
- PLOGI Sessions
- WWNN 50:0a:09:83:87:49:80:24 WWPN 50:0a:09:83:87:49:80:24 fc_id 0x450000
- LUN's configured (SCSI Type, Version, Vendor, Serial No.)
LUN ID : 0x0000000000000000 (0x0, 0x4, NETAPP , Hn/ZR40PU7K9)
- REPORT LUNs Query Response
LUN ID : 0x0000000000000000
- Nameserver Query Response
- WWPN : 50:0a:09:83:87:49:80:24
- WWPN : 50:0a:09:81:87:49:80:24
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 63
Lunlist – Step 1: Is the vHBA FLOGI present?
Let’s break it down!
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 64
Lunlist – Step 2: Is zoning correctly configured?
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 65
Lunlist – Verify Zoning on Upstream Switch
Similar to ACLs for Ethernet, default behavior is Deny
The zone configured on the
upstream switch must have
the following WWPNs: f241-03-08-5596-a# show zoneset name
zoneset name netapp1-1000 vsan 1000
netapp1-1000 vsan 1000
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 66
Lunlist – Step 3: What is my boot LUN ID?
adapter 1/7/1 (fls):1# lunlist
vnic : 15 lifid: 5 Setting from UCS Boot Policy
- FLOGI State : flogi est (fc_id 0x6e0051)
- PLOGI Sessions
- WWNN 50:0a:09:83:87:49:80:24 WWPN 50:0a:09:83:87:49:80:24 fc_id 0x450000
- LUN's configured (SCSI Type, Version, Vendor, Serial No.)
LUN ID : 0x0000000000000000 (0x0, 0x4, NETAPP , Hn/ZR40PU7K9)
- REPORT LUNs Query Response
LUN ID : 0x0000000000000000
- Nameserver Query Response
Returned from Storage Array
- WWPN : 50:0a:09:83:87:49:80:24
- WWPN : 50:0a:09:81:87:49:80:24
All of these
settings are
defined in boot
policy!
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 67
FC Boot Troubleshooting – Lunlist Command
Nonworking output
CiscoLive-2019-A# connect adapter 1/7/1
adapter 1/7/1 # connect
No entry for terminal type "dumb";
using dumb terminal settings.
adapter 1/7/1 (top):1# attach-fls
No entry for terminal type "dumb";
using dumb terminal settings.
adapter 1/7/1 (fls):1# lunlist
vnic : 15 lifid: 5
- FLOGI State : flogi est (fc_id 0x6e0051)
- PLOGI Sessions
- WWNN 50:0a:09:83:87:49:80:24 WWPN 50:0a:09:83:87:49:80:24 fc_id 0x000000
- LUN's configured (SCSI Type, Version, Vendor, Serial No.)
LUN ID : 0x0000000000000000 access failure
- REPORT LUNs Query Response
- Nameserver Query Response
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 68
FC User Story
I added links but no extra bandwidth!
TAC Case Example
• Hosts Reporting high storage latency
• Customer added 2 additional FC
uplinks (tripling bandwidth)
• Customer reported to TAC that nothing
changed! FC Switch
Fabric Interconnect
Newly added
uplinks in black
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 70
FC Uplink Behavior
Individual FC Uplinks
Servers pin to individual uplinks
No load balancing FC Switch
Fabric Interconnect
fcid1 fcid4
FDISC
fcid2 fcid5
FDISC
fcid3 fcid6
Fabric Interconnect
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 72
Note on FC Port Channels
Multiple VSANs Optional
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 73
FC Port Channel Verification
Which VSANs are active on my port channel?
CiscoLive-2019-B(nxos)# show int san 44
san-port-channel 44 is trunking
Hardware is Fibre Channel
Port WWN is 24:2c:54:7f:ee:c5:6c:c0
Admin port mode is NP, trunk mode is on
snmp link state traps are enabled
Port mode is TNP
Port vsan is 1001
Speed is 16 Gbps
Trunk vsans (admin allowed and active) (1,10,200-203,888,1000-1001)
201 and 1001 are up Trunk vsans (up) (201,1001)
because they are the Trunk vsans (isolated) (10,200,202,888,1000)
only vHBAs with Trunk vsans (initializing) (1,203)
1 minute input rate 13560 bits/sec, 1695 bytes/sec, 4 frames/sec
active FLOGI into FI 1 minute output rate 7480 bits/sec, 935 bytes/sec, 4 frames/sec
1940486 frames input, 2478434904 bytes
83 discards, 0 errors
0 CRC, 0 unknown class
0 too long, 0 too short
736055 frames output, 89418044 bytes
0 discards, 0 errors
4 input OLS, 4 LRR, 3 NOS, 0 loop inits
10 output OLS, 2 LRR, 0 NOS, 0 loop inits
last clearing of "show interface" counters never
Member[1] : fc2/15
Member[2] : fc2/16
Interface last changed at Sun May 14 20:03:28 2017
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 74
Troubleshooting FC in
UCS
How to check for FC Possible Issues within UCS
Since UCS is normally in End Host mode, congestion is hard to find
• Storage traffic within UCS is FCoE
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 76
Is there a hardware issue?
Check the IOM ports from NX-OS…
• z = port Eth1/1/5
Eth1/1/6
0
0
0
0
0
0
0
0
0
0
0
0
Eth1/1/7 0 0 0 0 0 0
Eth1/1/8 0 0 0 0 0 0
Eth1/1/9 0 0 0 0 0 0
Eth1/1/10 0 0 0 0 0 0
Eth1/1/11 0 0 0 0 0 0
Eth1/1/12 0 0 0 0 0 0
Eth1/1/13 0 0 0 0 0 0
Eth1/1/14 0 0 0 0 0 0
Eth1/1/15 0 0 0 0 0 0
Eth1/1/16 0 0 0 0 0 0
Eth1/1/17 0 0 0 0 0 0
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 77
IOM Troubleshooting
Example where errors on IOM are indicating issues downstream…
--------------------------------------------------------------------------------
Port Align-Err FCS-Err Xmit-Err Rcv-Err UnderSize OutDiscards
--------------------------------------------------------------------------------
Eth1/6 0 103 0 103 0 0 Uplink interfaces rcvd bad
Eth1/21 0 103 0 103 0 0
frames
Po1027 0 206 0 206 0 0 Uplink port-channel counters
Po1351 0 207 0 207 0 0 Adapter-IOM port-channel
Eth3/1/1 0 0 0 0 0 0
Eth3/1/2
Eth3/1/3
0
0
0
0
0
0
0
0
0
0
0
0 HIF ports on IOM
Eth3/1/4 0 207 0 207 0 0
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 78
Which server is seeing an issue?
Determine interface association
The Host Interface (HIF) is now known. You can use this
info to determine affected blade
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 79
IOM Troubleshooting
Reviewing HIF ports to isolate blades impacted
fex-1# show platform software woodside sts
Board Status Overview: Uplink #: 1 2 3 4 5 6 7 8
legend: Link status: | | | |
' '= no-connect +-+--+--+--+--+--+--+--+-+
X = Failed SFP: [$][$][$][$][ ][ ][ ][ ]
- = Disabled +-+--+--+--+--+--+--+--+-+
: = Dn | N N N N N N N N |
| = Up | I I I I I I I I |
[$] = SFP present | 0 1 2 3 4 5 6 7 |
[ ] = SFP not present | |
[X] = SFP validation failed | NI (0-7) |
------------------------------ +------------+-----------+
|
+-------------------------+-------------+-------------+---------------------------+
| | | |
+------------+-----------+ +-----------+------------+ +------------+-----------+ +-------------+----------+
| HI (0-7) | | HI (8-15) | | HI (16-23) | | HI (24-31) |
| | | | | | | |
| H H H H H H H H | | H H H H H H H H | | H H H H H H H H | | H H H H H H H H |
| I I I I I I I I | | I I I I I I I I | | I I I I I I I I | | I I I I I I I I |
| 0 1 2 3 4 5 6 7 | | 8 9 1 1 1 1 1 1 | | 1 1 1 1 2 2 2 2 | | 2 2 2 2 2 2 3 3 |
| | | 0 1 2 3 4 5 | | 6 7 8 9 0 1 2 3 | | 4 5 6 7 8 9 0 1 |
+-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+
[ ][ ][ ][ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ][ ][ ][ ]
+-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+
- - | | - | - | | | | | | | | |
1 1 1 1 1 1 1 9 8 7 6 5 4 3 2 1
6 5 4 3 2 1 0
\__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/
blade8 blade7 blade6 blade5 blade4 blade3 blade2 blade1
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 80
Understanding PAUSE
Frames
What are PAUSE Frames?
• Storage traffic needs to be lossless, so PAUSE frames are used so frames
are not dropped
• PAUSE frames are used in FCoE and allow an interface to send a request
for a short pause in frame transmission to avoid drops
• Can be a sign of an issue, but not always…
• Under normal operations, we would expect PAUSE frames to increment
• Requires detailed review – Remember the numbers are relative!
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 82
Do we have congestion in our UCS?
CiscoLive-2019(nx-os)# show interface priority-flow-control
============================================================
Port Mode Oper(VL bmap) RxPPP TxPPP
============================================================
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 83
IOM Troubleshooting
Reviewing HIF ports to isolate blades impacted
fex-1# show platform software woodside sts
Board Status Overview: Uplink #: 1 2 3 4 5 6 7 8
legend: Link status: | | | |
' '= no-connect +-+--+--+--+--+--+--+--+-+
X = Failed SFP: [$][$][$][$][ ][ ][ ][ ]
- = Disabled +-+--+--+--+--+--+--+--+-+
: = Dn | N N N N N N N N |
| = Up | I I I I I I I I |
[$] = SFP present | 0 1 2 3 4 5 6 7 |
[ ] = SFP not present | |
[X] = SFP validation failed | NI (0-7) |
------------------------------ +------------+-----------+
|
+-------------------------+-------------+-------------+---------------------------+
| | | |
+------------+-----------+ +-----------+------------+ +------------+-----------+ +-------------+----------+
| HI (0-7) | | HI (8-15) | | HI (16-23) | | HI (24-31) |
| | | | | | | |
| H H H H H H H H | | H H H H H H H H | | H H H H H H H H | | H H H H H H H H |
| I I I I I I I I | | I I I I I I I I | | I I I I I I I I | | I I I I I I I I |
| 0 1 2 3 4 5 6 7 | | 8 9 1 1 1 1 1 1 | | 1 1 1 1 2 2 2 2 | | 2 2 2 2 2 2 3 3 |
| | | 0 1 2 3 4 5 | | 6 7 8 9 0 1 2 3 | | 4 5 6 7 8 9 0 1 |
+-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+
[ ][ ][ ][ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ][ ][ ][ ]
+-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+ +-+--+--+--+--+--+--+--+-+
- - | | - | - | | | | | - | | |
1 1 1 1 1 1 1 9 8 7 6 5 4 3 2 1
6 5 4 3 2 1 0
\__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/ \__\__/__/
blade8 blade7 blade6 blade5 blade4 blade3 blade2 blade1
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 84
IOM Troubleshooting
fex-1# show platform software woodside rmon 0 hi31
+----------------------+----------------------+-----------------+----------------------+----------------------+-----------------+
| TX | Current | Diff | RX | Current | Diff |
+----------------------+----------------------+-----------------+----------------------+----------------------+-----------------+
| TX_PKT_LT64 | 0| 0| RX_PKT_LT64 | 0| 0|
| TX_PKT_64 | 0| 0| RX_PKT_64 | 19985| 0|
| TX_PKT_65 | 30235045| 0| RX_PKT_65 | 22468| 0|
| TX_PKT_128 | 713668| 0| RX_PKT_128 | 46488| 0|
| TX_PKT_256 | 26427672| 2| RX_PKT_256 | 19112| 0|
| TX_PKT_512 | 6425| 0| RX_PKT_512 | 5996| 0|
| TX_PKT_1024 | 12184| 0| RX_PKT_1024 | 21769| 0|
| TX_PKT_1519 | 2690146| 0| RX_PKT_1519 | 106682| 0|
| TX_PKT_2048 | 33075| 0| RX_PKT_2048 | 0| 0|
| TX_PKT_4096 | 0| 0| RX_PKT_4096 | 0| 0|
| TX_PKT_8192 | 0| 0| RX_PKT_8192 | 0| 0|
| TX_PKT_GT9216 | 0| 0| RX_PKT_GT9216 | 0| 0|
| TX_PKTTOTAL | 60118215| 2| RX_PKTTOTAL | 242500| 0|
| TX_OCTETS | 16177265983| 712| RX_OCTETS | 213575710| 0|
| TX_PKTOK | 60118215| 2| RX_PKTOK | 242500| 0|
| TX_UCAST | 2833669| 0| RX_UCAST | 198047| 0|
| TX_MCAST | 6234374| 0| RX_MCAST | 44299| 0|
| TX_BCAST | 51050172| 2| RX_BCAST | 154| 0|
| TX_VLAN | 0| 0| RX_VLAN | 0| 0|
| TX_PAUSE | 0| 0| RX_PAUSE | 0| 0|
| TX_USER_PAUSE | 0| 0| RX_USER_PAUSE | 37754621| 9843|
| TX_FRM_ERROR | 0| 0| | | |
| | | | RX_DISCARD | 0| 0|
| | | | RX_UNDERSIZE | 0| 0|
| | | | RX_FRAGMENT | 0| 0|
| | | | RX_CRC_NOT_STOMPED | 0| 0|
| | | | RX_CRC_STOMPED | 0| 0|
| | | | RX_INRANGEERR | 0| 0|
| | | | RX_JABBER | 0| 0|
| TX_OCTETSOK | 16177265983| 712| RX_OCTETSOK | 213575710| 0|
+----------------------+----------------------+-----------------+----------------------+----------------------+-----------------+
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 85
IOM Troubleshooting
Checking the network port between FI and IOM
fex-1# show platform software woodside rmon 0 ni0 | in PAUSE
+----------------------+----------------------+-----------------+----------------------+----------------------+-----------------+
| TX | Current | Diff | RX | Current | Diff |
+----------------------+----------------------+-----------------+----------------------+----------------------+-----------------+
| PORT CNTRS NI0 |
| TX_PAUSE | 0| 0| RX_PAUSE | 0| 0|
| TX_USER_PAUSE | 1956| 78| RX_USER_PAUSE | 87512| 3564|
UCS 2208XP
RX from FI
3
TX to FI
are normal 4 that TX and RX
behavior. (coming from upstream) 5
(sending to upstream) are from the
The number is 6
IOM’s
relative! perspective
7
2208XP IOM
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 86
Understanding FC Aborts
Presentation ID
Common Scenario - FC Aborts Seen in OS Logs
ESXi host – vmkernel.log
2017-09-05T08:14:04.267Z cpu26:2727449)<7>fnic : 0 :: Abort Cmd called FCID 0x450060, LUN 0x61 TAG f3 flags 273
2017-09-05T08:14:04.267Z cpu4:2727482)<7>fnic : 0 :: Abort Cmd called FCID 0x450060, LUN 0x63 TAG f5 flags 273
2017-09-05T08:14:04.270Z cpu11:2605734)<7>fnic : 0 :: abts cmpl recd. id 243 status FCPIO_ABORTED
2017-09-05T08:14:04.270Z cpu26:2727449)<7>fnic : 0 :: Returning from abort cmd type 2 SUCCESS
2017-09-05T08:14:04.270Z cpu28:33545)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.600601602e503900fd54acb15dfae511"
state in doubt; requested fast path state update...
2017-09-05T08:14:04.270Z cpu28:33545)ScsiDeviceIO: 2651: Cmd(0x43a601e05d80) 0xfe, CmdSN 0xaa13f from world 32822 to dev
"naa.600601602e503900fd54acb15dfae511" failed H:0x8 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2017-09-05T08:14:04.273Z cpu11:2605734)<7>fnic : 0 :: abts cmpl recd. id 245 status FCPIO_ABORTED
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 88
Common Scenario – Check UCS Adapter Logs
What does the mezzanine adapter show?
• OS log showing CiscoLive-2019-A# connect adapter 1/1/1
adapter 1/1/1 # connect
issues hitting adapter 1/1/1 (top):1# show-log
storage
160309-20:25:43.456386 ecom.ecom_main ecom(8:2): abort called for exch 68f1,
status 3 rx_id 8517 s_stat 0x1 xmit_recvd 0x400 burst_offset 0x400 burst_len 0x0
• Investigate adapter sgl_err 0x0 last_param 0x0 last_seq_cnt 0x0 tot_bytes_exp 0x400 h_seq_cnt 0x0
logs on UCS exch_type 0x1 s_id 0x450020 d_id 0x450060 host_tag 0x58
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 89
Review Upstream FC Switch
Correlate Source and Destination FCIDs to devices in the fabric
• Source FCID and Abort 1: s_id 0x450020 d_id 0x450060
Abort 2: s_id 0x450060 d_id 0x6e0051
Dest FCID should
be present f241-03-08-5596-a# show fcns database vsan 1000
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 90
4th Gen FI
New Commands for FC
Troubleshooting
FC Credit Mechanism Basics
• Frames are only transmitted when it is known that the receiver
has buffer space
• For each frame sent, an R_Rdy (B2B Credit) should be returned
• R_Rdys can only be returned once the frame that has previously
occupied that buffer location has been handled
SW2
SW1 Frame sent, B2B Credit –1 on SW1
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 92
Buffer to Buffer Credit Counters
CiscoLive-2019-A# connect nxos a
CiscoLive-2019-A(nx-os)# attach module 1
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 93
Credit Loss Events and Slow Drain Detection
module-1# show process creditmon credit-loss-events
Credit Loss Events: YES
----------------------------------------------------
| Interface | Total | Timestamp |
| | Events | |
----------------------------------------------------
| fc1/1 | 2 | 1. Thu Aug 9 17:16:03 2018 |
| | | 2. Thu Aug 9 17:14:09 2018 |
----------------------------------------------------
module-1# show platform software fcpc info interface fc 1/1 | begin CREDIT
CREDIT MONITOR INFO
if index: 0x1000000
monitor event: on
number of err functions invoked: 0
number of GSM events generated: 0 Slow drain event detected count
number of err entries: 1 Think of it as “number of times credit-loss-recovery was detected”
fcp port mode: trunking port
e port credit loss count: 0 Number of credit loss in current monitor window
When in End-Host Mode, uplink NP port counters are displayed as E Ports
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 94
Storage Troubleshooting Summary
Quick recap…
• Lunlist output only available before boot
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 95
UCS Management
Mastery
When are they recommended?
UCS Health Checks • Before Upgrades
• Prior to planned maintenance of any
kind
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 97
UCS Manager Health Check
CiscoLive-2019-A(local-mgmt)# show cluster extended-state
Cluster Id: 0x2c092182748311e2-0x8ed9547feec569c4
• Connect local (a/b) Start time: Wed Apr 19 14:41:20 2017
via SSH Last election time: Wed Apr 19 14:42:58 2017
HA READY
Detailed state of the device selected for HA storage:
Chassis 1, serial: FOX1330GDH1, state: active
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 98
UCS Manager Health Check
Healthy FI Working Output
CiscoLive-2019-A(local-mgmt)# show pmon state
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 99
UCS Management Database Health
Scheduled Checks...
CiscoLive-2019-B# scope system
CiscoLive-2019-B /system # show mgmt-db-check-policy detail
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 100
UCS Management Database Health
Manually Checking…
CiscoLive-2019-B# scope system
CiscoLive-2019-B /system # start-db-check
CiscoLive-2019-B /system* # commit-buffer
<wait…>
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 101
Hardware Diagnostics
Diagnostic ISO
Tests available:
• Memtest 86 – Memory and CPU
cache tests
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 103
Blade Memory Diagnostics
• Added in UCSM version 3.1(3a) as an embedded tool
• Only test for issues with memory
• Several options configurable in diag-policy:
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 104
Blade Memory Diagnostics
Available under Diagnostics tab of Server view
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 105
Blade Memory Diagnostics
Results
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 106
The Power of Intersight
• Connecting UCS to the Cloud
Presentation ID © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 108
Intersight - Claim UCS Domain
• First step is to claim your domain in Intersight!
• Log in at https://www.intersight.com
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 109
Intersight - Claim UCS Domain
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 110
Intersight - Claim UCS Domain
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 111
Intersight – New Domain Added!
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 112
Intersight - HCL
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 113
One minute while we update
firmware…
Presentation ID
Intersight - HCL
We are compliant!
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 115
Cisco Intersight
Connected TAC
Cisco Intersight: Enhanced Support
Connected TAC
Overview: Supports:
Automated transmission of technical support files • UCS Manager
to the Cisco Technical Assistance Center (TAC) • Standalone C-Series
for accelerated troubleshooting. • HyperFlex (ESXi & Hyper-V)
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 117
Intersight + TAC Real World Example #1
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 118
Intersight + TAC Real World Example #2
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 119
Intersight – Future Features
• Intersight Appliance (On-Prem version)
• TAC SR Creation
• More integration with TAC Digitized IC (automatic issue detection)
• More server platforms integrated
• New C series models!
• Hyperflex Connect
• Deploy HX through Intersight!
• New features and documentation at https://www.intersight.com/help/
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 120
Questions?
Cisco Webex Teams
Questions?
Use Cisco Webex Teams (formerly Cisco Spark)
to chat with the speaker after the session
How
1 Find this session in the Cisco Events Mobile App
2 Click “Join the Discussion”
3 Install Webex Teams or go directly to the team space
4 Enter messages/questions in the team space
cs.co/ciscolivebot#BRKINI-2011
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 122
Complete your online
session survey
• Please complete your Online Session
Survey after each session
• Complete 4 Session Surveys & the Overall
Conference Survey (available from
Thursday) to receive your Cisco Live T-
shirt
• All surveys can be completed via the Cisco
Events Mobile App or the Communication
Stations
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 123
Continue Your Education
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 123
Thank you
Disjoint Layer 2
What is Disjoint L2?
Two different L2 Domains…
• Click to edit Master text styles Pro Back
• Second level d up
• Third level
• Fourth level
Backup
Prod vNIC
vNIC
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 128
Configuration done half-way...
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 129
Uplink port configuration in this scenario…
CiscoLive-2019-B(nxos)# show running-config interface ethernet 1/17
interface Ethernet1/17
description U: Uplink
pinning border
pinning server nf-exporter
switchport mode trunk
switchport trunk allowed vlan 1,104,111,204,211,304,311,900
udld disable
no shutdown
interface port-channel2
description U: Uplink
switchport mode trunk
switchport trunk allowed vlan 1,104,111,204,211,304,311
pinning border
speed 10000
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 130
Understanding the Designated Receiver
• Absence of STP means we rely on other mechanisms to avoid loops
Prod Backup
vNIC vNIC
BRKINI-2011 131
Who is the Designated Receiver?
CiscoLive-2019-B(nxos)# show platform software enm internal info vlandb all
vlan_id 1
-------------
Designated receiver: Po2
Membership:
Eth1/17 Po2
Pro Back
vlan_id 104
-------------
d up VLAN 900 Only
Designated receiver: Eth1/17
Po2
Membership:
Eth 1/17
Eth1/17 Po2
vlan_id 111
-------------
Designated receiver: Po2
Membership:
Eth1/17 Po2
vlan_id 900
-------------
Designated receiver: Eth1/17
Membership: Prod Backup
vNIC vNIC
Eth1/17
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 132
Disjoint Layer 2 Configured in full…
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 133
Correct configuration from CLI
CiscoLive-2019-B(nxos)# show running-config interface ethernet 1/17
interface Ethernet1/17
description U: Uplink
pinning border
pinning server nf-exporter
switchport mode trunk
switchport trunk allowed vlan 1,900
udld disable
no shutdown
interface port-channel2
description U: Uplink
switchport mode trunk
switchport trunk allowed vlan 1,104,111,204,211,304,311
pinning border
speed 10000
BRKINI-2011 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 134