Strategies and Tactics:: Application Troubleshooting Simplified
Strategies and Tactics:: Application Troubleshooting Simplified
Strategies and Tactics:: Application Troubleshooting Simplified
Fluke Networks Is
Global
A $300+ million company Profitable since its inception as a separate operating entity Over 800 employees worldwide service customers in more than 120 countries Approximately 30% of revenue from outside the U.S. Worldwide Headquarters: Everett, WA
Major Facilities: Colorado Springs, CO; Duluth, GA; Bridgewater, NJ; Rockville, MD; Sales Offices & Associates: Worldwide Technical Assistance Centers: Everett, WA; Eindhoven, NL
Fluke Networks Is
Thriving
Backed by an $12B corporate parent, Danaher Corporation
Fluke Networks and Fluke are both part of Danaher (NYSE: DHR)
Trusted
Trusted by 98 of the Fortune 100 who use Fluke Networks solutions to deploy, solve, manage and optimize their networks.
Agenda
Todays Challenges Complexity Applications and the infrastructure that delivers them Change You think you know your network? Wanna bet? Triage Determining just who owns this problem Root cause analysis (RCA) What is the specific cause of latency? Best Practices Getting in the Path of the Packets Capturing all the Packets Discovering Problems before the Customer Discovers Them Resolving Problems in a Timely Manner
Challenges
and this is just the view of the app inside of a data center. What is happening in that users network?
Can lead to increased problem resolution time Without a clear view of the current state of the network, it is very difficult to quickly resolve network and application related problems
Its the network! Whether it is a network problem, server problem, or application problem, the network always gets blamed first The faster we can determine the fault domain of the problem, the faster we can get the right resources working on it
Without a history of normal network operation, it is difficult to determine what is not normal Keeping a history of:
Utilization levels Roundtrip Latencies Protocol Distributions Packet captures of working applications
Allows us to get to the root of the problem, without chasing symptoms that are not really part of the problem
Best Practices
Getting in the path of the packets Capturing all the Packets Discovering Problems before the customer does Resolving Problems in a Timely Manner
10
Best Practices
Getting in the Path of the Packets
No Network Documentation Understanding Application Dependencies Tapping Technologies Virtual Machines
Host Conversations
Data Center
Application Servers
Database Servers
Hubs
Pros Cheap Available Easy to install Cons Reduce link to half duplex May not be a true hub Not practical on servers or switch uplinks If power drops, link drops 10/100 Mbps speeds
Span/Mirror Ports
Pros Free Available Does not require link to be dropped Great for one-time link monitoring
1 3 5 7 9 11 13 15 17 19 21 23 1 SYSTEM RPS STAT UTIL DUPLEX SPEED
Cons Requires switch access Configuration mistakes can result in network outages Can quickly become over provisioned Requires a free switch port
CATALYST 3550
2
10
12
14
16
18
20
22
24
Taps
Pros Truly monitors full-duplex traffic If power is lost link stays active Can monitor gigabit links without packet loss Once installed, can stay Cons Most expensive option Have to break the link to install Can over-provision the monitor port and drop packets
Tap Deployment
Analysis equipment can be quickly connected to the network, without the need for configuration changes Aggregators can be used to merge the traffic from multiple taps into a single stream This allows a single analyzer to monitor traffic at multiple locations as well as redundant paths
Tap Deployment
Having taps deployed at key locations provides easy access for the analysis equipment These points include:
In front of server farms At the Internet connection Switch Uplinks Demarcation Points between Responsibilities
Tap Deployment
Data Center
Application Servers
Database Servers
If the capture buffer is not big enough, the packets will roll out of the buffer, before anyone knows the problem even occurred
Portable Solution
Remote Offices
Data Center
Application Servers
Database Servers
When these services are not performing well, the customer wants them fixed and fixed now!
Data Center
Application Servers
Database Servers
Data Center
Application Servers
Database Servers
Understanding Applications
While the network analyst does not need to understand applications down to the code level, it is important to understand the network traffic related to applications This understanding will help reduce the amount of time it takes to troubleshoot the application A good practice is to capture the application traffic when the application is running well. This good capture can be compared with the problem trace to reduce the amount of time it takes to isolate the problem
What is a Transaction?
Business Transaction
User Action
Application Transaction
Packets
Packet #1
Go to Trade Page Look up Danaher Symbol Enter Symbol And Qty Submit Order
GET /tradepage.aspx GET /border.gif GET /dnarrow.gif GET /displayDHR.gif GET /stylesheet.css GET /javascript.js POST /submit_order.asp
Packet #2 Packet #3 Packet #4 Packet #5 Packet #6 Packet #7 Packet #8 Packet #9 Packet #10 Packet #11 Packet #12 Packet #13 Packet #14 Packet #15 Packet #16
Multi-Segment Analysis
In order to get a complete picture of the problem, we may need to see both sides of the conversation at the same time By capturing on both sides and merging that traffic together, we are able to quickly identify the source of packet loss and delays To perform this multi-segment analysis, we must be able to synchronize the traces based on time stamp
Multi-Segment Analysis
ClearSight merges traces files from both analyzers
Client Network Web Server
Optiview
Multi-Segment Analysis
Firewall Latency Router Latency Core Latency
Multimedia Playback
In some cases it takes more than just looking at packets to resolve an application problem When troubleshooting VoIP and Video problems, it is helpful to be able to play the media stream back to view the quality Problems such as echo with VoIP cannot be determined by looking at the statistics or packets. The only way to detect echo is to listen to the audio stream
Portable Solution
Having an portable analysis solution allows the analyst to move connect to various locations to isolate the problem In cases of remote offices, the analysis solution can be shipped to the office to capture the end user experience
Use of Taps
Having taps installed ahead of time provides a quick and easy way to connect the analyzer The use of taps insures that the timing of the multimedia packets is not changed, which could adversely impact the metrics
Use of Taps
Data Center
Application Servers
Database Servers
Method
Flow of the packets Application Dependencies Span/Tap
Multimedia Analysis
Resources
90-Day ClearSight Trial requires unique Proof of Purchase (POP) Code found on the ClearSight Flyer handed out at the seminar 14-Day ClearSight Trial if you misplaced your POP Code you can download the 14day trial at www.flukenetworks.com/csatrial Application-Centric Resource Center: www.flukenetworks.com/app-centric Network Forensics Resource Center: www.flukenetworks.com/ntmresources Portable Network Analysis: www.flukenetworks.com/optiview Request OptiView 5Day Evaluation: www.flukenetworks.com/optivieweval
For additional information: Email: [email protected]. Phone 800-283-5853 (US/Canada) or 425-446-4519 (other locations).