Introduction To Internet Measurement: What, Where, Why and How
Introduction To Internet Measurement: What, Where, Why and How
Nina Taft
Technicolor Research Palo Alto
Outline
Motivation & Ecosystem Quick examples of where and why
Areas
traffic performance
topology
security traffic matrices Summary
access net
enterprise net
access net
enterprise net
Internet structure
Access Internet Service Providers (ISP) connect to a global transit ISP. Customer and provider ISPs have economic agreement.
access net access net access net access net enterprise net enterprise net access net
ISP A
access net
ISP B
ISP C
enterprise net
access net
enterprise net access net access net access net enterprise net access net
Internet structure
Global ISPs must be interconnected
access net access net access net access net enterprise net enterprise net access net
ISP A
enterprise net
access net
ISP B
ISP C
access net access net
access net
peering link
enterprise net access net access net
enterprise net
Internet structure
Content provider networks (e.g., Google, Microsoft, Akamai) may run their own network,
access net access net access net access net enterprise net access net
enterprise net
ISP A
ISP B
ISP C
enterprise net
access net
enterprise net access net access net access net enterprise net access net
Tier 1 ISP
Tier 1 ISP
Regional ISP
Regional ISP
access ISP
access ISP
access ISP
access ISP
access ISP
access ISP
access ISP
access ISP
Tier-1 ISP
BGP
Tier-1 ISP
Tier-2 ISP
AS (Autonomous System) - a collection of routers under the same technical and administrative domain. BGP (Border Gateway Protocol) - used between two ASs to allow them to exchange routing information so that traffic can be forwarded across AS borders.
AS2 R3
Customer with address range A
AS1 traffic to A
BGP
R2
R1
border router
ISP 1
access router
access router
customer
access link
customer
What should we measure? Where should we measure? How do we capture data? It all depends upon what you want to do with the data !
Outline
Motivation & Ecosystem Quick examples of where and why
Areas
traffic performance
topology
security traffic matrices Summary
border router
ISP 1
Why? ISPs can see if their customer traffic is growing, or if there is unwanted traffic
access router
access router
customer
access link
customer
ISP 1
Why? ISPs need to know when to upgrade a router, or add a new one; when to upgrade link capacity.
access router
access router
customer
access link
customer
border router
Example 3: What ? Failures Why? To know if recover well from outages, and how often they occur
ISP 1
access router
access router
customer
access link
customer
ISP 2 Example 4: Where ? Peering links What ? BGP routing announcements ISP 1 Why ? Learn who is connected to whom
access router access router
border router
border router
customer
access link
customer
ISP 2 Example 5: Where ? In the home Why ? Understand performance of a video streaming
access router
border router
border router
ISP 1
access router
customer
access link
Did you get enough samples of routers? clients? servers? paths? to make general claim about internet ?
Many things hard to do without access to a large infrastructure. Researchers have created distributed measurement platforms.
What
Traffic
Outline
Motivation & Ecosystem Quick examples of where and why
Areas
traffic performance
topology
security traffic matrices Summary
can my network accommodate bursts and peaks? provisioning depends upon aggregate traffic mix delay & loss performance (for example)
Identify abnormalities
Failures
Security
Approach 1: capture packets Example Tools: Libpcap: for general purpose systems OC192MON: specialized hardware for backbone links (~10Gbps) Analysis tools: tcpdump & Wireshark Approach 2: capture flows Flow commonly defined as 5-tuple:
(srcIPaddr, srcPort, dstIPaddr, dstPort, protocol-id)
Example Tools: Ciscos Netflow, Junipers cflowd Typical data recorded: start-time, end-time, number of bytes, number of packets Reduced trace size compared to packet capture
Wireshark
packet analyzer
pcap
copy packets
slide from Kurose & Ross Computer Networking, Ch. 1 (online demo)
WIRESHARK DEMO
Traffic Trends
Example of backbone traffic in Tier1 ISP Average growing over time Variability is growing over time Spikes present throughout
Traffic Patterns
Mbps
time
Timescale: zoom in to one month of previous trace We see strong diurnal trends Clear periodicities at 24 and 12 hour periods Weekdays vs. weekends
Recap on Traffic
Who and Why?
ISPs collect packets and flows inside their ISPs to better manage their network, and to be ready for the future. pcap can be put anywhere an end user can do it from their home; hows my traffic composition changing? Enterprise networks also need to understand corporate traffic
Outline
Motivation & Ecosystem Quick examples of where and why
Areas
traffic performance
topology
security traffic matrices Summary
Performance Measurement
What are bandwidth measurements useful for?
Applications that adapt their rate accordingly (video streaming) Verifying SLA (service level agreements) and more Usually measures the latency of a path
Any app that involves choosing among different paths to deliver data: CDNs, P2P, multiuser games,
Measuring such things often involves a measurement process at multiple places in the network
Packets of equal size transmitted back-to-back onto a path Slowest link creates fixed delay between start of pkt1 and pkt2 When packets leave narrow link, fixed delay D is preserved
pkt1 pkt2
Outline
Motivation & Ecosystem Quick examples of where and why
Areas
traffic performance
topology
security traffic matrices Summary
Topology
A way to visualize the interconnection patterns of Internet components
AS graph
Interconnection pattern of ASes : each node is an AS, and each edge are peering links (or AS exchange points). Each node is a router, and each edge is a one-hop link between 2 routers. Can annotate edges with link capacity.
Router-level graph
1239 Sprint
7922 Comcast
22390 Xerox
12576 Orange
What does a router level topology look like? For one ISP (simple version):
slide from Kurose & Ross Computer Networking, Ch. 1 (online slides)
Combines 2 different data types: traceroute data and BGP routing tables Have ~70 monitors worldwide
Built a worldwide community of thousands of users who installed host software (Dimes agent) to run traceroutes and pings (at low rates) from their hosts. Data collected centrally for analysis.
Traceroute data has ambiguities in it. There was much debate over a number of years on the best way to produce these topologies. After settling on a model many researchers liked, the Internet changed again
Outline
Motivation & Ecosystem Quick examples of where and why
Areas
traffic performance
topology
security traffic matrices Summary
Reached very high scanning rate in 3 minutes. Overwhelmed many servers, some disabled, slowed down links
Slammer studied using telescope data, and predictions made: 1) could target small populations; 2) spreading speed could increase.
Outline
Motivation & Ecosystem Quick examples of where and why
Areas
traffic performance
topology
security traffic matrices Summary
City A
City B City C 250 Mbps
Describes traffic demands for all nodepairs: end-to-end flows within a single network (e.g., AS domain)
ingress node:
egress node:
Example Problem
A 5 3
How much traffic flows between origin-destination pairs? A->D A->C B->C B->D
A 5 3
1 C
4 3
4 4
0 D
A 5 3
3 C
4 1
4 2
2 D
type of equations:
0 1 1 0 0 0 0 0 1 0 0 . .
Y=AX
Problem Statement
System: Y = AX We have Y from SNMP link measurements We know A from OSPF link weights (so we can compute shortest paths) problem: find X # origin-destination (OD) pa issue: # links < < # OD pairs => underconstrained system => infinite # of solutions
# links
5.5
4.5
Bps
4 3.5 3 2.5 0
200
400
600
1400
1600
1800
Summary
What Traffic How packet or flow capture active probes Who ISPs Why
network planning
everyone
CDNs users ISPs
traffic evolution
web download performance Fairness delivering promised performance
Performance
Topology
Security
Internet is flattening
Google Comcast
Content & search providers building their own infrastructure
Akamai
Acknowledgements
Book Internet Measurement by Prof. Mark Crovella (Boston University) and Bala Krishnamurthy (AT&T Research) Some slideware thanks to Prof. Jim Kurose (University of Massachusetts at Amherst) online course material from book Computer Networking: Top-Down Approach by Kurose and Ross
Thank You
Questions ?