CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and...

38
CSci5221: Internet Measurement Basics 1 Internet Measurement Basics • Measurement Overview and Internet Challenges –Why measure? Why model measurements? –What to measure? Where to measure? • Measurement tools –Active: ping, traceroute, and pathchar –Passive: logs, SNMP, packet, and flow monitoring • Two Case Studies: – trace-route based routing behavior measurement [Pa97] – OSPF-based passive monitoring of intra-domain routing [AG04] • Operational applications of measurement Readings: Please do the required readings

Transcript of CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and...

Page 1: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 1

Internet Measurement Basics • Measurement Overview and Internet Challenges

–Why measure? Why model measurements?–What to measure? Where to measure?

• Measurement tools–Active: ping, traceroute, and pathchar–Passive: logs, SNMP, packet, and flow monitoring

• Two Case Studies: – trace-route based routing behavior measurement [Pa97]– OSPF-based passive monitoring of intra-domain routing [AG04]

• Operational applications of measurement

Readings: Please do the required readings

Page 2: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 2

Why Measure?• The Internet is a man-made system, so

why do we need to measure it?– Because we still don’t really understand it– Because sometimes things go wrong

• Measurement for network operations– Reliability analysis, Traffic engineering, Capacity Planning

• Better and more efficient management of network resources• Detecting, diagnosing and predicting problems• What-if analysis of future changes

• Measurement for scientific discovery– Characterizing a complex system as organism– Creating accurate models that represent reality– Identifying new features and phenomena

Page 3: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 3

Why Build Models of Measurements?

• Compact summary of measurements– Efficient way to represent a large data set– E.g., exponential distribution with mean 100 sec

• Expose important properties of measurements– Reveals underlying cause or engineering question– E.g., mean RTT to help explain TCP throughout

• Generate random but realistic data as input– Generate new data that agree in key properties– E.g., topology models to feed into simulators

“All models are wrong, but some models are useful.” – George Box

Page 4: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 4

What Can be Measured?• Traffic

– Load statistics– Packet or flow traces

• Performance of paths– Application performance, e.g,. Web download time– Transport performance, e.g., TCP bulk throughput– Network performance, e.g., packet delay and loss

• Network structure– Topology, and paths on the topology– Dynamics of the routing protocol

Page 5: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 5

Where Measure, and How?

• Short answer– Anywhere you can!

• End hosts– Application logs, e.g., Web server logs– Sending active probes to measure performance

• Individual links/routers– Load statistics, packet traces, flow traces– Configuration state– Routing-protocol messages or table dumps– Alarms

• How: Active vs. Passive Measurement– First understand some measurement challenges

Page 6: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 6

Internet Challenges Make Measurement an Art

• Stateless routers– Routers do not routinely store packet/flow state– Measurement is an afterthought, adds overhead

• IP narrow waist– IP measurements cannot see below network layer– E.g., link-layer retransmission, tunnels, etc.

• Violations of end-to-end argument– E.g., firewalls, address translators, and proxies– Not directly visible, and may block measurements

• Decentralized control– Autonomous Systems may block measurements– No global notion of time

Page 7: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 7

Active Measurement Example: Ping

• Adding traffic for purposes of measurement– Trade-offs between accuracy and overhead– Need careful methods to avoid introducing bias

• Ping– Host sends an ICMP ECHO packet to a target– … and captures the ICMP ECHO REPLY– Useful for checking connectivity, and RTT– Only requires control of one of the two end-points

• Problems with ping– Round-trip rather than one-way delays– Some hosts might not respond

Page 8: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 8

Active Measurement Example:Pathchar for Links

cLdirttirtt /)()1(

Three delay components:delay npropagatio :d

delay ontransmissi :/ cL

noise delay queueing :

How to infer d,c?d

min. RTT (L)

L

rtt(i+1)-rtt(i)

slope=1/c

sizepacket

capacity link

TTL value initial

:

:

:

L

c

i

Page 9: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 9

Active Measurement Example: Traceroute

• Time-To-Live field in IP packet header– Source sends a packet with a TTL of n– Each router along the path decrements the TTL– “TTL exceeded” sent when TTL reaches 0

• Traceroute tool exploits this TTL behavior

source destination

TTL=1

Time exceeded

TTL=2

Send packets with TTL=1, 2, 3, … and record source of “time exceeded” message

Page 10: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 10

Challenges of Traceroute

• Measuring multiple paths– Successive probes may traverse different paths

• Non-participating network elements– Some routers and firewalls don’t reply

• Inaccurate delay information– Includes processing delays on the router CPU

• Round-trip vs. one-way measurements– Paths may have asymmetric properties

• Interfaces, not routers– Returns IP address of interfaces, not routers

Page 11: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 11

Applications of Traceroute

• Network troubleshooting– Identify forwarding loops and black holes– Identify long and convoluted paths– See how far the probe packets get

• Network topology inference– Launch traceroute probes from many places– … toward many destinations– Join together to fill in parts of the topology– … though traceroute undersamples the edges

Page 12: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

Paxson Study: Forwarding Loops• Forwarding loop

– Packet returns to same router multiple times

• May cause traceroute to show a loop– If loop lasted long enough– So many packets traverse the loopy path

• Traceroute may reveal false loops– Path change that leads to a longer path– Causing later probe packets to hit same nodes

• Heuristic solution– Require traceroute to return same path 3 times

Page 13: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

Paxson Study: Causes of Loops• Transient vs. persistent

– Transient: routing-protocol convergence– Persistent: likely configuration problem

• Challenges– Appropriate time boundary between the two?– What about flaky equipment going up and down?– Determining the cause of persistent loops?

• Anecdote on recent study of persistent loops– Provider has static route for customer prefix– Customer has default route to the provider

Page 14: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

Paxson Study: Path Fluttering• Rapid changes between paths

– Multiple paths between a pair of hosts– Load balancing policies inside the network

• Packet-based load balancing– Round-robin or random– Multiple paths for packets in a single flow

• Flow-based load balancing– Hash of some fields in the packet header– E.g., IP addresses, port numbers, etc.– To keep packets in a flow on one path

Page 15: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

Paxson Study: Routing Stability

• Route prevalence– Likelihood of observing a particular route– Relatively easy to measure with sound sampling– Poisson arrivals see time averages (PASTA)– Most host pairs have a dominant route

• Route persistence– How long a route endures before a change– Much harder to measure through active probes– Look for cases of multiple observations– Typical host pair has path persistence of a week

Page 16: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

Paxson Study: Route Asymmetry• Hot Potato Routing• Other causes

– Asymmetric link weights in intradomain routing

– Cold-potato routing, where AS requests traffic enter at particular place

• Consequences– Lots of asymmetry– One-way delay is not

necessarily half of the round-trip time

Customer B

Customer A

multiplepeeringpoints

Provider A

Provider B

Early-exit routing

Page 17: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 17

Passive Measurement Example: Logs at Hosts

• Web server logs– Host, time, URL, response code, content length, …– E.g., 122.345.131.2 - - [15/Oct/1998:00:00:25 -0400]

"GET /images/wwwtlogo.gif HTTP/1.0" 304 - "http://www.aflcio.org/home.htm" "Mozilla/2.0 (compatible; MSIE 3.02; Update a; AK; AOL 4.0; Windows 95)" "-"

• DNS logs– Request, response, time

• Useful for workload characterization, troubleshooting, etc.

Page 18: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 18

“Passive” Traffic Measurement• Packet-level:

– Tcpdump: software based– Special hardware packet collectors

• Flow-level: – Cisco Netflow; other vendors have similar facility– 5-tuple flow: srcIP, dstIP, scrPort, dstPort, protocol

• use a time-out value to “terminate” a flow• statistics collected: start/end time, packet/byte counts

– Sampling may be used for scalability

• Link-level: – SNMP traffic statistics, often over 5-min interval– IETF MIB (management information base)

• Byte counts, packet counts, etc.

• pros and cons of each?

Page 19: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 19

Passive Measurement: SNMP

• Simple Network Management Protocol– Coarse-grained counters on the router– E.g., byte and packet counts

• Polling– Management system can poll the counters– E.g., once every five minutes

• Limitations– Extremely coarse-grained statistics– Delivered over UDP!

• Advantages: ubiquitous

Page 20: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 20

Passive Measurement: Packet Monitoring

• Tapping a link

Host A Host B Monitor

Shared media (Ethernet, wireless)

Router A Router B

Monitor

Splitting a point-to-point link

Router A

Line card that does packet sampling

Host A

Host B

Host C

Monitor

Switch

Multicast switch

Page 21: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 21

Packet Monitoring: Selecting the Traffic

• Filter to focus on a subset of the packets– IP addresses/prefixes (e.g., to/from specific Web sites, client

machines, DNS servers, mail servers)– Protocol (e.g., TCP, UDP, or ICMP)– Port numbers (e.g., HTTP, DNS, BGP, Napster)

• Collect first n bytes of packet (snap length)– Medium access control header (if present)– IP header (typically 20 bytes)– IP+UDP header (typically 28 bytes)– IP+TCP header (typically 40 bytes)– Application-layer message (entire packet)

Page 22: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 22

Analysis of Packet Traces

• IP header– Traffic volume by IP addresses or protocol– Burstiness of the stream of packets– Packet properties (e.g., sizes, out-of-order, etc.)

• TCP header– Traffic breakdown by application (e.g., Web)– TCP congestion and flow control– Number of bytes and packets per session

• Application header– URLs, HTTP headers (e.g., cacheable response?)– DNS queries and responses, user key strokes, …

Page 23: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 23

Packet vs. Flow Measurement• Basic statistics (available from both techniques)

– Traffic mix by IP addresses, port numbers, and protocol– Average packet size

• Traffic over time– Both: traffic volumes on a medium-to-large time scale– Packet: burstiness of the traffic on a small time scale

• Statistics per TCP connection– Both: number of packets & bytes transferred over the link– Packet: frequency of lost or out-of-order packets, and the

number of application-level bytes delivered

• Per-packet info (available only from packet traces)– TCP seq/ack #s, receiver window, per-packet flags, …– Probability distribution of packet sizes– Application-level header and body (full packet contents)

Page 24: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 24

Network Topology Measurement• Use traceroute

– Pros• Can be done at end hosts• “router-level” topology• Can a “sample” of “global” Internet topology,

– Cons• Active measurement, incur overhead/load on routers• Not routers all respond to traceroutes• IP address aliasing problem;

– Also MPLS tunnels may “obscure” real topology• Only “sampled”, or “snapshots”

• BGP routing data– “global” AS-level topology, – Partial view, unless you can BGP data from all BGP routers

• ISP topology– If you are the ISP operator, an easier task, but not

necessarily an easy task

Page 25: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Network Failures and Fast Convergence

25

OSPF Protocol: A Quick Recap• Link-state protocol

– Routers flood Link State Advertisements (LSAs)

– Routers compute shortest paths based on weights

– Routers identify next-hop to reach other routers

32

2

1

13

1

4

5

3

Page 26: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

Measurement: Intradomain Route Monitoring

• OSPF is a flooding protocol– Every link-state advertisements sent on every link– Very helpful for simplifying the monitor

• Can participate in the protocol– Shared media (e.g., Ethernet)

• Join multicast group and listen to LSAs

– Point-to-point links• Establish an adjacency with a router

• … or passively monitor packets on a link– Tap a link and capture the OSPF packets

Page 27: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Network Failures and Fast Convergence

27

Intradomain Route Monitoring

• Construct continuous view of topology– Detect when equipment goes up or down– Input to traffic-engineering and planning tools

• Detect routing anomalies– Identify failures, LSA storms, and route flaps– Verify that LSA load matches expectations– Flag strange weight settings as misconfigurations

• Analyze convergence delay– Monitor LSAs in multiple locations with go– Compare the times when LSAs arrive

• Detect router implementation mistakes

Page 28: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Network Failures and Fast Convergence

28

Passive Collection of LSAs

• OSPF is a flooding protocol– Every LSA sent on every participating link– Very helpful for simplifying the monitor

• Can participate in the protocol– Shared media (e.g., Ethernet)

• Join multicast group and listen to LSAs– Point-to-point links

• Establish an adjacency with a router

• … or passively monitor packets on a link– Tap a link and capture the OSPF packets

• Note LSAs do not tell us the “root causes” of failures!– need to gather route configurations, syslogs, …– need to dig below IP: link/physical layers, …

Page 29: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Network Failures and Fast Convergence

29

Reducing Volume of Information

• Prioritizing the messages– Router failure over router recovery– Link failure or weight change over a refresh– Informational messages about weight settings

• Grouping related messages– Link failure: group messages for the two ends– Router failure: group the affected links– Common failure: group links failing close in time

Page 30: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Network Failures and Fast Convergence

30

Anomalies Found in Shaikh04 paper

• Intermittent hardware problem– Router periodically losing OSPF adjacencies– Risk of network partition if 2nd failure occurred

• External link flaps– Congestion on edge link causing lost messages– Lost adjacency leading to flapping routes

• Configuration errors– Two routers assigned the same IP address– Inefficient config leading to duplicate LSAs

• Vendor implementation bug– More frequent refreshing of LSAs than specified

Page 31: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 31

Measurement Challenges for Operators

• Network-wide view – Crucial for evaluating control actions – Multiple kinds of data from multiple locations

• Large scale– Large number of high-speed links and routers– Large volume of measurement data

• Poor state-of-the-art– Working within existing protocols and products– Technology not designed with measurement in mind

• The “do no harm” principle– Don’t degrade router performance – Don’t require disabling key router features– Don’t overload the network with measurement data

Page 32: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 32

Network Operations Tasks• Reporting of network-wide statistics

– Generating basic information about usage and reliability

• Performance/reliability troubleshooting – Detecting and diagnosing anomalous events

• Security– Detecting, diagnosing, and blocking security problems

• Traffic engineering– Adjusting network configuration to the prevailing traffic

• Capacity planning– Deciding where and when to install new equipment

Page 33: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 33

Basic Reporting• Producing basic statistics about the network

– For business purposes, network planning, ad hoc studies

• Examples– Proportion of transit vs. customer-customer traffic– Total volume of traffic sent to/from each private peer– Mixture of traffic by application (Web, Napster, etc.)– Mixture of traffic to/from individual customers– Usage, loss, and reliability trends for each link

• Requirements– Network-wide view of basic traffic and reliability statistics– Ability to “slice and dice” measurements in different ways

(e.g., by application, by customer, by peer, by link type)

Page 34: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 34

Troubleshooting

• Detecting and diagnosing problems– Recognizing and explaining anomalous events

• Examples– Why a backbone link is suddenly overloaded– Why the route to a destination prefix is flapping– Why DNS queries are failing with high probability– Why a route processor has high CPU utilization– Why a customer cannot reach certain Web sites

• Requirements– Network-wide view of many protocols and systems– Diverse measurements at different protocol levels– Thresholds for isolating significant phenomena

Page 35: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 35

Security• Detecting and diagnosing problems

– Recognizing suspicious traffic or disruptions

• Examples– Denial-of-service attack on a customer or service– Spread of a worm or virus through the network– Route hijack of an address block by adversary

• Requirements– Detailed measurements from multiple places– Including deep-packet inspection, in some cases– Online analysis of the data– Installing filters to block the offending traffic

Page 36: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 36

Traffic Engineering

• Adjusting resource allocation policies– Path selection, buffer management, and link scheduling

• Examples– OSPF weights to divert traffic from congested links– BGP policies to balance load on peering links– Link-scheduling weights to reduce delay for “gold” traffic

• Requirements– Network-wide view of the traffic carried in the backbone– Timely view of the network topology and configuration– Accurate models to predict impact of control operations

(e.g., the impact of RED parameters on TCP throughput)

Page 37: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 37

Capacity Planning• Deciding whether to buy/install new equipment

– What? Where? When?

• Examples– Where to put the next backbone router– When to upgrade a link to higher capacity– Whether to add/remove a particular peer– Whether the network can accommodate a new customer– Whether to install a caching proxy for cable modems

• Requirements– Projections of future traffic patterns from measurements– Cost estimates for buying/deploying the new equipment– Model of the potential impact of the change (e.g., latency

reduction and bandwidth savings from a caching proxy)

Page 38: CSci5221: Internet Measurement Basics 1 Internet Measurement Basics Measurement Overview and Internet Challenges –Why measure? Why model measurements?

CSci5221: Internet Measurement Basics 38

Examples of Public Data Sets• Network-wide data

– Abilene and GEANT backbones– Netflow, IGP, and BGP traces

• CAIDA DatCat – Data catalogue maintained by CAIDA– http://imdc.datcat.org/

• Interdomain routing– RouteViews and RIPE-NCC– BGP routing tables and update messages

• Traceroute and looking glass servers– http://www.traceroute.org/– http://www.nanog.org/lookingglass.html