The Sprint IP Monitoring Project and
Traffic Dynamics at a Backbone POP
Supratik Bhattacharyya
Sprint ATLhttp://www.sprintlabs.com
The IP Group at Sprintlabs
Charter : Investigate IP technologies for robust,
efficient, QOS-enabled networks Anticipate and evaluate new services and
applications
Major Projects : Monitoring Sprint’s IP Backbone Service Platform
IP Backbone : POP-to-POP view
POPOC-48
POP : Point of Presence, typically a metropolitan area
OC-12
OC-3
Motivation: Need for Monitoring
Current network is over-provisioned, over-engineered, best-effort…
Diagnosis: detect and report problems at IP level
Management configuration problems, traffic engineering resource provisioning, network
dimensioning Value-added service
feedback to customers (performance, traffic characteristics)
Detect attacks and anomalies
Existing Measurement Efforts
Passive measurements SNMP-based tools Netflow (Cisco proprietary) OC3MON, OC12MON
Active Measurements ping, traceroute, NIMI, MINC, Surveyor Skitter, Keynote, Matrix
Integrated Approach AT&T Netscope
• Network topology and routes• Traffic at flow level granularity• Delay and loss statistics
Our approach
Passive monitoring Capture header (44 bytes) from every packet
full TCP/IP headers, no http information Use GPS time stamping - allows accurate
correlating of packets on different links Day long traces Simultaneously monitor multiple links and
sites. Collect routing information along with packet
traces. Traces archived for future use
Applications
Data from a commercial Tier-1 IP backbone
Applications of data: traffic modeling traffic engineering provisioning pricing, SLAs hardware design in collaboration with
vendors denial-of-service
Measurement Facilities
IPMON System Collects packet traces by passively tapping
onto the fiber using optical splitters supports OC-3 to OC-48 data rates
Data Repository Large tape library to archive data
Analysis Platform Initially 17 nodes computing cluster SAN under deployment
IPMON Architecture
OC-3/12/48link
opticalsplitter
GPSclock
SONET
DAG Card
main memory buffer
diskarray
IPMON system
Linux PC with multiple PCI buses
Monitoring links at a POP
BackboneRouter
Backbone links
Peering points
AccessRouter
AccessRouter
AccessRouter
customer customer customer
Current Status of IPMONs
Currently operational in one major west coast POP on OC3 links
Under way in two major east coast POPs for OC3 and OC12 -- (we hope by July 2001)
OC48 in preparation for 1 east coast POP and 1 west coast POP -- summer 2001
Future: Sprint Dial-Up Network, more POPs, European network
Practical Constraints
Difficult to monitor operational network : Complex procedure for deploying
equipment POPs evolve too fast
Too costly to be ubiquitous Technology limitations (PCs, disks,
etc.) Only off-line analysis is possible Are 44 bytes enough?
Ongoing Projects
Routing and Traffic Dynamics
Delay measurement across a router
TCP flow analysis
Denial of service
Bandwidth provisioning and pricing
Routing and Traffic Dynamics Project
Part 1: what are the traffic demands between pairs of POPs? How stable is this demand?
Part 2: what are the paths taken by those demands? Are link utilizations levels similar throughout
the backbone? Part 3: is there a better way to spread
the traffic across paths? At what level of traffic granularity should
traffic be split up?
POP-to-POP Traffic Matrix
City A
City B
City C
City A City B City C Measure traffic over different timescales
Divide traffic per destination prefix, protocol, etc.
For every ingress POP :Identify total traffic to each egress POPFurther analyze this traffic
Applications
Intra-domain routing
Analyzing routing anomalies
Verify BGP Peering
Capacity planning and dimensioning
POP architecture
Mapping BGP destinations to POPs
BGP tableFind best Next-Hop
Get Unique Next-Hops
Map to POP
(Dst,Next-Hop)
UniqueNext-Hops
(Next-Hop, Last Sprint Hop)
(Next-Hop, POP map)
(BGP Dst,POP)
Map Dst to POP
Recursive BGPlookup to find last Sprint hop
Data Processing
Step 1: Use BGP tables to generate [prefix, egress POP] map
Step 2: Run IP lookup software on packet trace using above map Output : single trace file for each egress-
POP, e.g. all packets headed to POP k from monitored POP
Step 3: Use our traffic analysis tool for statistics evaluation.
Monitored links at a single POP
CoreCore
Core
Peer 2
Access
Access
Access
Access
web hosting
ISP
Peer 1
Data
5 traces collected on Aug 9, 2000
Access Link Type
Trace Length (hours)
Webhost 1
Webhost 2
Peer 1
Peer 2
ISP
19
13
24
15
8
Summary
Wide disparity in “traffic demands” among egress POPs
POPs can be roughly categorized as : small, medium, large; and they maintain their rank during the day.
Traffic is heterogeneous in space yet stable in time.
Traffic varies by (access link, egress POP pair)
Hard to characterize time-of-day behaviour 20-50% reduction at night
Routing and Traffic Dynamics Project
Part 1: what are the traffic demands between pairs of POPs? How stable is this demand?
Part 2: what are the paths taken by those demands? Are link utilizations levels similar throughout
the backbone? Part 3: is there a better way to spread
the traffic across paths? At what level of traffic granularity should
traffic be split up?
What we’ve seen so far
Wide disparity in traffic demands between (ingress, egress) POP pairs
+ Wide disparity in link utilization levels,
plus many underutilized routes +Routing Policies concentrate traffic on few
paths
Question: Can we divert some traffic to the lightly loaded paths?
Routing and Traffic Dynamics Project
Part 1: what are the traffic demands between pairs of POPs? How stable is this demand?
Part 2: what are the paths taken by those demands? Are link utilizations levels similar throughout
the backbone? Part 3: is there a better way to spread
the traffic across paths? At what level of traffic granularity should
traffic be split up?
Creating traffic aggregates
To address issues of splitting traffic over multiple paths, need to define “streams” within traffic
How should packets be aggregated into streams?
Coarse granularity: POP-to-POP Very fine granularity: use 5-tuple Initial criterion : destination address
prefix
Elephants and Mice among /8 streams
Stream : all packets in a group with same /8 destination address
prefix
Traffic grouped by egress POPs
Ingress : Webhost Link
Observations about prefix-based streams
Recursive : /8 elephant has a few /16 elephants and many mice, likewise at /24 level
Phenomenon is less pronounced at /24 level
Qn : Are elephants stable?
Definition: Ri(n) = the rank of flow i at time slot n
i,n,k= | Ri(n) - Ri(n+k) | each time slot corresponds to 30 minutes
Conclusions
Monitoring and measurement is key to better network design
IPMon : a passive monitoring system for packet-level information
We have used our data to build components of traffic matrices for traffic engineering
Backbone traffic can be better load-balanced : destination-prefix is a possible (simple) criterion
Top Related