APRS: Automatic Packet Reporting System CHRIS MAHER – N7CPM .
Automatic Test Packet Generation -...
Transcript of Automatic Test Packet Generation -...
![Page 1: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/1.jpg)
Automatic Test Packet Generation
James Hongyi Zengwith Peyman Kazemian,
George Varghese, Nick McKeownStanford University, UCSD, Microsoft Research
http://eastzone.github.com/atpg/CoNEXT 2012, Nice, France
![Page 2: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/2.jpg)
CS@Stanford Network Outage
Tue, Oct 2, 2012 at 7:54 PM:
“Between 18:20-19:00 tonight we experienced a complete network outage in the building when a loop was accidentally created by CSD-CF staff. We're investigating the exact circumstances to understand why this caused a problem, since automatic protections are supposed to be in place to prevent loops from disabling the network.”
2
![Page 3: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/3.jpg)
Outages in the Wild
3
Hosting.com's New Jersey data
center was taken down on June 1, 2010, igniting a cloud outage and connectivity loss for nearly two hours… Hosting.com said the connectivity loss was due to a software bug in a Cisco switch that caused the switch to fail.
On April 26, 2010, NetSuitesuffered a service outage that rendered its cloud-based applications inaccessible to customers worldwide for 30 minutes… NetSuite blamed a network issue for the downtime.
The Planet was rocked by a pair of
network outages that knocked it off line for about 90 minutes on May 2, 2010. The outages caused disruptions for another 90 minutes the following morning.... Investigation found that the outage was caused by a fault in a router in one of the company's data centers.
![Page 4: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/4.jpg)
Network troubleshooting a problem?
• Survey of NANOG mailing list (June 2012)
– Data set: 61 responders: 23 medium size networks (<10K hosts), 12 large networks (< 100K hosts)
– Frequency: 35% generate >100 tickets per month
– Downtime: 25% take over an hour to resolve. (estimated $60K-110K/hour [1])
– Current tools: Ping, Traceroute, SNMP
– 70% asked for better tools, automatic tests
[1] http://www.evolven.com/blog/downtime-outages-and-failures-understanding-their-true-costs.html
4
![Page 5: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/5.jpg)
The Battle
5
HardwareBuffers, fiber cuts, broken interfaces,
mis-labeled cables, flaky links
Softwarefirmware bugs, crashed module
vs
+ping, traceroute,
SNMP, tcpdump
wisdom and intuition
![Page 6: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/6.jpg)
Automatic Test Packet Generation
Goal: automatically generate test packets to testthe network state, and pinpoint faults before being noticed by application.
Augment human wisdom and intuition.Reduce the downtime.Save money.
Non-Goal: ATPG cannot explain why forwarding state is in error.
6
![Page 7: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/7.jpg)
ATPG Workflow
7
ATPG
Network
FIBs, ACLsTopology
Test Packets
Test Results
![Page 8: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/8.jpg)
Systematic Testing
• Comparison: chip design
– Testing is a billion dollar market
– ATPG = Automatic Test Pattern Generation
8
![Page 9: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/9.jpg)
Roadmap
• Reachability Analysis
• Test packet generation and selection
• Fault localization
• Implementation and Evaluation
9
![Page 10: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/10.jpg)
Reachability Analysis
• Header Space Analysis (NSDI 2012)
• All-pairs reachability: Compute all classes of packets that can flow between every pair of ports.
10
Header Space Analysis
FIBs, config filestopology
<Port X, Port Y>
All Forwarding EquivalentClasses (FECs) flowing X->Y
![Page 11: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/11.jpg)
rA1,rA2,rA3
rB1,rB2,rB3,rB4
PA PB
PC
rC1,rC2
Example
11
Box A
Box C
Box B
![Page 12: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/12.jpg)
All-pairs reachability
12
PA PB
PC
Box A
Box C
Box B
![Page 13: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/13.jpg)
New Viewpoint: Testing and coverage
• HSA represents networks as chips/programs• Standard testing finds inputs that cover every
gate/flipflop (HW) or branch/function (SW)
13
Testbench
Results
CoverChip model:
Boolean Algebra
Device Under Test
Test PatternsHSA Network Model:Reachability
Network Under Test
Test Packets
![Page 14: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/14.jpg)
New Viewpoint: Testing and coverage
• In networks, packets are inputs, different covers
– Links: packets that traverse every link
– Queues: packets that traverse every queue
– Rules: packets that test each router rule
• Mission impossible?
– testing all rules 10 times per second needs < 1% of link overhead (Stanford/Internet2)
14
![Page 15: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/15.jpg)
Roadmap
• Reachability Analysis
• Test packet generation and selection
• Fault localization
• Implementation and Evaluation
15
![Page 16: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/16.jpg)
All-pairs reachability and covers
16
PA PB
PC
Box A
Box C
Box B
![Page 17: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/17.jpg)
Test Packet Selection
• Packets in all-pairs reachability table are more than necessary
• Goal: select a minimum subset of packets whose histories cover the whole rule set
A Min-Set-Cover problem
17
![Page 18: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/18.jpg)
Min-Set-Cover
18
R1 R2 R3 R4 R5 R6
A
B
C
D
E
F
G
R1 R2 R3 R4 R5 R6
B
C
G
Pack
ets
Pack
ets
![Page 19: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/19.jpg)
Test Packets Selection
19
Test Packets
Min-Set-CoverRegular Packets Reserved Packets
- Exercise all rules- Sent out periodically
- “Redundant”- Will be used in
fault localization
• Min-Set-Cover
– Optimization is NP-Hard
– Polynomial approximation (O(N^2))
![Page 20: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/20.jpg)
Roadmap
• Reachability analysis
• Test packet generation and selection
• Fault localization
• Evaluation: offline (Stanford/Internet2), emulated network, experimental deployment
20
![Page 21: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/21.jpg)
Fault Localization
21
![Page 22: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/22.jpg)
Fault Localization
• Network Tomography? → Minimum Hitting Set• In ATPG: we can choose packets!• Step 1: Use results from regular test packets
– F (potentially broken rules) = Union from all failing packets– P (known good rules) = Union from all passing packets– Suspect Set = F – P
22
F PSuspects
![Page 23: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/23.jpg)
Fault Localization
• Step 2: Use reserved test packets
– Pick packets that test only one rule in the suspect set, and send them out for testing
– Passed: eliminate
– Failed: label it as “broken”
• Step 3: (Brute force…) Continue with test packets that test two or more rules in the suspect set, until the set is small enough
23
![Page 24: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/24.jpg)
Roadmap
• Reachability analysis
• Test packet generation and selection
• Fault localization
• Implementation and Evaluation
24
![Page 25: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/25.jpg)
Parser
Topology, FIBs, ACLs, etc
Transfer Function
All-pairs Reachability
Hea
der
Sp
ace
An
alys
isHeader In Port Out Port Rules
10xx… 1 2 R1,R5,R20
… … … …
All-pairs Reachability Table
Test Packet Generator(sampling + Min-Set-Cover)
Fault Localization
Test Terminal
(1)
(2)
(3)
(4)
(5)
Putting them all together
25
![Page 26: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/26.jpg)
Implementation
• Cisco/Juniper Parsers– Translate router configuration files and forwarding
tables (FIB) into Header space representation
• Test Packet Generation/Selection– Hassel: A python header space library
– Min-Set-Cover
– Python’s multiprocess module to parallelize
• SDN can simplify the design
26
![Page 27: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/27.jpg)
Datasets
• Stanford and Internet2– Public datasets
• Stanford University backbone– ~10,000 HW forwarding entries (compressed from
757,000 FIB rules), 1,500 ACLs
– 16 Cisco routers
• Internet2– 100,000 IPv4 forwarding entries
– 9 Juniper routers
27
![Page 28: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/28.jpg)
Test Packet Generation
28
<1% Link Utilizationwhen testing 10 times per second!
Stanford Internet2
Computation Time ~1hour ~40min
Regular Packets 3,871 35,462
Packets/Port (Avg) 12.99 102.8
Min-Set-Cover Reduction 160x 85x
Ruleset structure
![Page 29: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/29.jpg)
Using ATPG for Performance Testing
• Beyond functional problems, ATPG can also be used for detecting and localizing performance problems
• Intuition: generalize results of a test from success/failure to performance (e.g. latency)
• To evaluate used emulated Stanford Network in Mininet-HiFi– Open vSwitch as routers– Same topology, translated into OpenFlow rules
• Users can inject performance errors
29
![Page 30: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/30.jpg)
s3 s5s2
yoza
s4s1
bozacoza pozbpoza rozagoza
bbra
30
![Page 31: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/31.jpg)
Does it work?
• Production Deployment
– 3 buildings on Stanford campus
– 30+ Ethernet switches
• Link cover only (instead of rule cover)
– 51 test terminals
31
![Page 32: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/32.jpg)
CS@Stanford Network Outage
Tue, Oct 2, 2012 at 7:54 PM:
“Between 18:20-19:00 tonight we experienced a complete network outage in the building when a loop was accidentally created by CSD-CF staff. We're investigating the exact circumstances to understand why this caused a problem, since automatic protections are supposed to be in place to prevent loops from disabling the network.”
32
![Page 33: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/33.jpg)
33
The problem in the email
Unreported problem
![Page 34: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/34.jpg)
ATPG Limitations
• Dynamic/Non-deterministic boxes
– e.g. NAT
• “Invisible” rules
– e.g. backup rules
• Transient network states
• Ambiguous states (work in progress)
– e.g. ECMP
34
![Page 35: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/35.jpg)
Related work
35
Policy“Group X can
talk to Group Y”
Control Plane
Forwarding State
TopologyForwarding
Rules
ATPG
NICE, AnteaterHSA, VeriFlow
Forwarding Rule != Forwarding StateTopology on File != Actual Topology
![Page 36: Automatic Test Packet Generation - SIGCOMMconferences.sigcomm.org/co-next/2012/slides/Zeng_66.pdf · Automatic Test Packet Generation James Hongyi Zeng with Peyman Kazemian, George](https://reader031.fdocuments.in/reader031/viewer/2022030506/5ab55dfa7f8b9a1a048cc113/html5/thumbnails/36.jpg)
Takeaways
• ATPG tests the forwarding state by generating minimal link, queue, rule covers automatically
• Brings lens of testing and coverage to networks
• For Stanford/Internet2, testing 10 times per second needs <1% of link overhead
• Works in real networks.
36