UH SWARM:Dense perfSONAR DeploymentWith Small, Inexpensive Devices
Alan WhineryU. Hawaii ITSI2 GS 2015
April 28, 2015
The Swarm
● Wrote paragraph into our CC-NIE campus networking proposal about making use of the recent availability of ~$50 computers to “sense” the network, using elements of perfSONAR.
● Funded a project to deploy 100 nodes on one campus over 2 years, exploiting a ~$50 price point to deploy many nodes on campus as a dense mesh.
Goals/Challenges● Finding nodes to buy in the face of market exhaustion
● Getting node deployment work-flow down to nil
– Getting recoveries of off-line nodes to a minimum● Tracking assets and reliability, generating metrics
● Evaluating capabilities of the whole set-up
● Developing a test program for many nodes
● Slicing/Dicing data to see what it has to tell us
● Developing visualizations and distillations to put tools in hands of network maintainers, merging into pS Toolkit
Devices We Have/Are Getting
● Raspberry Pi – famous, $50, med-perf, file system on SD card, 100 Mb Ethernet, USB 2.0
● BeagleBone Black – $50, more perf, FS on internal flash, and/or SD card, 100 Mb, USB 2.0
Honorable mention:● CuBox i4 – $147, more perf, FS on SD, GigE,
WiFi, USB 2.0● MiraBox $149 – most perf, FS on SD, dual
GigE, WiFi, USB 3.0
Reliability
● Raspberry Pi (July 2014)– UH ITS owns 47 – 1 has failed
– 22 SD card hard failures
– 10 file-system failures
● BeagleBone Black Rev A. (December 2013)– UH ITS owns 10 (+50 NIB), 1 has corrupted
firmware
– 9 in production, one had to be power-cycled, once
● CuBox – one deployed 6 months of service zero problems. (using SD from OEM).
SD Cards● DANE ELEC 8 GB Class 4
10 cards, 2 failures in light duty
● SanDisk Ultra 8 GB Class 10– 10 cards, 0 failures, 3 FS corrupted in 42k hours
● Kingston 8 GB Class 10– 10 cards, 0 failures, 7 FS corrupted, in 42k hours
● Kingston 4 GB Class 4– 20 hard failures in less than 20k hours
– (100% across 6 weeks, < 1000 Hr MTBF)
● SanDisk Ultra – 8GB Class 10– Most recent batch of replacements
Year 1● Tried 10 BeagleBones, liked them
– And a few Raspberries Pi● The market vacuum around the release of BBB Rev. C
made BBB impossible to obtain
● Bought 43 Raspberries
● Although we are going with
BeagleBone Black for the
completion, we could make
Raspberries work if necessary.
● Bought 2 Dell rack servers as
test facilitators, data archives.
2nd Year Completion● 50 BeagleBone Black Rev. C
(4 GB internal flash)– BBB Internal flash is more reliable
than SD
– Internal + SD card enables separating system/data partitions
– Better 100 Mb Ethernet performance
● 5 Raspberry Pi 2 Model B● As number deployed approaches
100, we will be placing nodes in new/special roles.
Management● Puppet/The Foreman
– https://puppetlabs.com/
– http://theforeman.org/
– Easy to push changes, updates out to the swarm.
– Easy to push errors out to the swarm and require 50 SSH sessions.
● Work-flow – Try to minimize per node actions and attended
setup
– RPi – ua-netinstall with tweaks for Puppetization
– BBB – custom SD that auto-images the internal flash
Characteristics Of DenseSensor Deployment
● Having many observations makes the loss of a single one less important.
● You can correlate topo and test results to “triangulate” on the source of a problem.
Test Programs: powstream (owamp)● powstream from pS Toolkit node to/from each
sensor node– Really, really, really boring at first glance. All loss
appears to be about zero. Always one or two losing a packet per day (1 in 864000)
– Standard deviation in latency groups somewhat interesting, may reflect queuing, flares in latency std dev may precede loss events
– Longitudinal analysis reveals damaging loss rates that would otherwise be invisible
– Higher packet rates might expose low loss probabilities in shorter time
30 nodes, in/out
Mathis, Semke, Mahdavi, "The Macroscopic Behavior of the TCP Congestion Avoidance Algorithm”,ACM SIGCOMM, Vol 27, Number 3, July 1997
Slid
e:P
hil D
ykst
ra S
C20
06U
sed
with
per
mis
sion
Slide: Used with permission
Speed Limits You Can't SeeFor 45 milliseconds RTT, typical minimum to get onto
continental US from HawaiiLoss Rate 10 pps Powstream
PacketsLostPer day
TCP AIMD Coastal Limit@1460 MSS(Mbits/sec)
45 mS RTT
TCP AIMD Coastal Limit@8960 MSS(Mbits/sec)
45 mS RTT
1.82E-005 15.75 42.56 261.18
2.25E-006 1.94 121.11 743.23
1.87E-006 1.62 132.76 814.72
9.38E-007 0.81 187.58 1151.16
6.05E-007 0.52 233.55 1433.28
5.93E-007 0.51 236.03 1448.52
3.35E-007 0.29 314.03 1927.21
2.51E-007 0.22 362.49 2224.57
1.74E-007 0.15 435.64 2673.49
Test Progams:50 Node Full Mesh TCP Throughput● <= 100 Mbps RPi, BBB throughput tests resemble
real-life user flows
– Unlike a high performance iperf tester which “punches the network in the face”
● I run a 50x50 full mesh iperf matrix (2450 tests) in about 7 hours, (5 second tests).
● Full-mesh traceroute is collected concurrently
● By scoring every hop encountered on the average peformance for paths it appears in, “per-hop confidence” can be derived.
● Using multi-rate UDP vs. TCP is worth investigating.
The Matrix
● Cut-out view of iperf3 tests to/from a chosen node...
● This row/column represents all tests to/from that chosen node.
● Leaves one wondering what the correlation is between the pink squares showing retransmissions
Sou
rces
Destinations
Correlating Full Mesh Throughput And Traceroute Results For
Fault Isolation
Graph of per-hop “confidence” with colored links where retransmissions were observed(names/addresses obfuscated)
This graph shows hops involved in in-bound Throughput testing between a chosennode and all partners.
Each oval represents anIP interface as reported in Traceroute output.
Graph rendered fromtest data with GraphViz.(GraphViz.org)
Ongoing● perfSONAR toolkit integration
– Not so much new development as making some pieces fit together
● Correlation of other sources to zero in on a fault– NetDot
– Flows/MRTG
● Ancillary programs – Log collection (honeypot-ish info)
– Name resolution tests● v6/v4 precedence
PerfClub
● http://perfclub.org● Monthly conference call for perfSONAR
deployers
3rd Monday 22:00 GMT● Send email to
[email protected] or [email protected]
to join the mailing list.
Top Related