Fast and Efficient Flow and Loss Measurement for Data...
Transcript of Fast and Efficient Flow and Loss Measurement for Data...
FastandEfficientFlowandLossMeasurementforDataCenterNetworks
YuliangLiRui Miao Changhoon Kim Minlan Yu
1
FlowRadar:Captureallflowsonafinetimescale
2
Flowcoverage
• Weneedtoinspecteachindividualflow– Definedby5tuples:source,dest IPs,ports,protocol
3
Transientloops/blackholes Fine-grainedtrafficanalysis
0%
10%
20%
30%
40%
50%
60%
1 10 100 1000 10000
Distrib
ution
Size (Byte)
Temporalcoverage
• Weneedflowinformationonafinetimescale
4
0
1
2
3
4
5
6
0 50 100
#Loss
Time (ms)
ShorttimescaleLossrates Timelyattackdetection
DoS
Key insight: division of labor
• Goal:Monitoralltheflowsonafinetimescale
5
Overhead atthecollector
Overhead at theswitches
NetFlow
Mirroring
Limitedper-packetprocessingtimeLimitedmemory(10sofMB)
Collectorhaslimitedbandwidthandstorage
Needssampling
Key insight: division of labor
• Goal:Monitoralltheflowsonafinetimescale
6
FlowRadar
Usefixedoperationsperpacketatswitches
Overhead atthecollector
Overhead at theswitches
NetFlow
Mirroring
Smalldatastructures
FlowRadar architecture
7
Extractper-flowcountersacrossswitchesCollector
Switches
AnalysisApplications
Flows&counters
Compactflowcounterswithfixedper-packetoperations
Periodicreport
Challenge:Howtohandlecollisions?
• Handling hashcollisionsishard• Large hash tablesà highmemoryusage• Linked list/Cuckoohashingàmultiple,non-constant#memoryaccesses
8
Flow a b cPacketCount … … …
Flowd
d1
collision
Solution:Embracescollisions
• Handling hashcollisionsishard• Large hash tablesà highmemoryusage• Linked list/Cuckoohashingàmultiple,non-constant#memoryaccesses
• Embracingthecollisions• Lessmemoryandconstant#memoryaccesses
9
FlowXorFlowCountPacketCount
Solution:Embracescollisions
• Embracingthecollisions:xor upalltheflows• Lessmemoryandconstant#memoryaccesses
10
Countingtable
flowa:S(a) Flowb:S(b) Flowc:S(c)a a b b c c
[InvertibleBloomLookupTable(arXiv 2011)]
S(x):#packetsinx
H1 H2 H3
⊕ba a1S(a)
1S(a)
a1S(a)
b1
S(b)+S(b) +S(c)
⊕c b⊕c1
S(b)+S(c)
c1S(c)
2 2 2
FlowFiltertoidentifynewflows
• 1.Checkandupdatetheflowfilter• 2.Updatecountingtable– Thefirstpacketfromanewflow,updateallfields– SubsequentpacketsupdateonlyPacketCount
11
bloomfilter:identifynewflow
FlowXor a a⊕b b⊕c b⊕c a cFlowCount 1 2 2 2 1 1PacketCount S(a) S(a)+S(b) S(b)+S(c) S(b)+S(c) S(a) S(c)
+1+1 +1
d
⊕d⊕d ⊕d+1+1 +1
+1+1 +1
d
Flowfilter
[InvertibleBloomLookupTable(arXiv 2011)]
EncodedFlowset
Countingtable
H1 H2 H3
Easytoimplementinmerchantsilicon
• Switchdataplane– Fixedoperations per packet– Smallmemory: 2.36MBfor100Kflows
• Switchcontrolplane– Collectsthesmallencodedflowset every 10ms
• WeimplementeditusingP4Language.– Portabletomanyp4-compatibleswitchchips
12
Flows&counters
Extractper-flowcountersacrossswitches
AnalysisApplications
FlowRadar architecture
13
Compactflowcounterswithfixedper-packetoperations
Collector
Switches
Periodicreport
Encodedflowset
Encodedflowset
Encodedflowset
Stage1.SingleDecode Stage2.Network-wideDecode
Stage1.SingleDecode
14
FlowXor …FlowCount …PacketCount …
Bloomfilter
Countingtable
FlowfilterFlow #packeta S(a)… …
Input:anencodedflowset fromasingleswitch
Output:per-flowcounters
Flowfilter
Flow #packeta 5
FlowXor a a⊕b b⊕c⊕d b⊕c⊕d a c⊕dFlowCount 1 2 3 3 1 2PacketCount 5 12 13 13 5 6
Stage1.SingleDecode
• Findapurecell:acellwithoneflow• Removetheflowfromallcells
15
Purecell
Decoded:
-1 -1 -1-5 -5 -5
H1 H2 H3
FlowXor 0 b b⊕c⊕d b⊕c⊕d 0 c⊕dFlowCount 0 1 3 3 0 2PacketCount 0 7 13 13 0 6
Flowfilter
Stage1.SingleDecode
• Findapurecell:acellwithoneflow• Removetheflowfromallcells– Createmorepurecells
• Iterateuntilnopurecells
16
Flow #packeta 5Decoded:
H1 H2 H3
Stage1.SingleDecode
17
FlowXor 0 0 c⊕d c⊕d 0 c⊕dFlowCount 0 0 2 2 0 2PacketCount 0 0 6 6 0 6
Flowfilter
Flow #packeta 5b 7
Decoded:
H1 H2 H3
Flows&counters
FlowRadar architecture
18
Collector
Switches
Stage1.SingleDecode Stage2.Network-wideDecode
Periodicreport
Encodedflowset
Encodedflowset
Encodedflowset
AnalysisApplications
Keyinsight:overlappingsetsofflows
• Thesetsofflowsoverlapacrosshops–Wecanusetheredundancytodecodemoreflows
• Usedifferenthashfunctionsacrosshops
19
abcd
abcd
a a⊕c⊕d
b⊕c⊕d
a⊕b⊕c b⊕d
1 3 3 3 2
a⊕d a⊕c b⊕c⊕d
a⊕b⊕c b⊕d
2 2 3 3 2
FlowXor
FlowCountPktCount
FlowXor
FlowCountPktCount
Use5cellstodecode4flows10
Collectorcanleverageflowsets fromallswitchestodecodemoreflows
Keyinsight:overlappingsetsofflows
• Thesetsofflowsoverlapacrosshops–Wecanusetheredundancytodecodemoreflows
• Usedifferenthashfunctionsacrosshops• Provisionswitchmemorybasedonavg(#flows),notmax(#flows)– SingleDecode fornormalcases– Network-widedecodeforburstsofflows
20
Challenge1:setsofflowsnotfullyoverlapped
• Flowsfromoneswitchmaygotodifferentnexthops• Oneswitchreceivesflows frommultiplehops
21
abcd
a
e
cd
Solution:ZigzagdecodewithFlowFilters
• Generalizetotheentirenetwork– Noneedforroutinginformation– Supportincrementaldeployment
22
Flowfilter Flowfilter
a
a b c d
Decoded:
a ec d
ab cc d d eSingleDecode SingleDecodeRemovefromtheother
FlowDecode
Challenge2:differentcountervalues
• Thecounters ofthesameflowmaybedifferentacrosshops– Somepacketsmaygetlost– Somepacketsmaybeonthefly
23drop
Solution:calculatecountersforeachswitch
• Solvealinearequationsystemforeachswitch– Alreadyknowthefullsetofflowsateachswitch– Solvablewithnflowsand>=ncombinationsofcounters
24
a b c d
a⊕b⊕d …
3 …14 …
FlowXor
FlowCountPktCount
Flow #pkta 5b 7c 4d 2
!𝑆(𝑎)+𝑆(𝑏)+𝑆(𝑑)=14…
Solving the linearequationsystem
• Challenge:solvingthelinearequationsystemis notfast• Thematrixisverysparse,eachcolumnhask“1”s.• We use iterative solvers• Speeduptheiteration:– Starttheiterationfromcloseapproximationsofcounters,whichisgotfromtheFlowDecode
– Stoptheiterationwhentheresultisfloatingwithin±0.5aninteger.
25
Flows&counters
FlowRadar architecture
26
Collector
Switches
Stage1.SingleDecode Stage2.Network-wideDecode
Periodicreport
Encodedflowset
Encodedflowset
Encodedflowset
Stage2.1FlowDecode
Stage2.2CounterDecode
FlowXor …FlowCount …PacketCount …
Flow filter
Flow filter
Flow #pktabc
CounterDecode
Flow #pkta 5b 7c 4
FlowDecode
CounterDecode
FlowXor …FlowCount …PacketCount …
Flow #pktxyz
Flow #pktx 3y 5z 6
AnalysisApplications
Evaluations
27
AnalysisApplicationsFlows&countersCollector
Switches
Stage1.SingleDecode
Stage2.1FlowDecode
Stage2.2CounterDecode
Periodicreport
Encodedflowset
Encodedflowset
Encodedflowset
• Memoryusageisefficient• Bandwidthusageislow• NetDecode vsSingleDecode
Evaluation
• Simulationofk=8FatTree (80switches,128hosts)inns3• Configurethememorybaseonavg(#flow),– whenburstofflowshappens,usenetwork-widedecode
• Theworstcaseisallswitchesarepushedtomax(#flow)– Traffic:eachswitchhassamenumberofflows,andthussamememory
• Eachswitchreportstheflowsets every10ms.
28
�
�
��
��
��
��
��
� �� ��� ����
�����������������������
� ����� ��� ������ ���
��������� �� ���������
Memoryefficiency
29
�
�
��
��
��
��
��
� �� ��� ����
�����������������������
� ����� ��� ������ ���
��������� �� ���������������� ���� �� ��
#cell=#flow(Impractical) FlowRadar:2.36MB
FlowRadar:24.8MB
Bandwidthusage
• EvaluatedbasedonFacebookdatacenters(SIGCOMM’ 15)– EachToR haslessthan100Kflowsper10msà lessthan2.3Gbps– EachToR talksto44*10G-hosts– FlowRadar consumes0.52%oftotalToR bandwidth
30
NetDecode vs. SingleDecode
31
���
�
��
���
����
�����
�� �� �� �� ��� ��� ���
����������������������������
� ����� ��� ������ ���
SingleDecode NetDecode
SingleDecode:100Kflow,10ms
NetDecodeneedsmoretime
NetDecode:126.8Kflow,3sec
NetDecode
• TheCounterDecode takesthemajorityofthedecodingtime• TheCounterDecode limitsthe#counterscouldbedecoded– If#flows>126.8K,wecanstilldecodeallflows(upto152K)withoutcountersusingFlowDecode
32
AnalysisApplicationsFlows&counters
FlowRadar analyzer
33
Collector
Switches
Stage1.SingleDecode Stage2.Network-wideDecode
Periodicreport
Encodedflowset
Encodedflowset
Encodedflowset
Analysisapplications
• Flowcoverage– Transientloops/blackholes– Errorinmatch-actiontable– Fine-grainedtrafficanalysis
• Temporalcoverage– Shorttimescaleper-flowlossrate– ECMPloadimbalance– Timelyattackdetection
34
Per-flowlossmap:bettertemporalcoverage
• Detectlossesaftereveryflowlet
35
012345
012345
1 2 3 4 5 6 7 8 9 10 11
Switch1
Switch2
15packets
14packets
35packets
34packets
NetFlowdetectsloss
FlowRadardetectsloss
FlowRadar conclusion
• Reportallper-flowcountersonafinetimescale• Fullyleveragebothswitchesandthecollector– Switches:Encodedflowsets withfixedper-packetprocessingtime,andlowmemoryusage
– Collector:Network-widedecoding
36
FlowRadar
Overhead atthecollector
Overhead at switches
NetFlow
Mirroring
LossRadar:Detectindividuallossesonafinetimescale
37
Losseshavesignificantimpact
• Significantlatencyincreaseandthroughputdrop– ViolatingSLAanddroprevenue
• Takesoperatorsuptotensofhourstofindandfixtheproblem
38
Takeaway:WewanttoknowlossASAP
Challenge:manytypesoflosses
39
OutputPort 1OutputPortn
InputPort1InputPortn
Input buffer
… parser Ingressmatch-actions(L2->L3->ACL)
Sharedbuffer
Egressmatch-actions
…
SwitchCPU
Configure
corruptions
misconfigurations updates
ResourceshortageResourceshortage corruptions
Greyfailure
Takeaway:Weneedagenerictool
Challenge: losses could happen everywhere
40
TCP:I have losses
But I don’t knowwhere
ECMP:multiplepossible paths
Takeaway:Weneedthelocationsofthelosses
Challenge:lackofdetailsoflosses
• Difficulttoinfertherootcauses
41
Flow1
Flow2 Pass
Drop
0
1
2
3
4
5
6
0 50 100
#Loss
Time (ms)
Differentflowsexperiencedifferentlosspatterns
Losspatternschangeovertime
Takeaway:Needtheinformationofindividuallosses
LossRadar overview
• Fastdetectionà Periodicallysendtrafficdigesttocollector• Knowinglocationà Install meters to monitor traffic• Needinfoofindividuallossesà Includedetailsofeachloss• Beinggenericà Comparethesetsofpackets
42
DigestCollector
TrafficDigest
TrafficDigest
TrafficDigest
Howtocoverthewholenetwork
• Need to coverallpipelines
43
UM
DMUM
DM
DMUMIngresspipeline
sharedbuffer
Egresspipeline
Cover ingress pipeline Cover egress pipeline
UM:UpstreammeterDM:Downstreammeter
S2
S1S3
LossRadar overview
• Fastdetectionà Periodicallysendtrafficdigesttocollector• Knowinglocationà Install meters to monitor traffic• Needinfoofindividuallossesà Includedetailsofeachloss• Beinggenericà Comparethesetsofpackets
44
DigestCollector
TrafficDigest
TrafficDigest
TrafficDigest
Howtocomparethesetsofpackets
• UsingInvertibleBloomFilter(IBF)[Sigcomm’11]• OnlyO(#loss)memory– Eachendkeepsasmallpieceofmemory– Same packetsat both ends will cancel out– Only the differences remain
45
upstream
downstream
Collector
How to achieve low memory usage
46
xorSumcount
00
00
00
00
00
xorSumcount
00
00
00
00
00
upstream
downstream
Collector
How to achieve low memory usage
47
a a axorSumcount
a1
00
a1
a1
00
xorSumcount
00
00
00
00
00
upstream
downstream
Collector
a
drop
a=5-tuple+IPID
How to achieve low memory usage
48
bb
bxorSumcount
a1
b1
a⊕b2
a1
b1
xorSumcount
00
b1
b1
00
b1
a a a upstream
downstream
Collector
b
bb b b
How to achieve low memory usage
49
c c cupstream
downstream
Collector
xorSumcount
a⊕c2
b⊕c2
a⊕b2
a⊕c2
b1
xorSumcount
00
b1
b1
00
b1
bb
ba a a
b b b
c
drop
How to achieve low memory usage
50
d dd
upstream
downstream
Collector
xorSumcount
a⊕c⊕d3
b⊕c2
a⊕b⊕d3
a⊕c2
b⊕d2
xorSumcount
d1
b1
b⊕d2
00
b⊕d2
c c cb
bba a a
b b bdd d
d
d
xorSumcount
a⊕c2
c1
a1
a⊕c2
00
ca a ac
c
Collectordoessubtraction
Benefits
• Memory-efficient– Proportionalto#losses– 10KBpermeter(trafficpatternreportedinDCTCP)
• Extend to collect more packet information– TTL:helpidentifyloops– Timestamp:taggedinthepacketheadersattheupstream– Anyotherfieldsthatprogrammableswitchescanconfigure
51
Challenges
• HowtoensureUMandDMcapturethesamesetofpackets– UsethepacketheadertocarrybatchID
• Incrementaldeployment– Put UMandDMaroundtheblackbox– Comparesum(UMi)andsum(DMj)
52
LossRadar conclusion
• Designafastandefficientlossdetectionsystem– Providelocationsanddetailsofindividuallosses– Generictoanytypesoflosses– Memoryonlyproportionalto#losses
• Futurework:designananalyzertodiagnoseproblems– Temporalanalysis,correlationacrossflowsandswitches– Combinewithinfoprovidedbyhosts
53
Conclusion
• FullvisibilityofDataCenters– Reportalltheflowsandallthelostpacketsinafewmillisecondsacrossalltheswitches
• Efficientdatastructuresonprogrammableswitches– BothFlowRadar andLossRadar areimplementedonP4– TechnologytransfertoBarefootswitches(Seemydemotomorrow)
54