Scaling BIRD Routeservers - RIPE 73 · Scaling BIRD Routeservers Benedikt Rudolph, Sebastian Abt...
Transcript of Scaling BIRD Routeservers - RIPE 73 · Scaling BIRD Routeservers Benedikt Rudolph, Sebastian Abt...
Where networks meetwww.de-cix.net
RIPE 73, Madrid
Scaling BIRD Routeservers
Benedikt Rudolph, Sebastian Abt
DE-CIX Research and Development
» Dynamic IP routing daemon (BGP, RIP, OSPF, Babel, BFD)
» Written in C
» OpenSource Software (GNU GPLv2)
» Widely deployed for Route Servers at IXPs» Mid-size to small ones» Few large ones with〜1000 BGP neighbors
» Flexible configuration file syntax
» Reliable and stable operation
The BIRD Internet Routing Daemon
2
Help:• Scaleout tounusedCPUs• ReduceloadonBIRDprocesses
– Fewerpeers/prefixesperPID• Servemorepeers
Avoid:• “spiralof death” syndrome
– Re-convergence afterhighchurn• performancedegradation on
protocolreload/maintainance
Easytotest,deploy andmaintain• Noclient-sideconfigurationchanges• NoalterationstotheSourceCode
Motivation and Goals
3
» Make multiple BIRD processes behave like one
» Share a common RIB (master table)
» Calculate the same BGP best-paths
» Appear as the same BGP neighbor
» Share an IP address
» Balance incoming BGP connections
» Share load with 1, 2, 3,…, n processes
Problem Statement
4
» Private subnet on loopback interface
» Link master tables (full-mesh, eBGP Add-Path)
» iBGP fails at best-path selection
» eBGP exhibits path-hiding
» “Balance” BGP peers based on IP subnet
» BGPv4 incompatible with TCP proxies
» DestinationNAT for incomming connections
Solution Building Blocks
DNAT
PublicIPv4:203.0.113.1
5
» Peering LAN split in n subnets
» DNAT: subnet to BIRD pocess
Overhead:
» Master table: n copies
» New best-path: n-1 updates
Details: Load Balancing
Src Dst
10.0.0.0/18 192.168.0.1
10.0.64.0/18 192.168.0.2
10.0.128.0/18 192.168.0.3
10.0.192.0/18 192.168.0.4
DNAT
10.0.0.40 10.0.64.40 10.0.128.40 10.0.192.40
...
6
Initialization• parameters• config files
Setup• Monitor• Tester• Target(BIRD)
Execution• LogTime• LogCPU• LogMemory
Test Execution with bgperf
7
./bgperf.py ExaBGP0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180
20
40
60
80
100
0
0.5
1
2
4
8
0
10
20
30
40
50
60
70
80
time [s]
CPU
utilization[%]
·230
MemoryUsagen[GiB]
·103
max. prefixes = 76500
#prefixes
Repeat3times,for 1,2,4processes at252,507,756,1022,1278,1518peers
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180
20
40
60
80
100
0
0.5
1
2
4
8
0
10
20
30
40
50
60
70
80
time [s]
CPU
utilization[%]
·230
MemoryUsagen[GiB]
·103
max. prefixes = 76500
#prefixes
Benchmark: Use of Multiple Processes» 1 BIRD Process, Time to learn 76500 prefixes = 165 s
8
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180
20
40
60
80
100
120
200
220
0
0.5
1
2
4
8
0
10
20
30
40
50
60
70
80
time [s]
CPU
utilization[%]
·230
MemoryUsagen[GiB]
·103
max. prefixes = 76500
#prefixes
Benchmark: Use of Multiple Processes» 2 BIRD Processes, Time to learn 76500 prefixes = 150 s
9
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180
20
40
60
80
100
120
200
220
300
320
400
420
0
0.51
2
4
8
16
0
10
20
30
40
50
60
70
80
time [s]
CPU
utilization[%]
·230
MemoryUsagen[GiB]
·103
max. prefixes = 76500
#prefixes
Benchmark: Use of Multiple Processes» 4 BIRD Processes, Time to learn 76500 prefixes = 108 s
10
» Large variation for small settings ⇒ excluded
» Super linear scaling
Results: Execution Time, 1 Process
11
2964 125 252 507 765 1019 1276 1532
·102
0
100
200
300
400
500
600
700
800
900
1000
1100
prefixes
Avg.Timeuntilallprefixeslearned(s)
measured data
polynomial-fit
ax
2+ bx+ c; R
2= 0.9922
» Low variation
» No results for 1532 peers⇒ RAM limit
Results: Execution Time, 2 Processes
12
2964 125 252 507 765 1019 1276 1532
·102
0
50
100
150
200
250
300
350
400
prefixes
Avg.Timeuntilallprefixeslearned(s)
measured data
polynomial-fit
ax
2+ bx+ c; R
2= 0.9518
» Anomaly for small settings
» No results for 1532, 1276 peers⇒ RAM limit
Results: Execution Time, 4 Processes
2964 125 252 507 765 1019 1276 1532
·102
0
100
200
300
400
500
600
prefixes
Avg.Timeuntilallprefixeslearned(s)
measured data
polynomial-fit
ax
2+ bx+ c; R
2= 0.9531
13
Comparison of Execution Times» Anomalies for small peer count
» 2 BIRD processes: low overhead
» 4 BIRD processes: faster in some, but not all settings
2900 6100 12500 25200 50700 76500 10190 153200
0
100
200
300
400
500
600
#prefixes
Avg.Timeuntilallprefixeslearned(s)
1P
2P
4P
Up to 60%lower
14
» Multi-process setup improves response to high load
» Increased ressource usage:» Moderate increase in Memory (2 processes)» Memory usage doubles (4 processes)
» Speedup is partly egalised by overhead» Replication of Master Table» Internal Communication: TCP over loopback interface (use Sockets)
» Possible improvements» Simple multi-threading in BIRD» Use of “realistic” peers» Investigate cause for “waiting“ BIRD processes (System CPU?)» Use a separate (physical) host for load generation (Tester)» Check overhead with 8 BIRD processes
Summary and Outlook
15
Where networks meetwww.de-cix.net
Questions & AnswersTalk to us!
Benedikt Rudolph, Sebastian Abt
DE-CIX Research and Development