Taming Latency in Software Defined...
Transcript of Taming Latency in Software Defined...
Understanding Latency in Software Defined Networks
Keqiang He, Sourav Das, Aditya Akella
1
Junaid Khalid
Li Erran Li, Marina Thottan
Latency in SDN
Centralized Controller
Mobility
100 flows
Openflow msgs
2
Some XXX msecs?? Can be as large as 10 secs!!!
Time taken to install 100 rules ?
Do applications care about latency?
3
Intra-DC Traffic Engineering
• MicroTE routes predictable traffic on time scales of 1-2s
• Route updates can take as long as 0.5s
Latency can undermine MicroTE’s effectiveness
Fast Failover
• Reroute the affected flows quickly in face of failures
• Longer update time increases congestion and drops
Latency can inflate failover time by nearly 20s!
Latency is critical to many applications
Limits the applicability of SDN applications
Applications assume latency is low and constant
Factors contributing to Latency?
4
Speed of Control Programs and Network Latency
Latency in network switches
Robust Control Software Design and Distributed Controllers
Not received much attention
Our Work
5
Latency in network switches
Two contributions:
• Systematic experiments to explore latencies in production switches
• Design a framework to overcome the impact of latencies
Outline
• Motivation
• Elements of Latency
• Measurement Methodology
• Inbound Latency
• Outbound Latency • Insertion
• Modification
• Deletion
6
Outline
• Motivation
• Elements of Latency
• Measurement Methodology
• Inbound Latency
• Outbound Latency • Insertion
• Modification
• Deletion
7
Inbound Latency
Elements of Latency
PHY PHY
Switch
Controller
I1: Send to ASIC SDK I2: Send to OF Agent I3: Send to Controller
Packet
8
CPU Memory DMA
ASIC SDK
OF Agent
I2 CPU board
Forwarding Engine Lookup Hardware
Tables
No Match
I1
PC
I
Switch Fabric
I3
1. Inbound Latency
Outbound Latency
Elements of Latency
PHY PHY
Switch
Controller
O1: Parse OF Msg O2: Software schedules the rule O3: Reordering of rules in table O4: Rule is updated in table
9
Memory CPU
ASIC SDK
OF Agent
O2 DMA
O1
Forwarding Engine Lookup Hardware
Tables
PC
I O4
O3
Switch Fabric
CPU board
1. Inbound Latency 2. Outbound Latency
Outline
• Motivation
• Elements of Latency
• Measurement Methodology
• Inbound Latency
• Outbound Latency • Insertion
• Modification
• Deletion
10
Latency Measurements - Setup
eth0
eth1
eth2
Control Channel
Flows IN
Flows OUT
Openflow Switch
POX Controller
Pktgen
Libpcap
11
Switch CPU RAM Flow table size
Data Plane
Vendor A 1 Ghz 1 GB 896 14*10Gbps + 4*40Gbps
Vendor B 2 Ghz 2 GB 4096 40*10Gbps +4*40Gbps
Outline
• Motivation
• Elements of Latency
• Measurement Methodology
• Inbound Latency
• Outbound Latency • Insertion
• Modification
• Deletion
12
Inbound Latency
• Increases with flow arrival rate
• CPU Usage is higher for higher flow arrival rates
13
Flow Arrival Rate (packets/sec)
Mean Delay per packet_in (msec)
CPU Usage ( % )
0 - 7.1
100 3.32 15.7
200 8.33 26.5
vendor A switch
Low Power CPU
Inbound Latency
vendor A switch 200 flows/sec
• Increases with interference from outbound msgs
14
0
50
100
150
200
250
300
0 200 400 600 800 1000
inb
ou
nd
del
ay (
ms)
flow #
(a) with flow_mod/pkt_out
0
50
100
150
200
250
300
0 200 400 600 800 1000
inb
ou
nd
del
ay (
ms)
flow #
(b) w/o flow_mod/pkt_out
Low Power CPU
Software Inefficiency
Outline
• Motivation
• Elements of Latency
• Measurement Methodology
• Inbound Latency
• Outbound Latency • Insertion
• Modification
• Deletion
15
Outbound Latency
• Latency for three different flow_mod operations Insertion
Modification
Deletion
• Impact of key factors on these latencies Table occupancy
Rule priority structure
16
Outline
• Motivation
• Elements of Latency
• Measurement Methodology
• Inbound Latency
• Outbound Latency • Insertion
• Modification
• Deletion
17
Insertion Latency – Priority Effects
Vendor B switch
18
0
5
10
15
20
25
30
0 20 40 60 80 100
inse
rtio
n d
elay
(m
s)
rule #
(a) Burst size 100, same priority
0
5
10
15
20
25
30
0 20 40 60 80 100
inse
rtio
n d
elay
(m
s)
rule #
(b) Burst size 100, increasing priority
Insertion Latency – Priority Effects
Vendor B switch
19
0
5
10
15
20
25
30
0 20 40 60 80 100
inse
rtio
n d
elay
(m
s)
rule #
(a) Burst size 100, same priority
0
5
10
15
20
25
30
0 20 40 60 80 100
inse
rtio
n d
elay
(m
s)
rule #
(b) Burst size 100, increasing priority
100
100
0x0000
TCAM
Insertion Latency – Priority Effects
Vendor B switch
20
0
5
10
15
20
25
30
0 20 40 60 80 100
inse
rtio
n d
elay
(m
s)
rule #
(a) Burst size 100, same priority
0
5
10
15
20
25
30
0 20 40 60 80 100
inse
rtio
n d
elay
(m
s)
rule #
(b) Burst size 100, increasing priority
100
100
99
0x0000
TCAM
Insertion Latency – Priority Effects
Vendor B switch
21
0
5
10
15
20
25
30
0 20 40 60 80 100
inse
rtio
n d
elay
(m
s)
rule #
(a) Burst size 100, same priority
0
5
10
15
20
25
30
0 20 40 60 80 100
inse
rtio
n d
elay
(m
s)
rule #
(b) Burst size 100, increasing priority
100
100 99
101
0x0000
TCAM
Insertion Latency – Priority Effects
Vendor B switch
22
0
5
10
15
20
25
30
0 20 40 60 80 100
inse
rtio
n d
elay
(m
s)
rule #
(a) Burst size 100, same priority
0
5
10
15
20
25
30
0 20 40 60 80 100
inse
rtio
n d
elay
(m
s)
rule #
(b) Burst size 100, increasing priority
100
100 99
101
0x0000
TCAM
Insertion Latency – Priority Effects
Vendor A switch
23
0
2
4
6
8
10
0 100 200 300 400 500 600 700 800
inse
rtio
n d
elay
(m
s)
rule # (a) Burst size 800, same priority
0
2
4
6
8
10
0 100 200 300 400 500 600 700 800
inse
rtio
n d
elay
(m
s)
rule # (b) Burst size 800, decreasing priority
Insertion Latency – Priority Effects
Vendor A switch
24
0
2
4
6
8
10
0 100 200 300 400 500 600 700 800
inse
rtio
n d
elay
(m
s)
rule # (a) Burst size 800, same priority
0
2
4
6
8
10
0 100 200 300 400 500 600 700 800
inse
rtio
n d
elay
(m
s)
rule # (b) Burst size 800, decreasing priority
0x0000
TCAM
Insertion Latency – Priority Effects
Vendor A switch
25
0
2
4
6
8
10
0 100 200 300 400 500 600 700 800
inse
rtio
n d
elay
(m
s)
rule # (a) Burst size 800, same priority
0
2
4
6
8
10
0 100 200 300 400 500 600 700 800
inse
rtio
n d
elay
(m
s)
rule # (b) Burst size 800, decreasing priority
0x0000
TCAM
100
101
102
103
104
Insertion Latency – Priority Effects
Vendor A switch
26
0
2
4
6
8
10
0 100 200 300 400 500 600 700 800
inse
rtio
n d
elay
(m
s)
rule # (a) Burst size 800, same priority
0
2
4
6
8
10
0 100 200 300 400 500 600 700 800
inse
rtio
n d
elay
(m
s)
rule # (b) Burst size 800, decreasing priority
0x0000
TCAM
100
101
102
103
104
99
Insertion Latency – Priority Effects
Vendor A switch
27
0
2
4
6
8
10
0 100 200 300 400 500 600 700 800
inse
rtio
n d
elay
(m
s)
rule # (a) Burst size 800, same priority
0
2
4
6
8
10
0 100 200 300 400 500 600 700 800
inse
rtio
n d
elay
(m
s)
rule # (b) Burst size 800, decreasing priority
0x0000
TCAM
100
101
102
103
104
99
Insertion Latency – Priority Effects
Vendor A switch
28
0
2
4
6
8
10
0 100 200 300 400 500 600 700 800
inse
rtio
n d
elay
(m
s)
rule # (a) Burst size 800, same priority
0
2
4
6
8
10
0 100 200 300 400 500 600 700 800
inse
rtio
n d
elay
(m
s)
rule # (b) Burst size 800, decreasing priority
0x0000
TCAM
100
101
102
103
104
99
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
0 100 200 300 400 500 600
avg
com
ple
tio
n t
ime
(ms)
burst size
(b) high priority rules into a table with low priority existing rules
table 100 table 400
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
0 100 200 300 400 500 600
avg
com
ple
tio
n t
ime
(ms)
burst size
(a) low priority rules into a table with high priority existing rules
table 100 table 400
Insertion Latency – Table occupancy Effects
• Affected by the types of rules in the table
Vendor B switch 29
Rule Priority Structure and Table Occupancy
TCAM Organization and # Of Slices
Switch Software Overhead
Outline
• Motivation
• Elements of Latency
• Measurement Methodology
• Inbound Latency
• Outbound Latency • Insertion
• Modification
• Deletion
30
Modification Latency
• Higher than Insertion latency for vendor B
• Not affected by rule priority but affected by table occupancy
Vendor B switch 31
0
20
40
60
80
100
120
140
160
0 20 40 60 80 100
mo
dif
icat
ion
del
y (m
s)
rule #
(a) 100 rules in the table
0
20
40
60
80
100
120
140
160
0 50 100 150 200
mo
dif
icat
ion
del
ay (
ms)
rule #
(b) 200 rules in table
Poorly optimized software in vendor B
Outline
• Motivation
• Elements of Latency
• Measurement Methodology
• Inbound Latency
• Outbound Latency • Insertion
• Modification
• Deletion
32
Deletion Latency
• Higher than Insertion latency for both vendor A and B
• Not affected by rule priority but affected by table occupancy
Vendor B switch
33
0
20
40
60
80
100
120
0 50 100 150 200
del
etio
n d
elay
(m
s)
rule #
(b) 200 rules in table
0
20
40
60
80
100
120
0 20 40 60 80 100
del
etio
n d
elay
(m
s)
rule #
(a) 100 rules in table
Deletion Latency
• Higher than Insertion latency for both vendor A and B
• Not affected by rule priority but affected by table occupancy
34
Vendor A switch
0
2
4
6
8
10
12
14
0 20 40 60 80 100
del
etio
n d
elay
(m
s)
rule #
(a) 100 rules in table
0
2
4
6
8
10
12
14
0 50 100 150 200
del
etio
n d
elay
(m
s)
rule #
(b) 200 rules in table
Deletion is causing TCAM Reorganization
Summary
• Latency in SDN is critical to many applications
• Assumption: Latency is small or constant
• Latency is high and variable
• Varies with Platforms, Type of operations, Rule priorities, Table occupancy, Concurrent operations
• Key Factors: TCAM Organization, Switch CPU and inefficient Software Implementation
35
Summary
• Latency in SDN is critical to many applications
• Assumption: Latency is small or constant
• Latency is high and variable
• Varies with Platforms, Type of operations, Rule priorities, Table occupancy, Concurrent operations
• Key Factors: TCAM Organization, Switch CPU and inefficient Software Implementation
36
Backup Slides
37
Impact of Concurrent CPU jobs
38
• Impacts insertion delay. E.g. Polling Statistics
• Polling stats impacts more when table occupancy is higher
0
200
400
600
800
1000
1200
0 1 2 3 4 5
flow stats polling count
table occupancy 0 table occupancy 500
Low Power CPU delays concurrent jobs
Vendor B