INF5050 – Protocols and Routing in Internet (Friday...
-
Upload
dinhnguyet -
Category
Documents
-
view
213 -
download
0
Transcript of INF5050 – Protocols and Routing in Internet (Friday...
INF5050 – Protocols and Routing in Internet (Friday 9.2.2018)
Presented by Tor Skeie
Subject: IP-router architecture
Nick McKeown 2
High PerformanceSwitching and RoutingTelecom Center Workshop: Sept 4, 1997.
This presentation is based on slidesfrom Nick McKeown, with updates
Nick McKeownProfessor of Electrical Engineering and Computer Science, Stanford University
www.stanford.edu/~nickm
Stanford High Performance Networking group: http://klamath.stanford.edu
Nick McKeown 3
Outline
Background What is a router?
Why do we need faster routers?
Why are they hard to build?
Architectures and techniques The evolution of router architecture.
IP address lookup.
Packet buffering.
Switching.
The Future
Nick McKeown 4
What is Routing?
R3
A
B
C
R1
R2
R4 D
E
FR5
R5F
R3E
R3D
Next HopDestination
Nick McKeown 5
What is Routing?
R3
A
B
C
R1
R2
R4 D
E
FR5
R5F
R3E
R3D
Next HopDestination
16 3241
Data
Options (if any)
Destination Address
Source Address
Header ChecksumProtocolTTL
Fragment OffsetFlagsFragment ID
Total Packet LengthT.ServiceHLenVer
20
byt
es
Nick McKeown 6
Points of Presence (POPs)
A
B
C
POP1
POP3POP2
POP4 D
E
F
POP5
POP6 POP7POP8
Nick McKeown 7
(400 Gb/s)
Where High Performance Routers are Used
R10 R11
R4
R13
R9
R5
R2R1 R6
R3 R7
R12
R16R15
R14
R8
(2.5 Gb/s)
(400 Gb/s)
(400 Gb/s)
(400 Gb/s)
Nick McKeown 8
What a Router Looks Like
Cisco CRS-X(CRS-X 16 slot single-shelf on picture)
Juniper M320 (M160 on picture)
2.14m
0.60m
0.91m
Capacity: 12.8Tb/sPower: 11.2kWWeight: 723kg
0.88m
0.65m
0.44m
Capacity: 160Gb/sPower: 3.5kW
Capacity is sum of rates of linecards
Alcatel 7670 RSP Juniper TX8/T640
TX8
ChiaroAvici TSR
Some Multi-rack Routers
Nick McKeown 11
Generic Router Architecture
LookupIP Address
UpdateHeader
Header ProcessingData Hdr Data Hdr
1M prefixesOff-chip DRAM
AddressTable
IP Address Next Hop
QueuePacket
BufferMemory
1M packetsOff-chip DRAM
Nick McKeown 12
Generic Router Architecture
LookupIP Address
UpdateHeader
Header Processing
AddressTable
LookupIP Address
UpdateHeader
Header Processing
AddressTable
LookupIP Address
UpdateHeader
Header Processing
AddressTable
Data Hdr
Data Hdr
Data Hdr
BufferManager
BufferMemory
BufferManager
BufferMemory
BufferManager
BufferMemory
Data Hdr
Data Hdr
Data Hdr
Nick McKeown 13
Why do we Need Faster Routers?
1. To prevent routers becoming the bottleneck in the Internet.
2. To increase POP capacity, and to reduce cost,
1. size and
2. power.
Nick McKeown 14
0,1
1
10
100
1000
10000
1985 1990 1995 2000
Sp
ec95In
t C
PU
resu
lts
Why we Need Faster Routers 1: To prevent routers from being the bottleneck
0,1
1
10
100
1000
10000
1985 1990 1995 2000
Fib
er
Cap
ac
ity (
Gb
it/s
)
TDM DWDM
Packet processing Power Link Speed
2x / 18 months 2x / 7 months
Source: SPEC95Int & David Miller, Stanford.
More recently transmission speed of Petabit/s has been
demonstratedby Labs (multicore fiber)
Nick McKeown 15
POP with large routersPOP with smaller routers
Why we Need Faster Routers 2: To reduce cost, power & complexity of POPs
Ports: Price >$100k, Power > 400W. It is common for 50-60% of ports to be for interconnection.
Nick McKeown 16
Why are Fast Routers Difficult to Make?
1. It’s hard to keep up with Moore’s Law:
The bottleneck is memory speed.
Memory speed is not keeping up with Moore’s Law.
Nick McKeown 17
1. It’s hard to keep up with Moore’s Law:
The bottleneck is memory speed.
Memory speed is not keeping up with Moore’s Law.
Why are Fast Routers Difficult to Make?Speed of Commercial DRAM
Moore’s Law2x / 18 months
1.1x / 18 months
0,0001
0,001
0,01
0,1
1
10
100
1000
1980 1983 1986 1989 1992 1995 1998 2001 2004 2007 2011
Acc
ess
Tim
e (ns
)
1.1x / 18 months
Moore’s Law2x / 18 months
DDR4 (2018):• Speed: 3200 Mb/s
• Latency: ~13 ns•Capacity: 64GB
Nick McKeown 18
Why are Fast Routers Difficult to Make?
1. It’s hard to keep up with Moore’s Law:
The bottleneck is memory speed.
Memory speed is not keeping up with Moore’s Law.
2. Moore’s Law is too slow:
Routers need to improve faster than Moore’s Law.
Nick McKeown 19
Router Performance Exceeds Moore’s Law
Growth in capacity of commercial routers: Capacity 1992 ~ 2Gb/s Capacity 1995 ~ 10Gb/s Capacity 1998 ~ 40Gb/s Capacity 2001 ~ 160Gb/s Capacity 2003 ~ 640Gb/s Capacity 2008 ~ 100Tb/s Capacity 2013 ~ 922Tb/s
Average growth rate: 2.2x / 18 months, but the last 5 years: 2.8x / 18 months.
2013:The Cisco CRS-X multishelf router has a capacity of 921.6Tb/s (1152 ports)
Nick McKeown 20
Outline
Background What is a router?
Why do we need faster routers?
Why are they hard to build?
Architectures and techniques The evolution of router architecture.
IP address lookup.
Packet buffering.
Switching.
The Future
Nick McKeown 21
RouteTable
CPU BufferMemory
LineInterface
MAC
LineInterface
MAC
LineInterface
MAC
Typically <0.5Gb/s aggregate capacity
First Generation Routers
Shared Backplane
Nick McKeown 22
Second Generation RoutersRouteTable
CPU
LineCard
BufferMemory
LineCard
MAC
BufferMemory
LineCard
MAC
BufferMemory
FwdingCache
FwdingCache
FwdingCache
MAC
BufferMemory
Typically <5Gb/s aggregate capacity
Nick McKeown 23
Third Generation Routers
LineCard
MAC
LocalBufferMemory
CPUCard
LineCard
MAC
LocalBufferMemory
Switched Backplane
FwdingTable
RoutingTable
FwdingTable
Typically <50Gb/s aggregate capacity
Nick McKeown 24
Fourth Generation Routers/SwitchesOptics inside a router for the first time
Switch Core Linecards
Optical links
100sof metres
100s Tb/s routers available/in development
Nick McKeown 25
Outline
Background What is a router?
Why do we need faster routers?
Why are they hard to build?
Architectures and techniques The evolution of router architecture.
IP address lookup.
Packet buffering.
Switching.
The Future
Nick McKeown 26
Generic Router Architecture
LookupIP Address
UpdateHeader
Header Processing
AddressTable
LookupIP Address
UpdateHeader
Header Processing
AddressTable
LookupIP Address
UpdateHeader
Header Processing
AddressTable
BufferManager
BufferMemory
BufferManager
BufferMemory
BufferManager
BufferMemory
LookupIP Address
AddressTable
LookupIP Address
AddressTable
LookupIP Address
AddressTable
Nick McKeown 27
IP Address Lookup
Why it’s thought to be hard:1. It’s not an exact match: it’s a longest prefix
match.
2. The table is large: about 700,000 entries today, and growing.
3. The lookup must be fast: about 2ns for a 80Gb/s line.
Nick McKeown 28
Longest Prefix Match is Harder than Exact Match
• The destination address of an arriving packet does not carry with it the information to determine the length of the longest matching prefix
• Hence, one needs to search among the space of all prefix lengths; as well as the space of all prefixes of a given length
Nick McKeown 29
IP Lookups find Longest Prefixes
128.9.16.0/21 128.9.172.0/21
128.9.176.0/24
0 232-1
128.9.0.0/16142.12.0.0/1965.0.0.0/8
128.9.16.14
Routing lookup: Find the longest matching prefix (aka the most specific route) among all prefixes that match the destination address.
Nick McKeown 30
Address Tables are Large
Nick McKeown 31
Lookups Must be Fast
Year Aggregate Line-rate
Arriving rate of 40B packets (Million pkts/sec)
1997 622 Mb/s 1.94
2001 10 Gb/s 31.25
2006 80 Gb/s 250
2010 140 Gb/s 437.5
2013 400 Gb/s 1250
1. Lookup mechanism must be simple and easy to implement
2. Memory access time is the bottleneck
250Mpps × 2 lookups/pkt = 500 Mlookups/sec → 2ns per lookup
Nick McKeown 32
IP Address LookupBinary tries
Example Prefixes:
a) 00001
b) 00010
c) 00011
d) 001
e) 0101
f) 011
g) 100
h) 1010
i) 1100
j) 11110000
e
f g
h i
j
0 1
a b c
d
0
0
1
1
Nick McKeown 33
Multi-bit Tries
Depth = WDegree = 2Stride = 1 bit
Binary trieW
Depth = W/kDegree = 2k
Stride = k bits
Multi-ary trie
W/k
Time ~ W/kStorage ~ NW/k * 2k-1
W = longest prefixN = #prefixes
Nick McKeown 34
Prefix Length Distribution
99.5% prefixes are 24-bits or shorter
0
10000
20000
30000
40000
50000
60000
70000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
Prefix Length
Num
ber
of P
refixes
Source: Geoff Huston, Oct 2001
Nick McKeown 35
24-8 Direct Lookup Trie
0000……0000 1111……1111
0 224-1
24 bits
8 bits
0 28-1
When pipelined, allows one lookup per memory access. Inefficient use of memory, though.
Nick McKeown 38
Outline
Background What is a router?
Why do we need faster routers?
Why are they hard to build?
Architectures and techniques The evolution of router architecture.
IP address lookup.
Packet buffering.
Switching.
The Future
Nick McKeown 40
Conceptual architecture
Line cardshosting
one or moreports
Non-blockingswitchingcore(s)
Arbitration/Control
Bi-
dir
ect
iona
l po
rts B
i-dire
ctional ports
Nick McKeown 41
Conceptual Packet Buffering
Non-blockingswitchingcore(s)
Arbitration/Control
Bi-
dir
ect
iona
l po
rts
Bi-d
irectional ports
Input buffering
Nick McKeown 43
Arbitration
LookupIP Address
UpdateHeader
Header Processing
AddressTable
LookupIP Address
UpdateHeader
Header Processing
AddressTable
LookupIP Address
UpdateHeader
Header Processing
AddressTable
QueuePacket
BufferMemory
QueuePacket
BufferMemory
QueuePacket
BufferMemory
Data Hdr
Data Hdr
Data Hdr
1
2
N
1
2
N
Data Hdr
Data Hdr
Data Hdr
Arbitration
Nick McKeown 44
Head of Line Blocking
Nick McKeown 45
0% 20% 40% 60% 80% 100%Load
Dela
yA Router with Input Queues
Head of Line Blocking
The best that any queueing system can
achieve.
2 2 58%
Nick McKeown 46
0% 20% 40% 60% 80% 100%Load
Dela
yThe Best Performance
The best that any queueing system can
achieve.
Nick McKeown 47
Conceptual Packet Buffering
Non-blockingswitchingcore(s)
Arbitration/Control
Bi-
dir
ect
iona
l po
rts
Bi-d
irectional ports
Central buffer
Nick McKeown 48
Fast Packet Buffers(http://yuba.stanford.edu/fastbuffers/)
Example: 40Gb/s packet bufferSize = RTT*BW = 10Gb; 40 byte packets
Write Rate, R
1 packetevery 8 ns
Read Rate, R
1 packetevery 8 ns
BufferManager
BufferMemory
Use SRAM?+ fast enough random access time, but
- too low density to store 10Gb of data.
Use DRAM?+ high density means we can store data, but
- too slow (~15ns random access time).
Nick McKeown 49
DRAM Buffer Memory
Packet Caches to implement Central buffer
Buffer Manager
SRAM
Arriving
Packets
DepartingPackets12
Q
2
1234
345
123456
Small ingress SRAM cache of FIFO headscache of FIFO tails
5556
9697
8788
57585960
899091
1
Q
2
Small ingress SRAM
1
57 6810 9
79 81011
1214 1315
5052 515354
8688 878991 90
8284 838586
9294 9395 68 7911 10
1
Q
2
DRAM Buffer Memory
b>>1 packets at a time
Nick McKeown 50
Conceptual Packet Buffering
Non-blockingswitchingcore(s)
Arbitration/Control
Bi-
dir
ect
iona
l po
rts
Bi-d
irectional ports
Output buffering
Nick McKeown 51
Output buffering
LookupIP Address
UpdateHeader
Header Processing
AddressTable
LookupIP Address
UpdateHeader
Header Processing
AddressTable
LookupIP Address
UpdateHeader
Header Processing
AddressTable
Data Hdr
Data Hdr
Data Hdr
BufferManager
BufferMemory
BufferManager
BufferMemory
BufferManager
BufferMemory
Data Hdr
Data Hdr
Data Hdr
Nick McKeown 54
Speed-up
LookupIP Address
UpdateHeader
Header Processing
AddressTable
LookupIP Address
UpdateHeader
Header Processing
AddressTable
LookupIP Address
UpdateHeader
Header Processing
AddressTable
QueuePacket
BufferMemory
QueuePacket
BufferMemory
QueuePacket
BufferMemory
Data Hdr
Data Hdr
Data Hdr
1
2
N
1
2
N
N times line rate
N times line rate
Nick McKeown 55
Conceptual Packet Buffering
Non-blockingswitchingcore(s)
Arbitration/Control
Bi-
dir
ect
iona
l po
rts
Bi-d
irectional ports
Input buffering with a virtual output queue
Nick McKeown 56
Virtual Output Queues
Nick McKeown 57
Matching
A matching on a graph is a subset of edges of the graph such that no two of them share a vertex in common.
edgevertex
Nick McKeown 59
0% 20% 40% 60% 80% 100%Load
Dela
yA Router with Virtual Output Queues
The best that any queueing system can
achieve.
Nick McKeown 67
Current Internet Router TechnologySummary
There are three potential bottlenecks: Address lookup, Packet buffering, and Switching.
Techniques exist today for: 100sTb/s Internet routers, with 400Gb/s linecards.
But what comes next…?
Nick McKeown 68
Outline
Background What is a router? Why do we need faster routers? Why are they hard to build?
Architectures and techniques The evolution of router architecture. IP address lookup. Packet buffering. Switching.
The Future More parallelism. Eliminating schedulers. Introducing optics into routers.
The Future
Nick McKeown 72
Complex linecards
PhysicalLayer
Framing&
Maintenance
PacketProcessing
Buffer Mgmt&
Scheduling
Buffer Mgmt&
Scheduling
Buffer& StateMemory
Buffer& StateMemory
Typical IP Router Linecard
10Gb/s linecard: Number of gates: 30M Amount of memory: 2Gbits Cost: >$20k Power: 300W
LookupTables
SwitchFabric
Arbitration
Optics
Nick McKeown 73
External Parallelism: Multiple Parallel Routers
What we’d like:R
R R
R
The building blocks we’d like to use:
R
RR
R
NxN
IP Router capacity 100s of Tb/s
Nick McKeown 74
Multiple parallel routers Load Balancing
R R
R
1
2
…
…
k
R
R
R
R/k R/k
R
R
R
Nick McKeown 75
Intelligent Packet Load-balancingParallel Packet Switching
1
2
k
1
N
rate, R
rate, R
rate, R
rate, R
1
N
Router
Bufferless
R/k R/k
Demultiplexor Multiplexor
Nick McKeown 76
Parallel Packet Switching
AdvantagesSingle-stage of buffering
No excess link capacity
kh a power per subsystem i
kh a memory bandwidth i
kh a lookup rate i
Nick McKeown 77
Parallel Packet Switching
AdvantagesLoad-balancing: output links are less
congested
Scalability: new router can be dynamically added
Redundancy
Nick McKeown 78
Parallel Packet SwitchTheorem
If Speed-up > 2k/(k+2) @ 2 then a parallel packet switch can precisely emulate a single big router.
Nick McKeown 80
Eliminating schedulersTwo-Stage Switch
1
N
1
N
1
N
External Outputs
Internal Inputs
External Inputs
First Round-Robin Second Round-Robin
Load Balancing
Nick McKeown 83
Optics in routers
Switch Core Linecards
Optical links
Nick McKeown 85
160Gb/s
40Gb/s
40Gb/s
40Gb/s
40Gb/s
Optical2-stageSwitch160-
320Gb/s160-
320Gb/s
100 Tb/s IP Router,
625 linecards, each operating at 160Gb/s.
The Stanford Phicticious Optical Router
Nick McKeown 90
References
General1. J. S. Turner “Design of a Broadcast packet switching network”,
IEEE Trans Comm, June 1988, pp. 734-743.
2. C. Partridge et al. “A Fifty Gigabit per second IP Router”, IEEE Trans Networking, 1998.
3. N. McKeown, M. Izzard, A. Mekkittikul, W. Ellersick, M. Horowitz, “The Tiny Tera: A Packet Switch Core”, IEEE Micro Magazine, Jan-Feb 1997.
Fast Packet Buffers1. Sundar Iyer, Ramana Rao, Nick McKeown “Design of a fast
packet buffer”, IEEE HPSR 2001, Dallas.
Nick McKeown 91
ReferencesIP Lookups1. A. Brodnik, S. Carlsson, M. Degermark, S. Pink. “Small
Forwarding Tables for Fast Routing Lookups”, Sigcomm 1997, pp 3-14.
2. B. Lampson, V. Srinivasan, G. Varghese. “ IP lookups using multiway and multicolumn search”, Infocom 1998, pp 1248-56, vol. 3.
3. M. Waldvogel, G. Varghese, J. Turner, B. Plattner. “Scalable high speed IP routing lookups”, Sigcomm 1997, pp 25-36.
4. P. Gupta, S. Lin, N. McKeown. “Routing lookups in hardware at memory access speeds”, Infocom 1998, pp 1241-1248, vol. 3.
5. S. Nilsson, G. Karlsson. “Fast address lookup for Internet routers”, IFIP Intl Conf on Broadband Communications, Stuttgart, Germany, April 1-3, 1998.
6. V. Srinivasan, G.Varghese. “Fast IP lookups using controlled prefix expansion”, Sigmetrics, June 1998.
Nick McKeown 92
ReferencesSwitching• N. McKeown, A. Mekkittikul, V. Anantharam, and J. Walrand.
Achieving 100% Throughput in an Input-Queued Switch. IEEE Transactions on Communications, 47(8), Aug 1999.
• A. Mekkittikul and N. W. McKeown, "A practical algorithm to achieve 100% throughput in input-queued switches," in Proceedings of IEEE INFOCOM '98, March 1998.
• L. Tassiulas, “Linear complexity algorithms for maximum throughput in radio networks and input queued switchs,” in Proc. IEEE INFOCOM ‘98, San Francisco CA, April 1998.
• D. Shah, P. Giaccone and B. Prabhakar, “An efficient randomized algorithm for input-queued switch scheduling,” in Proc. Hot Interconnects 2001.
• J. Dai and B. Prabhakar, "The throughput of data switches with and without speedup," in Proceedings of IEEE INFOCOM '00, Tel Aviv, Israel, March 2000, pp. 556 -- 564.
• C.-S. Chang, D.-S. Lee, Y.-S. Jou, “Load balanced Birkhoff-von Neumann switches,” Proceedings of IEEE HPSR ‘01, May 2001, Dallas,
Texas.
Nick McKeown 93
ReferencesFuture• C.-S. Chang, D.-S. Lee, Y.-S. Jou, “Load balanced Birkhoff-
von Neumann switches,” Proceedings of IEEE HPSR ‘01, May 2001, Dallas, Texas.
• Pablo Molinero-Fernndez, Nick McKeown "TCP Switching: Exposing circuits to IP" Hot Interconnects IX, Stanford University, August 2001
• S. Iyer, N. McKeown, "Making parallel packet switches practical," in Proc. IEEE INFOCOM `01, April 2001, Alaska.
• I. Keslassy et al. ”Scaling Internet Routers Using Optics”in. Proc. ACM SIGCOMM `03, August 2003, Germany.