High-speed serial links:speed serial links: Design Trends ...
Transcript of High-speed serial links:speed serial links: Design Trends ...
High-speed serial links:High speed serial links: Design Trends and Challenges
Vladimir Stojanovićj
Integrated Systems GroupMassachusetts Institute of Technology
Backbone router – lots of high-speed links
source: Juniper Networkssource: Alcatel, Tyco
State-of-the art up to 1 Tb/s throughputLots of linecards – power constrained system
What matters is energy cost per bit
Integrated Systems Group 2
What matters is energy cost per bit
Inside the routerLine Cards:
8 to 16 per System Switch Cards:
2 to 4 per SystemPassive
Backplane
TM/TM/
SerDesSerDes CrossbarCrossbarMEM
MEM
MEM
MEM
MEM
MEM
MEM
MEM
MACMACTM/
FabricIF
TM/Fabric
IFNPUNPUOpticsOptics
SerDesSerDesSerDesSerDes
4x3.125 Gb/sXAUI Serial Links
OC-19212 5Gb/s
3.125-12.5Gb/s Backplane Serial Links
(chip-to-chip)12.5Gb/s
Laser driver linkBackplane Serial Links
Regardless of where the links are, there is a constant desire to signal
Integrated Systems Group 3
faster and with less power
Scaling the throughput to 100 Tb/s
Electrical I/O Challenges
100 Tb/s I/O throughputWith 10Gb/s per linkWith 10Gb/s per link
10000 transceivers20000 high-speed I/O pairs10000 mm2 in 0 13 µm technology10000 mm in 0.13 µm technology
Power 4kW40 mW/Gb/s – energy cost per bit
Integrated Systems Group 4
Scaling the throughput to 100 Tb/s
Density issuesConnectors
50 diff pairs/inch400” long connector400 long connector
Trace routing50mils pitch250” wide 4-signal layer line-cardBackplane less critical
PackagePackage/Chip ball pitch (1mm / 200um)4000 mm2 / 160mm2
Integrated Systems Group 5
source: Teradyne, Rambus
Design challenge
GoalGoalFit 100 Tb/s on a 100 W crossbar chipReasonable system/rack size
Need
y
PowerReduce energy/bit to 1mW/Gb/s
DensityIncrease data rate per link by 10-15x
Integrated Systems Group 6
What makes it challenging
High speed link chiplink chip
> 2 GHz signalsg
source: Rambus
Integrated Systems Group 7
Now, the bandwidth limit is in wiressource: Rambus
High-speed link efficiency – energy cost per bit
10000004
5.5
10
12
14
16
18
bit [
mW
/Gb/
s]
PAM4PAM2
How efficient are high-speed links?
1000
10000
100000
ost p
er b
itG
b/s)
1 2.2
811
1.5
5.9
4
0 3
0.450
2
4
6
8
10
Ener
gy c
ost p
er b
10
100
1000
Ene
rgy
com
W/(G 0.30
TxTap RxTap RxSamp PLL CDR
100
120
140
t [m
W/G
b/s]
156Kb/s V.92
modem12x12Mb/s
ADSLmodem
GigabitEthernet
10Gb/s High-speed
link
20
40
60
80 PAM2 Tx5 Rx20PAM2 Tx5 Rx1+20PAM2 Tx50 Rx80PAM4 Tx5 Rx20PAM4 Tx50 Rx80
nerg
y co
st p
er b
it
2-3 orders more energy-efficientThan traditional wireline systems
Starting to pa the price for band limited channels
0 2 4 6 8 10 12 14 16 18 200
20
Data rate [Gb/s]
En
Integrated Systems Group 8
Starting to pay the price for band-limited channels
Outline
Show the path to efficient 100 Tb/s systemsShow the path to efficient 100 Tb/s systemsLook at all aspects of system design
High-speed link environmentImproving the channel
What can chips do?What can chips do?
Integrated Systems Group 9
Backplane environment
PackagePackage
Line card traceOn-chip parasitic(termination resistance and device loading capacitance)
Package viaLine card trace
On-chip parasitic(termination resistance and device loading capacitance)
Package via
Back plane connector Line card via
Back plane trace Back plane connector Line card via
Back plane trace
Backplane viaBackplane via
Line attenuationReflections from stubs (vias)
Integrated Systems Group 10
Backplane channel
Loss is variableSame backplane
0
dB]
pDifferent lengthsDifferent stubs
Top vs. Bot-20
-10
enua
tion
[d 9" FR4
p
Attenuation is large>30dB @ 3GHz 50
-40
-30Atte
9" FR4, i t b
26" FR4
>30dB @ 3GHzBut is that bad?
Required signal amplitude
-60
-50 via stub
26" FR4,via stub
Required signal amplitude set by noise
0 2 4 6 8 10frequency [GHz]
Integrated Systems Group 11
Interference
-20
-10
0
uatio
n [d
B]
THROUGH
0.8
1
espo
nse
50
-40
-30
20
Atte
nu
FEXT
NEXT
0.4
0.6
puls
e re
Tsymbol=160ps
0 2 4 6 8 10
-60
-50
f [GH ]
FEXT
0 1 2 3
0
0.2
frequency [GHz] 0 1 2 3ns
Inter-symbol interferenceDispersion (skin-effect, dielectric loss) - short latencyspe s o (s e ec , d e ec c oss) s o a e cyReflections (impedance mismatches – connectors, via stubs, device parasitics, package) – long latency
Co-channel interference (Far-End & Near-End Crosstalk)
Integrated Systems Group 12
Co channel interference (Far End & Near End Crosstalk)
Reflections and CrosstalkDon’t just receive the signal you want
Get versions of signals “close” to youVertical connections have worst coupling
“Close” in these vertical connection regions
Far-end XTALK (FEXT)
Desired signal
Sercu, DesignCon03
g
Reflections
Near-end XTALK (NEXT)
Integrated Systems Group 13
A complex system
PCB only
PCB + ConnectorsPCB + Connectors
PCB, Connectors,Via stubs & Devices
Integrated Systems Group 14
Outline
Show the path to efficient 100 Tb/s systemsShow the path to efficient 100 Tb/s systemsLook at all aspects of system design
High-speed link environmentImproving the channel
What can chips do?What can chips do?
Integrated Systems Group 15
Dispersion: material loss
FR4 dielectric, 8 mil wide and 1m long 50 Ohm strip line1
0.6
0.8
nuat
ion
Total lossConductor loss
0
0.2
0.4
Atte
n
Dielectric loss
01.0E+06 1.0E+07 1.0E+08 1.0E+09 1.0E+10
Frequency, Hz Kollipara DesignCon03
PCB Loss : skin & dielectric lossSkin Loss ∝ √fDielectric loss ∝ f : a bigger issue at higher f
Integrated Systems Group 16
Dielectric loss ∝ f : a bigger issue at higher f
Better dielectric
Rogers
FR4
FR4+stubs
Rogers is expensive – but smallest losssource: Alcatel, Tyco
Integrated Systems Group 17
Rogers is expensive but smallest loss
Minimizing reflections - the vias
Minimizing via stubsgThinner PCBs are better… but sometimes impossibleCounter-boringCounter boringBlind viasSMT technology
plated through hole
All are costly1.1x - 2x
Integrated Systems Group 18
counter-bored blind via
Connector technologies
BP
LC
BP
microvia trace
Stubs big problem in standard press fit connectors
Standard - Press-Fit Side-Interface (Tyco) Orthogonal - Teradyne(Differential Plated Through Hole)
Surface-mount + microvia
Stubs big problem in standard press-fit connectorsSide-Interface eliminates DC stubs and diff-pair length mismatchOrthogonal interconnect DPTH eliminates the backplane
Integrated Systems Group 19
Surface-mount
Eliminating the backplane - orthogonal interconnect
source: Teradyne
No backplane traceNo backplane via-stubCoax-like shielding and diff-pair matching in DPTH mid-plane
Integrated Systems Group 20
M. Cartier et al “Optimized Signal Path for Orthogonal System Architectures,” DesignCon 2005.
DPTH connector performance
No shared vias (non-DPTH) Shared vias (DPTH)
Insertion Loss of DPTH very smallReflections minimizedNEXT and FEXT minimized
Integrated Systems Group 21
NEXT and FEXT minimized
Outline
Show the path to efficient 100Tb/s systemsShow the path to efficient 100Tb/s systemsLook at all aspects of system design
High-speed link environmentImproving the channel
What can chips do?What can chips do?
Integrated Systems Group 22
New link design
Dealing with bandwidth limited channels
This is an old research areaTextbooks on digital communicationsThi k d DSLThink modems, DSL
But can’t directly apply their solutionsStandard approach requires high speed A/Ds and digitalStandard approach requires high-speed A/Ds and digital signal processing20Gs/s A/Ds are expensive
(Un)fortunately need to rethink issues
Integrated Systems Group 23
Baseline Channels
-20
0
Short ATCA BP, 3”
-40
20
[dB
] N6K BP, 26”
-80
-60S 21
Legacy FR4 BP26”, via stub
0 5 10 15-100
frequency [GHz]
Legacy (FR4) - lots of reflectionsMicrowave engineered (N6K)
Integrated Systems Group 24
Emerging standards (IEEE 802.3ap, ATCA)
Capacity and MT data rates – the impact of noise
200
220
200
220
Short ATCA BP
Capacity – thermal and phase noise Uncoded MT – thermal and phase noise
120
140
160
180
e [G
b/s]
120
140
160
180
y [G
b/s] N6K BP
Short ATCA BP
Short ATCA BP
40
60
80
100
Dat
a ra
te
40
60
80
100
Cap
acity
Legacy FR4 BP
N6K BP
Capacity Uncoded MT
0 5 10 15 200
20
Noise factor [dB]0 5 10 15 20
0
20
Noise factor [dB]
Legacy FR4 BP
CapacityMuch higher than data rates in today’s linksNoise
Thermal - 50Ohm termination
Uncoded MTHalf the capacity
BER target of 10-15
Peak-power constraintCoding can help
Integrated Systems Group 25
Thermal - 50Ohm terminationPhase noise – best LC PLL (0.14%UI rms)
Coding can help
Removing ISI – baseband link
Linear transmit equalizerSampled
DataDeadband Feedback tapsTx Anticausal taps
Data
TapSel
Data
Channel
Decision-feedback equalizer
Tap SelLogicCausal
taps
I
doutNoutP
d
Ω50Ω50
Transmit and Receive Equalization Changes signal to correct for ISI
0eqI
Changes signal to correct for ISIOften easier to work at transmitter
DACs easier than ADCs
Integrated Systems Group 26
J. Zerbe et al, "Design, Equalization and Clock Recovery for a 2.5-10Gb/s 2-PAM/4-PAM Backplane Transceiver Cell," IEEE Journal Solid-State Circuits, Dec. 2003.
Pulse amplitude modulation
Binary (NRZ) PAM41 bit / symbolSymbol rate = bit rate
PAM4 2 bits / symbolSymbol rate = bit rate/2
00
011
10
110
Integrated Systems Group 27
Multi-level: offset and jitter are crucial
thermal noise + ff t
thermal noise + offset+ jittth l i offset jitter
25
30
te [G
b/s]
PAM8 25
30
e [G
b/s]
35
40
45
te [G
b/s]
PAM16
thermal noise
10
15
20D
ata
rat
PAM16PAM4
PAM2
10
15
20
Dat
a ra
t
PAM2
PAM4
15
20
25
30
Dat
a ra
t
PAM4
PAM16
PAM8
0 2 4 6 8 10 12 14 16 18 200
5
10
S b l [G / ]0 2 4 6 8 10 12 14 16 18 20
0
5
10 PAM2
PAM8
0 2 4 6 8 10 12 14 16 18 200
5
10
15PAM2
S b l t [G / ]
To make better use of available bandwidth, need better circuits2/ f
Symbol rate [Gs/s] Symbol rate [Gs/s]Symbol rate [Gs/s]
Integrated Systems Group 28
PAM2/PAM4 robust candidate for next generation links
Full ISI compensation too costly
thermal noisethermal noise + offset
thermal noise + offset+ jitter
14
16
18
20
rate
[Gb/
s]
14
16
18
20
a ra
te [G
b/s]
PAM414
16
18
20
rate
[Gb/
s]
6
8
10
12Dat
a
PAM16PAM4
PAM2PAM8
6
8
10
12Dat
a PAM8
PAM28
10
12Dat
a
PAM2
PAM4
PAM8
0 2 4 6 8 10 12 14 160
2
4
6
0 2 4 6 8 10 12 14 160
2
4
6 PAM2
0 2 4 6 8 10 12 14 160
2
4
6
0 2 4 6 8 10 12 14 16Symbol rate [Gs/s]
0 2 4 6 8 10 12 14 16Symbol rate [Gs/s]
0 2 4 6 8 10 12 14 16Symbol rate [Gs/s]
Today’s links cannot afford to compensate all ISIToo much power
Integrated Systems Group 29
Too much powerLimits today’s maximum achievable data rates
Capacity – Bit Loading
7
8
4
4.5Excess Noise factor 0dB Excess Noise factor 20dB
Short ATCA BP
4
5
6
er d
imen
sion
2
2.5
3
3.5
er d
imen
sionN6K BP
1
2
3
# bi
ts p
e
0 5
1
1.5
2
# bi
ts p
eB d idth i li it d b tt ti d i
0 5 10 150
1
Frequency [GHz]0 5 10 15
0
0.5
Frequency [GHz]
Legacy FR4 BP
Bandwidth is limited by attenuation and noiseCan’t just keep increasing the signaling frequencyNeed to focus on available bandwidth (at most 10-20GHz)
Integrated Systems Group 30
Need circuits that can create/sense 4-8 bits/dim
Uncoded Multi-tone – Bit Loading
1.8
2
4.5
5Short ATCA BP
Excess Noise factor 0dB Excess Noise factor 20dB
1
1.2
1.4
1.6
dim
ensi
on
2 5
3
3.5
4
dim
ensi
on N6K BP
0.4
0.6
0.8
1
# bi
ts p
er d
1
1.5
2
2.5
# bi
ts p
er d
0 5 10 150
0.2
Frequency [GHz]0 5 10 15
0
0.5
Frequency [GHz]
Legacy FR4 BP
Integer constellations and target BER=10-15
Bandwidth not affected much (still 10-20GHz)In high-noise case - less advantage over baseband
Integrated Systems Group 31
g gWith coding can improve by up to 2x – closer to capacity
Impact of jitter on baseband
-2
0
-2
0Legacy FR4 BP Short ATCA BP
12Gb/s
-6
-4
BER
-6
-4
0BER
8Gb/s10Gb/s12Gb/s
25Gb/s
-12
-10
-8
log 10
-12
-10
-8
log 10
15Gb/s
20Gb/s
0 5 10 15 20
-14
Jitter Factor [dB]0 5 10 15 20
-14
Jitter Factor [dB]
6Gb/s10Gb/s
With proper codingIncrease data rateRelax PLL jitter spec – save power
Integrated Systems Group 32
Relax PLL jitter spec save powerOriginal jitter – rms = 1.4%UI (ring oscillator based PLL)
BER vs. hardware complexity
00
12Gb/s
Legacy FR4 BP Short ATCA BP
-5
ER
-5
ER
12Gb/s
-10
log 10
BE
25Gb/s-10
log 10
B
10Gb/s
0 10 20 30 40-15
# feedback eq taps
15Gb/s 20Gb/s
0 10 20 30 40 50 60 70 80-15
# feedback eq taps
6Gb/s 8Gb/s
Partially eliminate ISI (leave most of the reflections)Let simple code take care of the rest
Can recover from raw BER of 10-5
# feedback eq tapsq p
Integrated Systems Group 33
Can recover from raw BER of 10 5
And save up to 50 feedback taps - up to 15mW/Gb/s in 0.13µm
But, need to be careful
Always now what you’re optimizingPowerful coders/encoders often costly
Example - fastest RS (255,239) implementation10 – 40 Gb/s throughput10 40 Gb/s throughputEnergy cost - 12mW/Gb/s50x area of the high-speed link (extensive parallelism)
Need to include the energy cost per bit in the d d icode design spec
Integrated Systems Group 34
L. Song, M-L Yu, M.S. Shaffer, “10- and 40-Gb/s Forward Error Correction Devices for Optical Communications,”IEEE Journal of Solid-State Circuits, vol. 37, no. 11, Nov. 2002.
Opportunity for coding
Break the coding/equalization/modulation hierarchy
Goal to minimize overall energy cost per bit
Proper coding can be more energy-efficient in achieving the low BER than modulation/equalization
Especially with lots of crosstalk and numerous small reflectionsEspecially with lots of crosstalk and numerous small reflections
Need new paradigms in code development to specificationNon Gaussian (system) noiseNon-Gaussian (system) noise Circuit non-idealitiesCrosstalk and residual channel memory (ISI)E t t i t d f
Integrated Systems Group 35
Energy cost constraint on code performance
Bridging the gap: Multi-tone link
10
Multi-tone data rates with thermal noise
6
8 Nelco 64Gb/sFR4 38Gb/s
Hz
4
6
#bits
/H
0
2
0 2 4 6 8 10 12 140
frequency [GHz]
Integrated Systems Group 36
A. Amirkhany, V. Stojanovic, M.A. Horowitz, “Multi-tone Signaling for High-speed Backplane Electrical Links,” IEEE Global Telecommunications Conference, November 2004.
Bridging the gap: Multi-tone link
6
8
10
Multi-tone data rates with thermal noise
Nelco 64Gb/sFR4 38Gb/s
ts/H
z
data0
d t 10
2
4
#bit
LPF LPFdata0
data1
ls
data10 2 4 6 8 10 12 14frequency [GHz]BPF BPF
ejw1t ejw1t
data1LPF LPF
…
# le
vel
dataNBPF
e
BPFLPFdataN
LPF
fChallenge – balancing the inter-symbol and inter-channel interference
ejwNtejwNt
Integrated Systems Group 37
Microwave filter techniquesCustom signal processing
ConclusionsInterfaces are challenging system designs
Good space to explore system level optimization
Better backplanes are around the corner2-3x improvement in data rate possible
State-of-the-art baseband links (chips)Far from utilizing the capacity of the channels
10-20x difference in data rates10-20x difference in data ratesLooking into multi-tone and coding to bridge the gapUseful channel bandwidth 10-20 GHz
F l d i i i it f hi h d t ll tiFocus on lower-speed precision circuits for higher order constellations
CodingIf careful, can lower the energy cost per bit for the whole system
Integrated Systems Group 38
If careful, can lower the energy cost per bit for the whole systemProblem formulation different in so many ways
Acknowledgments
MARCO Interconnect Focus Center
Jared Zerbe and Ravi Kollipara - RambusJohn D’Ambrosia – TycoIEEE 802.3ap, ATCA forump,Alcatel, Teradyne, Juniper Networks
Integrated Systems Group 39