Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take...
Transcript of Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take...
![Page 1: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/1.jpg)
M.H. Perrott
High Speed Communication Circuits and SystemsLecture 22
Delay Locked Loops andHigh Speed Circuit Highlights
Michael H. PerrottApril 28, 2004
Copyright © 2004 by Michael H. PerrottAll rights reserved.
1
![Page 2: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/2.jpg)
M.H. PerrottM.H. Perrott
Recall the CDR Model (Hogge Det.) From Lecture 21
Similar to frequency synthesizer model except- No divider- Phase detector gain depends on the transition density
of the input data
1πα Kv
Φout(t)Φdata(t)
s2π
Hogge Detector
PhaseSampler VCO
e(t)H(s)Icp
ChargePump
LoopFilter
α = transition density0 < α < 1, = 1/2for PRBS input
v(t)i(t)
2
![Page 3: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/3.jpg)
M.H. PerrottM.H. Perrott
Key Observation: Must Use a Type II Implementation
Integrator in H(s) forces the steady-state phase error to zero- Important to achieve aligned clock and to minimize jitter
1πα Kv
Φout(t)Φdata(t)
s2π
Hogge Detector
PhaseSampler VCO
e(t)H(s)Icp
ChargePump
LoopFilter
α = transition density0 < α < 1, = 1/2for PRBS input
sCtot(1+s/wp)1+s/wz
v(t)
C1C2
R1
i(t)
1s(C1+C2)
=1+sR1C2
1+sR1C||
C|| =C1C2
C1+C2
=
v(t)i(t)
H(s)
3
![Page 4: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/4.jpg)
M.H. PerrottM.H. Perrott
Issue: Type II System Harder to Design than Type I
A stabilizing zero is required Undesired closed loop pole/zero doublet causes peaking
Non-dominantpole
Dominantpole pair
Open loopgain
increased
120o
-180o
-140o
-160o
20log|A(f)|
ffz
0 dB
PM = 55o for CPM = 53o for APM = 54o for B
angle(A(f))
A
A
A
A
B
B
B
B
C
C
C
C
Evaluation ofPhase Margin
Closed Loop PoleLocations of G(f)
fp
Re{s}
Im{s}
0
4
![Page 5: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/5.jpg)
M.H. PerrottM.H. Perrott
Delay Locked Loops
Delay element used in place of a VCO- No integration from voltage input to phase output- System is Type 1
1
πα Kv
Φout(t)Φdata(t)
s
2π
Hogge Detector
PhaseSampler VCO
e(t)H(s)Icp
ChargePump
LoopFilter
α = transition density
0 < α < 1,
= 1/2 for PRBS input
v(t)i(t)
1
πα Kv
Φout(t)Φdata(t)
2π
Hogge Detector
PhaseSampler
e(t)H(s)Icp
ChargePump
LoopFilter
α = transition density
0 < α < 1,
= 1/2 for PRBS input
v(t)i(t)
Voltage-ControlledDelay
Element
5
![Page 6: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/6.jpg)
M.H. PerrottM.H. Perrott
System Design Is Easier Than For CDR
-90o
-315o
-165o
-180o
-240o
20log|A(f)|
ffp3
\A(f)
Open loopgain
increased
0 dB
PM = 51o for B
PM = -12o for C
PM = 72o for A
Non-dominantpoles
Dominantpole pair
ABC
B
A
A
B
C
C
Evaluation ofPhase Margin
Closed Loop PoleLocations of G(f)
fp fp2
Re(s)
Im(s)
0
No stabilizing zero required- No peaking in closed loop frequency response
6
![Page 7: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/7.jpg)
M.H. PerrottM.H. Perrott
Example Delay-Locked Loop Implementation
Assume an input clock is provided that is perfectly matched in frequency to data sequence- However, phase must be adjusted to compensate for
propagation delays between clock and data on the PC board A variable delay element is used to lock phase to
appropriate value- Phase detector can be similar to that used in a CDR
Hogge, Bang-Bang, or other structures possible
PD ChargePump
e(t) v(t)LoopFilter
retimeddata(t)
data(t)
AdjustableDelay Element
clk(t)
Clock and Dataarrive misaligned
in phase
Clock and Dataare now re-aligned
in phase
adjustedclk(t)
7
![Page 8: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/8.jpg)
M.H. PerrottM.H. Perrott
The Catch
Delay needs to support an infinite range if system to be operated continuously- Can otherwise end up at the end of range of delay element
Won’t be able to accommodate temperature variations Methods have been developed to achieve infinite range
delay elements- Efficient implementation of such delay elements is often the
key issue for high performance designs
PD ChargePump
e(t) v(t)LoopFilter
retimeddata(t)
data(t)
AdjustableDelay Element
clk(t)
Clock and Dataarrive misaligned
in phase
Clock and Dataare now re-aligned
in phase
adjustedclk(t)
8
![Page 9: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/9.jpg)
M.H. PerrottM.H. Perrott
The Myth
Delay locked loop designers always point to jitter accumulation problem of phase locked loops- Implication is that delay locked loops can achieve much lower
jitter than clock and data recovery circuits The reality: phase locked loops can actually achieve lower
jitter than delay locked loops- PLL’s can clean up high frequency jitter of input clock- Whether a PLL or DLL is better depends on application (and achievable VCO performance)
PD ChargePump
e(t) v(t)LoopFilter
retimeddata(t)
data(t)
AdjustableDelay Element
clk(t)
Clock and Dataarrive misaligned
in phase
Clock and Dataare now re-aligned
in phase
adjustedclk(t)
9
![Page 10: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/10.jpg)
M.H. PerrottM.H. Perrott
One Method of Achieving Infinite Delay
I
Q
φ
cos(2πfint)
cos(2πfint+φ)
cos(2πfint+φ) = cos(2πfint)cos(φ) - sin(2πfint)sin(φ)
Phase shift of a sine wave can be implemented with I/Q modulation
Note: infinite delay range allows DLL to be used to adjust frequency as well as phase- Phase adjustment now must vary continuously- Hard to get low jitter in practical implementations
10
![Page 11: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/11.jpg)
M.H. PerrottM.H. Perrott
Conceptual Implementation of Infinite Delay Range
Practical designs often implement cos() and sin() signals as phase shifted triangle waves
I
Q
φ
φcos(2πfint+φ)
cos(2πfint)cos(2πfint)
sin(φ)
iin(t)
qin(t)90
o
cos(φ)φ
cos(2πfint)
cos(2πfint+φ)
cos(2πfint+φ)
cos(2πfint)
cos(2πfint+φ) = cos(2πfint)cos(φ) - sin(2πfint)sin(φ)
11
![Page 12: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/12.jpg)
M.H. PerrottM.H. Perrott
Some References on CDR’s and Delay-Locked Loops
Tom Lee et. al. were pioneers of the previous infinite range DLL approach- See T. Lee et. al., “A 2.5 V CMOS Delay-Locked Loop for an
18 Mbit, 500 Megabyte/s DRAM”, JSSC, Dec 1994 Check out papers from Mark Horowitz’s group at Stanford- Oversampling data recovery approach
See C-K K. Yang et. al., “A 0.5-um CMOS 4.0-Gbit/s Serial Link Transceiver with Data Recovery using Oversampling”, JSSC, May 1998
- Multi-level signaling See Ramin Farjad-Rad et. al., “A 0.3-um CMOS 8-Gb/s 4-
PAM Serial Link Transceiver”, JSSC, May 2000- Bi-directional signaling
See E. Yeung, “A 2.4 Gb/s/pin simultaneous bidirectional parallel link …”, JSSC, Nov 2000
12
![Page 13: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/13.jpg)
M.H. Perrott
High Speed Circuit Highlights
13
![Page 14: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/14.jpg)
M.H. PerrottM.H. Perrott
Examine Techniques from a Few Recent Papers
Circuit architectures utilizing circular topologies- “A 40-Gb/s Clock and Data Recovery Circuit in 0.18-um
CMOS Technology”, Jri Lee and Behzad Razavi, JSSC, Dec. 2003- “Fully Integrated CMOS Power Amplifier Design Using
the Distributed Active-Transformer Architecture”, Ichiro Aoki, ..., Ali Hajimiri, JSSC, March 2002- “A Circular Standing Wave Oscillator”, W. Andress,
Donhee Ham, ISSCC 2004 Donhee will talk about this (and other things) in his guest
lecture Low Noise, High Bandwidth Sigma-Delta Fractional-N
Frequency Synthesizers- “A Fractional-N Frequency Synthesizer Architecture
Utilizing a Mismatch Compensated PFD/DAC Structure …”, Scott Meninger, Michael Perrott, TCASII, Nov 2003
14
![Page 15: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/15.jpg)
M.H. PerrottM.H. Perrott
A 40 Gb/s CDR in 0.18u CMOS! (Razavi et. al.)
Achieves high speed operation using interleaving- 4 parallel 10 Gb/s detectors are fed by an 8-phase VCO
4 phases used for sampling registers 4 phases used for bang-bang phase detection registers
Key challenges- Low jitter and low mismatch between clock phases
We will look at this issue in detail here- Achievement of 10 Gb/s sampling/bang-bang detection
Phase
DetectorCharge
Pump
Loop
FilterVCO
CLK0CLK45CLK90CLK135
DIN
(40 Gb/s)
Demuxed
Data
(10 Gb/s)
Differential
Phase Shifted
Clocks
(10 GHz)
15
![Page 16: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/16.jpg)
M.H. PerrottM.H. Perrott
The Need for Low Mismatch Between Clock Phases
8-phases generated by 4 VCO clock signals and their complements
Desired spacing between clock signals is only 12.5 ps!- Must meet setup and hold times of each 10 Gb/s sampler
and phase detector register (limited by 0.18u technology)- Mismatch and jitter on clock phases quickly eats into any
margin left over after meeting setup/hold times Unacceptable bit error rates can easily result
CLK0
CLK45
CLK90
CLK135
DIN
12.5 ps
16
![Page 17: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/17.jpg)
M.H. PerrottM.H. Perrott
A Method to Generate Clock Phases
Use transmission delay lines to generate each phase Advantage over using buffers as delay elements- Wide bandwidth and lower noise- Mismatch only a function of geometry variation
Buffer mismatch a function of both geometry and device variation (i.e., doping variation, etc.)
Issue: transmission line is big- Loss (and finite bandwidth) due to finite resistance of
metal- Long distance between clock phase outputs undesirable
CLK0
CLK45 CLK90 CLK135
12.5 ps
Delay =
CLK180
17
![Page 18: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/18.jpg)
M.H. PerrottM.H. Perrott
Realize a Lumped Parameter Version of Trans. Line
Approximate transmission line as an LC ladder network- Allows a much more compact implementation- Offers the same advantage of having mismatch depend
only on geometry Issue: now that mismatch has been dealt with, how do
we achieve low jitter?
CLK0
CLK45 CLK90 CLK135
12.5 ps
Delay =
CLK180
CLK0
CLK180CLK45 CLK90 CLK135
18
![Page 19: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/19.jpg)
M.H. PerrottM.H. Perrott
Combine VCO and Phase Generator
Can satisfy Barkhausen criterion by inverting output of line and feeding back to the input- Looks a bit like a ring oscillator, but much better phase
noise performance
CLK0
CLK180CLK45 CLK90 CLK135
-1
19
![Page 20: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/20.jpg)
M.H. PerrottM.H. Perrott
Sustain Oscillation by Including Negative Resistance
Place negative resistance at each phase to keep amplitudes identical- Must be careful to minimize impact on mismatch
Issue: how do you match feedback path from CLK180to CLK0 with other phases?
CLK0
CLK180CLK45 CLK90 CLK135
-1
-Gm -Gm -Gm -Gm
20
![Page 21: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/21.jpg)
M.H. PerrottM.H. Perrott
Use a Circular Geometry!
Note use of differential inductors, etc.
-Gm
Vtune
Buffer CLK45
-Gm
Vtu
ne
Buffer
CLK90
-Gm
Vtune
BufferCLK135
-Gm
Vtu
ne
Buffer
CLK180
21
![Page 22: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/22.jpg)
M.H. PerrottM.H. Perrott
Other Nice Nuggets in the Razavi Paper
Phase detection using 4 bang-bang detectors- Clever combining of individual detectors to create an
overall control voltage- Note: Bang-bang detection linearized by metastable
behavior of registers Achievement of 10 Gb/s registers in 0.18u CMOS- Leverages a large amplitude clock signal using a tuned
VCO buffer- Uses SCL registers with resistor loads – bottom current
sources eliminated to leverage large amplitude clock Fast XOR gate and amplifier structures
Take a look at the paper for more details:“A 40-Gb/s Clock and Data Recovery Circuit in 0.18-um
CMOS Technology”, Jri Lee and Behzad Razavi, JSSC, Dec. 200322
![Page 23: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/23.jpg)
M.H. PerrottM.H. Perrott
A 2 Watt, 2.4 GHz CMOS Power Amplifier (Hajimiri et. al.)
Key issue facing CMOS power amps:- Breakdown voltage is too low for transistors with
sufficient speed- Example
0.35u CMOS limited to about a 3V supply To keep in M1 in saturation, assume we need Vout > 0.5 V
M1
Vout
Vin
Load presented
by antenna
RL=50Ω
VDD
23
![Page 24: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/24.jpg)
M.H. PerrottM.H. Perrott
Key Idea: Use a Transformer!
To achieve 1 Watt at the output, we need:
We know that:
Therefore, setting n = 4 is adequate:
M1
Vout
Vin
Load presented
by antenna
RL=50Ω
VDD
1:n
Zin
24
![Page 25: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/25.jpg)
M.H. PerrottM.H. Perrott
A Practical Issue for High Frequency Transformers
High frequency transformers are formed by coupled inductors- Will typically have a net inductive impedance at the
operating frequency (assuming self-resonant frequency is well above operating frequency)
M1
Vout
Vin
Load presented
by antenna
RL=50Ω
VDD
1:n
Zin
C0
Use a capacitor to resonate out the inductivecomponent of Zin at the desired frequency
25
![Page 26: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/26.jpg)
M.H. PerrottM.H. Perrott
The Issue of Bondwires
The presence of bondwires will alter the impedance seen by the transistor- Would prefer to desensitize the circuit to the bondwire
inductances
M1
Vout
Vin
Load presented
by antenna
RL=50Ω
VDD
1:n
Zin
C1
Lbondwire
Lbondwire
26
![Page 27: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/27.jpg)
M.H. PerrottM.H. Perrott
The Fix
A differential topology places the bondwire nodes at incremental ground- Bondwire inductance now has little impact
M1
Vout
Vin
Load presented
by antenna
RL=50Ω
VDD
1:n
Zin
C1
Lbondwire
Lbondwire
C1
1:nRL=50Ω
VDD
M1 M2Vin Vin
27
![Page 28: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/28.jpg)
M.H. PerrottM.H. Perrott
How Do We Implement the Transformer?
Classical options- Spiral 1:n transformer
Problem: very lossy- Resonant L-match or -match transformer
Problem: still too lossy (though better than a spiral transformer)
A novel approach by Aoki & Hajimiri: Create a distributed, active transformer
C1
1:nRL=50Ω
VDD
M1 M2Vin Vin
28
![Page 29: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/29.jpg)
M.H. PerrottM.H. Perrott
A 1:4 Transformer Achieved Using Four 1:1 Sections
1:1 transformers can be implemented much more efficiently than their 1:4 counterparts- Winding ratio is one-to-one, and integrated processes
allow very close proximity between the two windings Cascading of the secondary windings leads to their
output voltages being summed- Net effect is a 1:4 transformer!
RL=50Ω
C1
1:1
VDD
M1 M2Vin Vin
C1
1:1
VDD
M1 M2Vin Vin
C1
1:1
VDD
M1 M2Vin Vin
C1
1:1
VDD
M1 M2Vin Vin
29
![Page 30: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/30.jpg)
M.H. PerrottM.H. Perrott
Implementation of 1:1 Transformer Sections
High efficiency using slab (i.e. straight wire) inductors- Avoids inefficiency of current crowding at corners of windings
RL=50Ω
C1
1:1
VDD
M1 M2Vin Vin
C1
1:1
VDD
M1 M2Vin Vin
C1
1:1
VDD
M1 M2Vin Vin
C1
1:1
VDD
M1 M2Vin Vin
C1
VDD
M1 M2Vin VinM1Vin
M2 Vin
30
![Page 31: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/31.jpg)
M.H. PerrottM.H. Perrott
Problem: Long Wires Required for Diff. Pair Elements
The use of slab inductors would seem to imply that an equally long return path for the current is required- Implies that long wires are required for connection to
capacitor and differential pair transistors Issue: loss and undesired inductance
C1
VDD
M1 M2Vin VinM1Vin
M2 Vin
31
![Page 32: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/32.jpg)
M.H. PerrottM.H. Perrott
A Clever Fix: Redefine The Differential Pairs
Observation: neighbors of adjoining transformer sections have opposite signaling on their transistor gates- Can define differential pairs to be between the sections
rather than within each section- Short wires can now be achieved for capacitor and
transistors
C1
VDD
M1 M2Vin VinM1Vin
M2 Vin
C1
Issue: what do you do about the ends?
32
![Page 33: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/33.jpg)
M.H. PerrottM.H. Perrott
Use a Circular Topology!
Removes the end effects!
VDD
VDD
C 1
M 2
V in
M 1
V in
VDD
M1
Vin
M2
Vin
C1
VDD
M1
Vin
M2
Vin
C1
C1
M2
Vin
M1
Vin
RL=50Ω
33
![Page 34: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/34.jpg)
M.H. PerrottM.H. Perrott
Other Issues to Consider
Efficient achievement of 50 Ohm matching at the input of the amplifier
Efficiency calculations Input power distribution Harmonic suppression
Take a look at the paper for more details:“Fully Integrated CMOS Power Amplifier Design Using the
Distributed Active-Transformer Architecture”, Ichiro Aoki, Scott Kee, David Rutledge, Ali Hajimiri, JSSC, March 2002
34
![Page 35: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/35.jpg)
M.H. PerrottM.H. Perrott
Wide Bandwidth, Low Noise Fractional-N Synthesizers
Fractional-N frequency synthesis
- Achieves very high frequency resolution- There is a noise/bandwidth tradeoff
PFD ChargePump
Nsd[m]
out(t)e(t)
DitheringModulator
v(t)
N[m]
LoopFilter
DividerVCO
ref(t)
div(t)
MM+1
Fout = M.F Fref
M.F
Fref
35
![Page 36: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/36.jpg)
M.H. PerrottM.H. Perrott
The Issue of Quantization Noise
Divide value dithering introduces noise Sigma-Delta modulation shapes noise to high
frequencies
PFD ChargePump
Nsd[m]
out(t)e(t)
Σ−ΔModulator
v(t)
N[m]
LoopFilter
DividerVCO
ref(t)
div(t)
MM+1
Fout = M.F Fref
f
Σ−ΔQuantization
Noise
Fref
36
![Page 37: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/37.jpg)
M.H. PerrottM.H. Perrott
Impact of Quantization Noise on Synth. Output
PFD LoopFilter
N/N+1
Ref Out
M-bit 1-bit
Div
Σ−ΔModulator
Fout
Noise
FrequencySelection
FrequencySelection
OutputSpectrum
QuantizationNoise Spectrum
PLL dynamicsΣ−Δ
Lowpass action of PLL dynamics suppresses the shaped - quantization noise
37
![Page 38: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/38.jpg)
M.H. PerrottM.H. Perrott
Impact of Increasing the PLL Bandwidth
Higher PLL bandwidth leads to less quantization noise suppression- There is a direct trade-off between PLL bandwidth and jitter
PFD LoopFilter
N/N+1
Ref Out
M-bit 1-bit
Div
Σ−ΔModulator
Fout
Noise
FrequencySelection
FrequencySelection
OutputSpectrum
QuantizationNoise Spectrum
PLL dynamicsΣ−Δ
38
![Page 39: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/39.jpg)
M.H. PerrottM.H. Perrott
Method 1 of Reducing Quantization Noise
Lower quantization step size by switching between multiple phases of the VCO output- Generate phases by using a ring oscillator or delay
locked loop Issue: noise induced by mismatch between phases
PFDChargePump
Nsd[m]
out(t)e(t)
Σ−ΔModulator
v(t)
N[m]
LoopFilter
VCO
ref(t)
div(t)Phase
Shifting LogicDivider
N phases
39
![Page 40: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/40.jpg)
M.H. PerrottM.H. Perrott
Method 2 of Reducing Quantization Noise
Use classical fractional-N approach of “phase interpolation” to cancel out quantization noise- Use a D/A converter matched to PFD/Charge Pump output
Issue: limited by mismatch between gain of D/A and PFD/Charge Pump output and nonlinearity in D/A
PFDChargePump
Nsd[m]
out(t)e(t)
Accumulator
v(t)LoopFilter
VCO
ref(t)
div(t) Divider
Carry Out
D/A
Residue
40
![Page 41: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/41.jpg)
M.H. PerrottM.H. Perrott
Horizontal Slicing with B = 2
Tvco Tvco Tvco Tvco
0
Ichp
Vertical Slicing with B = 2
Charge Pump
Output
Tvco Tvco Tvco Tvco
0
Ichp
Charge Pump
Output
1
4IchpTvco
2
4IchpTvco
3
4IchpTvco
4
4IchpTvcodQ =
dQ =
Charge TransferredIn Dashed Box dQ =
ε =04
ε =14
ε =24
ε =34
1
4IchpTvco
2
4IchpTvco
3
4IchpTvco
4
4IchpTvco
Tvco
0
4IchpTvco
Tvco
0
4IchpTvco
ε =44
Comparison of Approaches
Phase shifting- “Vertical” approach
Phase interpolation- “Horizontal” approach
41
![Page 42: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/42.jpg)
M.H. PerrottM.H. Perrott
Which Is Best?
Phase shifting- Limited by number of phases that can be generated and
their mismatch- Ring oscillators have poor phase noise
Phase interpolation- Limited by ability to match DAC output to that of the
PFD/Charge pump- High spurious noise can result due to DAC nonlinearity
Key observation- Phase interpolation allows us to take advantage of
advances in DAC design over the last 20 years We can now largely overcome the above limitations!
42
![Page 43: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/43.jpg)
M.H. PerrottM.H. Perrott
Two Recent Approaches to the Cancellation Method
“A Wideband 2.4-GHz Delta-Sigma Fractional-N PLL With 1-Mb/s In-Loop Modulation”, Sudhakar Pamarti and Ian Galton, JSSC, Nov 2004- Impact of DAC mismatch mitigated by using - modulator
rather than accumulator to perform dithering- Impact of DAC nonlinearity mitigated by using mismatch
noise shaping techniques- Overall: reliably achieves 20 dB noise suppression
“A Fractional-N frequency synthesizer architecture utilizing a mismatch compensated PFD/DAC structure…”, Scott Meninger and M.H. Perrott, TCAS II, Nov 2003- Utilizes a mismatch compensated PFD/DAC structure- Simulations show that 40 dB noise suppression is
achievable!
43
![Page 44: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/44.jpg)
M.H. PerrottM.H. Perrott
Key Element: A PFD/DAC Structure
Leverages application of selective delays of parallel PFD outputs to realize the D/A function- No explicit D/A required- Delay of one VCO cycle can be easily achieved using
registers clocked by the VCO Illustrate the idea through animation
PFD
PFD
PFD
PFD
Tvco
Tvco
44
![Page 45: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/45.jpg)
M.H. PerrottM.H. Perrott
Apply Phase Shift to Two out of the Four PFD’s
Net horizontal level shifts to halfway point
PFD
PFD
PFD
PFD
Tvco
Tvco
45
![Page 46: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/46.jpg)
M.H. PerrottM.H. Perrott
Net horizontal point shifts up
PFD
PFD
PFD
PFD
Tvco
Tvco
Apply Phase Shift to Three out of the Four PFD’s
DAC function is self-aligned in gain to PFD output!
46
![Page 47: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/47.jpg)
M.H. PerrottM.H. Perrott
A current DAC is used, but is self-aligned to PFD output using the phase shifting method just discussed
Nonlinearity of the DAC is removed using mismatch noise shaping techniques
Note: approach overcomes mismatch limitations of prior art: Y. Dufour, “… Fractional Division Charge Compensation …”, US Patent 6,130,561
Div
Ref
DAC
Mismatch
Shaping
Φ0
To Loop Filter
2B-1 Current Sources
From ΣΔB
2B-1
PFD1
PFD0
VCORegister Based
DelayΦ1
Charge
Pump0
Charge
Pump1
Actual PFD/DAC Implementation
47
![Page 48: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/48.jpg)
M.H. PerrottM.H. Perrott
Goal: GSM Level Noise Performance with 1 MHz Bandwidth!
- Left: Calculated Performance (PLL Design Assistant)- Right: Simulated Performance (CppSim)
106
107
108
109
240
220
200
180
160
140
120
100
80
Simulated Phase Noise of Freq. Synth.
Frequency Offset from Carrier (Hz)
L(f
) (d
Bc/H
z)
Total NoiseΣΔ Noise
BW=1MHz, = 1/64 BW=1MHz, = 1/64
CppSim simulations verify this is possible with only a 6-bit DAC!
48
![Page 49: Delay Locked Loops and High Speed Circuit Highlights · Fast XOR gate and amplifier structures Take a look at the paper for more details: “A 40-Gb/s Clock and Data Recovery Circuit](https://reader033.fdocuments.in/reader033/viewer/2022042023/5e7b5a12e60e9b3bef1d3922/html5/thumbnails/49.jpg)
M.H. PerrottM.H. Perrott
Other Issues to Consider
Additional nonidealities must be dealt with- Timing mismatch- Impact of shape of horizontal cancellation waveforms- Impact of both DAC element and timing mismatch
sources on achievable spurious performance Note: detailed analytical examination of the above
items is difficult- CppSim is an invaluable tool for exploring such issues
Take a look at the paper for more details:“A Fractional-N Frequency Synthesizer Architecture Utilizing a
Mismatch Compensated PFD/DAC Structure …”, Scott Meninger, Michael Perrott, TCASII, Nov 2003
49