A Capacitive DC-DC Converter for Stacked Loads With Wide ...
Transcript of A Capacitive DC-DC Converter for Stacked Loads With Wide ...
A Capacitive DC-DC Converter for Stacked LoadsWith Wide Range DVS Achieving 98.2% Peak
Efficiency in 40nm CMOS
Tim Thielemans, Nicolas Butzen, Athanasios Sarafianos, Michiel Steyaert, Filip TavernierE-mail: {Tim.Thielemans, Filip.Tavernier}@esat.kuleuven.be
Department of Electrical Engineering ESAT-MICAS - KU Leuven, Belgium
Abstract—This paper presents a fully integrated gearbox-typeswitched capacitor DC-DC converter that is able to provide twostacked loads with a wide voltage range for Dynamic VoltageScaling (DVS). This combination offers significant efficiencyimprovements in both the power delivery and the functionalblocks. To face the challenges of wide voltage range stacked loads,7 topologies with custom gate and bulk drive were implemented,together with a threefold fully integrated control loop. This 8-phase interleaved converter has been fabricated in a 40nm CMOStechnology. Measurements demonstrate a peak system efficiencyof 98.2%, while a best-in-class DVS-range from 0.45V to 0.9V isguaranteed for two independent stacked loads.
Index Terms—DC-DC Converter, Switched Capacitor, Gear-box, Stacked Voltage Domains, Dynamic Voltage Scaling
I. INTRODUCTION
Dynamic Voltage Scaling (DVS) balances performance and
energy consumption in digital circuits and is a widely used
technique to improve power efficiency [1]. Although DVS is
a powerful technique, from a power management perspective,
this is far from ideal as each block requires its own highly
loaded power converter. Even assuming up to 93% efficient
converters for DVS [1]–[3], the conventional DVS-approach
still requires a large chip area and induces significant losses
in power delivery (PD) because the DVS-converters have to
convert all the power, as indicated by I1,2 in Fig. 1a.
Voltage domain stacking (VDS) significantly reduces the
overhead of the PD because, due to charge recycling, only
the mismatch current (I2-I1 in Fig. 1b) has to be converted.
Besides less chip area occupied by the PD, this leads to
a tremendous decrease in PD-losses and thus an increased
system efficiency (ηsystem). When loads are matched and
draw nearly identical currents, the converter barely converts
power, unlocking system efficiencies up to 100%, regardless
the converter efficiency (ηconv) [4]. Only in worst-case, when
one load is completely turned off, ηsystem drops to ηconv .
Moreover, as a consequence of Vbus < Vin, the initial power
converter at Vsup can achieve higher efficiencies as well
compared to conventional DVS. All these advantages of VDS
require a converter which is able to sink and source current as
the mismatch current can flow both in and out the converter.
Ideally, the converter supports one load to be completely
turned off to guarantee proper operation while both loads can
work fully independently.
DC
DC
Vbus
= Vload,max
Vload1
Load1
DC
DC
Vsup
I1
DC
DC
Vload2
Load2
I2
(a) Conventional DVS
DC
DC
Vin
= Vload1
+ Vload2
+
Vload1
-
Load1
Load1
|I2 - I1|
I1
I2
+
Vload2
-
Vout
DC
DC
Vsup
(b) Proposed DVS
Fig. 1: Comparison of DVS-techniques
Current state-of-art load stacking converters are usually not
DVS-supportive [5]–[7] or only offer a limited DVS-range
[8], notably limiting the possible power reductions in digital
blocks. Therefore, this paper presents a converter that supplies
two independent stacked loads with a wide DVS-range. As
explained in Section II, 7 topologies have been implemented
in a single gearbox to achieve the desired wide DVS-range
from 0.45V-0.9V for both loads. This is by far the best-in-
class DVS-range for stacked loads as will be stated with a
state-of-the-art comparison of measurement results in Section
III, before drawing the conclusions in Section IV.
II. IMPLEMENTATION DETAILS
A. Gearbox Topology
A switched capacitor (SC) converter can be modeled as
an ideal transformer (iVCR) and a series output impedance
(Rout) [9]. The output current causes a voltage drop over Rout,
resulting in losses and a discrepancy between iVCR and the
actual voltage conversion ratio (VCR). For efficient operation
over a wide voltage range, a gearbox type converter is benefi-
cial because iVCR can be chosen close to VCR, avoiding large
voltage drops and associated losses. Additionally, depending
on the direction of the output current (sink or source), VCR
can be higher or lower than iVCR. As a result, two different
topologies are required for each VCR when loads are stacked.
V CR =Vout
Vin=
Vload2
Vload1 + Vload2(1)
r g
Vin
Vout
b r g
b
Vin
Vout
r g
b
Vout
r g
Vin
b
Vout
φ1 φ2 φ1 φ2
2/3 1/2 1/3
r g
Vout
b
g
Vin
b
r
Vout
g
Vin
b
r
Vout
g
Vin
b
r
Vout
r g b
Vout
φ1 φ2
φ1 φ2
2/5
1/4
3/4
g
b
Vout
r
g
Vout
b
r
Vin
φ1 φ2
3/5
φ1
Fig. 2: Used topologies and iVCRs with gray common phases
Vin
Vout
Cg
TG1 TG2 TG3 TG4
BG2 BG3 BG4
BG1
TG
BGC
r
TR1 TR2 TR3 TR4
BR2 BR3 BR4
BR1
TR
BR
Cb
BB1
BB2
TB1 TB2
TB
BB
Vout
TBBB TBBB
BBTBBBTR
Vout Vout
Vdd
Fig. 3: Implementation of the proposed gearbox SC core
The desired wide DVS-range from 0.45V-0.9V for both
stacked loads results, according to (1), in a VCR-range from
1/3 to 2/3. Since VDS requires two iVCRs for a single VCR,
the iVCR-range of the gearbox should be wider than [1/3 2/3].
Seven different topologies with iVCRs ranging from 1/4 to
3/4 have been selected as visualized in Fig. 2. Thanks to
the common phases, the amount of switches could be limited
to 20 and all devices could be sized closer to their optimal
size in each topology. Additionally, the three flying capacitors
are placed in each topology such that the overall bottom-plate
and soft charging losses could be minimized. Capacitor Cg ,
for instance, has in each topology the largest bottom-plate
swing and Cb transfers in each topology the most charge. Both
capacitors are sized appropriately (Cg < Cr and Cb ≈ 2Cr)
B. Power Switches
Gearbox converters with wide VCR-range suffer in general
from high blocking voltages and varying overdrive (Vgs)
voltages for the switches. Moreover, the combination of DVS
with VDS provokes drain source reversals. Whereas the high
blocking voltage is solved traditionally by combining thin
and thick-oxide devices (see Fig. 3), the varying Vgs requires
custom solutions. To start, a low-power auxiliary fixed supply
(Vdd) is used for control logic and some bootstrapping cir-
cuits to guarantee timing restrictions independently of Vout.
Furthermore, the overdrive voltage of the frequently used
PMOS switches BG1, BR1 and BB1 is always limited to Vout.
M1 M2 M3
M4
M5 M6
M7b
M7a
Cx
Φ1
Φ2
Φ2
Φ2
Φ1
Vout
Φ1
α
Bootstrap circuit Multiplexer
Compensation
D/S reversal
Power
switch TR2Φ1 = TR2 open
α = Vout >TR
Φ2 = TR2 closed
M8 M9
M10
M11
Vdd
Vout
Vdd
Vout
VTR
Nbs
Ng
Nb
Fig. 4: Gate and bulk driving of power switch TR2
To overcome the trade-off between losses due to the gate
capacitance and due to the on-resistance, these switches are
fragmented and parallelized according to Vout.
Drain source reversals require extra attention for bulk and
gate contacts to avoid forward biased junction diodes, un-
wanted conductive switches or excessive inter-terminal volt-
ages. As an example of the precautions taken, the driving stage
of power switch TR2, between Vout and the top-plate of Cr
(TR), is shown in Fig. 4. The switch is toggled with a bootstrap
circuit, where Vdd charges Cx during Φ1 through M3 and M1,
while M6 disconnects the top plate of Cx from the bootstrap
output (Nbs) which is grounded by M7. In Φ2, M4 connects
the bottom plate of Cx to Vout, resulting in a boosted voltage
(Vdd+Vout) at Nbs. A grounded gate (Ng) can be used to open
TR2 without any risk of excessive Vdg when VTR ≤ Vout. This
is, however, only the case when the converter is configured in
topologies 3/4, 2/5 or 3/5. Therefore, a pass-gate multiplexer
connects either Nbs or the switched bulk voltage (Nb) to the
gate of TR2 (Ng). This switched gate and bulk contacts prevent
excessive inter-terminal voltages and forward biased diodes.
C. Control
The toplevel architecture of the converter core with control
is shown in Fig. 5. VDS requires fast and proper detection of a
sink (SNK) or source (SRC) situation to avoid voltage drops.
Therefore, an 800MHz-clocked comparator with hysteresis
compares Vref with Vout and switches state (SNK ↔ SRC)
when Vout crosses the boundary [8]. This SRC-signal is then
used in a threefold control to ensure proper operation.
Two open-loop controls configure the converter core in the
most efficient way, as can be seen in Fig. 5. The iVCR
decoder (a) sets the proper topology, based on the SRC-
signal and the desired VCR. For this, clocked comparators
compare Vref with four fixed ratios of Vin, giving an idea
of the VCR-region (see also Fig. 10). The second open-loop
control (b) determines, similar to the previous one, when Vout
drops below 0.6V or 0.5V to incorporate two, respectively,
three of the fragmented switches mentioned earlier. Finally, a
pulse skipping hysteretic controller clocked at 800MHz (c),
closes the loop and ensures a stable output voltage. Either the
negative (SRC) or positive (SNK) terminal of the comparator
is applied to the phase generator, creating the switching signals
for each of the 8x2 fragments.
Δt
+
-
+
-
+
-
+
-
+
-
+
-
+
-
+
-
Switch fragment control b
+
-
+
-
+
-
+
-
+
-
+
-
+
-
+
-
Topology control a
iVCR
Select
Decoder
SRC
Open loop control
fcontrol
Closed loop control
decoding
SC core
SC core
gate & bulk drive
NOCG & decoder
180°
SC core
gate & bulk drive
NOCG & decoder
0°
8
decoding
SC core
SC core
gate & bulk drive
NOCG & decoder
180°
SC core
gate & bulk drive
NOCG & decoder
0°
2
decoding
SC core
SC core
gate & bulk drive
NOCG & decoder
180°
SC core
gate & bulk drive
NOCG & decoder
0°
1
Converter
VddVref
Phase-
Generator
0
1
SRC
Vout
Vreffclk
Vo
ut
fclk
Vre
fS
RC
SN
K
Vin
VoutLoad1
Load2
iVCR
SF
3
2
fsw
1
fsw
2
fsw
8
c
Fig. 5: Toplevel architecture
Decoupling
Capacitance
1 (0°)
795 μm
61
0 μ
mCentral Control1’ (180°)
Cb
Cg
Cb
Fig. 6: Chip photograph, total active area of 0.484mm2
III. MEASUREMENT RESULTS
To validate the proposed design, a prototype chip was
fabricated in a 40nm bulk CMOS technology with a total active
area of 0.484mm2, control and decoupling included. The 8x2
fragments can be clearly distinguished in Fig. 6. All flying and
decoupling capacitance has been implemented with a MOS-
MOM stack wherein IO-devices are used to minimize leakage.
Also, the substrate coupling is reduced by Deep n-well biasing
with two front-to-front connected diodes [10].
As shown in Fig. 7, Vout follows Vref in both maximal
(20mA) sink and source loads. To distinguish the operation
of the topology selection and the hysteretic controller, the
frequency of fcontrol was decimated in the lower two plots.
The hysteretic controller maximizes fsw to 50MHz after a step
in Vref , decreasing the gap between Vref and Vout as much
as possible with the current topology. The controller updates
the topology instantaneously to the correct one when fcontroltoggles and Vout rises/falls to Vref with up to 1.5V/μs. The
transient measurement in Fig. 8 demonstrates that the converter
detects and recovers from maximal load reversals (± 20mA)
while voltage droop is kept limited to 85mV.
As a result of the voltage stacking capability, the best ηconvis equal to the worst ηsystem, completely overcoming the low
efficiency for low-load conditions (Fig. 9). With a maximal
current drawn of 20mA in each load, the full DVS-range
VoutVreffcontrolnew iVCR
Time0.6
0.7
0.8
0.9
[V]
iVCR = 1/2
iVCR = 2/3
0.6
0.7
0.8
0.9
[V]
iVCR = 1/3
iVCR = 1/2
Time
iVCR = 2/3
iVCR = 1/2
iVCR = 1/3
iVCR = 1/2
SNK (20mA) SRC (20mA)
SRC (20mA) SNK (20mA)
Fig. 7: Transient measurements showing Vref -tracking capa-
bility and operation of both hysteretic and topology controller.
Time-100
-50
0
50
100
Vout
-Vre
f [m
V]
-20020SRC SRCSNK
ILoad [mA
]
1/2 1/2 (iVCR)1/3
Fig. 8: Detection and recovery of maximal load reversal
-50 -40 -30 -20 -10 0 10 20 30 40 500.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
Effic
ienc
y
Vin: 1.35 V, Vout 0.52 VVin: 1.35 V, Vout 0.75 V
Vin: 1.60 V, Vout 0.90 VVin: 1.80 V, Vout 0.90 V
Vin: 1.35 V, Vout 0.82 V
Fully independentDVS-range
conv
system
Current mismatch (Iload2-Iload1) [mA]
Fig. 9: System and converter efficiency for varying current
imbalance and multiple voltages, all supplies included.
(0.45V-0.9V) is guaranteed for both loads, even while one
load is completely turned off. However, when only a single
fixed conversion ratio is required (for instance from 1.6V to
0.9V), the maximal current drawn for stacked loads increases
to 39mA or even to 50mA for a single load. At this maximal
single loaded power density, the converter still achieves 76%
efficiency. Fig.10 demonstrates the capability of providing the
full DVS-range, while achieving a peak system efficiency of
98.2%, including Vdd. Also, the boundaries for the topology
selection are marked with a dotted line.
TABLE I: Comparison with current state-of-the-art SC converters for stacked loads (VDS) and/or DVS
DVS + VDS VDS DVS
This Work VLSI17 [8] JSSC17 [6] VLSI15 [5] VLSI10 [7] ISSCC15 [1] ISSCC13 [2]
Technology 40nm G 65nm G 40nm G 40nm G 45nm SOI 65nm LL 130nm
Capacitors MOM-MOS MOM-MIM N/A MOS Deep-Trench MOS-MOM-MIM Ferro-electric
Vin [V] 0.9-1.8 1.5-2.2 2.2 3.6 2 1.6-2.2 1.5
Vout [V] 0.45-0.9 0.75-1.1 1.1 0.9; 1.8; 2.7+ 1 0.6-1.2 0.4-1.1
DVS-range ( ΔVoutVout,min
) 1 0.46 0 0 0 1 1.75
Load stacking? Yes Yes Yes Yes Yes No No
δImax 100% 100% N/A 15.7% 60% / /
iVCRs 3/4;2/3;3/5;1/2;2/5;1/3;1/4 2/3;1/2;1/3 1/2 Ladder 1/2 1;2/3;1/2;1/3 3/4;2/3;1/2
ηsystem,peak 98.2% 99.6% 96% 99.8% 99% 78.3% 93%
ρ [mA/mm2] 82.64*/161.15** 59.8*/140** 19.09 N/A 4600 125*/150** 2.73
+ 4 stacked loads of 0.9V * For the full reported DVS-working range ** Peak
0.97
0.97
0.96
0.95
0.97
0.97
0.97
0.97
0.97
0.97
0.96
0.95
0.94
0.98
0.97
0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85VLoad2 [V]
0.45
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.852:1(SRC)
3:1(SNK) 5:3(SRC)
5:2(SNK)
3:2(SRC)
2:1(SNK)
5:2(SRC)
4:1(SNK)
4:3(SRC)
5:3(SNK)0.93
0.94
0.95
0.96
0.97
system
0.92
0.98
Peak = 98.2 %
V Load
1 [V]
Fig. 10: ηsystem for identical load currents (20mA) over the
whole DVS-range, all supplies included, and an indication of
the iVCR-regions with dotted lines
Table I compares this converter with current state-of-the-art
converters for VDS and/or DVS. Just like other converters that
support VDS, the presented converter clearly outperforms tra-
ditional converters regarding system efficiency. This converter,
however, allows one load to be completely turned off (δImax
= 100%) so that both loads can operate fully independently
in contrast to work reported in [5] and [7]. Moreover, the
highest current density is achieved compared to other bulk
CMOS implementations [6], [8], and even to DVS-converters
that do not enable load stacking [1], [2]. Above all, this work
achieves a DVS-range of 1 ( 0.9V−0.45V0.45V ), which is by far the
widest reported DVS-range for stacked loads [5]–[8].
IV. CONCLUSION
This paper discussed the possibility to increase the power
efficiency on both power delivery as well as functional level
by fusing VDS with DVS. The design and implementation of a
SC converter that fully supports DVS-loads in stacked voltage
domains was presented. By integrating 7 different topologies
in a single gearbox, together with a threefold regulation
mechanism and custom gate and bulk drive, the proposed
converter offers a wide DVS-range to two stacked loads while
one load can be completely turned off. Measurements of a
40nm bulk CMOS chip validated the integrated control loop
and the capability to supply two stacked independent loads
from 0.45V to 0.9V. This DVS-range is more than twice as
wide than current state-of-the-art converters for stacked loads.
The highest current densities in bulk CMOS were measured,
together with a peak ηsystem of 98.2%.
REFERENCES
[1] Y. Lu, J. Jiang, Wing-Hung Ki, C. P. Yue, Sai-Weng Sin, Seng-Pan U,and R. P. Martins, “A 123-phase DC-DC converter-ring with fast-DVSfor microprocessors,” in IEEE Int. Solid-State Circuits Conf. (ISSCC)Dig. Tech. Papers, feb 2015, pp. 1–3.
[2] D. El-Damak, S. Bandyopadhyay, and A. P. Chandrakasan, “A 93%efficiency reconfigurable switched-capacitor DC-DC converter using on-chip ferroelectric capacitors,” in IEEE Int. Solid-State Circuits Conf.(ISSCC) Dig. Tech. Papers, feb 2013, pp. 374–375.
[3] L. G. Salem and P. P. Mercier, “An 85%-efficiency fully integrated 15-ratio recursive switched-capacitor DC-DC converter with 0.1-to-2.2Voutput voltage range,” in IEEE Int. Solid-State Circuits Conf. (ISSCC)Dig. Tech. Papers, feb 2014, pp. 88–89.
[4] S. Rajapandian, Zheng Xu, and K. Shepard, “Implicit DC-DC downcon-version through charge-recycling,” IEEE Journal of Solid-State Circuits,vol. 40, no. 4, pp. 846–852, Apr. 2005.
[5] S. K. Lee, T. Tong, X. Zhang, D. Brooks, and G.-Y. Wei, “A 16-corevoltage-stacked system with an integrated switched-capacitor DC-DCconverter,” in Symposium on VLSI Circuits, Jun. 2015, pp. C318–C319.
[6] K. Blutman, A. Kapoor, A. Majumdar, J. G. Martinez, J. Echeverri,L. Sevat, A. P. van der Wel, H. Fatemi, K. A. A. Makinwa, and J. P.de Gyvez, “A Low-Power Microcontroller in a 40-nm CMOS UsingCharge Recycling,” IEEE Journal of Solid-State Circuits, vol. 52, no. 4,pp. 950–960, apr 2017.
[7] L. Chang, R. K. Montoye, B. L. Ji, A. J. Weger, K. G. Stawiasz,and R. H. Dennard, “A fully-integrated switched-capacitor 2:1 voltageconverter with regulation capability and 90% efficiency at 2.3A/mm2,”in Symposium on VLSI Circuits, jun 2010, pp. 55–56.
[8] A. Sarafianos and M. Steyaert, “A true two-quadrant fully integratedswitched capacitor DC-DC converter supporting vertically stacked DVS-loads with up to 99.6% efficiency,” in Symposium on VLSI Circuits, jun2017, pp. C210–C211.
[9] M. D. Seeman and S. R. Sanders, “Analysis and Optimization ofSwitched-Capacitor DCDC Converters,” IEEE Transactions on PowerElectronics, vol. 23, no. 2, pp. 841–851, mar 2008.
[10] N. Butzen and M. Steyaert, “A 1.1W/mm2 -power-density 82%-efficiency fully integrated 3:1 Switched-Capacitor DC-DC converter inbaseline 28nm CMOS using Stage Outphasing and Multiphase Soft-Charging,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech.Papers, feb 2017, pp. 178–179.