A Capacitive DC-DC Converter for Stacked Loads With Wide ...

A Capacitive DC-DC Converter for Stacked LoadsWith Wide Range DVS Achieving 98.2% Peak

Efficiency in 40nm CMOS

Tim Thielemans, Nicolas Butzen, Athanasios Sarafianos, Michiel Steyaert, Filip TavernierE-mail: {Tim.Thielemans, Filip.Tavernier}@esat.kuleuven.be

Department of Electrical Engineering ESAT-MICAS - KU Leuven, Belgium

Abstract—This paper presents a fully integrated gearbox-typeswitched capacitor DC-DC converter that is able to provide twostacked loads with a wide voltage range for Dynamic VoltageScaling (DVS). This combination offers significant efficiencyimprovements in both the power delivery and the functionalblocks. To face the challenges of wide voltage range stacked loads,7 topologies with custom gate and bulk drive were implemented,together with a threefold fully integrated control loop. This 8-phase interleaved converter has been fabricated in a 40nm CMOStechnology. Measurements demonstrate a peak system efficiencyof 98.2%, while a best-in-class DVS-range from 0.45V to 0.9V isguaranteed for two independent stacked loads.

Index Terms—DC-DC Converter, Switched Capacitor, Gear-box, Stacked Voltage Domains, Dynamic Voltage Scaling

I. INTRODUCTION

Dynamic Voltage Scaling (DVS) balances performance and

energy consumption in digital circuits and is a widely used

technique to improve power efficiency [1]. Although DVS is

a powerful technique, from a power management perspective,

this is far from ideal as each block requires its own highly

loaded power converter. Even assuming up to 93% efficient

converters for DVS [1]–[3], the conventional DVS-approach

still requires a large chip area and induces significant losses

in power delivery (PD) because the DVS-converters have to

convert all the power, as indicated by I1,2 in Fig. 1a.

Voltage domain stacking (VDS) significantly reduces the

overhead of the PD because, due to charge recycling, only

the mismatch current (I2-I1 in Fig. 1b) has to be converted.

Besides less chip area occupied by the PD, this leads to

a tremendous decrease in PD-losses and thus an increased

system efficiency (ηsystem). When loads are matched and

draw nearly identical currents, the converter barely converts

power, unlocking system efficiencies up to 100%, regardless

the converter efficiency (ηconv) [4]. Only in worst-case, when

one load is completely turned off, ηsystem drops to ηconv .

Moreover, as a consequence of Vbus < Vin, the initial power

converter at Vsup can achieve higher efficiencies as well

compared to conventional DVS. All these advantages of VDS

require a converter which is able to sink and source current as

the mismatch current can flow both in and out the converter.

Ideally, the converter supports one load to be completely

turned off to guarantee proper operation while both loads can

work fully independently.

DC

DC

Vbus

= Vload,max

Vload1

Load1

DC

DC

Vsup

I1

DC

DC

Vload2

Load2

I2

(a) Conventional DVS

DC

DC

Vin

= Vload1

+ Vload2

+

Vload1

-

Load1

Load1

|I2 - I1|

I1

I2

+

Vload2

-

Vout

DC

DC

Vsup

(b) Proposed DVS

Fig. 1: Comparison of DVS-techniques

Current state-of-art load stacking converters are usually not

DVS-supportive [5]–[7] or only offer a limited DVS-range

[8], notably limiting the possible power reductions in digital

blocks. Therefore, this paper presents a converter that supplies

two independent stacked loads with a wide DVS-range. As

explained in Section II, 7 topologies have been implemented

in a single gearbox to achieve the desired wide DVS-range

from 0.45V-0.9V for both loads. This is by far the best-in-

class DVS-range for stacked loads as will be stated with a

state-of-the-art comparison of measurement results in Section

III, before drawing the conclusions in Section IV.

II. IMPLEMENTATION DETAILS

A. Gearbox Topology

A switched capacitor (SC) converter can be modeled as

an ideal transformer (iVCR) and a series output impedance

(Rout) [9]. The output current causes a voltage drop over Rout,

resulting in losses and a discrepancy between iVCR and the

actual voltage conversion ratio (VCR). For efficient operation

over a wide voltage range, a gearbox type converter is benefi-

cial because iVCR can be chosen close to VCR, avoiding large

voltage drops and associated losses. Additionally, depending

on the direction of the output current (sink or source), VCR

can be higher or lower than iVCR. As a result, two different

topologies are required for each VCR when loads are stacked.

V CR =Vout

Vin=

Vload2

Vload1 + Vload2(1)

r g

Vin

Vout

b r g

b

Vin

Vout

r g

b

Vout

r g

Vin

b

Vout

φ1 φ2 φ1 φ2

2/3 1/2 1/3

r g

Vout

b

g

Vin

b

r

Vout

g

Vin

b

r

Vout

g

Vin

b

r

Vout

r g b

Vout

φ1 φ2

φ1 φ2

2/5

1/4

3/4

g

b

Vout

r

g

Vout

b

r

Vin

φ1 φ2

3/5

φ1

Fig. 2: Used topologies and iVCRs with gray common phases

Vin

Vout

Cg

TG1 TG2 TG3 TG4

BG2 BG3 BG4

BG1

TG

BGC

r

TR1 TR2 TR3 TR4

BR2 BR3 BR4

BR1

TR

BR

Cb

BB1

BB2

TB1 TB2

TB

BB

Vout

TBBB TBBB

BBTBBBTR

Vout Vout

Vdd

Fig. 3: Implementation of the proposed gearbox SC core

The desired wide DVS-range from 0.45V-0.9V for both

stacked loads results, according to (1), in a VCR-range from

1/3 to 2/3. Since VDS requires two iVCRs for a single VCR,

the iVCR-range of the gearbox should be wider than [1/3 2/3].

Seven different topologies with iVCRs ranging from 1/4 to

3/4 have been selected as visualized in Fig. 2. Thanks to

the common phases, the amount of switches could be limited

to 20 and all devices could be sized closer to their optimal

size in each topology. Additionally, the three flying capacitors

are placed in each topology such that the overall bottom-plate

and soft charging losses could be minimized. Capacitor Cg ,

for instance, has in each topology the largest bottom-plate

swing and Cb transfers in each topology the most charge. Both

capacitors are sized appropriately (Cg < Cr and Cb ≈ 2Cr)

B. Power Switches

Gearbox converters with wide VCR-range suffer in general

from high blocking voltages and varying overdrive (Vgs)

voltages for the switches. Moreover, the combination of DVS

with VDS provokes drain source reversals. Whereas the high

blocking voltage is solved traditionally by combining thin

and thick-oxide devices (see Fig. 3), the varying Vgs requires

custom solutions. To start, a low-power auxiliary fixed supply

(Vdd) is used for control logic and some bootstrapping cir-

cuits to guarantee timing restrictions independently of Vout.

Furthermore, the overdrive voltage of the frequently used

PMOS switches BG1, BR1 and BB1 is always limited to Vout.

M1 M2 M3

M4

M5 M6

M7b

M7a

Cx

Φ1

Φ2

Φ2

Φ2

Φ1

Vout

Φ1

α

Bootstrap circuit Multiplexer

Compensation

D/S reversal

Power

switch TR2Φ1 = TR2 open

α = Vout >TR

Φ2 = TR2 closed

M8 M9

M10

M11

Vdd

Vout

Vdd

Vout

VTR

Nbs

Ng

Nb

Fig. 4: Gate and bulk driving of power switch TR2

To overcome the trade-off between losses due to the gate

capacitance and due to the on-resistance, these switches are

fragmented and parallelized according to Vout.

Drain source reversals require extra attention for bulk and

gate contacts to avoid forward biased junction diodes, un-

wanted conductive switches or excessive inter-terminal volt-

ages. As an example of the precautions taken, the driving stage

of power switch TR2, between Vout and the top-plate of Cr

(TR), is shown in Fig. 4. The switch is toggled with a bootstrap

circuit, where Vdd charges Cx during Φ1 through M3 and M1,

while M6 disconnects the top plate of Cx from the bootstrap

output (Nbs) which is grounded by M7. In Φ2, M4 connects

the bottom plate of Cx to Vout, resulting in a boosted voltage

(Vdd+Vout) at Nbs. A grounded gate (Ng) can be used to open

TR2 without any risk of excessive Vdg when VTR ≤ Vout. This

is, however, only the case when the converter is configured in

topologies 3/4, 2/5 or 3/5. Therefore, a pass-gate multiplexer

connects either Nbs or the switched bulk voltage (Nb) to the

gate of TR2 (Ng). This switched gate and bulk contacts prevent

excessive inter-terminal voltages and forward biased diodes.

C. Control

The toplevel architecture of the converter core with control

is shown in Fig. 5. VDS requires fast and proper detection of a

sink (SNK) or source (SRC) situation to avoid voltage drops.

Therefore, an 800MHz-clocked comparator with hysteresis

compares Vref with Vout and switches state (SNK ↔ SRC)

when Vout crosses the boundary [8]. This SRC-signal is then

used in a threefold control to ensure proper operation.

Two open-loop controls configure the converter core in the

most efficient way, as can be seen in Fig. 5. The iVCR

decoder (a) sets the proper topology, based on the SRC-

signal and the desired VCR. For this, clocked comparators

compare Vref with four fixed ratios of Vin, giving an idea

of the VCR-region (see also Fig. 10). The second open-loop

control (b) determines, similar to the previous one, when Vout

drops below 0.6V or 0.5V to incorporate two, respectively,

three of the fragmented switches mentioned earlier. Finally, a

pulse skipping hysteretic controller clocked at 800MHz (c),

closes the loop and ensures a stable output voltage. Either the

negative (SRC) or positive (SNK) terminal of the comparator

is applied to the phase generator, creating the switching signals

for each of the 8x2 fragments.

Δt

+

-

+

-

+

-

+

-

+

-

+

-

+

-

+

-

Switch fragment control b

+

-

+

-

+

-

+

-

+

-

+

-

+

-

+

-

Topology control a

iVCR

Select

Decoder

SRC

Open loop control

fcontrol

Closed loop control

decoding

SC core

SC core

gate & bulk drive

NOCG & decoder

180°

SC core

gate & bulk drive

NOCG & decoder

0°

8

decoding

SC core

SC core

gate & bulk drive

NOCG & decoder

180°

SC core

gate & bulk drive

NOCG & decoder

0°

2

decoding

SC core

SC core

gate & bulk drive

NOCG & decoder

180°

SC core

gate & bulk drive

NOCG & decoder

0°

1

Converter

VddVref

Phase-

Generator

0

1

SRC

Vout

Vreffclk

Vo

ut

fclk

Vre

fS

RC

SN

K

Vin

VoutLoad1

Load2

iVCR

SF

3

2

fsw

1

fsw

2

fsw

8

c

Fig. 5: Toplevel architecture

Decoupling

Capacitance

1 (0°)

795 μm

61

0 μ

mCentral Control1’ (180°)

Cb

Cg

Cb

Fig. 6: Chip photograph, total active area of 0.484mm2

III. MEASUREMENT RESULTS

To validate the proposed design, a prototype chip was

fabricated in a 40nm bulk CMOS technology with a total active

area of 0.484mm2, control and decoupling included. The 8x2

fragments can be clearly distinguished in Fig. 6. All flying and

decoupling capacitance has been implemented with a MOS-

MOM stack wherein IO-devices are used to minimize leakage.

Also, the substrate coupling is reduced by Deep n-well biasing

with two front-to-front connected diodes [10].

As shown in Fig. 7, Vout follows Vref in both maximal

(20mA) sink and source loads. To distinguish the operation

of the topology selection and the hysteretic controller, the

frequency of fcontrol was decimated in the lower two plots.

The hysteretic controller maximizes fsw to 50MHz after a step

in Vref , decreasing the gap between Vref and Vout as much

as possible with the current topology. The controller updates

the topology instantaneously to the correct one when fcontroltoggles and Vout rises/falls to Vref with up to 1.5V/μs. The

transient measurement in Fig. 8 demonstrates that the converter

detects and recovers from maximal load reversals (± 20mA)

while voltage droop is kept limited to 85mV.

As a result of the voltage stacking capability, the best ηconvis equal to the worst ηsystem, completely overcoming the low

efficiency for low-load conditions (Fig. 9). With a maximal

current drawn of 20mA in each load, the full DVS-range

VoutVreffcontrolnew iVCR

Time0.6

0.7

0.8

0.9

[V]

iVCR = 1/2

iVCR = 2/3

0.6

0.7

0.8

0.9

[V]

iVCR = 1/3

iVCR = 1/2

Time

iVCR = 2/3

iVCR = 1/2

iVCR = 1/3

iVCR = 1/2

SNK (20mA) SRC (20mA)

SRC (20mA) SNK (20mA)

Fig. 7: Transient measurements showing Vref -tracking capa-

bility and operation of both hysteretic and topology controller.

Time-100

-50

0

50

100

Vout

-Vre

f [m

V]

-20020SRC SRCSNK

ILoad [mA

]

1/2 1/2 (iVCR)1/3

Fig. 8: Detection and recovery of maximal load reversal

-50 -40 -30 -20 -10 0 10 20 30 40 500.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Effic

ienc

y

Vin: 1.35 V, Vout 0.52 VVin: 1.35 V, Vout 0.75 V

Vin: 1.60 V, Vout 0.90 VVin: 1.80 V, Vout 0.90 V

Vin: 1.35 V, Vout 0.82 V

Fully independentDVS-range

conv

system

Current mismatch (Iload2-Iload1) [mA]

Fig. 9: System and converter efficiency for varying current

imbalance and multiple voltages, all supplies included.

(0.45V-0.9V) is guaranteed for both loads, even while one

load is completely turned off. However, when only a single

fixed conversion ratio is required (for instance from 1.6V to

0.9V), the maximal current drawn for stacked loads increases

to 39mA or even to 50mA for a single load. At this maximal

single loaded power density, the converter still achieves 76%

efficiency. Fig.10 demonstrates the capability of providing the

full DVS-range, while achieving a peak system efficiency of

98.2%, including Vdd. Also, the boundaries for the topology

selection are marked with a dotted line.

TABLE I: Comparison with current state-of-the-art SC converters for stacked loads (VDS) and/or DVS

DVS + VDS VDS DVS

This Work VLSI17 [8] JSSC17 [6] VLSI15 [5] VLSI10 [7] ISSCC15 [1] ISSCC13 [2]

Technology 40nm G 65nm G 40nm G 40nm G 45nm SOI 65nm LL 130nm

Capacitors MOM-MOS MOM-MIM N/A MOS Deep-Trench MOS-MOM-MIM Ferro-electric

Vin [V] 0.9-1.8 1.5-2.2 2.2 3.6 2 1.6-2.2 1.5

Vout [V] 0.45-0.9 0.75-1.1 1.1 0.9; 1.8; 2.7+ 1 0.6-1.2 0.4-1.1

DVS-range ( ΔVoutVout,min

) 1 0.46 0 0 0 1 1.75

Load stacking? Yes Yes Yes Yes Yes No No

δImax 100% 100% N/A 15.7% 60% / /

iVCRs 3/4;2/3;3/5;1/2;2/5;1/3;1/4 2/3;1/2;1/3 1/2 Ladder 1/2 1;2/3;1/2;1/3 3/4;2/3;1/2

ηsystem,peak 98.2% 99.6% 96% 99.8% 99% 78.3% 93%

ρ [mA/mm2] 82.64*/161.15** 59.8*/140** 19.09 N/A 4600 125*/150** 2.73

+ 4 stacked loads of 0.9V * For the full reported DVS-working range ** Peak

0.97

0.97

0.96

0.95

0.97

0.97

0.97

0.97

0.97

0.97

0.96

0.95

0.94

0.98

0.97

0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85VLoad2 [V]

0.45

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.852:1(SRC)

3:1(SNK) 5:3(SRC)

5:2(SNK)

3:2(SRC)

2:1(SNK)

5:2(SRC)

4:1(SNK)

4:3(SRC)

5:3(SNK)0.93

0.94

0.95

0.96

0.97

system

0.92

0.98

Peak = 98.2 %

V Load

1 [V]

Fig. 10: ηsystem for identical load currents (20mA) over the

whole DVS-range, all supplies included, and an indication of

the iVCR-regions with dotted lines

Table I compares this converter with current state-of-the-art

converters for VDS and/or DVS. Just like other converters that

support VDS, the presented converter clearly outperforms tra-

ditional converters regarding system efficiency. This converter,

however, allows one load to be completely turned off (δImax

= 100%) so that both loads can operate fully independently

in contrast to work reported in [5] and [7]. Moreover, the

highest current density is achieved compared to other bulk

CMOS implementations [6], [8], and even to DVS-converters

that do not enable load stacking [1], [2]. Above all, this work

achieves a DVS-range of 1 ( 0.9V−0.45V0.45V ), which is by far the

widest reported DVS-range for stacked loads [5]–[8].

IV. CONCLUSION

This paper discussed the possibility to increase the power

efficiency on both power delivery as well as functional level

by fusing VDS with DVS. The design and implementation of a

SC converter that fully supports DVS-loads in stacked voltage

domains was presented. By integrating 7 different topologies

in a single gearbox, together with a threefold regulation

mechanism and custom gate and bulk drive, the proposed

converter offers a wide DVS-range to two stacked loads while

one load can be completely turned off. Measurements of a

40nm bulk CMOS chip validated the integrated control loop

and the capability to supply two stacked independent loads

from 0.45V to 0.9V. This DVS-range is more than twice as

wide than current state-of-the-art converters for stacked loads.

The highest current densities in bulk CMOS were measured,

together with a peak ηsystem of 98.2%.

REFERENCES

[1] Y. Lu, J. Jiang, Wing-Hung Ki, C. P. Yue, Sai-Weng Sin, Seng-Pan U,and R. P. Martins, “A 123-phase DC-DC converter-ring with fast-DVSfor microprocessors,” in IEEE Int. Solid-State Circuits Conf. (ISSCC)Dig. Tech. Papers, feb 2015, pp. 1–3.

[2] D. El-Damak, S. Bandyopadhyay, and A. P. Chandrakasan, “A 93%efficiency reconfigurable switched-capacitor DC-DC converter using on-chip ferroelectric capacitors,” in IEEE Int. Solid-State Circuits Conf.(ISSCC) Dig. Tech. Papers, feb 2013, pp. 374–375.

[3] L. G. Salem and P. P. Mercier, “An 85%-efficiency fully integrated 15-ratio recursive switched-capacitor DC-DC converter with 0.1-to-2.2Voutput voltage range,” in IEEE Int. Solid-State Circuits Conf. (ISSCC)Dig. Tech. Papers, feb 2014, pp. 88–89.

[4] S. Rajapandian, Zheng Xu, and K. Shepard, “Implicit DC-DC downcon-version through charge-recycling,” IEEE Journal of Solid-State Circuits,vol. 40, no. 4, pp. 846–852, Apr. 2005.

[5] S. K. Lee, T. Tong, X. Zhang, D. Brooks, and G.-Y. Wei, “A 16-corevoltage-stacked system with an integrated switched-capacitor DC-DCconverter,” in Symposium on VLSI Circuits, Jun. 2015, pp. C318–C319.

[6] K. Blutman, A. Kapoor, A. Majumdar, J. G. Martinez, J. Echeverri,L. Sevat, A. P. van der Wel, H. Fatemi, K. A. A. Makinwa, and J. P.de Gyvez, “A Low-Power Microcontroller in a 40-nm CMOS UsingCharge Recycling,” IEEE Journal of Solid-State Circuits, vol. 52, no. 4,pp. 950–960, apr 2017.

[7] L. Chang, R. K. Montoye, B. L. Ji, A. J. Weger, K. G. Stawiasz,and R. H. Dennard, “A fully-integrated switched-capacitor 2:1 voltageconverter with regulation capability and 90% efficiency at 2.3A/mm2,”in Symposium on VLSI Circuits, jun 2010, pp. 55–56.

[8] A. Sarafianos and M. Steyaert, “A true two-quadrant fully integratedswitched capacitor DC-DC converter supporting vertically stacked DVS-loads with up to 99.6% efficiency,” in Symposium on VLSI Circuits, jun2017, pp. C210–C211.

[9] M. D. Seeman and S. R. Sanders, “Analysis and Optimization ofSwitched-Capacitor DCDC Converters,” IEEE Transactions on PowerElectronics, vol. 23, no. 2, pp. 841–851, mar 2008.

[10] N. Butzen and M. Steyaert, “A 1.1W/mm2 -power-density 82%-efficiency fully integrated 3:1 Switched-Capacitor DC-DC converter inbaseline 28nm CMOS using Stage Outphasing and Multiphase Soft-Charging,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech.Papers, feb 2017, pp. 178–179.

A Capacitive DC-DC Converter for Stacked Loads With Wide ...

Documents

Transcript of A Capacitive DC-DC Converter for Stacked Loads With Wide ...