Distributed Sleep Transistor Network for Power Reduction*

Post on 12-Jan-2016

38 views 0 download

description

Distributed Sleep Transistor Network for Power Reduction*. Changbo Long ECE Department, UW-Madison clong@cae.wisc.edu Lei He EDA Research Group EE Department, UCLA lhe@ee.ucla.edu. *Partially sponsored by NSF CAREER Award 0093273, SRC grant HJ-1008 and Intel Corporation. Outline. - PowerPoint PPT Presentation

Transcript of Distributed Sleep Transistor Network for Power Reduction*

Changbo LongChangbo Long

ECE Department, UW-MadisonECE Department, UW-Madison

clong@cae.wisc.educlong@cae.wisc.edu

Lei HeLei He

EDA Research GroupEDA Research Group

EE Department, UCLAEE Department, UCLA

lhe@ee.ucla.edulhe@ee.ucla.edu

Distributed Sleep Transistor Network for Power Reduction*Distributed Sleep Transistor Network for Power Reduction*

*Partially sponsored by NSF CAREER Award 0093273, SRC grant HJ-1008 and Intel Corporation

OutlineOutline

Motivation

Background

Distributed sleep transistor network (DSTNDSTN) Structure, advantages, modeling and sizing algorithm

Experiment results

Conclusion and future work

MotivationMotivation

Leakage power will become the dominant power component Reduced feature size Increased system integration more idle modules

Leakage reduction techniques To reduce leakage for active modules

Dual threshold voltage assignment for sub-threshold leakage [Mahesh et-al, ICCAD’02]

Pin reordering for gate leakage [Lee et-al, DAC’03] To reduce leakage for idle modules

Input vector control [Johnson et-al, DAC’99] Power gatingPower gating [Kao et-al, DAC’98][Anis-et al, DAC’02]

MotivationMotivation

System level: use power management processor (PMPPMP) to generate control signals [Mutoh et-al, JSSC’96] PMPPMP can be distributed

Gate level: use sleep transistors to turns off power supply Concerned with performance loss and area overhead

PMPPMPSleepSleep

gg11 ggnn

Virtual GNDVirtual GND

VVdddd

Sleep tr.Sleep tr.

SleepSleep

SleepSleep tr.tr.

Performance LossPerformance Loss

Performance loss Increase in the propagation delay

Performance loss is proportional to VVstst

ist Maximum Simultaneous Switching Current (MSSCMSSC)

gg11 ggnn

VVdddd

ist

ist

)V-(VCμ

1*)

W

L(=R

tHddoxnst

stst

ststst Ri = V

MSSCMSSC

MSSC: MSSC: maximum current in the time domain and the input vector domain

g1

g2

g3

g1

g2

g3

ig1 ig2 ig3

Input vector

Time

MSSCMSSC

t t t t

t t t t

itotal+ + =

Area OverheadArea Overhead

Area overhead: the sleep transistor area and the routing area of virtual ground wires

Design convention: given performance loss , minimize area overhead

)V-)(VV-(VCδμ

1=k

MSSC*k=L

W=Area

tHddtLddoxnc

cst

ststgg11 ggnn

VVdddd

MSSCMSSC

Related WorkRelated Work

Module-based design methodology [Mutoh-et al, JSSC’95 ’96] [Kao-et al, DAC’98] A singlesingle and largelarge sleep transistor accommodates entire

module [JSSC’96] Manual sizing automatic sizing considering discharge

patterns [Kao-et al, DAC’98] Voltage drop on long long virtual ground wires is nontrivial, and

results in large area

Related WorkRelated Work

Module-based design methodology [Mutoh-et al, JSSC’95 ’96] [Kao-et al, DAC’98] A singlesingle and largelarge sleep transistor accommodates entire

module [JSSC’96] Manual sizing automatic sizing considering discharge

patterns [Kao-et al, DAC’98] Voltage drop on long long virtual ground wires is nontrivial, and

results in large area

Cluster-based design methodology [Anis-et al, DAC’02] Group gates into clusters and minimize peak currentminimize peak current in

clusters by clustering algorithms Insert a sleep transistor for each cluster to avoidavoid long virtual

ground wires Clustering may conflictconflict with time-driven placement

Sleep transistor areaSleep transistor area

Area*: Area*: the sleep transistor area ignoring the resistance of virtual ground wires

MSSCMSSCmodulemodule < ∑∑iiMSSCMSSCcluster_i cluster_i area* area*modulemodule<area*area*clustercluster

∑i cluster_iccluster

modulecmodule

MSSCk=*Area

MSSCk=*Area

×

×

Sleep transistor areaSleep transistor area

Area*: Area*: the sleep transistor area ignoring the resistance of virtual ground wires

MSSCMSSCmodulemodule < ∑∑iiMSSCMSSCcluster_i cluster_i area* area*modulemodule<area*area*clustercluster

Considering the resistance of virtual ground wires, AreaAreamodmod > AreaAreaclu clu [Anis-et al, DAC’02]

DSTN DSTN has the smallest area AreaAreaDSTNDSTN ≈≈ Area Area**

modmod

∑i cluster_iccluster

modulecmodule

MSSCk=*Area

MSSCk=*Area

×

×

DSTN: Distributed Sleep Transistor NetworkDSTN: Distributed Sleep Transistor Network

DSTN DSTN enhances cluster-based design by connecting clusters with extra virtual ground wires

Cluster-based designCluster-based design DSTNDSTN

Current Discharging Balance Reduces SizeCurrent Discharging Balance Reduces SizeCurrent Discharging Balance Reduces SizeCurrent Discharging Balance Reduces Size

Cluster-based designCluster-based design DSTNDSTN

Cluster-based design Current discharges by its privateprivate sleep transistor large

transistor size

DSTNDSTN Current discharges by bothboth privateprivate and neighboringneighboring sleep

transistors small transistor size

Additional Advantages of DSTNAdditional Advantages of DSTNAdditional Advantages of DSTNAdditional Advantages of DSTN

Cluster-based designCluster-based design DSTNDSTN

DSTNDSTN introduces NO constraintNO constraint on placement

Wire overhead of DSTNDSTN is smallsmall

SleepSleep tr.tr. SleepSleep tr.tr.

AdditionalAdditional wireswires

ClusterCluster

Entire module resistance network plus current source

Switching Switching currentcurrent

RRii

RRstst

Modeling of DSTNModeling of DSTN

DSTN Sizing Problem (DSTN/SPDSTN/SP) Given DSTN topology, DSTN/SPDSTN/SP finds the size for every sleep

transistor such that the total transistor area of DSTN is minimizedminimized and the performance loss constraint is satisfiedsatisfied for every cluster

DSTN Sizing ProblemDSTN Sizing Problem

RRstst=?=?W=?W=? W=?W=?

W=?W=? W=?W=?

PL<PL<

PL<PL<

RRstst=?=?VVstst<<εε

VVstst<<εε

RRstst=?=?VVstst<<εε

RRstst=?=?VVstst<<εε

Switching Switching currentcurrent

Primary challenge: current source Dependency between the current sources Current varies w.r.t. time

Secondary challenge: resistance network Given current source, size RRst st to minimize transistor area while

satisfy performance loss constraints

Does any algorithms exist in the literature? No exact solution

Close solution for Power/Ground network sizing [Boyd, et-al ISPD’01]

We have developed an algorithm based on special special properties of DSTN/SPDSTN/SP

Difficulties of DSTN/SPDifficulties of DSTN/SP

Properties of DSTN/SP SolutionsProperties of DSTN/SP SolutionsProperties of DSTN/SP SolutionsProperties of DSTN/SP Solutions

P1P1: Assuming RRii=0=0,

: Performance loss constraint, MSSC MSSC: Maximum current

modcmodule*

DSTN MSSC*k=Area=Area

)V-)(VV-(VCδμ

1k

tHddtLddoxnc =

P2P2: given current source, AreaAreaDSTNDSTN increases when RRii increases The increase is limited because RRii << R << Rstst

Ri=∞, AreaAreaDSTNDSTN=AreaAreaclustercluster

Properties of DSTN/SP SolutionsProperties of DSTN/SP SolutionsProperties of DSTN/SP SolutionsProperties of DSTN/SP Solutions

P3P3: Assuming cluster current and AreaAreaDSTNDSTN to be constant, to achieve minimum performance loss,

Properties of DSTN/SP SolutionsProperties of DSTN/SP SolutionsProperties of DSTN/SP SolutionsProperties of DSTN/SP Solutions

DSTN

i cluster_i

cluster_i

cluster_i Area*MSSC

MSSC=Area∑

Algorithm for DSTN/SPAlgorithm for DSTN/SPAlgorithm for DSTN/SPAlgorithm for DSTN/SP

P1P1, P2P2: Total sleep transistor area of DSTNDSTN is determined by

[0.05, 0.5], empirical parameter increases when RRi i increases

P3P3: Size of each individual sleep transistor is

Key is to estimate MSSCMSSCmodulemodule and MSSCMSSCclustercluster

modulecDSTN MSSC*k*β)(1Area +=

DSTN

i cluster_i

cluster_i

cluster_i Area*MSSC

MSSC=Area∑

Estimate MSSCMSSCmodulemodule

Circuit current strongly depends on input vector The space of input vector increase exponentially with the

number of primary input Genetic algorithm (GA) based algorithm is used [Jiang et-al,

TVLSI’00]

Efficient algorithm to estimate MSSCMSSCcluster cluster has been proposed in the paper

Maximum Current EstimationMaximum Current Estimation

Cluster-based design without considering placement constraint

Given a circuit and cluster size, partition gates into clusters such that ∑∑i i MSSCMSSCcluster_icluster_i is minimized and AreaAreaclustercluster is minimized in turn

Clustering algorithm Simulated Annealing (SA)

Sizing algorithm Each individual sleep transistor

Total area

∑i cluster_iccluster MSSC*k=Area

cluster_iccluster_i MSSC*kArea =

Base-line Case: Cluster-based DesignBase-line Case: Cluster-based Design

Experiment SetupExperiment SetupExperiment SetupExperiment Setup

Gate level synthesis Sizing

Estimate maximum current for clusters and the entire moduleApply the sizing algorithms

VerificationSimulate the circuit and obtain the current source by 10,000 random input vectorsObtain performance loss performance loss by solving the resistance network with circuit KCLKCL and KVLKVL equationsFind the maximum performance loss maximum performance loss among the performance loss for each input vector

Custom layout Implement a four-bit CLA using 0.35μm technology Determine size by SPICESPICE simulation

Cluster-based design: each cluster satisfy the performance loss constraintDSTNDSTN: the entire module satisfy the performance loss constraint

On average, DSTN DSTN reduces total W/L by 49.8%49.8% with smaller performance loss

Result of Gate Level SynthesisResult of Gate Level SynthesisResult of Gate Level SynthesisResult of Gate Level Synthesis

C432C499

C880

C1355

C1908

C2670

C3540

C5315

C6288

C7552C432

C499C880

C1355

C1908

C2670

C3540

C5315

C6288

C7552

Cluster-based DSTNDSTN

W/L of Sleep TransistorsW/L of Sleep Transistors Maximum Performance LossMaximum Performance Loss

Each cluster is Each cluster is accommodated by a accommodated by a sleep transistorsleep transistor

Sleep transistorsSleep transistors

Sleep transistors Sleep transistors are connected by are connected by virtual ground wiresvirtual ground wires

Sleep transistorsSleep transistors

Virtual ground Virtual ground wireswires

Cluster-based design

DSTNDSTN

Custom Layout in 0.35Custom Layout in 0.35μμmmCustom Layout in 0.35Custom Layout in 0.35μμmm

DSTNDSTN reduces runtime leakage by 50x50x and 5x5x compared to no sleep transistor and cluster-based design,

respectively

DSTNDSTN reduces sleep transistor area by 6.83x6.83x with 6.6%6.6% smaller performance degradation compared to the cluster-based design

Custom Layout ComparisonCustom Layout ComparisonCustom Layout ComparisonCustom Layout Comparison

Leakage current

delay Sleep tr. Area

Total area

No sleep transistor

Cluster-based

DSTNDSTN

Conclusion and Future WorkConclusion and Future Work

We have proposed DSTN DSTN and the sizing algorithm DSTNDSTN has reduced area, less leakage current and supply

voltage drop

Future work Ideal power/ground network is assumed in this paper Investigate the co-design of DSTNDSTN and the power/ground power/ground

networknetwork