Regularity for Reduced Variability - Columbia University · 28 July 2006 Regularity for Reduced...

28 July 2006

www.c2s2.org

Regularity for Reduced VariabilityRegularity for Reduced VariabilityLarry Pileggi

Carnegie [email protected]

28 July 2006 Slide 2

www.c2s2.org

CMU CollaboratorsCMU CollaboratorsAndrzej StrojwasSlava RovnerTejas JhaveriThiago HersanKim Yaw TongSandeep GuptaXin LiNorris LuiJon ProeselUmut Arslan


www.c2s2.org

Layout Dependent VariationsLayout Dependent VariationsLayout dependent variations are having an increasingly dominant impact on functional and parametric yield

0%10%20%30%40%50%60%70%80%90%

100%

130nm 90nm 65nm

Pattern Dependent Random Parametric

Per

cent

of Y

ield

Los

s

Source: PDF Solutions


www.c2s2.org

Layout DependenciesLayout DependenciesExample: Without SRAM-layout specific SPICE models, design closure would be improbable for scaled CMOS

Statistical transistor models based on “all possible”patterns produce a very wide noise margin distribution

SRAM

0.8 0.9 1 1.1 1.20

50

100

150

200

250

Static Noise M argin (Normalized)N

umbe

r of S

ampl

es

DR compliant SPICE models

σ = 0.060

90nm bulk CMOS


www.c2s2.org

Layout DependenciesLayout Dependencies

SRAM

0.8 0.9 1 1.1 1.20

50

100

150

200

250

Static Noise M argin (Normalized)

Num

ber o

f Sam

ples

SRAM-layout-specific SPICE models

0.8 0.9 1 1.1 1.20

50

100

150

200

250

Static Noise M argin (Normalized)

Num

ber o

f Sam

ples

DR compliant SPICE models

σ = 0.060 σ = 0.026

Based on 1000-simulation-run Monte Carlo


www.c2s2.org

Micro-RegularityMicro-RegularityGrid of CD shapes with 500nm pitch Frequency response suggests that simple RETs would be effective for controlling this set of patterns

2-D FFT plotsof poly-Si patterns

Highest peak at 2 “Hz”(2 objects per micron)


www.c2s2.org

Macro-RegularityMacro-RegularityLess restriction is required for the patterns that can be encapsulated within macro-regular pattern groups Can pre-qualify fundamental elements in silicon for known pattern neighborhoods

Pattern neighborhood is known, therefore cells can be reliably implemented to create manufacturable arrays

Macro-Regularity for cell-to-cell variationsMicro-regularity for transistor-to-transistor variations


www.c2s2.org

90nm Memory Array90nm Memory ArrayMacro-regularity evident from repeated bit-cells

Spread of impulses due to lack of micro-regularity in bit-cells, but patterns validated in silicon via trial-and-error



www.c2s2.org

Standard Logic CellsStandard Logic CellsIncreasingly difficult to apply RETs and precisely print all patterns with a single optical setup for these 90nm std cells



www.c2s2.org

Micro- and Macro-Variability ImpactMicro- and Macro-Variability ImpactEx: identical min size transistors measured for three different physical environments on the same 65nm IC

Macro-regularity can provide for identical pattern environments for devices and “cells”

Micro-regularity is an area/performance vs. variability trade-off

Source: PDF Solutions

-10.50-10.00-9.50-9.00-8.50-8.00

-7.00

-6.00

300.00 400.00 500.00 600.00 700.00 800.00Idrive

Ioff

(log)

Env I Env II Env III


www.c2s2.org

Full Adder Ring OscillatorFull Adder Ring Oscillator

1 0

CinCoutB_7 A_7

sum … 0

osc1 0

CinCoutB_6 A_6

sum

1 0

CinCoutB_1 A_1

sum

1 0

CinCoutB_0 A_0

sum

Micro-regular layout expected to enhance printabilityExpect reduced variability – tighter Tp and Idc variation

E.g. Tp ~ gate_length; Idc ~ exp(gate_length)


www.c2s2.org

Layout ComparisonLayout Comparison

Regular Logic Fabrics2.88μm x 3.2 μm ≈ 9.2 μm2

Standard cell mirror adder3.6μm x 2.6 μm ≈ 9.4 μm2

Std cell and regular mirror adders Regular adder based on SRAM FEOL-like “pushed rules” that are enabled by regularity to provide for comparable area design

Std cell layout based on wrong-way poly with multiple jogs,

diffusion routing, and multiple metal routing directions


www.c2s2.org

Extra Capacitance on CICO PathExtra Capacitance on CICO Path

Regular Logic FabricsTotal diffusion area at switching nodes =

0.57 μm2

Standard cell mirror adderTotal diffusion area at switching nodes =

0.26 μm2

Larger diffusion areas due to on-grid placement of poly, contacts and metals – but identical transistor sizing for both adders

output

output

internal

input

input

internal


www.c2s2.org

Wafer Probe MeasurementsWafer Probe MeasurementsCommercial 65nm bulk processDistributed ROsthroughout waferMeasured 1 waferthus far (147 sites)2 sites with failedmeasurements for standard cell adderMore measurementsplanned to show consistencyof results across multiple wafers


www.c2s2.org

Delay vs. Idc for CICO @ 1.2VDelay vs. Idc for CICO @ 1.2V

(normalized by min current)


www.c2s2.org

Idc for CICO @ 1.2VIdc for CICO @ 1.2V

(values normalized by min current)


www.c2s2.org

Tp for CICO @ 1.2VTp for CICO @ 1.2V

Micro-regularity incurs delay penalty due to extra parasitic C

(values normalized by min current)


www.c2s2.org

Mean, Std. Deviation, and Coeff. Of Var.Mean, Std. Deviation, and Coeff. Of Var.Mean

Std. Deviation

assuming xi are independent, identically distributed samples

Coefficient of Variation is

( )∑=

−−

=N

iix

N 1

2

11 μσ

∑=

=N

iix

N 1

1μ

μσ


www.c2s2.org

Static Current Comparisons (CICO)Static Current Comparisons (CICO)Mean Static Current

1.0

2.0

3.0

4.0

5.0

6.0

7.0

8.0

0.8 0.9 1 1.1 1.2 1.3 1.4Vdd (V)

Nor

mal

ized

Sta

tic

Cur

rent

Standard Cell

Regular Fabric

Standard Deviation

1.02.03.04.05.06.07.08.09.0

10.0

0.8 0.9 1 1.1 1.2 1.3 1.4Vdd (V)

Nor

mal

ized

Sta

ndar

d D

evia

tion

Standard Cell

Regular Fabric

Coefficient of Variation

1.00

1.05

1.10

1.15

1.20

1.25

1.30

0.8 0.9 1 1.1 1.2 1.3 1.4

Vdd (V)

Nor

mal

ized

Coe

ffici

ent o

f Va

riatio

n

Standard CellRegular Fabric

σ/μ


www.c2s2.org

Propagation Delay Comparison (CICO)Propagation Delay Comparison (CICO)Mean Propagational Delay

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

0.8 0.9 1 1.1 1.2 1.3 1.4

Vdd (V)

Nor

mal

ized

Pro

paga

tiona

l D

elay


Standard Deviation

1.0

2.0

3.0

4.0

5.0

6.0

7.0

8.0

9.0

0.8 0.9 1 1.1 1.2 1.3 1.4

Vdd (V)

Nor

mal

ized

Sta

ndar

d D

evia

tion


Coefficient of Variation

1.00

1.10

1.20

1.30

1.40

1.50

1.60

1.70

1.80

1.90

2.00

0.8 0.9 1 1.1 1.2 1.3 1.4Vdd (V)

Nor

mal

ized

Coe

ffici

ent o

f Va

riatio

n


Slightly higher nominal delay expected due to increased parasitics for pushed-rule regular fabric adder


www.c2s2.org

Variations - Tp per gate @ Vdd = 1.2VVariations - Tp per gate @ Vdd = 1.2V

Standard cell mirror adderμ= 1.000 σ = 0.0371

Regular mirror adderμ = 1.025 σ = 0.0327

Wafermaps of abs(x - μ), normalized by max value

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

-6 -4 -2 0 2 4 6

-6

-4

-2

0

2

4

6

-6 -4 -2 0 2 4 6

-6

-4

-2

0

2

4

6


www.c2s2.org

Variations – Idc per gate @ Vdd = 1.2VVariations – Idc per gate @ Vdd = 1.2V

Standard cell mirror adderμ= 1.000 σ = 0.4224

Regular mirror adderμ = 0.7213 σ = 0.2920

Wafermaps of abs(x - μ), normalized by max value

-6 -4 -2 0 2 4 6

-6

-4

-2

0

2

4

6

-6 -4 -2 0 2 4 6

-6

-4

-2

0

2

4

6

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


www.c2s2.org

Outlier Count (Tp)Outlier Count (Tp)

Non-Regular mirror adder (all paths)# of outliers = 20

Regular mirror adder (all paths)# of outliers = 0

-6 -4 -2 0 2 4 6

-6

-4

-2

0

2

4

6

0

1

2

3

4

5

6

7

-6 -4 -2 0 2 4 6

-6

-4

-2

0

2

4

6

Count of outlyers beyond 3σ

σ = 0.0371 σ = 0.0327


www.c2s2.org

Outlier Count (Idc)Outlier Count (Idc)

Non-Regular mirror adder (all paths)# of outliers = 22

Regular mirror adder (all paths)# of outliers = 12

Count of outliers beyond 3σ

-6 -4 -2 0 2 4 6

-6

-4

-2

0

2

4

6

0

1

2

3

4

5

6

7

-6 -4 -2 0 2 4 6

-6

-4

-2

0

2

4

6

σ = 0.4224 σ = 0.2920


www.c2s2.org

ObservationsObservationsAny difference is due solely to micro-regularity of layout

Both designs are macro-regularImplementations have identical transistor topology and sizing

Difference in spread is most prominent in IdcExponentially dependent on gate length

We expect a greater variability impact when comparing macro-regularity differences


www.c2s2.org

Macro-Regular Logic BricksMacro-Regular Logic BricksRecently proposed macro-regular design via regular bricks

Less cell-to-cell variation, as in SRAM bit-cellsTotal number of geometry patterns dramatically reduced

Provides known pattern neighborhood to adjacent bricksTighter characterization with known electrical environments

Micro-regularlogic

r

Micro-regularlogic

Micro-regularlogic

well-characterized, predictable pattern environments like memories


www.c2s2.org

Big enough to satisfy optical proximity constraintsSmall enough to allow characterization and optimizationSpecific enough to minimize wasted logicGeneric enough to allow reuse over multiple logic functions

Optimal Brick SizeOptimal Brick Size

Brick Size

Total Area Small generic cells; (micro-regularity

penalty)

Big, generic bricks(wasted logic)

FewerPatterns

LogicEfficiency


www.c2s2.org

Experimental FlowExperimental Flow


www.c2s2.org

65nm Low Power CMOSStd Cell Spec Design:

16KB D cache, 32KB I cache250MHz worst caseArea: 1.1323 mm2

Bricks derived from7 fixed-size primitives3 Flip Flop typesVarious INV sizes for buffering16 fixed-size application-specific bricks

ARM9 ImplementationARM9 Implementation

ICache

MMU

DCache

Identical block footprint area for bricks and

std cell designs


www.c2s2.org

ARM9 Implementation ResultsARM9 Implementation ResultsStd cells based on full sizing and resynthesis using complete library40% more buffer area for bricks design due to sizing limitationsDoes not measure potential improvement in parametric yield

99.4718.40Regular Bricks (using primitive mapping)

105.4016.90Primitives (on grid)

100.0025.17Standard Cells (not on grid)

Relative WC Timing (%)Silicon Whitespace (%)

Brick design has slightly less whitespace but fewer nets to routeSimulation results do not capture improvement in control of variations, or improvement with Brick-specific synthesis and flow


www.c2s2.org

Normalized Leff comparison based on ACLV simulations at nominal process conditions for DFFs:

ACLV comparisonACLV comparison

3σµ

0.56921.0000.99051.000

FEOL push-rule BricksStd Cells


www.c2s2.org

Regularity-Friendly CircuitsRegularity-Friendly CircuitsCan further consider circuits and topologies which better match regular brick methodology and constraintsExample:

New DFF topology can reduce footprint, require only single clk polarity, and provide 20-40% improvement in speed


www.c2s2.org

Statistical OptimizationStatistical OptimizationBricks can be statistically optimized for sizing w.r.t. variationsWe expect that macro-regularity of bricks vs. standard cells will provide substantial improvement in predictability

Adjacent stress

Active corner

Random DopantSTI-Poly distance stress

Contact placement

Poly corner


www.c2s2.org

Conclusions and Future DirectionsConclusions and Future DirectionsForms of Regular Fabrics appear to offer advantages beginning at 65nm node

Benefits of reduced design margins have yet to be fully measuredWith limited number of bricks we can optimize them for better control and prediction of variations

Both systematic (those which we can model) and random (those which cannot completely model) variations can be reducedCan carefully design bricks to reduce sensitivity to random variations

Regularity for Reduced Variability - Columbia University · 28 July 2006 Regularity for Reduced...

Documents

Transcript of Regularity for Reduced Variability - Columbia University · 28 July 2006 Regularity for Reduced...