Arith Adder

download Arith Adder

of 40

Transcript of Arith Adder

  • 7/27/2019 Arith Adder

    1/40

    1

    Design Space Exploration forPower-Efficient Mixed-Radix LingAdders

    Chung-Kuan ChengComputer Science and Engineering Depart.

    University of California, San Diego

  • 7/27/2019 Arith Adder

    2/40

    2

    Outline Prefix Adder Problem

    Background & Previous Work

    Extensions: High-radix, Ling Our Work

    Area/Timing/Power Models

    Mixed-Radix (2,3,4) Adders ILP Formulation

    Experimental Results

    Future Work

  • 7/27/2019 Arith Adder

    3/40

    Prefix Adder Challenges

    Logical

    Levels

    Wire Tracks

    Fanouts

    Area

    Physical

    placement

    Detail routing

    New

    Design

    Scope

    Timing

    Gate Cap

    Wire Cap

    Gate sizing

    Buffer

    insertion

    Signal slope

    Input arrival

    time

    Output

    require time

    Increasing impact of physical design

    and concern of power.Power

    Static

    power

    Dynamic

    power

    Power

    gating

    Activity

    Probability

  • 7/27/2019 Arith Adder

    4/40

    4

    Binary Addition Input: two n-bit binary numbers

    and , one bit carry-in

    Output: n-bit sum and one bit carryout

    Prefix Addition: Carry generation &propagation

    011... aaan

    011... bbbn 0c

    011... sssn

    nc

    )(

    :Propagate

    :Generate

    1

    iiii

    iiii

    iii

    iii

    bacs

    cpgc

    bap

    bag

    =

    +=

    =

    =

    +

  • 7/27/2019 Arith Adder

    5/40

    5

    Prefix Addition Formulation

    iiiiii bapbag == Pre-processing:

    Post-

    processing:

    Prefix

    Computation:

    iii

    iii

    cps

    cPGc

    =

    +=+ 0]0:[]0:[1

    ]:1[]:[]:[

    ]:1[]:[]:[]:[

    kjjiki

    kjjijiki

    PPP

    GPGG

    =

    +=

  • 7/27/2019 Arith Adder

    6/40

    6

    1234

    12:13:14:1

    Prefix Adder Prefix StructureGraph

    gpi

    pi

    G[i:0]

    si

    biai

    GP[i, j] GP[j-1, k]

    GP[i, k]

    gp generator

    sum generator

    GP cell

    Pre-

    processing

    Post-

    processing

    Prefix

    Computation

  • 7/27/2019 Arith Adder

    7/40

    7

    Previous Works Classical prefix

    adders12345678

    12:13:14:16:17:18:1 5:1

    Brent-Kung:

    Logical levels: 2log2n1

    Max fanouts: 2

    Wire tracks: 1

    12345678

    12:13:14:16:17:18:1 5:1

    Kogge-Stone:

    Logical levels: log2n

    Max fanouts: 2

    Wire tracks: n/2

    12345678

    12:13:14:16:17:18:1 5:1

    Sklansky:

    Logical levels: log2n

    Max fanouts: n/2

    Wire tracks: 1

  • 7/27/2019 Arith Adder

    8/40

    8

    High-Radix Adders Each cell has more than two fan-ins

    Pros: less logic levels

    6 levels (radix-2) vs. 3 levels (radix-4) for64-bit addition

    Cons: larger delay and power in each

    cell

  • 7/27/2019 Arith Adder

    9/40

    9

    Radix-3 Sklansky & Kogge-

    Stone Adder

    David Harris, Logical Effort of Higher Valency Adders

  • 7/27/2019 Arith Adder

    10/40

    10

    Ling Adders

    iiiiii bapbag == Pre-processing:

    Post-

    processing:

    Prefix

    Computation:

    iii

    iii

    cps

    cPGc

    =

    += 0]0:1[]0:1[

    ]:1[]:[]:[

    ]:1[]:[]:[]:[

    kjjiki

    kjjijiki

    PPP

    GPGG

    =

    +=

    Prefix Ling

    * *

    [ 1:0] [ 1:0] 1( )

    i i i i i is G t G t p

    = +

    ,i i i i i i

    i i i

    g a b p a b

    t a b

    = = +

    =

    1

    *

    ]1:[1

    *

    ]1:[ , =+= iiiiiiii ppPggG

    * * * *

    [ : ] [ : ] [ 1: 1] [ 1: ]i k i j i j j k

    G G P G

    = +

    *

    ]:1[

    *

    ]:[

    *

    ]:[ kjjiki PPP =

    *

    ]0:1[1 = iii Gpc

  • 7/27/2019 Arith Adder

    11/40

    11

    An 8-bit Ling Adder

    H1

    H2

    H3

    H4

    H5

    H6

    H7

    H8

    12345678

    0 1

    si

    ai

    bi

    pi-1

    Hi

    gi

    pi-1

    pi-2

    G*i

    P*i-1

    gi-1

  • 7/27/2019 Arith Adder

    12/40

    12

    Area Model Distinguish physical placement from logical

    structure, but keep the bit-slice structure.

    Logical view Physical view

    Bit positionLogicalle

    vel

    Bit positionPhysicallevel

    Compact placement

    12345678 12345678

  • 7/27/2019 Arith Adder

    13/40

    13

    Timing Model Cell delay calculation:

    pfd +=

    Effort Delay Intrinsic Delay

    hgf =

    Logical EffortElectrical Effort = Cout/Cin= (fanouts+wirelength) / size

    Intrinsic properties of the cell

  • 7/27/2019 Arith Adder

    14/40

    14

    Power Model Total power consumption:

    Dynamic power + Static Power

    Static power: leakage current of devicePsta = *#cells

    Dynamic power: current switching capacitancePdyn = Cload

    is the switching probability = j (j is the logical level*)

    cellsCjPPP loadstadyntotal #+=+=

    * Vanichayobon S, etc, Power-speed Trade-off in Parallel Prefix Circuits

  • 7/27/2019 Arith Adder

    15/40

    15

    ILP Formulation Overview

    Structure variables:

    GP cells

    Connections (wires)

    Physical positions

    Capacitance variables:

    Gate cap

    Vertical wire cap

    Horizontal wire cap

    Timing variables:

    Input arrival time

    Output arrival time

    Power

    Objective

    ILPILOG CPLEX

    Optimal Solution

  • 7/27/2019 Arith Adder

    16/40

    16

    Integer Linear Programming(ILP)

    ILP: Linear Programming with integervariables.

    Difficulties and techniques: Constraints are not linear

    Linearize using pseudo linear constraints

    Search Space too large

    Reduce search space

    Search is slow

    Add redundant constraints to speedup

  • 7/27/2019 Arith Adder

    17/40

    17

    ILPInteger Linear Programming

    Linear Programming: linear constraints, linearobjective, fractional variables.

    Integer Linear Programming: Linear Programmingwith integer variables.

    LP Optimal

    Constraints

    ILP Optimal

  • 7/27/2019 Arith Adder

    18/40

    ILP Pseudo-Linear Constraint

    Minimize: x3Subject to: x1 300

    x2 500x3 = min(x1, x2)

    Minimize: x3Subject to: x1300

    x2500x3x1x3x2x3x1 1000 b1 (1)x3x2 1000 (1 b1) (2)

    b1 is binary

    Problem: ILP formulation:

    LP objective: 0

    ILP objective: 300

    A constraint is called pseudo-linear if its not effective

    until some integer variables are fixed.

    Pseudo-linear constraints mostly arise from IF/ELSE

    scenarios

    binary decision variables are introduced to indicate true or false.

  • 7/27/2019 Arith Adder

    19/40

  • 7/27/2019 Arith Adder

    20/40

    Interval Adjacency Constraint

    H1H2H3H4H5H6H7H8

    12345678

    (7,3): Interval [7,1]

    (3,2): Interval [3,1]

    (7,2): Interval [7,4]

    Must be adjacent,

    i.e. 4 = 3 + 1

    (column id, logic level)

  • 7/27/2019 Arith Adder

    21/40

    21

    Linearization for IntervalAdjacency Constraint

    (i, j)

    (i, h) (k1, l1) (k2, l2)

    wl wr1

    wr2

    ],[ ),(),(R

    hi

    L

    hi yy ],[ )1,1()1,1(R

    lk

    L

    lk yy ],[ )2,2()2,2(R

    lk

    L

    lk yy

    ],[),(),(

    R

    ji

    L

    ji

    yy

    11if1),(),( ==+= (i,j,k,l)wrwl(i,j,h)yyL

    lk

    R

    hi

    1if1),,,(1),(

    ),( =+= wl(i,j,h)lkjiwrkylk

    R

    hi

    11),,,(1),(

    ),( + wl(i,j,h))(nlkjiwrkylk

    R

    hi

    11),,,(1

    ),(

    ),( ++ wl(i,j,h))(nlkjiwrkylk

    R

    hi

    iyL

    ji =),(

    Linearize

    Pseudo Linear

    Left interval boundequal to column index

  • 7/27/2019 Arith Adder

    22/40

    22

    Search Space Reduction

    Lings adder:separate odd and

    even bits Double the bit-width

    we are able tosearch H1H2H3H4H5H6H7H8

    12345678

  • 7/27/2019 Arith Adder

    23/40

    23

    Redundant Constraints

    Cell (i,j)is known to have logic leveljbefore wireconnection

    Assume load is MinLoad(fanout=1 with minimum

    wire length):

    + MinLoadjP ji ),(

    Cell (i,j) has a path of lengthj-1

    Assume each cell along the path has MinLoad

    )(),( MinLoadLEPDjT ji +

  • 7/27/2019 Arith Adder

    24/40

    24

    Experiments16-bit Uniform Timing

  • 7/27/2019 Arith Adder

    25/40

    Experiments16-bit Uniform Timing

    25

  • 7/27/2019 Arith Adder

    26/40

    26

    Min-Power Radix-2 Adder(delay= 22, power = 45.5FO4 )

    1

    1

    2

    2

    3

    3

    4

    4

    5

    5

    6

    6

    7

    7

    8

    8

    9

    9

    10

    10

    11

    11

    12

    12

    13

    13

    14

    14

    15

    15

    16

    16

  • 7/27/2019 Arith Adder

    27/40

    27

    Min-Power Radix-2&4 Adder(delay=18, power = 29.75FO4 )

    1

    1

    2

    2

    3

    3

    4

    4

    5

    5

    6

    6

    7

    7

    8

    8

    9

    9

    10

    10

    11

    11

    12

    12

    13

    13

    14

    14

    15

    15

    16

    16

    Radix-2 Cell Radix-4 Cell

  • 7/27/2019 Arith Adder

    28/40

    28

    Min-Power Mixed-Radix Adder(delay=20, power = 28.0FO4)

    1

    1

    2

    2

    3

    3

    4

    4

    5

    5

    6

    6

    7

    7

    8

    8

    9

    9

    10

    10

    11

    11

    12

    12

    13

    13

    14

    14

    15

    15

    16

    16

    Radix-2 Cell Radix-4 CellRadix-3 Cell

  • 7/27/2019 Arith Adder

    29/40

    29

    Experiments16-bit Non-

    uniform Time (Mixed Radix)

    ILP is able to handle non-uniform timings

    Ling adders are most superior in increasing arrival timefaster carries

  • 7/27/2019 Arith Adder

    30/40

    Increasing Arrival Time(delay=35.5, power = 27.0FO4 )

    30

  • 7/27/2019 Arith Adder

    31/40

    Decreasing Arrival Time(delay=34.5, power = 30.5FO4)

    31

  • 7/27/2019 Arith Adder

    32/40

    Convex Arrival Time(delay=35.9, power = 32.4FO4 )

    32

  • 7/27/2019 Arith Adder

    33/40

    Increasing Required Time(delay=34.5, power = 30.5FO4)

    33

  • 7/27/2019 Arith Adder

    34/40

    Decreasing Required Time(delay=36.5, power = 32.5FO4)

    34

  • 7/27/2019 Arith Adder

    35/40

    Convex Required Time(delay=36.5, power = 32.5FO4)

    35

  • 7/27/2019 Arith Adder

    36/40

    36

    Experiments64-bit Hierarchical

    Structure (Mixed-Radix)

    Handle high bit-width applications

    16x4 and 8x8

    ILP Block ILP Block ILP Block ILP Block

    ILP Block

    a1b1a16b16a17b17a32b32a33b33a48b48a49b49a64b64

    ... ... ... ...

    Level 1

    Level 2

    ... ... ... ...

    ... ... ... ...

    GP*[64:50]

    GP*[48:34] GP*[32:18] GP*[16:2]

    GP*[1:1]

    GP*[17:17]

    GP*[33:33]GP*[49:49]

    ... ... ...H

    64H

    49H

    48H

    33H

    32H

    17H

    16H

    1

  • 7/27/2019 Arith Adder

    37/40

    37

    Experiments64-bit

    Hierarchical Structure

    TSL: a 64-bit high-radix three-stage Ling adder

    V. Oklobdzija and B. Zeydel, Energy-Delay Characteristics of CMOS Adders,

    in High-Performance Energy-Efficient Microprocessor Design, pp. 147-170, 2006

  • 7/27/2019 Arith Adder

    38/40

    38

    ASIC Implementation - Results

    64-bit hierarchical design (mixed-radix) byILP vs. fast carry look-ahead adder by

    Synopsys Design Compiler TSMC 90nm standard cell library was used

  • 7/27/2019 Arith Adder

    39/40

    39

    Future Work

    ILP formulation improvement

    Expected to handle 32 or 64 bit

    applications without hierarchical scheme

    Optimizing other computer arithmeticmodules

    Comparator, Multiplier

  • 7/27/2019 Arith Adder

    40/40

    40

    Q & A

    Thank You!