A Parallel Decimal Multiplier Using Hybrid Binary Coded ... · PDF file A Parallel Decimal...

Click here to load reader

  • date post

    30-May-2020
  • Category

    Documents

  • view

    9
  • download

    0

Embed Size (px)

Transcript of A Parallel Decimal Multiplier Using Hybrid Binary Coded ... · PDF file A Parallel Decimal...

  • A Parallel Decimal Multiplier Using Hybrid Binary Coded Decimal (BCD) Codes

    Xiaoping Cui, Weiqiang Liu* and Wenwen Dong

    College of Electronic and Information Engineering,

    Nanjing University of Aeronautics and Astronautics, Nanjing, China

    Fabrizio Lombardi

    Department of Electrical and Computer Engineering,

    Northeastern University, Boston, USA

  • Outline

    The Proposed Partial Product Tree

    Motivation

    Review of BCD Representations and Decimal Multiplier

    2

    The Proposed Partial Product Tree

    Evaluation and Comparison

    Conclusions

  • Motivation

    Why Decimal Arithmetic is Needed?

    � Binary arithmetic introduces conversion and rounding errors

    � Decimal arithmetic is highly demanded in many applications

    (financial, commercial and so on) that cannot tolerant errors.

    3

    (financial, commercial and so on) that cannot tolerant errors.

    � Decimal specification has been added to the revised IEEE

    754-2008 standard.

    � High performance decimal arithmetic circuits are required.

  • Introduction

    � Partial Production Generation

    Sign Digit (SD) Radix-10 recoding

    Redundant BCD excess-3 (XS-3)

    Overloaded decimal digit set (ODDS) code

    Double BCD recoding

    Generation of Multiplier

    5X 4X 3X 2X 1X

    XS-3 digits in [-3,12]

    Selection of Multiples (MUX-5)

    PP[0] … PP[k] … PP[d-1] PP[d]

    ODDS digits in [0,15]

    Radix-10

    recoder

    4d

    X(BCD)

    4d

    Y(BCD)

    6d

    .

    .

    .

    .

    .

    .Yb0

    YbK

    Ybd-1 6

    6

    6

    PPG

    Block

    4(d+1) 4(d+1) 4(d+1) 4(d+1) 4d

    4(d+1) 4(d+1) 4(d+1) 4d ... ...

    � Partial Product Compression

    Decimal 3:2 CSA

    Binary compressor

    � Final Decimal Adder

    Parallel prefix/carry select adder

    4

    (d+1):2 PPR tree

    ODDS digits in [0,15]

    A(BCD XS-6) B(BCD-8421)

    BCD Adder (2d digits)

    8d 8d

    8d

    P(BCD)

    [7] A. Vazquez, E. Antelo, and J. Bruguera, “Fast Radix-10 Multiplication

    Using Redundant BCD codes”, IEEE Transactions on Computers, vol. 63,

    no. 8, pp. 1902–1914, Aug. 2014.

  • Partial Production Generation

    � SD Radix-10 recoding scheme SD Radix-10 Recoding

    yi,3yi,2yi,1yi,0 y5i y4i y3iy2iy1i (ysi-1=0)

    ysi y5i y4i y3iy2iy1i (ysi-1=1)

    ysi

    0000 00000 0 00001 0

    0001 00001 0 00010 0 1

    5&& 5

    1 5&& 5

    i i i Y Y Y

    Y Y Y

    <

  • Partial Production Generation

    Redundant BCD Codes

    6

  • XS-3 Recoding (Redundant Odds [0, 15])

    Partial Production Generation

    N*Xd-1+3…………… N*Xi+3 N*Xi-1+3 ……… N*X0+3

    4444

    X0Xi-1XiXd-1

    D0T0Di-1Ti-1 Ti-2DiTiDd-1 Td-2Td-1

    ……………… ……………

    Digit-set [0,9]

    STEP 1: Digit Mappings

    [3,9N+3]

    Carry-out

    X(BCD)

    4d

    ╳5╳5 ╳4 ╳3 ╳2 Xi+3

    5X 4X 3X 2X 1X

    Digits in XS-3[-3,12]

    4(d+1) 4(d+1) 4(d+1) 4(d+1) 4d

    +

    D0T0Di-1Ti-1 Ti-2DiTiDd-1 Td-2Td-1

    ++

    444

    ……… 4…

    NX0NXi-1NXiNXd-1

    Carry-out

    STEP 2: Carry assimilation

    [-3,12]

    Digits in XS-3[-3,12]

    MUX-5

    4 4 4 4 4

    5Xi 4Xi 3Xi 2Xi 1Xi

    1 1 1 1

    1 1 1 1

    SD Radix-10 1digit encoding

    Y5kY4k Y3kY2kY1k

    4

    Yk(BCD)

    Ysk Ysk-1

    Ysk

    Ysk*

    Digit in XS-3[-3,12]

    Digits in ODDS[0,15]

    Convert the XS-3 digits to ODDS by

    adding pre-computed correction term:

    fc(16)=1032 +

    07407407407407417037037037037037

    Advantage of XS-3 Codes: difficult

    multiples (such as 3X) can be obtained in

    a carry-free manner

    7

  • Decimal PP Compression

    Using ODDS Generation of Multiplier

    5X 4X 3X 2X 1X

    XS-3 digits in [-3,12]

    Selection of Multiples (MUX-5)

    PP[0] … PP[k] … PP[d-1] PP[d]

    Radix-10

    recoder

    4d

    X(BCD)

    4d

    Y(BCD)

    6d

    .

    .

    .

    .

    .

    .Yb0

    YbK

    Ybd-1 6

    6

    6

    PPG

    Block

    4(d+1) 4(d+1) 4(d+1) 4(d+1) 4d

    Partial Product Compression

    The (d+1:2) PP Reduction (PPR):

    (1) A regular binary CSA tree

    (2) A binary counter is used to count PP[0] … PP[k] … PP[d-1] PP[d]

    (d+1):2 PPR tree

    ODDS digits in [0,15]

    A(BCD XS-6) B(BCD-8421)

    BCD Adder (2d digits)

    Yb0

    4(d+1) 4(d+1) 4(d+1) 4d ... ...

    8d 8d

    8d

    P(BCD)

    (3) The ODDS partial products in

    (1) and (2) are added by the

    binary CSA tree and the decimal

    digit 3:2 compressor

    (2) A binary counter is used to count

    carries generated between the digit

    columns in the binary CSA tree

    8

  • � Decimal 3:2 CSADecimal PP Compression

    Based on BCD-4221/5211

    ci,jbi,jai,j

    Partial Product (PP) Compression

    s i,jhi,j

    9

  • Proposed Design

    A

    B

    P

    Partial

    product compression

    Partial

    product

    generation

    Final

    decimal

    adder

    New Design of PPR

    Tree Block

    10

  • Proposed Design: A New PPR Tree

    Proposed PPR

    (reduction) Tree

    (d+1):2 Binary

    CSA Tree

    (ODDS to 4221)

    BCD-4221 Sum

    Correction Block

    (Decimal Counter)

    A Decimal Digit

    3:2 Compressor

    (BCD-4221)

    11

  • A PPR Tree FOR 16*16-digit multiplier

    17:2

    Binary

    PPR tree

    8-bit

    counter

    3-bit counter

    3-bit

    counter

    ·

    ·

    · 4

    18

    1

    3

    3

    2

    PP i[0] PP i[k] PPi[16]

    . . . . . .

    Ci-1[0]

    ·

    ·

    ·

    ·

    ·

    ·

    3

    1

    ui,3; ui,2; ui,1

    vi,1

    Ci[0]

    ·

    ·

    ·

    Ci[7]

    Ci[8]

    Ci[9]

    Ci[10]

    Ci[11]

    C [12]

    Ci-1[K]

    Ci-1[13]

    4

    2

    BCD-4221 Sum

    Correction Block

    p1 p2 p3 p4

    cout2 cin

    p1 p2 p3 p4

    cout2 cin

    p1 p2 p3 p4

    cout2 cin

    p1 p2 p3 p4

    cout2 cin

    +6

    c [13] ci-1[13]

    � The No. of PP rows in the 1st,

    2nd, 3rd and 4th stages are 17, 9,

    6 and 4, respectively.

    counter

    6:2 Decimal PPR Tree

    44

    2

    4221

    4221

    2*4221

    x6

    x6

    x6

    4221 4221 4221 4221 4*4221 55

    1

    1

    3

    1

    4 4

    Bi Ai

    zi,1

    Ci[12]

    Ci[13]

    Hi(4221) Si(4221)

    x2 x1

    ui-1,3; ui-1,2; ui-1,1

    vi-1,1

    zi-1,1

    cout2 cin

    cout1 sum

    cout2 cin

    cout1 sum

    cout2 cin

    cout1 sum

    cout2 cin

    cout1 sum

    (2) (1)(4) (2)(8) (4)(16) (8)

    si,3ci,3

    ci[13]

    si,2ci,2 si,1ci,1 si,0ci,0

    ci-1[13]

    12

    � 4-bit binary 4:2 compressor

    in last compression stage

  • the Proposed PPR Tree

    3:2 3:2HA

    3:23:2

    ui,3ui,2ui,1ui-1,1 ui,0ui,0ui-1,3ui-1,2

    3:2

    3:23:2

    C[4] C[0] C[5] C[6] C[7]C[1] C[2] C[3]

    C[8]C[9]C[10] C[11]C[12]C[13]

    4 4 4 4

    ui,3 1 ui,2

    1 ui,1

    1 ui,0

    1 vi,1

    1 vi,0

    1

    vi,0vi,0vi,1vi-1,1

    zi,1 1

    zi,0 1

    zi,0zi,0zi,1zi-1,1

    BCD-4221 Sum

    Correction Block

    F

    8-bit counter

    3-bit counter

    BCD-4221 8-bit and

    3-bit counter correction

    � The 8-bit BCD-4221 counter is faster

    than a binary counter (only two 3:2 CSA

    delay).

    � 3-bit counters are used to generate a

    BCD-4221 decimal correction digit by 3:2

    3:2

    x2

    3:2

    3:2

    x2

    x2

    42214221

    x2

    x2

    4 4

    6:2 Decimal

    PPR Tree Block

    Ai(4221) Bi(4*4221)

    Hi(4221) Si(4221)

    4 4 x2 x1

    F

    F

    F

    6:2 decimal PPR tree block

    13

    BCD-4221 decimal correction digit by

    using only one 3:2 compressor.

    � To balance the paths in the decimal 6:2

    PPR tree and reduce the critical path.

  • Using Hybrid (Multiple) BCD Codes

    4*4221

    4221

    BCD-4221

    PPR Tree

    BCD-

    8421

    PPG

    XS-3

    2*4221

    4221

    ODDS