VHDL Implementation of a Fast Adder Tree - DiVA portal20456/FULLTEXT01.pdf · VHDL Implementation...

download VHDL Implementation of a Fast Adder Tree - DiVA portal20456/FULLTEXT01.pdf · VHDL Implementation of a Fast Adder Tree Master thesis in Electronics Systems ... I would like to thank

If you can't read please download the document

Transcript of VHDL Implementation of a Fast Adder Tree - DiVA portal20456/FULLTEXT01.pdf · VHDL Implementation...

  • Institutionen fr systemteknik Department of Electrical Engineering

    Examensarbete

    VHDL Implementation of a Fast Adder Tree

    Master thesis in Electronics Systems at Department of Electrical Engineering,

    Linkping University

    Chen Dacheng

    LiTH-ISY-EX--05/3760--SE Linkping 2005

    TEKNISKA HGSKOLAN LINKPINGS UNIVERSITET

    Department of Electrical Engineering Linkping University S-581 83 Linkping, Sweden

    Linkpings tekniska hgskola Institutionen fr systemteknik 581 83 Linkping

  • VHDL Implementation of a Fast Adder Tree

    Dacheng Chen

    LiTH-ISY-EX--05/3760--SE Linkping 2005

  • VHDL Implementation of a Fast Adder Tree

    Master thesis in Electronics Systems

    at Department of Electrical Engineering,

    Linkping University

    by

    Dacheng Chen

    LITH-ISY-EX--05/3760--SE

    Supervisor: Henrik Ohlsson

    Examiner: Lars Wanhammar

    Linkping, 1 June 2005.

  • Abstract

    This thesis discusses the design and implementation of a VHDL generator

    for Wallace tree with (3:2) counter modules and (2:2) counter modules to

    solve fast addition problem.

    The basic research has been carried out by MATLAB programming

    environment and automatic generation of VHDL file based on the result

    obtained from MATLAB simulation. MODELSIM has been used for

    compilation and simulation of the VHDL file.

  • Acknowledgement

    I would like to thank my examiner Professor Lars Wanhammar for

    offering me this opportunity to do this thesis and I also appreciate my

    supervisor Henrik Ohlsson, he gave me sincere help and advice during

    the thesis work.

    Thanks also to my classmate Kangmin Chen for his motivate advices on

    my work. And all the members who give me help and support.

    Last but not least, thanks to my family, Gaoyang Chen and Yan Wang,

    for all the courage and support for my study here in Sweden.

  • VHDL Implementation of Fast adder trees 1

    TABLE OF CONTENTS

    CHAPTER 1 INTRODUCTION........................................................................................................1

    1.1 MOTIVATION....................................................................................................................................1 1.2 THESIS TARGET ...............................................................................................................................1 1.3 READING GUIDE ..............................................................................................................................1

    CHAPTER 2 ADDER STRUCTURES..............................................................................................3

    2.1 ADDER STRUCTURES .......................................................................................................................3 2.1.1 Twos Complement Representation..........................................................................................3 2.1.2 Fixed Time Type.......................................................................................................................4 2.1.3 Variable Time Type ..................................................................................................................4 2.1.4 Carry-Propagate Adder...........................................................................................................4 2.1.5 Redundant Adders ...................................................................................................................8 2.1.6 Multi-operand Addition.........................................................................................................13

    CHAPTER 3 DESIGN FLOW .........................................................................................................17

    3.1 SYSTEM SPECIFICATION .................................................................................................................17 3.2 RELATED MATLAB ......................................................................................................................18

    3.2.1 Basic MATLAB program language .......................................................................................18 3.2.2 Multidimensional cell array ..................................................................................................18

    3.3 DESIGN FLOW STRUCTURE............................................................................................................20 3.4 DESCRIPTION OF CELL ARRAYS ......................................................................................................23

    3.4.1 Counter Block........................................................................................................................23 3.4.2 Input block.............................................................................................................................25 3.4.3 Output block ..........................................................................................................................27

    CHAPTER 4 VHDL GENERATOR AND TOP LEVEL SIMULATION.....................................32

    4.1 MATLAB PROGRAM TO GENERATE VHDL CODE ..........................................................................32 4.2 VHDL CODE DESCRIPTION............................................................................................................33

    4.2.1 Related MODELSIM and VHDL language ...........................................................................33 4.2.2 VHDL dataflow description...................................................................................................34 4.2.3 VHDL structural RTL description .........................................................................................34 4.2.4 VHDL code of each level.......................................................................................................35 4.2.5 VHDL code for top level........................................................................................................37

    4.3 SIMULATION RESULT......................................................................................................................38

    CHAPTER 5 CONCLUSION AND FUTURE WORK..................................................................42

  • VHDL Implementation of Fast adder trees 2

    5.1 CONCLUSION.................................................................................................................................42 5.2 FUTURE WORK...............................................................................................................................42

    REFERENCES .....................................................................................................................................43

    APPENDICES ......................................................................................................................................45

    APPENDIX 1 MATLAB PROGRAM FOR EACH LEVEL (EACH LEVEL. M) ................................................45 APPENDIX 2 MATLAB PROGRAM FOR TOP LEVEL (TOP LEVEL. M) ......................................................71

  • VHDL Implementation of Fast adder trees 3

    INDEX OF FIGURES

    FIGURE 1 RIPPLE-CARRY ADDER................................................................................................................5 FIGURE 2 CARRY OUT OF CARRY LOOKAHEAD ADDER..............................................................................7 FIGURE 3 SUM OF CARRY LOOKAHEAD ADDER.........................................................................................7 FIGURE 4 FUNCTION OF CARRY-SAVE ADDER.............................................................................................8 FIGURE 5 CSA USED IN n BIT NUMBERS ....................................................................................................9 FIGURE 6 CSA COMPUTATION....................................................................................................................9 FIGURE 7 FIRST STEP OF SIGNED-DIGIT ADDITION ................................................................................... 11 FIGURE 8 SECOND STEP OF SIGN-DIGIT ADDITION ...................................................................................12 FIGURE 9 SIGN-DIGIT ADDITION..............................................................................................................12 FIGURE 10 A [ p :2] ADDER .....................................................................................................................13 FIGURE 11 REDUCTION BY ROWS.............................................................................................................14 FIGURE 12 FA AND HA AS (3:2) COUNTER AND (2:2) COUNTER ...............................................................15 FIGURE 13 EXAMPLE OF REDUCTION BY COLUMNS ................................................................................15 FIGURE 14 SYSTEM REQUIREMENT..........................................................................................................17 FIGURE 15 COMPONENT FIGURE..............................................................................................................19 FIGURE 16 EXPLAIN OF REPRESENTATION................................................................................................19 FIGURE 17 EXAMPLE OF FIRST LEVEL'S STRUCTURE OF WALLACE TREE ..................................................20 FIGURE 18 DESIGN FLOW OF PROGRAM...................................................................................................22 FIGURE 19 COUNTER BLOCK...................................................................................................................23 FIGURE 20 FA POSITION IN ADDER TREE ..................................................................................................24 FIGURE 21 HA AND BP POSITION IN ADDER TREE ....................................................................................25 FIGURE 22 INPUT BLOCK.........................................................................................................................26 FIGURE 23 FA INPUTS STATE....................................................................................................................26 FIGURE 24 HA AND BP INPUTS STATE......................................................................................................27 FIGURE 25 OUTPUT BLOCK .....................................................................................................................27 FIGURE 26 FA'S SUM STATE .....................................................................................................................28 FIGURE 27 HA'S SUM STATE ....................................................................................................................29 FIGURE 28 BP'S OUTPUT STATE................................................................................................................29 FIGURE 29 FA'S CARRY OUT STATE ..........................................................................................................30 FIGURE 30 HA'S CARRY OUT STATE .........................................................................................................30 FIGURE 31 MATLAB CODE TO VHDL CODE...........................................................................................32 FIGURE 32 MODELSIM OPERATION.......................................................................................................33 FIGURE 33 VHDL CODE DESCRIPTION FOR EACH LEVEL .........................................................................36

  • VHDL Implementation of Fast adder trees 4

    FIGURE 34 VHDL CODE DESCRIPTION FOR TOP LEVEL ............................................................................38 FIGURE 35 ADDER TREE STRUCTURE DESCRIBED BY MODELSIM..........................................................39 FIGURE 36 COMPUTATION RESULT...........................................................................................................40

    INDEX OF TABLES

    TABLE 1 CSA COMPUTATION...................................................................................................................10 TABLE 2 COMPUTATION PROCEDURE .......................................................................................................40

  • VHDL Implementation of Fast adder trees 1

    Chapter 1

    Introduction

    1.1 Motivation

    Computation operations like fast parallel multiplication using adder trees are present in many parts of a digital system or digital computer, especially in signal processing, high-speed circuits, graphics and scientific computation. Examples of such are graphic processor, digital signal processors, communication or code compression. To speed up addition is a very important part for computation. There are many tree structure like Wallace adder tree [1], CSA tree, over turn stair tree [2] and some other kinds of adder trees are mentioned in [3]-[7]. Here Wallace tree is used as the tree structure because it is suitable for implementation

    1.2 Thesis Target

    Use MATLAB to make programs, the first part of the program is formed by blocks where each block contains some cell arrays. The second part of the program is used to generate a VHDL file, the information we need is all stored in cell arrays. Then use MODELSIM to compile and simulated the VHDL file created by MATLAB.

    1.3 Reading guide

    This thesis is organized in five chapters. Chapter 2 mainly discuss different adders, multi-operand addition and fast addition trees.

  • VHDL Implementation of Fast adder trees 2

    Chapter 3 mainly discuss basic knowledge of MATLAB programming, design flow of the program, how the program describes the structure of the adder tree of each level, and the method that was used to solve this problem. It also describes how to automatic generate VHDL code use this program. Chapter 4 focuses on the top levels simulation, using MODELSIM to compile and simulate. Chapter 5 gives the conclusion of this work and future work that still has to be done. Appendices shows the MATLAB code to generate VHDL code.

  • VHDL Implementation of Fast adder trees 3

    Chapter 2

    Adder Structures

    2.1 Adder Structures

    Adders are used in many aspects [11], [12]. It is generally recognized that most of the time required by adders is due to carry propagation, so how to reduce the propagation time is the focus on todays techniques. Different binary adder schemes have their own characters, such as area and energy dissipation. No such adder scheme is the best for every condition, so to choose in a specific context with specific requirement and constraint is important. Because this thesis work does not focus on analysis of delay time of different adders, here the function of some commonly used adders is given.

    2.1.1 Twos Complement Representation

    Twos complement representation uses the most significant bit as a sign bit, making it easy to test whether an integer is positive or negative. Range of twos complement

    representation is from 12 n to 12 1 n . Consider an n bits integer A , in twos

    complement representation. If A is positive, then the sign bit 1na is zero. The

    remaining bits represent the magnitude of the number, in the same fashion as for sign magnitude:

    =

    =2

    02

    n

    ii

    i aA for A 0

    The number zero is identified as positive and therefore has a 0 sign bit and a magnitude of all 0s, we can see that the range of positive integers that maybe represented is from 0 to 12 1 n . Any larger number would require more bits.

  • VHDL Implementation of Fast adder trees 4

    2.1.2 Fixed Time Type

    Most commonly implemented is the fixed time type adder scheme. The character is that no signal is indicated when addition is completed. Therefore the worst case delay should be considered.

    2.1.3 Variable Time Type

    Contrary to fixed time type adder scheme, the variable time type adders have a completion signal so that the result of the addition can be used as soon as the completion signal is asserted.

    2.1.4 Carry-Propagate Adder

    Carry-propagate adders (CPA) can get the result in conventional number system, also called fixed-radix system. The property of fixed-radix system is that every number has a unique representation, so that no two sequences have the same numerical value. A digit set from 0 to 1r , where r means radix.

    2.1.4.1 Ripple-Carry Adder

    An n -bit adder used to add two n -bit binary numbers can build by connecting n full adders in series. Each full adder represents a bit position i (from 0 to 1n ). Each carry out from a full adder at position i is connected to the carry in of the full adder at the higher position 1+i . The sum output of a full adder at position i as shown in Figure 1 is given by:

    iiii CYXS =

    The carry output of each FA as shown in Figure 1 is given by:

  • VHDL Implementation of Fast adder trees 5

    iiiiiii CYCXYXC ++=+1

    Figure 1 Ripple-carry adder

    In the expression of the sum, iC must be generated by the full adder at the lower

    position 1i . ct is the delay from the input from the full adder to the carry output

    and st is the delay form the input to the sum output. The worst case delay is given

    by

    ),max()1( sccCRA tttnT +=

    This adder is slow for large n . The main advantage of this adder is the simplicity of its cell and connection among them.

    2.1.4.2 Carry-Lookahead Adder

    The basic idea of carry-lookahead adder is computing the carries simultaneously, i.e. in this type of adder all the carries in the same groups are computed at the same time. The carry-lookahead adder has two functions, first is to compute all the carries then

    the operation iiii CYXS = is implemented by a simple 3-input XOR gate. The

  • VHDL Implementation of Fast adder trees 6

    design of the lookahead carry generator involves two Boolean functions named Generate and Propagate. For each pair of input bits these functions are defined as:

    iii YXG =

    iii YXP =

    The carry bit 1+iC generated when adding two bits iX and iY , is '1' when the

    function iG is '1' or if the IC is 1 and the function iP is '1' simultaneously. In

    the first case, the carry bit is activated by the local conditions (the values of iX

    and iY ). In the second, the carry bit is received from the less significant elementary

    addition and is propagated further to the more significant elementary addition

    depending on the function iP .Therefore, the carry-out bit corresponding to a pair of

    bits iX and iY is computed according to the equation:

    1+= iiii CPGC

    Hence, the carry signal can be computed by carry in, Generate and Propagate signals. For example, consider a four bit adder

    inCPGC 001 +=

    inCPPGPGC 010112 ++=

    inCPPPGPPGPGC 0120121223 +++=

    inCPPPPGPPPGPPGPGC 012301231232334 ++++=

  • VHDL Implementation of Fast adder trees 7

    Figure 2 can help us understand the carry out signal computation procedure more clearly.

    Figure 2 Carry out of Carry Lookahead Adder

    The sumoutput of each column is given in Figure 3.

    incarryYXoutsum iii __ =

    Figure 3 Sum of Carry Lookahead Adder

    The advantage of carry-lookahead adder is if we consider the input vector of n bits is divided into groups of m bits and groups connected like a ripple-carry adder, the worst delay should be:

    sgroupsCLA ttmnT +=

  • VHDL Implementation of Fast adder trees 8

    The worst delay is less than ripple-carry adder because groupst is smaller than cmt .

    Hence the carry-lookahead adder is faster than ripple-carry adder.

    2.1.5 Redundant Adders

    The character of redundant adders is that no carry propagation is required. In other words, independence of numbers of bits of the adders. The operand is represented using a redundant set. The main purpose of the redundant adder is to reduce the addition time. But this kind of adder have some disadvantages, first is the increase of the number of bits needed for representation of a number, which depend on the degree of the redundancy. Another disadvantage is that some of operations cant be performed in redundant numbers such as magnitude comparison or sign detection.

    2.1.5.1 Carry-Save Adder

    Carry-save adder(CSA) have the same circuit as the full adder, as show in Figure 4.

    Figure 4 Function of Carry-save adder

    The carry in signal is considered as an input of the CSA, and the carry out signal is considered as an output of the CSA. Figure 5 show how n carry save adders are

    arranged to adder three n bit numbers x , y , z . into two numbers c and s .

  • VHDL Implementation of Fast adder trees 9

    Figure 5 CSA used in n bit numbers

    In Figure 5, note that all full adders are independent Figure 6 show the CSA compute flow and Table 1 will show how the CSA works (basic on binary numbers).

    Figure 6 CSA computation

  • VHDL Implementation of Fast adder trees

    10

    Table 1 CSA Computation

    The computation can be divided into two steps, first we compute S and C using a CSA, then we use a CPA to compute the total sum. From this example, we can see that the carry signal and the sum signal can be computed independently to get only two n -bits numbers. A CPA is used for the last step computation and the carry propagation exist only in the last step.

    2.1.5.2 Signed-Digit Adder (SDA)

    Signed-digit (SD) number representation systems have been defined for any radix r with digit values ranging over the set (- alpha , . . ., -1, 0, 1, . . ., alpha ), where alpha

    is an arbitrary integer in the range 12

    1

    ralphar .Such number representation

    systems possess sufficient redundancy to allow for the cut up of carry or borrow chains and hence result in fast propagation-free addition and subtraction. The result of the addition uses signed digit representation. Use fixed-radix representation with digit value from a signed-integer set.

    =1

    0

    ni

    i rxx

    with a digit set (- alpha , . . ., -1, 0, 1, . . ., alpha ). Here the addition algorithm is not mention in detail. The objective of SDA is to eliminate carry propagation. A signed-digit addition is performed in two steps.

  • VHDL Implementation of Fast adder trees

    11

    Step 1: to compute sum( w ) and transfer( t ), the transfers function is something like carriers in CPA.

    twyx +=+ At the digital level this correspond to

    1++=+ iiii rtwyx

    Figure 7 show the addition of the first two bits of n -bit numbers

    Figure 7 First step of Signed-digit Addition

    Step 2: compute tws += At the digital level

    iii tws +=

    We can compute is without produce a carry, as shown in Figure 8.

  • VHDL Implementation of Fast adder trees

    12

    Figure 8 Second step of Sign-digit Addition

    Finally we can conclude SDA structure, as shown in Figure 9.

    Figure 9 Sign-digit Addition

    A.Avizienis [13] proposed a redundant binary number (a radix-2 signed-digit number). With this type of number, the propagation of carry figures is absorbed into its redundancy and the addition processes are unrelated to the number of digits and

    can be executed in only two steps. More detail to compute it and representation of

    operands has been mentioned in [14].

  • VHDL Implementation of Fast adder trees

    13

    2.1.6 Multi-operand Addition

    A common structure for adding several operands is an adder tree, such as Wallace tree, Dadda tree, carry save adder tree and so on. In this thesis, carry save adder tree structure and Wallace tree are used. The primitive operation performed on the inputs bit-array is reduction, to achieve an output bit-array with a small number of bits. There are two methods used: reduction by rows and reduction by columns, carry save adder tree belong to first method and the Wallace tree belong to second method. Modules to reduce the rows are called adders and reduce the columns are called counters.

    2.1.6.1 Carry Save Adder Tree

    The carry save adder tree can be used to add three operands in twos complement representation and produces a result as the sum of two vectors. A 3-to-2 reduction is

    called [3:2] adder, and using this tree, we can use a [ p :2] adder to reduce p

    bit-vectors to 2 bit-vectors using CSAs.

    Figure 10 A [ p :2] adder

    From Figure 10, each columns bit numbers are k , and have p levels. We can use

    [3:2] adders to reduce the rows and get 2 bit vectors. No propagation of the carries

  • VHDL Implementation of Fast adder trees

    14

    are required except on the last two rows which result in a speed up of the computation.

    Figure 11 Reduction by rows

    From Figure 11, the number of input vectors were reduced by the rows. Finally, we should estimate the numbers of levels of the CSA tree as

    23log

    2log k

    level

    where k is the number of input operands.

    2.1.6.2 Wallace tree

    Wallace tree structures are widely used in additions with several operands. The reduction by column is similar to reduction by rows if the number of bits in each column of the array is the same. But conditions are always not like this. For example the partial products of the multiplier, the Least-significant column cant receive bits from other columns. So reduction by columns is introduced. The basic concept is to reduce bit numbers in each column of each level. So full adder and half adder are used as (3:2) counter adder and (2:2) counter.

  • VHDL Implementation of Fast adder trees

    15

    Figure 12 FA and HA as (3:2) counter and (2:2) counter

    In Figure 12, three nodes inside pane represent the FAs three inputs and two nodes outside represents the FAs carry out and sum. The half adder has two inputs, abd one sum and one carry out. Here is a example used in this thesis presented.

    Example: a =[2 3] means 2 bit numbers with weight 12 and 3 bit numbers with

    weight 02 . We can use a Wallace tree as shown in Figure 13 to achieve fast addition. The basic module in the Wallace tree is (3:2) counter and (2:2) counter.

    Figure 13 Example of Reduction by Columns

  • VHDL Implementation of Fast adder trees

    16

    The vector change from a =[2 3] to a =[1 2 1] . The max change from 3 bits to 2 bits. Carry propagation delay was eliminated except for the last row. The last step use the CPA, like carry-lookahead adder, to compute the sum, and fast addition is achieved. I think Dadda tree is a special condition of the Wallace tree where all bit-numbers are collected and using Wallace tree with minimum number of counters and critical path. Wallace tree was chosen as basic algorithm of program for this thesis.

  • VHDL Implementation of Fast adder trees

    17

    Chapter 3

    Design Flow

    3.1 System specification

    This program mainly uses Wallace tree structure with (3:2) counter module and (2:2) counter module in adder tree to solve the fast addition problem. Environment of the program is MATLAB and the MATLAB program generates VHDL code.

    Figure 14 System requirement

    As shown in Figure 14. The input of the system is an integer vector (in MATLAB on integer vector can represent a bit array) that gives the number of bits in each column and the output of the program is VHDL code for the adder tree.

  • VHDL Implementation of Fast adder trees

    18

    3.2 Related MATLAB

    MATLAB [8] is a high-level technical computing language and interactive environment for algorithm development, data visualization, data analysis, and numerical computation. Using MATLAB, you can solve technical computing problems faster than with traditional programming languages, such as C, C++, and Fortran.

    3.2.1 Basic MATLAB program language

    Here key program syntax used in my program are introduced. First is if a is a vector, we will use length( a ) to express the vector length. Second is matrix addition and subtraction which are like in C language. Third is control flow like if end and for loops, in my program the tree levels depend on the input vector and we must use control flow to determine the levels and each levels detail information.

    3.2.2 Multidimensional cell array

    In the program, inputs, outputs and component names of each level will be stored. For example, as shown in Figure 15, the component name full_1_1_1 means the first full adder in column one in first level and in_data_1_1_1(0) means input data in column one in first level. All these names are variable character strings, because the level, column, and bit number are all variables. So if we want to store this information which must be recoded for next level, we must use an efficient method to solve it. MATLAB provide a good structure to solve this problem that is cell array. The function of multidimensional cell array is powerful. It can store the variable character string and all the information needed for each level. Two and three dimensional cell arrays [9] are used in my program.

  • VHDL Implementation of Fast adder trees

    19

    Figure 15 Component figure

    Figure 16 Explain of representation

    Consider Figure 16(a), a full adders name: Full_1_1_1

    First create a three dimensional cell array named cell( n , m , p ). then we define the

    information to the cell array, p means how many levels there is in the total system,

    m means which column in the defined level. n means which full adder is used in defined column. Consider Figure 16(b), an input data name: in_data_1_1(0)

    First create a three dimensional cell array named cell( n , m , p ). Then we define the

    information to the cell array, p means how many levels in the total system. m

    means which column in the defined level. Last n means which bit in the defined column.

  • VHDL Implementation of Fast adder trees

    20

    For all the three dimensional cell arrays in my program, p and m are defined as

    level and column, which is easy for the informations pick up.

    3.3 Design Flow Structure

    The design flow of the program is shown in figure 18, The first step is to compute the total numbers of levels in the adder tree. The second step is to compute each levels integer vector through the Wallace tree, The third step is to compute how many columns in each level through second step. The fourth step to compute the total numbers of counter and bypass in each column. There are three conditions. One condition is only have full adder in each column, Another condition is have full adder and half adder in each column, the last condition is have bypass and full adder in each column. The fifth step is after fourth step where we already know how many counters and bypasses in each column at each level. So though the numbers of full adder, half adder and bypasses in each column at each level, we can get inputs and outputs position states of counters and bypasss input and output position states. The sixth step is to store these states in cell arrays which will be used for describing hardware connection of Wallace tree. Here is an example to help us understand program procedure. Example: Assume the input integer vector is a = [6 4 5 6].

    Figure 17 Example of first level's structure of Wallace tree

  • VHDL Implementation of Fast adder trees

    21

    Figure 17 shows the first level of the tree structure. As mention above, we discuss this example from the third step. The third step can confirm that this level has 4 columns. The fourth step can compute that column 1 has two full adders, column 2 has one full adder and a bypass, column 3 has a half adder and a full adder, column 4 has two full adders. So this step can confirm full adder and half adders position. FA_4_1, means column 4 and level 1 and so on. Then store FA_4_1 to full adder position cell array and so on. The fifth step is through full adder, half adder and bypass position to get inputs. and outputs position. Like if we know FA_4_1, we can get its input should be In_data_4(0), In_data_4(1) and In_data_4(2). And output should be Out_data_4(0) , Out_data_3(0) and so on. The sixth step is: Store FA_4_1 in full adders cell array, In_data_4(0), In_data_4(1), In_data_4(2) in full adders input cell array. Out_data_4(0) in full adders outputs sum cell array. Out_put_3(0) to full adders outputs carry out cell array. Store HA_3 in half adders cell array, In_data_3(3), In_data3(4) (belong to half adder inputs in Figure 18) to half adders input cell array, Out_data_3(3) to half adders outputs sum cell array and Out_data_2(1) to half adders outputs carry out cell array. Assume bypass is a component like full adder. So store in_data_2(3) to bypasss input cell array and Out_data_2(3) to bypasss output cell array. So there are 4 cell arrays for full adder, 4 cell arrays for half adder and 3 cell arrays for bypass, because bypass doesnt have carry out. Now we store all this levels information by different cell arrays. When we want to use it, according to the column and level, we can find them in the cell arrays.

  • VHDL Implementation of Fast adder trees

    22

    Figure 18 shows design flow of this program.

    Figure 18 Design Flow of program

  • VHDL Implementation of Fast adder trees

    23

    3.4 Description of cell arrays

    In Figure 18, three blocks are defined: counter block, input block, and output block. Counter block means all the cell arrays where the information related to position of full adders, half adders and bypasses (assume it is a component) in this block are stored. This block contains full adders position cell array, half adders position cell array, and bypasss position cell array. Input block means all the cell arrays storing the inputs information will be in this block, so this block contains full adders input cell array, half adders input cell array and bypass input cell array. Output block means the cell arrays that store the output information will be in this block. This block contains full adders outputs sum cell array, full adders outputs carry out cell array, half adders outputs sum cell array, half adders outputs carry out cell array, bypasss output cell array.

    3.4.1 Counter Block

    The counter block divided by three cell arrays: (3:2) counter (FA) cell array, (2:2) counter cell array and bypass cell array. The FA cell array store each FAs position of each column in each level. HA cell array store the position of each column in each level, and so does bypass. Figure 19 shows the counter block.

    Figure 19 Counter Block

  • VHDL Implementation of Fast adder trees

    24

    First we discuss the store method of FA in cell array. We already know current level and column through level cell array and column cell array. So we can get the numbers of FA in current column and then create a three dimensional cell array FA(max, column, level). Max means the maximum number of FA in current column, through the cell array. We can easily find the position of the FA we want to use. Example: If the input bit array is a = [6 4 5 6]; through the FA cell array, we can get from Figure 20 that there are tree levels, and in first level, column one need two FAs. :

    Figure 20 FA position in adder tree

    Column two needs one FA, column tree need one FA, column four need two FA, and so on for the other two levels. When we want to use the FA information, just find it in the cell array is enough. Then the next steps are the HA and Bypass cell arrays. The principle is like the FA cell array, only difference is the dimension size of the cell array, as each column maybe only have one HA or Bypass, so the cell array should be HA(1,column, level), and Bypass(1, column, level). Consider that Bypass (BP) is a component because it is the easy for us to find its signal flow.

  • VHDL Implementation of Fast adder trees

    25

    Example: Same input bit array a =[6 4 5 6]; through the HA and Bypass cell arrays, we can get from Figure 21.

    Figure 21 HA and BP position in adder tree

    From Figure 21 we know that level one, column two has a Bypass and column three has a HA. The same concept applies to last two levels. Because the cell arrays two dimension(column, level) are same compared with FA cell array, it is easy for us to get all the component position information when we want to generate the VHDL code.

    3.4.2 Input block

    When position of counters are defined, we should add input signals to each FA, HA and BP. First consider input of the FA cell array. Create a cell array FAinput(max, column, level). Column and level are the same as mention above, Max here means when we already know a column have max FA numbers, this numbers multiplied by three because each FA has three inputs. Figure 22 shows the input block.

  • VHDL Implementation of Fast adder trees

    26

    Figure 22 Input Block

    Example: Same input bit array a =[6 4 5 6]; the input state is shown in Figure 23.

    Figure 23 FA inputs state

    Because we already know column one in first level has two FA as mentioned above, so the inputs of column one should be six input numbers. The same applies for the other columns and levels. Then we consider inputs of HA and BP. The principle is the same as for the FAs inputs, difference is that HA has two inputs and BP only has one input. So the cell

  • VHDL Implementation of Fast adder trees

    27

    array can be created like HAinput(2, column, level) and BPinput(1, column, level). Example: Same input bit array a = [6 4 5 6]; the input state we can get from HAinput cell array and BPinput cell array shown in Figure 24.

    Figure 24 HA and BP inputs state

    3.4.3 Output block

    The program has already stored the counters and inputs information, the last step is to store the output state. The output block contain FAs sum cell array, FAs carry out cell array, HAs sum cell array, HAs carry out cell array and BPs output cell array.

    Figure 25 Output Block

  • VHDL Implementation of Fast adder trees

    28

    There are five cell arrays used in the Outputs block as shown in Figure 25. First introduce FAs sum cell array and HAs sum cell array and BPs output cell array. The concept to discuss these three cell arrays together is their outputs are all in original column, not like carry out which is in the next column. The form of each cell array are FAsum(max, column, level), HAsum(1, column, level), BPout(1, column, level). Because numbers of FA are random in each column at defined level, we need to decide a maximum numbers of FA in that level as one dimension of the cell array, HA and BPs output in each column are only one, so the one dimension of the cell array should be one. Example: Same input bit array a =[6 4 5 6]; we can get FAs sum, HAs sum and BPs sum from Figure 26.

    Figure 26 FA's sum state

    The cell array named fos, means full adder output sum. We already know column one have two FA in level one, so there are two corresponding outputs. So do other columns and levels.

  • VHDL Implementation of Fast adder trees

    29

    Figure 27 HA's sum state

    The cell array hos, as shown in Figure 27, means HAs output sum, from counter block. We know that in the first level we only have one HA in column three. And this output corresponding to the HA position. So do other columns and levels.

    Figure 28 BP's output state

    The cell array bpos, as shown in Figure 28 means BPs output, from the counter block, we know that in the first level we only have one BP in column two, so the output corresponding to the BP position, so do other columns and levels. Carry in bits will affect the positions of the next columns output, so I create a cell array to store the number of bits of the carries to the next column. When storing the current sum output, we should use this cell array to get the correct position for each bit. Then we will discuss the carry out cell array of FA and HA. The method to

  • VHDL Implementation of Fast adder trees

    30

    store FA and HAs carry out bit is the same like store its sum value, the only difference is the column representation. Example: Same input bit array a =[6 4 5 6];

    Figure 29 FA's carry out state

    From Figure 29, for example in the first level, column two, carry out is out_data_1_1(0). Though this bit should be in column one because it is a carry out, but it is produced by column twos FA, so store this bit in column two. It is easy for us to get all information of a counter by column and level. So does HAs carry out cell array as shown in Figure 30.

    Figure 30 HA's carry out state

  • VHDL Implementation of Fast adder trees

    31

    From all these cell arrays, the program describe all the information of the adder tree hardware connections. And solve fast addition target by Wallace tree.

  • VHDL Implementation of Fast adder trees

    32

    Chapter 4

    VHDL generator and Top level simulation

    4.1 MATLAB program to generate VHDL code

    RTL description of the adder tree is suitable for describing the component structure. A MATLAB program is used to generate RTL VHDL code. The main method to generate VHDL code is using file input and output in the MATLAB language. in Figure 31 it is explained how to create a VHDL file and the procedure of generation.

    Figure 31 MATLAB code to VHDL code

    The first step in Figure 31 is to create a VHDL file, w means that we can write the file. And fid is used to identify which file that is used. Here eachlevel.vhdl as is used as filename, the next step is writing to the file, MATLABs syntax fprintf(fid, 'format','cmd') writes the string using the format specified by format. Format is a C language conversion specification. Conversion specifications involve the % character and the conversion characters d, i, o, u, x, X, f, e, E, g, G, c, and s. In this thesis, %s is used because all information is stored in cell array in character string format.

  • VHDL Implementation of Fast adder trees

    33

    The second sentence in Figure 31 defines a library, and the third sentence describes the library using std_logic_1164.all package. The generated code should agree with VHDL grammar. Information stored in cell array is used to represent the adder tree structure.

    4.2 VHDL code Description

    Because of structural VHDL, the Wallace adder tree representation was divided in each level and a top level. Each level record current levels state like numbers of counters and their positions. Also record inputs and outputs state of each counter. The Top level is used to integrate all these levels, to get the final result of the fast adder tree.

    4.2.1 Related MODELSIM and VHDL language

    MODELSIM is a quick and handy VHDL/Verilog simulator. The VHDL code must be complied into a VHDL library before it is simulated. The simulator itself cant read VHDL source code. The procedure flow is show in figure 32:

    Figure 32 MODELSIM operation

  • VHDL Implementation of Fast adder trees

    34

    4.2.2 VHDL dataflow description

    In the data flow approach, circuits are described by indicating how the inputs and outputs of built in primitive components (for example AND gates) are connected together. In other words we describe how signals (data) flow through the circuit.

    4.2.3 VHDL structural RTL description

    A structural description [10] of a piece of hardware is a description of what its subcomponents are and how the subcomponents are connected to each other. Structural description is more concrete than behavioral description; that is the correspondence between a given portion of a structural description and a portion of the hardware is easier to see than for a behavioral description.

    4.2.3.1 Building Blocks

    If we want to make the design more understandable and maintainable, a system design should be decomposed into several blocks. These blocks are connected together to form a complete design. Every part of a VHDL design is considered as a block. A VHDL design may be completely described in a single block, or it may be decomposed into several blocks. Each block in VHDL is analogous to an off-the-shelf part and is called an entity. The entity describes the interface to that block and a separate part associated with the entity describes how that block operates. The interface description is like a pin description in a data book, specifying the inputs and outputs of the block. The description of the operation of the block is like a schematic.

  • VHDL Implementation of Fast adder trees

    35

    4.2.3.2 Connect Block

    Once we have defined the basic building blocks of our design using entities and their associated architectures, we can combine them together to form the system. For my work, the top level is formed by each levels connection. Each level is a basic block, and the connect block integrate those blocks together to form the top level structure.

    4.2.4 VHDL code of each level

    Level numbers depend on the input bit array, when confirm the level, using VHDL structure description to describe the adder trees structure. The VHDL code was automatic generated by the MATLAB program. Each level of the structure is described by structure VHDL, ports and their connection are central matter of the structure description. Each levels ports information can be obtained from counters, inputs, and outputs cell array blocks. Example: This is a VHDL file generated by MATLAB code. Assume bit array a =[2 3]. Figure 33 shows the detailed information.

  • VHDL Implementation of Fast adder trees

    36

    Figure 33 VHDL code description for each level

  • VHDL Implementation of Fast adder trees

    37

    4.2.5 VHDL code for top level

    Integrate all those levels together, using the connect block of structural VHDL to complete it. First levels inputs as top levels inputs and last levels outputs as top levels outputs, other levels between these two levels are internal signals. The outputs from one level are used as next levels inputs Example: Bit array a =[6 4 5 6]. The VHDL code is shown in Figure 34.

  • VHDL Implementation of Fast adder trees

    38

    Figure 34 VHDL code description for top level

    4.3 Simulation result

    Now we get four types of VHDL files : each level structural VHDL, top level structural VHDL and (3:2) counter and (2:2) counter dataflow VHDL file. Using

  • VHDL Implementation of Fast adder trees

    39

    MODELSIM compile and simulate the files. Then we can get the final result of the fast adder tree. From MODELSIM, we can see the adder trees hardware connection more clearly, as shown in Figure 35.

    Figure 35 Adder tree structure described by MODELSIM

  • VHDL Implementation of Fast adder trees

    40

    Example: Assume input bit array is a =[12 11 14 9 5]. The computation result is shown in Figure 36.

    Figure 36 Computation result

    From Figure 35, this Wallace adder tree has five levels. Let us consider the simulation result. The sum value at the output must be equal to the sum value at the input. 72 62 52 42 32 22 12 02 a = 111111110011

    12 bits

    11110011111

    11 bits

    11010110101111

    14 bits

    111110000

    9 bits

    11010

    5 bitsSum=285 42 0 32 9 22 10 12 5 02 3a = 11

    2bits

    00

    2bits

    00

    2bits

    10

    2bits

    1

    1bits

    1

    1bits

    0

    1bits

    1

    1bits Sum=285 72

    2

    62 0

    52 0

    42 1 32 1 22 1 12 1 02 1

    Table 2 Computation procedure

  • VHDL Implementation of Fast adder trees

    41

    From Table 2, the computation results are the same, and bit vectors in the output bit array are no more than two bits, so the carry propagation is only required at the output.

  • VHDL Implementation of Fast adder trees

    42

    Chapter 5

    Conclusion and Future Work

    5.1 Conclusion

    In this work fast adder tree implementation in VHDL was considered. When inputs are of large word length, Wallace tree was used to solve this problem and VHDL files to describe Wallace adder trees hardware connection were generated.

    5.2 Future work

    Three programs are written in MATLAB language: one for storing each levels current states and the other two uses that program to automatic generate each level and top level VHDL files. Future work is to add pipeline to each level, and consider delay time of each level. Furthermore, the Wallace adder tree structure may be changed to another one because of the irregular routing and large wiring area problems.

  • VHDL Implementation of Fast adder trees

    43

    References

    [1] C.S. Wallace, A suggestion for a fast multiplier, IEEE Trans. Electron. Comput., pp. 1417, Feb. 1964. [2] Z.-J. Mou, 'Overturned-Stairs' Adder Trees and Multiplier Design, F. IEEE Computer Society, http://csdl.computer.org/comp/trans/tc/1992/08/t0940abs.htm [3] A. Weinberger and J. L. Smith, A logic for high-speed addition, National Bureau of Standards Circular591, pp. 312, 1958. [4] T. Lynch and E. E. Swartzlander, Jr., The redundant cell adder, in Proc. 10th Symp. Comput. Arithmetic, 1991, pp. 165170. [5] V. Kantabutra, Designing one-level carry-skip adders, IEEE Trans. Comput., vol. 42, no. 6, pp. 759764, June 1993. [6] A. Weinberger and J. L. Smith, A logic for high-speed addition, National Bureau of Standards Circular591, pp. 312, 1958. [7] Y. Harata et al., A high-speed multiplier using a redundant binary adder tree, IEEE J. Solid-State Circuits, vol. SC-22, pp. 2834, Feb. 1987. [8] http://www.mathworks.com/access/helpdesk/help/techdoc/matlab_prog/ [9] E. Herniter, Programming in MATLAB, Northern Arizona University, 2001. [10] R. Lipsatt, VHDL: Hardware Description and design, Intermetrics Inc., 1993. [11] Peter Kornerup, Southern Danish University IEEE Computer Society http://csdl.computer.org/comp/proceedings/asap/2002/1712/00/17120218abs.htm [12] F. Ancona, S. Rovetta, R. Zumino , High performance in tree-based parallel architectures Genoa Univ IEEE Computer Society, HUNGARY p.474.

  • VHDL Implementation of Fast adder trees

    44

    [13] A. Avizienis, Signed-digit number representations for fast parallel arithmetic, IRE Transactions on Electronic Computers, vol. EC-10, pp. 389400, Sep 1961. [14] I. Koren, Computer Arithmetic Algorithms, Englewood Cliffs, N.J.: Prentice Hall, 1993.

  • VHDL Implementation of Fast adder trees

    45

    Appendices

    Appendix 1 MATLAB program for each level

    clear a= random bit array % show fulladder's ouputs , divided two cell arrays, one store the output % sum data name, another store the carry_out data name.(fos)(foc) % show half adder's ouputs , divided two cell arrays, one store the output % sum data name, another store the carry_out data name.(hos)(hoc) % antoher cell array sotre the BP's output data name(bpos) % (cncolumn) is a cell array to show the total carry_out values to next % column in each level % define total levels final=a; model=a; result=a; page=0; while max(result)>2 a=result; result = zeros(1, length(a)); carry_out = 0; for k =length(a):-1:1 % k columnes in vector b=a(k);% get the number of each column rem (b,3);% get rem of b/3 c=fix(b./3);% # of full adder d=b-3.*fix(b./3);% remainder carry_in=carry_out; if d==0 carry_out_HA=0; sum=0; elseif d==1 carry_out_HA=0; sum=1; else

  • VHDL Implementation of Fast adder trees

    46

    carry_out_HA=1; sum=1; end carry_out = carry_out_HA + c; if k==length(a) totalsum=sum+c; else totalsum=c+carry_in+sum; end result(k)=totalsum; end if carry_out~=0 result = [ carry_out, result ]; end page=page+1; disp(result) a1=result; end level=page; rowcon=cell(1,level+1); a=model; % define a vector result1 = zeros(page, length( result)) result = a; page = 1; while max(result)>2 a=result; result = zeros(1, length(a)); carry_out = 0; for k =length(a):-1:1 % k columnes in vector b=a(k);% get the number of each column rem (b,3); c=fix(b./3);

  • VHDL Implementation of Fast adder trees

    47

    d=b-3.*fix(b./3); carry_in=carry_out; if d==0 carry_out_HA=0; sum=0;% elseif d==1 carry_out_HA=0; sum=1; else carry_out_HA=1; sum=1; end carry_out = carry_out_HA + c; if k==length(a) totalsum=sum+c; else totalsum=c+carry_in+sum; end result(k)=totalsum; % result= [result, totalsum] end if carry_out~=0 result = [ carry_out, result ]; end for intI = length(result):-1:1 result1 ( page, intI ) = result(intI) end page = page + 1; end Fresult=result1; page1=level; page1=page1+1; lcolumn=cell(1,page1); for i=1:1:page1 if i==1 a=model;

  • VHDL Implementation of Fast adder trees

    48

    vcolumn=length(a); else a=Fresult(i-1,:); marix=a; for k=length(a):-1:1 b=a(k); if b==0 a(:,k)=[]; else matrix=a; end end vcolumn=length(a); end lcolumn(1,i)={vcolumn}; end page=level; page=page+1; tcolumns=0; carfull=cell(1,tcolumns, page); carhalf=cell(1,tcolumns, page); carbp =cell(1,tcolumns, page); for i=1:1:page if i==1 a=model; tcolumns=lcolumn{1,1}; % carfull=cell(1,tcolumns, 1); tnFA=0; tnHA=0; tnBP=0; for k1=tcolumns:-1:1 b=a(k1);% get the number of each column

  • VHDL Implementation of Fast adder trees

    49

    rem (b,3);% get rem of b/3 c=fix(b./3);% # of full adder d=b-3.*fix(b./3);% remainder if d==0 nFA=c; nHA=0; nBP=0; temptnHA=nHA; temptnFA=nFA; temptnBP=nBP; elseif d==1 nFA=c; nHA=0; nBP=1; temptnHA=nHA; temptnFA=nFA; temptnBP=nBP; else nFA=c; nHA=1; nBP=0; temptnHA=nHA; temptnFA=nFA; temptnBP=nBP; end carfull(1,k1,1)={temptnFA}; carhalf(1,k1,1)={temptnHA}; carbp(1,k1,1)={temptnBP}; end else a=Fresult(i-1,:); tcolumns=lcolumn{1,i}; tnFA=0; tnHA=0; tnBP=0; for k1=1:1:tcolumns

  • VHDL Implementation of Fast adder trees

    50

    b=a(k1);% get the number of each column rem (b,3);% get rem of b/3 c=fix(b./3);% # of full adder d=b-3.*fix(b./3);% remainder if d==0 nFA=c; nHA=0; nBP=0; temptnHA=nHA; temptnFA=nFA; temptnBP=nBP; elseif d==1 nFA=c; nHA=0; nBP=1; temptnHA=nHA; temptnFA=nFA; temptnBP=nBP; else nFA=c; nHA=1; nBP=0; temptnHA=nHA; temptnFA=nFA; temptnBP=nBP; end carfull(1,k1,i)={temptnFA}; carhalf(1,k1,i)={temptnHA}; carbp(1,k1,i)={temptnBP}; end end end % define maxf for show full_adder 's name page=level; b=[];

  • VHDL Implementation of Fast adder trees

    51

    maxf=cell(1,page); for i =1:1:page column=lcolumn{1,i}; for k=1:1:column b=carfull{1,k,i}; comp(1,k)=b; maxn=max(comp); maxf(1,i)={maxn}; end end page=level; b=[]; maxg=cell(1,page); for i =1:1:page column=lcolumn{1,i}; for k=1:1:column b=carfull{1,k,i}; comp(1,k)=b; maxn=max(comp); maxg(1,i)={3*maxn}; end end % show all full adders name page=level; max=0; tcolumnf=0; c=1; funame=cell(max,tcolumnf,page); for i =1:1:page tcolumnf=lcolumn{1,i}; max= maxf{1,i}; for k1=1:1:tcolumnf fn=carfull{1,k1,i}; c=1; for q=1:1:fn

  • VHDL Implementation of Fast adder trees

    52

    for p=c:1:fn funame(p,k1,i)={ strcat( 'full_adder_', num2str(i), '_', num2str(k1),'_',num2str(q) ) }; end c=c+1; end end end % show all half adder's name page=level; tcolumnh=0; haname=cell(1,tcolumnh,page); for i=1:1:page tcolumnh=lcolumn{1,i}; for k=1:1:tcolumnh hn=carhalf{1,k,i}; if hn==1; haname(1,k,i)={ strcat( 'half_adder_', num2str(i), '_', num2str(k) ) }; else haname(1,k,i)={[] }; end end end % show BP 's name, although BP have no component, benefit for seek it's % input and output page=level; tcolumnbp=0; bpname=cell(1,tcolumnbp,page); for i=1:1:page tcolumnbp=lcolumn{1,i}; for k=1:1:tcolumnbp

  • VHDL Implementation of Fast adder trees

    53

    bpn=carbp{1,k,i}; if bpn==1; bpname(1,k,i)={ strcat( 'bp_', num2str(i), '_', num2str(k) ) }; else bpname(1,k,i)={[] }; end end end %store the full adder's input names of each level page=level; tcolumnfd=0; maxz=0; c=1; fuinput=cell(maxz, tcolumnfd, page); for i=1:1:page tcolumnfd=lcolumn{1,i}; maxz= maxg{1,i}; for k1=1:1:tcolumnfd fn=carfull{1,k1,i}; c=1; for q=0:1:3*fn-1 for p=c:1:3*fn fuinput(p,k1,i)={ strcat( 'in_data_', num2str(i), '_', num2str(k1),'(',num2str(q), ')') }; end c=c+1; end end end %store the half adder's input names of each level page=level; tcolumnhd=0;

  • VHDL Implementation of Fast adder trees

    54

    hainput=cell(2, tcolumnhd, page); for i=1:1:page tcolumnhd=lcolumn{1,i}; for k=1:1:tcolumnhd fn=carfull{1,k,i}; hn=carhalf{1,k,i}; if fn==0 if hn~=0 hainput(1,k,i)={ strcat( 'in_data_', num2str(i), '_', num2str(k),'(',num2str(0), ')') }; hainput(2,k,i)={ strcat( 'in_data_', num2str(i), '_', num2str(k),'(',num2str(1), ')') }; else hainput(1,k,i)={[]}; hainput(2,k,i)={[]}; end else if hn~=0 b=3*fn; c=3*fn+1; hainput(1,k,i)={ strcat( 'in_data_', num2str(i), '_', num2str(k),'(',num2str(b), ')') }; hainput(2,k,i)={ strcat( 'in_data_', num2str(i), '_', num2str(k),'(',num2str(c), ')') }; else hainput(1,k,i)={[]}; hainput(2,k,i)={[]}; end

  • VHDL Implementation of Fast adder trees

    55

    end end end % show BP's input's names in each level although BP is not a component, we % look it as a component page=level; tcolumnbd=0; bpinput=cell(1, tcolumnbd, page) for i=1:1:page tcolumnbd=lcolumn{1,i}; for k=1:1:tcolumnbd fn=carfull{1,k,i}; bpn=carbp{1,k,i}; if fn==0 if bpn~=0 bpinput(1,k,i)={ strcat( 'in_data_', num2str(i), '_', num2str(k),'(',num2str(0), ')') }; else bpinput(1,k,i)={[]}; end else if bpn~=0 b=3*fn; bpinput(1,k,i)={ strcat( 'in_data_', num2str(i), '_', num2str(k),'(',num2str(b), ')') }; else

  • VHDL Implementation of Fast adder trees

    56

    bpinput(1,k,i)={[]}; end end end end page=level; max=0; foscolumn=0; hoscolumn=0; bposcolumn=0; foccolumn=0; hoccolumn=0; cncolumn=0; fos=cell(max, foscolumn ,page ); hos=cell(1,hoscolumn,page); bpos=cell(1,bposcolumn, page); foc=cell(max,foccolumn,page); hoc=cell(1,hoccolumn,page); carrynum=cell(1,cncolumn,page); for i=1:1:page foscolumn=lcolumn{1,i}; hoscolumn=lcolumn{1,i}; bposcolumn=lcolumn{1,i}; foccolumn=lcolumn{1,i}; hoccolumn=lcolumn{1,i}; cncolumn=lcolumn(1,i); scolumn=foscolumn;

  • VHDL Implementation of Fast adder trees

    57

    max=maxf{1,i}; for k=scolumn:-1:1 m=scolumn; if k==m fn=carfull{1,k,i}; hn=carhalf{1,k,i}; bpn=carbp{1,k,i}; if (fn==0) & (hn==0) & (bpn~=0) bpos(1,k,i)= { strcat( 'out_data_', num2str(i), '_', num2str(k),'(',num2str(0), ')') }; carrynum(1,k,i)={0}; elseif (fn==0) & (hn~=0) & (bpn==0) hos(1,k,i)={ strcat( 'out_data_', num2str(i), '_', num2str(k),'(',num2str(0), ')') }; hoc(1,k,i)={ strcat( 'out_data_', num2str(i), '_', num2str(k-1),'(',num2str(0), ')') }; carrynum(1,k,i)={1}; elseif (fn~=0) & (hn==0) & (bpn~=0) c=1; % show fos(1,k,i) and foc(1,k,i) for q=0:1:fn-1 for p=c:1:fn fos(p,k,i)={ strcat( 'out_data_', num2str(i), '_', num2str(k),'(',num2str(q), ')') }; foc(p,k,i)={ strcat( 'out_data_', num2str(i), '_', num2str(k-1),'(',num2str(q), ')') };

  • VHDL Implementation of Fast adder trees

    58

    end c=c+1; end bpos(1,k,i)={ strcat( 'out_data_', num2str(i), '_', num2str(k),'(',num2str(fn), ')') }; carrynum(1,k,i)={fn}; elseif (fn~=0) & (hn==0) & (bpn==0) c=1; % show fos(1,k,i) and foc(1,k,i) for q=0:1:fn-1 for p=c:1:fn fos(p,k,i)={ strcat( 'out_data_', num2str(i), '_', num2str(k),'(',num2str(q), ')') }; foc(p,k,i)={ strcat( 'out_data_', num2str(i), '_', num2str(k-1),'(',num2str(q), ')') }; end c=c+1; end carrynum(1,k,i)={fn}; else (fn~=0) & (hn~=0) & (bpn==0) c=1; % show fos(1,k,i) and foc(1,k,i) for q=0:1:fn-1 for p=c:1:fn fos(p,k,i)={ strcat( 'out_data_', num2str(i), '_', num2str(k),'(',num2str(q), ')') }; foc(p,k,i)={ strcat( 'out_data_', num2str(i), '_', num2str(k-1),'(',num2str(q), ')') }; end c=c+1; end

  • VHDL Implementation of Fast adder trees

    59

    hos(1,k,i)={ strcat( 'out_data_', num2str(i), '_', num2str(k),'(',num2str(fn), ')') }; hoc(1,k,i)={ strcat( 'out_data_', num2str(i), '_', num2str(k-1),'(',num2str(fn), ')') }; carrynum(1,k,i)={fn+1}; end else fn=carfull{1,k,i}; hn=carhalf{1,k,i}; bpn=carbp{1,k,i}; if (fn==0) & (hn~=0) hoc(1,k,i)={ strcat( 'out_data_', num2str(i), '_', num2str(k-1),'(',num2str(0), ')') }; carrynum(1,k,i)={1}; elseif (fn~=0) & (hn==0) c=1; % show fos(1,k,i) and foc(1,k,i) for q=0:1:fn-1 for p=c:1:fn foc(p,k,i)={ strcat( 'out_data_', num2str(i), '_', num2str(k-1),'(',num2str(q), ')') }; end c=c+1; end carrynum(1,k,i)={fn}; elseif (fn==0)& (hn==0) carrynum(1,k,i)={0};

  • VHDL Implementation of Fast adder trees

    60

    else (fn~=0) & (hn~=0) c=1; % show fos(1,k,i) and foc(1,k,i) for q=0:1:fn-1 for p=c:1:fn foc(p,k,i)={ strcat( 'out_data_', num2str(i), '_', num2str(k-1),'(',num2str(q), ')') }; end c=c+1; end hoc(1,k,i)={ strcat( 'out_data_', num2str(i), '_', num2str(k-1),'(',num2str(fn), ')') }; carrynum(1,k,i)={fn+1}; end % second show fos and hos and bpos k=scolumn-1:-1:1 fn=carfull{1,k,i}; hn=carhalf{1,k,i}; bpn=carbp{1,k,i}; cn=carrynum{1,k+1,i}; if (fn==0) & (hn==0) & (bpn~=0) bpos(1,k,i)= { strcat( 'out_data_', num2str(i), '_', num2str(k),'(',num2str(cn), ')') }; elseif (fn==0) & (hn~=0) & (bpn==0)

  • VHDL Implementation of Fast adder trees

    61

    hos(1,k,i)={ strcat( 'out_data_', num2str(i), '_', num2str(k),'(',num2str(cn), ')') }; elseif (fn~=0) & (hn==0) & (bpn~=0) c=1; for q=0:1:fn-1 for p=c:1:fn fos(p,k,i)={ strcat( 'out_data_', num2str(i), '_', num2str(k),'(',num2str(q+cn), ')') }; end c=c+1; end bpos(1,k,i)={ strcat( 'out_data_', num2str(i), '_', num2str(k),'(',num2str(fn+cn), ')') }; elseif (fn~=0) & (hn==0) & (bpn==0) c=1; for q=0:1:fn-1 for p=c:1:fn fos(p,k,i)={ strcat( 'out_data_', num2str(i), '_', num2str(k),'(',num2str(q+cn), ')') };

  • VHDL Implementation of Fast adder trees

    62

    end c=c+1; end else (fn~=0) & (hn~=0) & (bpn==0) c=1; for q=0:1:fn-1 for p=c:1:fn fos(p,k,i)={ strcat( 'out_data_', num2str(i), '_', num2str(k),'(',num2str(q+cn), ')') }; end c=c+1; end hos(1,k,i)={ strcat( 'out_data_', num2str(i), '_', num2str(k),'(',num2str(fn+cn), ')') }; end end end end disp(funame) disp(haname) disp(bpname) disp(fuinput) disp(hainput) disp(bpinput) disp(fos); disp(hos);

  • VHDL Implementation of Fast adder trees

    63

    disp(bpos); disp(foc); disp(hoc); % total need 11 databases to store inputs , oupputs and component % information % full_adder 's name store in cell array======>funame. % full_adder's name like full_adder_4_3_2 : 4 mean level four, 3 mean % column three, 2 mean the second full_adder in this column. % half_adder's name store in cell array ======> haname % half_adder 's name like half_adder_4_3 mean level 4 and third column % Bypass is not a component, but for index, consider it is, store in cell % array =====> bpname bp_1_3 mean level 1 and column 3 % full_adder 's input data : in_data_1_1(1) mean level1 , column 1, second input ======> fuinput % half_adder 's input data : in_data_1_1(1) mean level1 , column 1, second input ======> hainput % Bypass 's input data : in_data_1_1(1) mean level1 , column 1, second input ======> bpinput % output data can divided into five cell arrays to show % full_adder 's sum informaion ====> fos , full_adder's carry_out infomation =====> foc % half_adder's sum information ======> hos, half_adder's carry_out infomation ============> hoc % Bypass's output infor ==============> bpos . page=level; a=final; fid = fopen('eachlevel.vhdl', 'w'); for i=1:1:page if i==1 fprintf(fid,'library ieee; \n'); fprintf(fid,' \n');

  • VHDL Implementation of Fast adder trees

    64

    fprintf(fid,'use ieee.std_logic_1164.all; \n'); fprintf(fid,' \n'); fprintf(fid, 'entity tree_level_%d is \n',i); fprintf(fid, 'port( \n'); fork=length(a):-1:1 % level 1 input fprintf(fid, 'in_data_%d_%d : in std_logic_vector(%d downto 0);\n',i, k, a(k)-1); end result = zeros(1, length(a)); carry_out = 0; for k =length(a):-1:1 % k columnes in vector b=a(k);% get the number of each column rem (b,3);% get rem of b/3 c=fix(b./3);% # of full adder d=b-3.*fix(b./3);% remainder carry_in=carry_out; if d==0 carry_out_HA=0; sum=0; elseif d==1 carry_out_HA=0; sum=1; else carry_out_HA=1; sum=1; end carry_out = carry_out_HA + c; if k==length(a) totalsum=sum+c; else totalsum=c+carry_in+sum;

  • VHDL Implementation of Fast adder trees

    65

    end result(k)=totalsum; % result= [result, totalsum] end if carry_out~=0 result = [ carry_out, result ]; end c=lcolumn{1,1}; d=lcolumn{1,2}; % level 1 output if d==c for m=length(result):-1:1 if m==1 fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0) \n',i, m, result(m)-1); else fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0);\n',i, m, result(m)-1); end end else for m=length(result)-1:-1:0 % level output if m==0 fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0) \n',i, m, result(m+1)-1); else fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0);\n',i, m, result(m+1)-1); end end end fprintf(fid,'); \n');

  • VHDL Implementation of Fast adder trees

    66

    fprintf(fid,'end tree_level_%d; \n',i); fprintf(fid,' \n'); fprintf(fid,'architecture tree_level_%d of tree_level_%d is \n',i,i);

    fprintf(fid,' \n'); fprintf(fid,'component full_adder \n'); fprintf(fid,'port( \n'); fprintf(fid,'ain, bin, cin : in std_logic; \n'); fprintf(fid,'sout,cout : out std_logic ); \n'); fprintf(fid,'end component; \n'); fprintf(fid,' \n'); fprintf(fid,'component half_adder \n'); fprintf(fid,'port( \n'); fprintf(fid,'ain, bin : in std_logic; \n'); fprintf(fid,'sout,cout: out std_logic ); \n'); fprintf(fid,'end component; \n'); fprintf(fid,' \n'); fprintf(fid,'begin \n'); % define the relation between adder and it's in out put column=lcolumn{1,i}; % how many columns in level 1 for k=column:-1:1 fn=carfull{1,k,i}; hn=carhalf{1,k,i}; bpn=carbp{1,k,i}; if fn~=0 for m=1:1:fn fprintf(fid,'%s : full_adder port map ( %s,%s,%s,%s,%s); \n', funame{m,k,i}, fuinput{3*m-2,k,i},fuinput{3*m-1,k,i},fuinput{3*m,k,i},fos{m,k,i},foc{m,k,i} ); end else fprintf(fid,' \n'); end

  • VHDL Implementation of Fast adder trees

    67

    if hn~=0; fprintf(fid,'%s : half_adder port map ( %s,%s,%s,%s); \n', haname{1,k,i}, hainput{1,k,i},hainput{2,k,i},hos{1,k,i},hoc{1,k,i} ); else fprintf(fid,' \n'); end if bpn~=0; fprintf(fid,'%s

  • VHDL Implementation of Fast adder trees

    68

    else a=matrix; end end for k1=length(a):-1:1 fprintf(fid, 'in_data_%d_%d : in std_logic_vector(%d downto 0);\n',i, k1, a(k1)-1); end a=Fresult(i,:); matrix=a; for k=length(matrix):-1:1 b=matrix(k); if b==0 matrix(:,k)=[]; a=matrix; else a=matrix; end end e=i+1; d=lcolumn{1,e}; c=lcolumn{1,e-1}; if d==c form=length(a):-1:1 % level output

  • VHDL Implementation of Fast adder trees

    69

    if m==1 fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0) \n',i, m, a(m)-1); else fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0);\n',i, m, a(m)-1); end end else form=length(a)-1:-1:0 % level output if m==0 fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0) \n',i, m, a(m+1)-1); else fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0);\n',i, m, a(m+1)-1); end end end fprintf(fid,'); \n'); fprintf(fid,'end tree_level_%d; \n',i); fprintf(fid,'\n'); %% show arcitecture fprintf(fid,'architecture tree_level_%d of tree_level_%d is \n',i,i); fprintf(fid,'\n'); fprintf(fid,'component full_adder \n'); fprintf(fid,'port( \n'); fprintf(fid,'ain, bin, cin : in std_logic; \n'); fprintf(fid,'sout, cout : out std_logic ); \n'); fprintf(fid,'end component; \n'); fprintf(fid,'\n'); fprintf(fid,'component half_adder \n'); fprintf(fid,'port( \n');

  • VHDL Implementation of Fast adder trees

    70

    fprintf(fid,'ain, bin : in std_logic; \n'); fprintf(fid,'sout, cout : out std_logic ); \n'); fprintf(fid,'end component; \n'); fprintf(fid,'\n'); fprintf(fid,'begin \n'); % set the realtion column=lcolumn{1,i}; % how many columns in level 1 for k=column:-1:1 fn=carfull{1,k,i}; hn=carhalf{1,k,i}; bpn=carbp{1,k,i}; if fn~=0 for m=1:1:fn fprintf(fid,'%s : full_adder port map ( %s,%s,%s,%s,%s); \n', funame{m,k,i}, fuinput{3*m-2,k,i},fuinput{3*m-1,k,i},fuinput{3*m,k,i},fos{m,k,i},foc{m,k,i} ); end else fprintf(fid,' \n'); end if hn~=0; fprintf(fid,'%s : half_adder port map ( %s,%s,%s,%s); \n', haname{1,k,i}, hainput{1,k,i},hainput{2,k,i},hos{1,k,i},hoc{1,k,i} ); else fprintf(fid,' \n'); end if bpn~=0; fprintf(fid,'%s

  • VHDL Implementation of Fast adder trees

    71

    Appendix 2 MATLAB program for top level

    page=lelve fid = fopen('toplevel.vhdl', 'w); fprintf(fid,'library ieee; \n'); fprintf(fid,' \n'); fprintf(fid,'use ieee.std_logic_1164.all; \n'); fprintf(fid,' \n'); fprintf(fid, 'entity top_level is \n'); fprintf(fid, 'port( \n'); for k=length(a):-1:1 % level 1 input fprintf(fid, 'in_data_%d_%d : in std_logic_vector(%d downto 0);\n',1, k, a(k)-1); end % last level output i=page; a=Fresult(i,:); matrix=a; for k=length(matrix):-1:1 b=matrix(k); if b==0 matrix(:,k)=[]; a=matrix; else a=matrix; end end

  • VHDL Implementation of Fast adder trees

    72

    e=i+1; d=lcolumn{1,e}; c=lcolumn{1,e-1}; if d==c for m=length(a):-1:1 % level output if m==1 fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0) \n',i, m, a(m)-1); else fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0);\n',i, m, a(m)-1); end end else for m=length(a)-1:-1:0 % level output if m==0 fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0) \n',i, m, a(m+1)-1); else fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0);\n',i, m, a(m+1)-1); end end end fprintf(fid,'); \n'); fprintf(fid,'end top_level; \n'); fprintf(fid,' \n'); fprintf(fid,'architecture top_level of top_level is \n'); fprintf(fid,' \n'); fprintf(fid,'component full_adder \n'); fprintf(fid,'port( \n'); fprintf(fid,'ain, bin, cin : in std_logic; \n');

  • VHDL Implementation of Fast adder trees

    73

    fprintf(fid,'cout, sout : out std_logic ); \n'); fprintf(fid,'end component; \n'); fprintf(fid,' \n'); fprintf(fid,'component half_adder \n'); fprintf(fid,'port( \n'); fprintf(fid,'ain, bin : in std_logic; \n'); fprintf(fid,'cout, sout: out std_logic ); \n'); fprintf(fid,'end component; \n'); fprintf(fid,' \n'); for i=1:1:page fprintf(fid,'component tree_level_%d \n',i); fprintf(fid, 'port( \n'); if i==1; a=final; for k =length(a):-1:1 % level 1 input fprintf(fid, 'in_data_%d_%d : in std_logic_vector(%d downto 0);\n',i, k, a(k)-1); end result = zeros(1, length(a)); carry_out = 0; for k =length(a):-1:1 % k columnes in vector b=a(k);% get the number of each column rem (b,3);% get rem of b/3 c=fix(b./3);% # of full adder d=b-3.*fix(b./3);% remainder carry_in=carry_out; if d==0 carry_out_HA=0; sum=0; elseif d==1 carry_out_HA=0; sum=1; else

  • VHDL Implementation of Fast adder trees

    74

    carry_out_HA=1; sum=1; end carry_out = carry_out_HA + c; if k==length(a) totalsum=sum+c; else totalsum=c+carry_in+sum; end result(k)=totalsum; end if carry_out~=0 result = [ carry_out, result ]; end c=lcolumn{1,1}; d=lcolumn{1,2}; % level 1 output if d==c for m=length(result):-1:1 if m==1 fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0) \n',i, m, result(m)-1); else fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0);\n',i, m, result(m)-1); end end else

  • VHDL Implementation of Fast adder trees

    75

    for m=length(result)-1:-1:0 % level output if m==0 fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0) \n',i, m, result(m+1)-1); else fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0);\n',i, m, result(m+1)-1); end end end fprintf(fid,' \n'); else a=Fresult(i-1,:); matrix=a; for k=length(matrix):-1:1 b=matrix(k); if b==0 matrix(:,k)=[]; a=matrix; else a=matrix; end end for k1=length(a):-1:1 fprintf(fid, 'in_data_%d_%d : in std_logic_vector(%d downto 0);\n',i, k1, a(k1)-1); end a=Fresult(i,:); % show outputs

  • VHDL Implementation of Fast adder trees