10.1.1.134.725

EE382N

High Speed Computer Arithmetic

Fall 2006

Project Report

A Newton Raphson Divider Based on Improved Reciprocal

Approximation Algorithm

Gaurav Agrawal

Ankit Khandelwal

Submitted On

Dec 4, 2006

2

Abstract

Newton Raphson Functional Approximation is an attractive division strategy as it

provides quadratic convergence and can be faster than digit recurrence methods if an

accurate initial approximation is available. In this project, we study and simulate several

table-lookup based initial approximation methods. Of particular interest is the Taylor

Series based reciprocal approximation method which uses a table lookup followed by a

multiplication for initial approximation and can provide a very accurate approximation

with a very small ROM size. We implemented and simulated a 24 bit divider based on

various methods published in literature and also proposed an improvement that retains

accuracy while using a much smaller ROM.

3

Table of Contents

1. Motivation

4

2. Problem Statement

4

3. Background

4

3.1 The Division Problem

4

3.2 Classification of Division Algorithms 5

3.2.1 Digit Recurrence Algorithms (Slow Division) 5

3.2.2 Functional Approximation Algorithms (Fast Division)

5

3.3 Initial Approximation Techniques 7

3.3.1 Linear Approximation 8

3.3.2 Direct Table Lookup 8

3.3.1 Table Lookup followed by Multiplication

8

4. Related Work

10

5. Design Implementation

14

6. Results

20

7. Conclusion

22

8. References

23

Appendix A: MATLAB Code

24

Appendix B: Table of ROM Values 35

4

1. Motivation

Floating point performance is a key denominator of performance for several applications

including those in scientific, graphics and DSP domains.

High speed floating point hardware is a requirement to meet the ever increasing

computational demands of these applications. Modern applications comprise several

floating point operations including addition, multiplication, and division. In recent FPUs,

emphasis has been placed on designing ever faster adders and multipliers, with division

receiving less attention. Typically, the range for addition latency is two to four cycles,

and the range for multiplication is two to eight cycles. In contrast, the latency for double

precision division in modern FPUs ranges from less than eight cycles to over 60 cycles. A

common perception of division is that it is an infrequent operation whose implementation

need not receive high priority. However, it has been argued that ignoring its

implementation can result in significant system performance degradation for many

applications.

2. Problem Statement

In Newton-Raphson functional approximation based division algorithms, the accuracy of

initial reciprocal approximation is highly desirable as it enables quick convergence to the

final result. The problem studied in this project is that of determination of the reciprocal

approximation with high accuracy while using less area.

3. Background

3.1 The Division Problem:

The problem of arithmetic division can be formulated as below:

D

NQ =

Where:

Q = Quotient

N = Numerator (Dividend)

D = Denominator (Divisor)

In this project N and D are assumed to be of the form (as would be the case for the

mantissa of a normalized floating point number)

k

k

yyyyD

xxxxN

......1

......1

321

321

=

=

5

3.2 Classification of division algorithms:

The division techniques suitable for VLSI implementation can be divided into two broad

categories:

3.2.1 Digit Recurrence Algorithms (Slow Division):

Digit recurrence algorithms use subtractive methods to calculate quotients one digit per

iteration. The basic recurrence relation used in these algorithms is as given below:

DqrPP jnjj )1(1 ++ =

Where

jP = The partial remainder of the division

r = The radix

)1( + jnq = the digit of the quotient in position )1( + jn , where the digit positions

are numbered from least-significant 0 to most significant )1( n

n = number of digits in the quotient

D = the denominator

Various techniques using digit-recurrence algorithms can be classified as below:

(i) Restoring Division a. Performing Restoring Division b. Non-Performing Restoring Division

(ii) Non Restoring Division (iii) Radix-r SRT Division

3.2.2 Functional Approximation Algorithms (Fast Division):

Unlike digit recurrence division, division by functional iteration utilizes multiplication as

the fundamental operation. The primary difficulty with subtractive division is the linear

convergence to the quotient. Multiplicative division algorithms are able to take

advantage of high-speed multipliers to converge to a result quadratically. Rather than

retiring a fixed number of quotient bits in every cycle, multiplication-based algorithms

are able to double the number of correct quotient bits in every iteration. However, the

tradeoff between the two classes is not only latency in terms of the number of iterations,

but also the length of each iteration in cycles. Additionally, if the divider shares an

existing multiplier, the performance ramifications on regular multiplication operations

must be considered. It has been reported that in typical floating point applications, the

performance degradation due to a shared multiplier is small. Accordingly, if area must be

minimized, an existing multiplier may be shared with the division unit with only minimal

system performance degradation.

6

(i) Goldschmidts Algorithm:

Goldschmidt algorithm uses series expansion to converge to the quotient. The strategy of

Goldschmidt is repeatedly multiply the dividend and divisor by a factor R to converge the

divisor to 1 as the dividend converges to the quotient Q.

)).....(2)(1)(0(

)).....(2)(1)(0(

RKRRRD

RKRRRNQ =

As )).....(2)(1)(0( RKRRRD converges to 1, )).....(2)(1)(0( RKRRRN converges to Q.

(ii) Newton Raphson Division:

Newton Raphson iteration is a well-known iterative method to approximate the root of a

non-linear function. Let )(xf be a well behaved function and let r be a root of the

equation 0)( =xf , we start with 0x which is a good estimate of r and let hxr += 0 . The

number h measures how far the estimate 0x is from the truth. Since h is small, the

linear approximation can be used to conclude that

)(')()()(0 000 xhfxfhxfrf ++==

And therefore, unless )(' 0xf is close to 0,

)('

)(

0

0

xf

xfh

It follows that

)('

)(

0

000

xf

xfxhxr +=

Our new improved estimate 1x of r is therefore given by

)('

)(

0

001

xf

xfxx =

Continue in this way. If ix is the current estimate, then the next estimate 1+ix is given by:

)1()('

)(1

i

iii

xf

xfxx =

+

The equation obtained above is called the Newton Raphson formula. In order to compute

the reciprocal, the following function and its derivative are used:

7

)2(1

)( Xx

xf =

)3(1

)('2x

xf =

Substituting equations (1) and (2) into (3) yields

)4(2 21 iii Xxxx =+

It can also be written as:

)5()2(1 iii Xxxx =+

Above equations can be implemented in hardware in order to double the accuracy in each

iteration. Using the form in equation (4), one square, one multiplication, one shift and one

subtraction are required for computation of 1+ix .

Error Analysis:

Let ii xX=

1 be the error at thi iteration, then:

)2(11

11 XxxX

xX

iiii == ++

This can also be expressed as:

22

1 )/1( iii XxXX ==+

The above equation clearly shows that the absolute error decays quadratically in each

iteration.

3.3 Initial Approximation Techniques

Quadratic convergence techniques like Newton-Raphson, require an initial approximation

on which they iterate to improve the accuracy of the final result. The number of iterations

required depends upon the accuracy of the initial approximation. The reduction in

number of iterations not only decreases the area of the design but it also helps in reducing

the delay and the power numbers. Thus it is good to have as accurate an initial

approximation as possible with as little an area increment as possible. Various techniques

are available to calculate the initial approximation and some of them are explained below.

8

3.3.1 Linear Approximation

This method is one of the simplest approaches used for calculating the initial

approximation. It uses the equation, )2914.2(0 DX = , and can be easily implemented

using an adder. But this approach does not provide a good initial approximation and

hence is rarely used in real world designs.

3.3.2 Direct Table Look Up

This method uses a ROM, as a look up table, to calculate the initial approximation. The

m most significant bits, excluding the leading 1, of the mantissa are used as the address

bits for the table look up. The values stored in the table are calculated using the equation

2

1

1 22'

1

+

+=

M

Mstored DD

Where, ]....1[' 21 MdddD = and (M+1) is the accuracy in bits desired in the initial

approximation. The ROM size required by this approach is MM 12 bits.

3.3.3 Table Look Up Followed By Multiplication

The ROM values are obtained by performing the Taylor series expansion of the

reciprocal function. A Taylor series expansion for a general function )(xf around point

a is given by

=

=0

)(!

)()(

n

nn

axn

afxf

So in order to obtain the Taylor series expansion of the reciprocal function 1D , the

operand D is split into two parts such that,

mm dddD ....1 211 = m

kmmm dddD

++= 2]...[. 212 and

21 mm DDD += , (1)

2mD can further be represented as, 1

2 2

=m

m dD .

Substituting this value of 2mD in equation (1) gives,

1

1 2

+=m

m dDD

9

Expanding the Taylor series for 1D around 1=d and taking the first two terms gives

the following equation,

)]2[()2(

)]2()2[()2(

)2()2()2(

)22()2()2(

2)1()2()2(

21

21

1

1

1

2

1

1

21

1

1

1

2

21

1

11

1

1

1121

1

11

1

1

121

1

11

1

1

m

m

m

m

m

m

m

m

m

m

m

m

m

m

m

m

m

mmm

m

m

m

mm

m

m

m

DDDD

DDDD

DDDD

dDDD

dDDD

++=

++=

++=

++=

++=

The first term, 211 )2(

+m

mD , of this equation is read from the ROM using the m

address bits of 1mD excluding the leading 1. The second term, 21 2 mm

m DD + , which

can be represented as kmmm dddddd~

...~~

....1 2121 ++ , is obtained from the operand modifier.

The operand modifier keeps the first )1( +m bits (including the leading 1) intact and

inverts the rest of the bits to obtain the final output.

Multiplication of these terms provides an initial approximation of the inverse of

denominator D , whose accuracy is )2( m bits. The corresponding ROM size is mm 2)22( + bits. For example, if an accuracy of 14 bits is desired in the initial

approximation, then a ROM of size 8962)262( 6 =+ bits needs to be designed. Since

the ROM output is only )22( +m bits accurate, the output bit accuracy obtained from

the operand modifier can be reduced to )22( +m bits. Finally, a multiplier of size

)22()22( ++ mm bits could be used and its output could be finally rounded off to

)22( +m bits. This would help in reducing the area and the power consumption of the

design.

In order to keep the ROM size minimum, the value of m needs to be determined

carefully as the ROM size depends exponentially on m .

10

4. Related Work:

Rich literature exists describing various implementations of Newton Raphson reciprocal

approximation based dividers. Most of these implementations differ in the method used

to get the initial estimate of reciprocal of denominator and in the tradeoffs between speed

and area. Described below are some of the interesting implementations that were

considered as the baseline for this project.

The design proposed by Fowler et al. [3], shown in fig 1 (Called Design 3), is one of the

earliest NR techniques utilizing direct table look-up method for initial approximation of

the reciprocal. The first ( 1m ) mantissa bits (excluding the leading 1) were used to

index into the ROM and a m bits accurate initial approximation was obtained, thus

making the ROM size mm 12 . Iteration steps were then used to improve upon this

initial approximation. Each iteration step needed two multipliers and a bunch of inverters

for calculating the twos complement (rather ones complement).

kxxxxxxxxxD ..........1 16151487621=

D

NQ =

kxxxxxxxxxN ..........1 16151487621=

Figure: Fowler et al.s Design [3]

11

Kucukkabak et al [2] used Taylor Series Expansion method to calculate the initial

approximation for its iteration steps. The initial approximation was obtained after

multiplying the ROM output with the modifier output as shown on next page (Called

Design 2). A mm 22 ROM output and a modifier output of m2 bits were used to

calculate the initial approximation of 32 m bits accuracy. The multiplier used for

calculating the initial approximation is also utilized by the iteration steps (twice by each

iteration) to improve upon the intermediate results.

5216151487621 ..........1 xxxxxxxxx

Figure: Kucukkabak et al.s Design [2]

12

Chen et al. [1] proposed an improvement to the table look-up read followed by a

multiplication with the modifier output, as shown on next page (Called Design 1), to

obtain the initial approximation. The memory size used in the design is )22(2 + mm

bits wide and the modifier output obtained is 12 +m bits wide. This provides an accuracy

of 12 m bits. Iteration steps improve upon the initial approximation, thus obtained, with

the help of a squarer, a multiplier, a shifter and a subtractor.

ROM Modifier

Register Register

Multiplier 1

Register

Squarer

MUX1

Register

Mutliplier 2

Register

Register

Subtractor 1

Shift Register

Shift Register

Register

MUX2

5216151487621 ..........1 xxxxxxxxx

con1

con2

Figure: Chen et al.s Design [1]

13

The three designs described above have their own advantages and disadvantages. As is

clearly evident, design 3 provides the best throughput but in turn requires more hardware,

thus increasing power and area. Design 2, on the other hand uses the same multiplier for

each multiplication step required. This reduces the throughput by a huge amount but

saves upon the area of the design. Design 1 uses different multipliers for initial

approximation and iteration steps. Hence, though it provides better throughput but it has

more area than Design 2. In comparison to Design 3, it is better is terms of area but is

worse in terms of throughput. If an initial approximation of a particular accuracy is

desired by all the designs, then Design 1 requires the minimum ROM size.

14

5. Design Implementation:

We used an unsigned 24 bit divider as the design on which various strategies were

compared. The divider performs the following operation:

2423321

2423321

......1

......1

yyyyyD

xxxxxN

D

NQ

=

=

=

N and D both have 24-bit significands and have been considered to be in the range

2),(1

15

Minimization of M was key to area minimization as the size of ROM depends

exponentially on M. As the accuracy of 2-term Taylor Series approximation itself

depends on the value of M, we first determined the minimum value of M for which

Taylor Series approximation itself will have sufficient accuracy.

ROM Modifier

Register Register

Multiplier 1

Register

Squarer

Register

Mutliplier 2

Register

Subtractor

Shift Register

2416151487621 ..........1 xxxxxxxxxD =

Mutliplier 3

D

NQ =

2416151487621 ..........1 xxxxxxxxxN =

M

W WX

WD

WX

WD

WD

WD

24

Figure: Improved Design

16

Key results from this analysis are shown in the figures below. They show the error in

Taylor Series approximation for M = 3,4,5,6,7. The plots in left column show the

histogram error while those in right column show the magnitude of the error. It can be

seen that the error magnitude is always negative which means that truncating the Taylor

series always results in an approximation that is always smaller than the true result. This

insight is used later for selecting ROM patterns.

5 10 15 20 250

0.05

0.1

0.15

0.2

0.25

0.3

Error in Quotient M=3

Error (in Bit)

Probability

-4.5 -4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0

x 10-3

0

0.1

0.2

0.3

0.4

0.5

0.6


Error Magnitude

Probability

5 10 15 20 250

0.05

0.1

0.15

0.2

0.25

0.3Error in Quotient M=4

Error (in Bit)

Probability

-1.2 -1 -0.8 -0.6 -0.4 -0.2 0

x 10-3

0

0.1

0.2

0.3

0.4

0.5

0.6


Error Magnitude

Probability

5 10 15 20 250

0.05

0.1

0.15

0.2

0.25

0.3


Error (in Bit)

Probability

-3 -2 -1 0

x 10-4

0

0.1

0.2

0.3

0.4

0.5

0.6


Error Magnitude

Probability

5 10 15 20 250

0.05

0.1

0.15

0.2

0.25

0.3


Error (in Bit)

Probability

-7 -6 -5 -4 -3 -2 -1 0

x 10-5

0

0.1

0.2

0.3

0.4

0.5

0.6


Error Magnitude

Probability

17

5 10 15 20 250

0.05

0.1

0.15

0.2

0.25

0.3


Error (in Bit)

Probability

-1.6 -1.4 -1.2 -1 -0.8 -0.6 -0.4 -0.2 0

x 10-5

0

0.1

0.2

0.3

0.4

0.5

0.6


Error Magnitude

Probability

Also looking at the histograms, we can see that as M increases, we get more and more

accuracy (visible from rightward shift of histograms as M increases). The increase in

accuracy of approximation as M increases is shown in the plot below. The accuracy

depends linearly on M as roughly (2M+1).

3 3.5 4 4.5 5 5.5 6 6.5 77

8

9

10

11

12

13

14

15

16

M (Bits)

Quotient Precision (Bits)

2-term Taylor Approximation accuracy vs M

It can be seen that M=6 is enough to give us desired accuracy of 13 bits in initial

approximation. M=5 does not provide enough accuracy while M=7 consumes

unnecessary ROM area. Therefore, we selected M=6 in our implementation.

Next step was to select a width of ROM word (W) and a strategy to fit the actual

approximation to this finite word size. Again we simulated for M=6, different values of

W for 3 different strategies:

1. Truncating the approximation to W bits (used in [1]) 2. Rounding the approximation to W bits 3. Ceiling the approximation to W bits

Accuracy = 2M+1

18

10 11 12 13 14 15 16 17 18 198

9

10

11

12

13

14ROM Width vs Accuracy (M=6)

ROM Width (Bits)

Quotient Precision

TruncationRoundingCeiling

The results from simulation are shown above. We find that the rounding gives us better

accuracy for smaller W but ceiling gives superior results when W is sufficiently large.

This can be understood by recalling that Taylor Series approximation by itself

underestimates the result. Therefore, doing a ceiling operation tends to compensate the

error introduces by Taylor series approximation. On the other hand, both rounding and

more so truncation add to the Taylor series error and therefore do not perform as well as

ceiling. This is visible in error histogram below for (M=6, W=15) where ceiling operation

provides a symmetric behavior around 0. For this reason, we ceiled the approximation to

15 bits in our implementation.

Appendix B lists the contents of ROM as used in our design.

-2 -1 0 1 2

x 10-4

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Floor (M=6, W=15)

Error in Quotient

Probability

-2 -1 0 1 2

x 10-4

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Round (M=6, W=15)

Error in Quotient

Probability

-2 -1 0 1 2

x 10-4

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Ceiling (M=6, W=15)

Error in Quotient

Probability

At small W, however, the error introduced by finite word width dominates the total error

and therefore, rounding gives the best results due to its symmetric error around zero.

19

The next parameter to decide was the width of modified operand (WX). Given our

previous selection of M=6 and W=15, we simulated various values of WX and the result

is as given below.

10 11 12 13 14 15 16 17 18 19 209

9.5

10

10.5

11

11.5

12

12.5

13

13.5XP Width vs Accuracy (M=6, W=15)

XP Width (Bits)

Quotient Precision (Bits)

We find that WX=15 gives us more than 12.5 bits of accuracy which will still leave us

with some margin to account for finite word effects and error amplification. WX=14 is

marginal at 12 bits which WX=16 will result in an unnecessarily large multiplier.

Therefore, we chose to use WX=15.

Selection of parameters M=6, W=15, WX=15 gives us an accurate enough

approximation. The next step was to decide the width of data path (WD) in Newton

Raphson iteration. We determined that WD=27 was the smallest value that gave us 24

bits of accuracy in final result. Therefore we chose WD=27.

For the other designs, we kept the various parameters as proposed in those designs and

made some modifications to convert their scheme to 24 bit wide multiplier.

20

6. Results:

The tables below list quantitative comparison of the three Newton-Raphson divider

implementations published in literature [1, 2, and 3] with our improved implementation.

Options Considered:

Design 1 The design based on the work of Chen et al. [1]

Design 2 The design based on the work of Kucukkabak et al. [2]

Design 3 The design based on the work of Fowler et al. [3]

Our Design The design based on improved table lookup

Complexity:

Implementation ROM Size Logic Gates

Design 1 2 Kbits 19530



Our Design 0.94 Kbits 19222

Accuracy:

Implementation Worst Accuracy (Bits)

Design 1 24.15

Design 2 24.31

Design 3 24.14

Our Design 24.27

Speed (Latency and Throughput):

Implementation Latency Pipelining Throughput

Design 1 5 1+1+1+1+1 1/Tck

Design 2 5 1+3+1 1/(3Tck)

Design 3 4 1+1+1+1 1/(Tck)

Our Design 5 1+1+1+1+1 1/(Tck)

As can be seen our design uses the smallest amount of ROM while still meeting the

desired accuracy. It has the same latency as [1] and can be fully pipelined.

21

The benefit that we achieved by using ceil instead of truncation can be seen in the error

histograms below. The first histogram is for our implementation while the second is the

histogram that would be achieved if we had used truncation instead. It can be seen that

truncated ROM does not meet the accuracy requirement of 24 bits.

18 20 22 24 26 28 30 32 34 360

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45Improved Design (M=6, W=15, 27 bit datapath)

Error (in bit)

Probability

18 20 22 24 26 28 30 32 34 360

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45Effect of truncated ROM values (M=6, W=15, 27 bit datapath)

Error (in bit)

Probability

Error in bit 24

22

7. Conclusions:

In this project we studied, simulated and compared three divider implementations based

on Newton Raphson based reciprocal division. We also proposed an improved

implementation that provides better accuracy while using a smaller ROM size than the

published methods.

23

8. References:

[1] Dongdong Chen; Bintian Zhou; Zhan Guo; Nilsson, P., "Design and implementation

of reciprocal unit," Circuits and Systems, 2005. 48th Midwest Symposium on , vol., no.pp.

1318- 1321 Vol. 2, 7-10 Aug. 2005

[2] Kucukkabak, U. and Akkas, A. 2004. Design and Implementation of Reciprocal Unit

Using Table Look-up and Newton-Raphson Iteration. In Proceedings of the Digital

System Design, EUROMICRO Systems on (Dsd'04) - Volume 00 (August 31 - September

03, 2004). DSD. IEEE Computer Society, Washington, DC, 249-253.

[3] Fowler, D.L.; Smith, J.E., "An accurate, high speed implementation of division by

reciprocal approximation," Computer Arithmetic, 1989., Proceedings of 9th Symposium

on , vol., no.pp.60-67, 6-8 Sep 1989

[4] Obermann, S.F.; Flynn, M.J., "Division algorithms and implementations," Computers,

IEEE Transactions on , vol.46, no.8pp.833-854, Aug 1997

[5] Behrooz Parhami, Computer Arithmetic Algorithms and Hardware Designs, Oxford

University Press, October 1999

24

Appendix A (Matlab Code)

%----------------------------------------------------------------------- % 24bit / 24bit = 24 bit newton Raphson Divider % % MATLAB code implementing the strategy as published in % Dongdong Chen; Bintian Zhou; Zhan Guo; Nilsson, P., % "Design and implementation of reciprocal unit," Circuits and Systems, % 2005. 48th Midwest Symposium on , vol., no.pp. 1318- 1321 Vol. 2, % 7-10 Aug. 2005 % % Ankit Khandelwal % Gaurav Agrawal % % Worst Case Accuracy: 24.15 bits % % Hardware Needed: % ROM Size : 2^7 x 16 bits % Multipliers : 4 % 16 x 15 = 16 truncated multiplier % 16 x 16 = 27 truncated squarer % 27 x 24 = 27 rounded multiplier % 27 x 24 = 24 rounded multipler % Adder : 1 % 27 - 27 = 27 subtractor % % Performance: 5 cycles (1+1+1+1+1) %----------------------------------------------------------------------- function Design1() M = 7; W = 2*M + 2; WX = 2*M; WM = 2*M + 2; WD = 27; N = 24; NUM_SAMPLES = 70000; for i=1:NUM_SAMPLES Nm1 = round(rand*(2^N))/2^N; Dm1 = round(rand*(2^N))/2^N; Q = (1+Nm1)/(1+Dm1); Q_scale = 1; if (Q < 1.0) Q_scale = 2; Q = Q*2; end Qm1 = Q-1; %------------------------------------------------------------ % Approximation by rom lookup followed by multiplier % generates WM bit wide approximation p0 %------------------------------------------------------------ % read rom - get fraction romvalue = rom(floor(Dm1*2^M), M, W); % determine 1.x1x2x3..xm+1'xm+2'

25

xp = truncate((1 + (floor(Dm1*2^M))/2^M + (1 - ((Dm1 - ((floor(Dm1*2^M))/2^M))*2^M) - 1/2^(N-M))/2^M), WX); % get initial approx p0 = truncate(xp*romvalue, WM); % 16 x 15 = 16 truncated multiplier %---------------------------------------------------------- % Iteration of Newton Raphson Method %---------------------------------------------------------- p0_squared = truncate(p0*p0, WD); % 16 x 16 = 27 truncated squarer i0 = round2(p0_squared*(Dm1+1), WD); % 27 x 24 = 27 rounded multiplier p1 = 2*p0-i0; % 27 - 27 = 27 subtractor %---------------------------------------------------------- % Final Multiplication %---------------------------------------------------------- QNR = round2(p1*(1+Nm1),24); % 27 x 24 = 24 rounded multipler Err_NR(i) = QNR - (Qm1+1)/Q_scale; end Err_NR = -log2(abs(Err_NR)+1e-50); minnr = min(Err_NR); X = 19.5:35.5; [N] = hist(Err_NR, X); N = N/length(Err_NR); bar(X, N); title('Design1'); xlabel('Error (in bit)'); ylabel('Probability'); fprintf (1, 'Max Error is %f\n', minnr); end %------------------------------------------------------------ % Function: rom %------------------------------------------------------------ function value = rom(add, M, W) if ((add < 0) || (add >= 2^M)) fprintf (1, 'Error: Invalid address to rom: %d\n', add); error ('quiting'); end x1 = 1 + add/2^M; c = 1/(x1 + 2^(-M-1))^2; value = floor(c*2^(W))/2^(W); end %------------------------------------------------------------ % Function: truncate %------------------------------------------------------------ function value = truncate(X, N) value = floor(X*(2^N))/(2^N); end %------------------------------------------------------------ % Function: round2 %------------------------------------------------------------ function value = round2(X, N) value = floor(X*(2^N)+0.5)/(2^N); end

26

%----------------------------------------------------------------------- % 24bit / 24bit = 24 bit newton Raphson Divider % % MATLAB code implementing the strategy as published in % Kucukkabak, U. and Akkas, A. 2004. Design and Implementation of % Reciprocal Unit Using Table Look-up and Newton-Raphson Iteration. In % Proceedings of the Digital System Design, EUROMICRO Systems on (Dsd'04) % - Volume 00 (August 31 - September 03, 2004). DSD. IEEE Computer % Society, Washington, DC, 249-253. % % Ankit Khandelwal % Gaurav Agrawal % % Worst Case Accuracy: 24.31 bits % % Hardware Needed: % ROM Size : 2^10 x 20 bits % Multipliers : 2 % 27 x 27 = 27 rounded multiplier % 27 x 24 = 24 rounded multipler % % Performance: 5 cycles (1+3+1) %----------------------------------------------------------------------- function Design2() M = 10; W = 2*M; WX = 2*M; WD = 27; N = 24; NUM_SAMPLES = 70000; for i=1:NUM_SAMPLES Nm1 = round(rand*(2^N))/2^N; Dm1 = round(rand*(2^N))/2^N; Q = (1+Nm1)/(1+Dm1); Q_scale = 1; if (Q < 1.0) Q_scale = 2; Q = Q*2; end Qm1 = Q-1; %------------------------------------------------------------ % Approximation by rom lookup followed by multiplier % generates WM bit wide approximation p0 %------------------------------------------------------------ % read rom - get fraction romvalue = rom(floor(Dm1*2^M), M, W); % determine 1.x1x2x3..xm+1'xm+2' xp = truncate((1 + (floor(Dm1*2^M))/2^M + (1 - ((Dm1 - ((floor(Dm1*2^M))/2^M))*2^M) - 1/2^(N-M))/2^M), WX); % get initial approx p0 = round2(xp*romvalue, WD); % 27 x 27 = 27 rounded multiplier %---------------------------------------------------------- % Iteration of Newton Raphson Method %----------------------------------------------------------

27

p0D = round2(p0*(1+Dm1), WD); % same multiplier p0Db = (2-p0D-(2^(-WD-1))); p1 = round2(p0Db*p0, WD); % same multiplier %---------------------------------------------------------- % Final Multiplication %---------------------------------------------------------- QNR = round2(p1*(1+Nm1),24); % 27 x 24 = 27 rounded multiplier Err_NR(i) = QNR - (Qm1+1)/Q_scale; end Err_NR = -log2(abs(Err_NR)+1e-50); minnr = min(Err_NR); X = 19.5:35.5; [N] = hist(Err_NR, X); N = N/length(Err_NR); bar(X, N); title('Design2'); xlabel('Error (in bit)'); ylabel('Probability'); fprintf (1, 'Max Error is %f\n', minnr); end %------------------------------------------------------------ % Function: rom %------------------------------------------------------------ function value = rom(add, M, W) if ((add < 0) || (add >= 2^M)) fprintf (1, 'Error: Invalid address to rom: %d\n', add); error ('quiting'); end x1 = 1 + add/2^M; c = 1/(x1 + 2^(-M-1))^2; value = floor(c*2^(W))/2^(W); end %------------------------------------------------------------ % Function: truncate %------------------------------------------------------------ function value = truncate(X, N) value = floor(X*(2^N))/(2^N); end %------------------------------------------------------------ % Function: round2 %------------------------------------------------------------ function value = round2(X, N) value = floor(X*(2^N)+0.5)/(2^N); end %------------------------------------------------------------ % EOF %------------------------------------------------------------

28

%----------------------------------------------------------------------- % 24bit / 24bit = 24 bit newton Raphson Divider % % MATLAB code implementing the strategy as published in % Fowler, D.L.; Smith, J.E., "An accurate, high speed implementation % of division by reciprocal approximation," Computer Arithmetic, 1989., % Proceedings of 9th Symposium on , vol., no.pp.60-67, 6-8 Sep 1989 % % Ankit Khandelwal % Gaurav Agrawal % % Worst Case Accuracy: 24.14 bits % % Hardware Needed: % ROM Size : 2^13 x 14 bits % Multipliers : 3 % 14 x 24 = 27 rounded multiplier % 14 x 27 = 27 rounded multiplier % 27 x 24 = 27 rounded multiplier % % Performance: 3 cycles (1+1+1) %----------------------------------------------------------------------- function Design3() M = 13; W = 14; WD = 27; N = 24; NUM_SAMPLES = 70000; for i=1:NUM_SAMPLES Nm1 = round(rand*(2^N))/2^N; Dm1 = round(rand*(2^N))/2^N; Q = (1+Nm1)/(1+Dm1); Q_scale = 1; if (Q < 1.0) Q_scale = 2; Q = Q*2; end Qm1 = Q-1; %------------------------------------------------------------ % Approximation by rom lookup followed by multiplier % generates WM bit wide approximation p0 %------------------------------------------------------------ % read rom - get fraction p0 = rom(floor(Dm1*2^M), M, W); %---------------------------------------------------------- % Iteration of Newton Raphson Method %---------------------------------------------------------- p0D = round2(p0*(1+Dm1), WD); % 14 x 24 = 27 rounded multiplier p0Db = (2-p0D-(2^(-WD-1))); p1 = round2(p0Db*p0, WD); % 14 x 27 = 27 rounded multiplier %---------------------------------------------------------- % Final Multiplication %----------------------------------------------------------

29

QNR = round2(p1*(1+Nm1),24); % 27 x 24 = 27 rounded multiplier Err_NR(i) = QNR - (Qm1+1)/Q_scale; end Err_NR = -log2(abs(Err_NR)+1e-50); minnr = min(Err_NR); X = 19.5:35.5; [N] = hist(Err_NR, X); N = N/length(Err_NR); bar(X, N); title('Design3'); xlabel('Error (in bit)'); ylabel('Probability'); fprintf (1, 'Max Error is %f\n', minnr); end %------------------------------------------------------------ % Function: rom %------------------------------------------------------------ function value = rom(add, M, W) if ((add < 0) || (add >= 2^M)) fprintf (1, 'Error: Invalid address to rom: %d\n', add); error ('quiting'); end x1 = 1 + add/2^M; c = (1/(x1 + 2^(-M-1))) + (2^(-M-2)); value = floor(c*2^(W))/2^(W); end %------------------------------------------------------------ % Function: truncate %------------------------------------------------------------ function value = truncate(X, N) value = floor(X*(2^N))/(2^N); end %------------------------------------------------------------ % Function: round2 %------------------------------------------------------------ function value = round2(X, N) value = floor(X*(2^N)+0.5)/(2^N); end %------------------------------------------------------------ % EOF %------------------------------------------------------------

30

%----------------------------------------------------------------------- % 24bit / 24bit = 24 bit newton Raphson Divider % % MATLAB code implementing our Newton Raphson based divider which uses % smaller ROM size and gets better accuracy in less area % % Ankit Khandelwal % Gaurav Agrawal % % Worst Case Accuracy: 24.27 bits % % Hardware Needed: % ROM Size : 2^6 x 15 bits % Multipliers : 4 % 16 x 15 = 15 truncated multiplier % 15 x 15 = 27 truncated squarer % 27 x 24 = 27 rounded multiplier % 27 x 24 = 24 rounded multipler % Adder : 1 % 27 - 27 = 27 subtractor % % Performance: 5 cycles (1+1+1+1+1) %----------------------------------------------------------------------- function Divider() M = 6; W = 2*M + 3; WX = 2*M + 3; WM = 2*M + 3; WD = 27; N = 24; NUM_SAMPLES = 70000; for i=1:NUM_SAMPLES Nm1 = round(rand*(2^N))/2^N; Dm1 = round(rand*(2^N))/2^N; Q = (1+Nm1)/(1+Dm1); Q_scale = 1; if (Q < 1.0) Q_scale = 2; Q = Q*2; end Qm1 = Q-1; %------------------------------------------------------------ % Approximation by rom lookup followed by multiplier % generates WM bit wide approximation p0 %------------------------------------------------------------ % read rom - get fraction romvalue = rom(floor(Dm1*2^M), M, W); % determine 1.x1x2x3..xm+1'xm+2' xp = truncate((1 + (floor(Dm1*2^M))/2^M + (1 - ((Dm1 - ((floor(Dm1*2^M))/2^M))*2^M) - 1/2^(N-M))/2^M), WX); % 9 Inv % get initial approx p0 = truncate(xp*romvalue, WM); % 16 x 15 = 15 truncated multiplier %---------------------------------------------------------- % Iteration of Newton Raphson Method

31

%---------------------------------------------------------- p0_squared = truncate(p0*p0, WD); % 15 x 15 = 27 truncated squarer i0 = round2(p0_squared*(Dm1+1), WD); % 27 x 24 = 27 rounded multiplier p1 = 2*p0-i0; % 27 - 27 = 27 subtractor %---------------------------------------------------------- % Final Multiplication %---------------------------------------------------------- QNR = round2(p1*(1+Nm1),24); % 27 x 24 = 24 rounded multipler Err_NR(i) = QNR - (Qm1+1)/Q_scale; end Err_NR = -log2(abs(Err_NR)+1e-50); minnr = min(Err_NR); X = 19.5:35.5; [N] = hist(Err_NR, X); N = N/length(Err_NR); bar(X, N); title('Improved Design (M=6, W=15, 27 bit datapath)'); xlabel('Error (in bit)'); ylabel('Probability'); fprintf (1, 'Max Error is %f\n', minnr); end %------------------------------------------------------------ % Function: rom %------------------------------------------------------------ function value = rom(add, M, W) if ((add < 0) || (add >= 2^M)) fprintf (1, 'Error: Invalid address to rom: %d\n', add); error ('quiting'); end x1 = 1 + add/2^M; c = 1/(x1 + 2^(-M-1))^2; value = ceil(c*2^(W))/2^(W); end %------------------------------------------------------------ % Function: truncate %------------------------------------------------------------ function value = truncate(X, N) value = floor(X*(2^N))/(2^N); end %------------------------------------------------------------ % Function: round2 %------------------------------------------------------------ function value = round2(X, N) value = floor(X*(2^N)+0.5)/(2^N); end %------------------------------------------------------------ % EOF %------------------------------------------------------------

32

%---------------------------------------------------------------------- % Matlab code to determine error inherent in Taylor series expansion %---------------------------------------------------------------------- M = 7; N = 52; % practically infinite precision NUM_SAMPLES = 50000; for i=1:NUM_SAMPLES Nr = round((1+rand)*(2^N))/2^N; Dr = round((1+rand)*(2^N))/2^N; % precise Quotient Q = (Nr)/(Dr); %------------------------------------------------------------ % Floating point division based on taylor series %------------------------------------------------------------ x1 = floor((Dr)*2^M)/2^M; x2 = Dr - floor(Dr*2^M)/2^M; % Quotient obtained from first two terms of Taylor series expansion Q_Taylor = 1/(x1 + 2^(-M-1)) - (1/(x1 + 2^(-M-1))^2)*(x2 - 2^(-M-1)); % Error in Taylor series approximation Err(i) = Nr*Q_Taylor - Q; end figure(1); plot(Err); [N, X] = hist(Err, 20); N = N/length(Err); plot(X, N, 'o-'); axis([0.7*min(X) 0 0 1.5*max(N)]); title('Error in Quotient M=7'); xlabel('Error Magnitude'); ylabel('Probability'); grid on; Err = -log2(-Err+1e-30); avg = mean(Err); var = std(Err)^2; maxm = max(Err); minm = min(Err); fprintf(1, ' M = %d\n', M); fprintf(1, ' avg = %.4f\n var = %.4f\n max = %.4f\n min = %.4f\n\n', avg, var, maxm, minm); figure(2); X = 6.5:25.5; [N] = hist(Err, X); N = N/length(Err); bar(X, N); axis([5 25 0 1.5*max(N)]); title('Error in Quotient M=7'); xlabel('Error (in Bit)'); ylabe('Probability');

33

%----------------------------------------------------------------------- % MATLAB code to understand accuracy trade offs with ROM-size %----------------------------------------------------------------------- function ROMAccuracy() M = 6; N = 52; LSB = 2*M+2; NUM_SAMPLES = 10000; for j=1:10 W = 2*M+j-3; for i=1:NUM_SAMPLES Nm1 = round(rand*(2^N))/2^N; Dm1 = round(rand*(2^N))/2^N; Q = (1+Nm1)/(1+Dm1); Q_scale = 1; if (Q < 1.0) Q_scale = 2; Q = Q*2; end Qm1 = Q-1; %------------------------------------------------------------ % Division based on ROM followed by a multiplier %------------------------------------------------------------ % read rom - get fraction romvalue = rom(floor(Dm1*2^M), M, W); % determine 1.x1x2x3..xm+1'xm+2'... xp = 1 + (floor(Dm1*2^M))/2^M + (1 - ((Dm1 - ((floor(Dm1*2^M))/2^M))*2^M) - 1/2^(N-M))/2^M; %xp = floor((1 + (floor(Dm1*2^M))/2^M + (1 - ((Dm1 - ((floor(Dm1*2^M))/2^M))*2^M) - 1/2^(N-M))/2^M)*2^(W+3))/2^(W+3); % get initial approx p0 = xp*romvalue; % only 16 bits after decimal are considered %p0 = floor(p0*2^W)/(2^W); Q_rom = p0*(1+Nm1); Err_ROMf(i) = (Q_rom(1) - (1+Qm1)/Q_scale); Err_ROMr(i) = (Q_rom(2) - (1+Qm1)/Q_scale); Err_ROMc(i) = (Q_rom(3) - (1+Qm1)/Q_scale); end Err_ROMf = -log2(abs(Err_ROMf) + 1e-40); Err_ROMr = -log2(abs(Err_ROMr) + 1e-40); Err_ROMc = -log2(abs(Err_ROMc) + 1e-40); minf(j) = min(Err_ROMf); minr(j) = min(Err_ROMr); minc(j) = min(Err_ROMc); end X = 2*M + (1:10) -3; plot(X,minf, 'ro-'); hold on; plot(X,minr, 'gx-');

34

hold on; plot(X,minc, 'b^-'); title('ROM Width vs Accuracy (M=6)'); xlabel('ROM Width (Bits)'); ylabel('Quotient Precision'); legend('Truncation', 'Rounding', 'Ceiling'); grid on; end %------------------------------------------------------------ % Function: rom %------------------------------------------------------------ function value = rom(add, M, W) if ((add < 0) || (add >= 2^M)) fprintf (1, 'Error: Invalid address to rom: %d\n', add); error ('quiting'); end x1 = 1 + add/2^M; c = 1/(x1 + 2^(-M-1))^2; value(1) = floor(c*2^(W))/2^(W); value(2) = round(c*2^(W))/2^(W); value(3) = ceil(c*2^(W))/2^(W); end

35

Appendix B (ROM Pattern)

ROM Contents (15 bits) Address (6 bits) Binary Hex

0 111111000000110 7E06

1 111101000110101 7A35

2 111011010001111 768F

3 111001100010010 7312

4 110111110111101 6FBD

5 110110010001011 6C8B

6 110100101111101 697D

7 110011010001111 668F

8 110001110111111 63BF

9 110000100001101 610D

10 101111001110111 5E77

11 101101111111010 5BFA

12 101100110010111 5997

13 101011101001011 574B

14 101010100010101 5515

15 101001011110101 52F5

16 101000011101000 50E8

17 100111011101111 4EEF

18 100110100001000 4D08

19 100101100110011 4B33

20 100100101101110 496E

21 100011110111001 47B9

22 100011000010011 4613

23 100010001111011 447B

24 100001011110001 42F1

25 100000101110100 4174

26 100000000000100 4004

27 011111010100000 3EA0

28 011110101000111 3D47

29 011101111111001 3BF9

30 011101010110110 3AB6

31 011100101111101 397D

32 011100001001110 384E

33 011011100100111 3727

34 011011000001010 360A

35 011010011110110 34F6

36 011001111101001 33E9

37 011001011100101 32E5

38 011000111101000 31E8

39 011000011110010 30F2

40 011000000000011 3003

36

41 010111100011011 2F1B

42 010111000111010 2E3A

43 010110101011111 2D5F

44 010110010001010 2C8A

45 010101110111010 2BBA

46 010101011110001 2AF1

47 010101000101100 2A2C

48 010100101101101 296D

49 010100010110011 28B3

50 010011111111110 27FE

51 010011101001110 274E

52 010011010100010 26A2

53 010010111111010 25FA

54 010010101010111 2557

55 010010010110111 24B7

56 010010000011100 241C

57 010001110000100 2384

58 010001011110001 22F1

59 010001001100000 2260

60 010000111010100 21D4

61 010000101001010 214A

62 010000011000100 20C4

63 010000001000001 2041

10.1.1.134.725

Documents

Transcript of 10.1.1.134.725