Analysis of Parallel Prefix Adders - IJMETMR · ... Gizopoulos, M. Psarakis, A. Paschalis, and Y...
Transcript of Analysis of Parallel Prefix Adders - IJMETMR · ... Gizopoulos, M. Psarakis, A. Paschalis, and Y...
Page 1356
Analysis of Parallel Prefix Adders
T.Sravya
M.Tech (VLSI)
C.M.R Institute of Technology,
Hyderabad.
D. Chandra Mohan
Assistant Professor
C.M.R Institute of Technology,
Hyderabad.
Dr.M.Gurunadha Babu, M.Tech,
Ph.D, Professor of ECE,
C.M.R Institute of Technology,
Hyderabad.
Abstract
In Very Large Scale Integration (VLSI) designs,
Parallel prefix adders (PPA) have the better delay
performance. This paper investigates four types of
PPA’s (Kogge Stone Adder (KSA), Spanning Tree
Adder (STA), Brent Kung Adder (BKA) and Sparse
Kogge Stone Adder (SKA)). Additionally Ripple
Carry Adder (RCA), Carry Lookahead Adder (CLA)
and Carry Skip Adder (CSA) are also investigated.
These adders are implemented in verilog Hardware
Description Language (HDL) using Xilinx Integrated
Software Environment (ISE) 14.2 Design Suite.
These designs are implemented in Xilinx Spartan 3
Field Programmable Gate Arrays (FPGA) and delays
are measured,all these adder’s delay, power and area
are investigated and compared finally.
Key Words —parallel prefix adders; carry tree
adders; FPGA; logic analyzer; delay; power.
I. INTRODUCTION
The binary addition is the basic arithmetic operation in
digital circuits and it became essential in most of the
digital systems including Arithmetic and Logic Unit
(ALU), microprocessors and Digital Signal Processing
(DSP). At present, the research continues on increasing
the adder’s delay performance. In many practical
applications like mobile and telecommunications, the
Speed and power performance improved in FPGAs is
better than microprocessor and DSP’s based solutions.
Additionally, power is also an important aspect in
growing trend of mobile electronics, which makes
large-scale use of DSP functions. Because of the
Programmability, structure of configurable logic
blocks (CLB) and programming interconnects in
FPGAs, Parallel prefix adders have better
performance. The delays of the adders are discussed
[1]. In this paper, above mentioned PPA’s and RCA
and CSA are implemented and characterized on a
Xilinx virtex 5 FPGA. Finally, delay, power and area
for the designed adders are presented and compared.
II. DRAWBACKS OF RIPPLE CARRY AND
CARRY LOOKAHEAD ADDER In figure1, the first
sum bit should wait until input carry is given, the
second sum bit should wait until previous carry is
propagated and so on. Finally the output sum should
wait until all previous carries are generated. So it
results in delay.
Fig. 1. 4 bit ripple carry adder
In order to reduce the delay in RCA (or) to propagate
the carry in advance, we go for carry look ahead adder.
Basically this adder works on two operations called
propagate and generate The propagate and generate
equations are given by.
For 4 bit CLA, the propagated carry equations are
given as
Page 1357
Equations (3),(4),(5) and (6) are observed that, the
carry complexity increases by increasing the adder bit
width. So designing higher bit CLA becomes
complexity. In this way, for the higher bit of CLA’s,
the carry complexity increases by increasing the width
of the adder. So results in bounded fan-in rather than
unbounded fan-in, when designing wide width adders.
In order to compute the carries in advance without
delay and complexity, there is a concept called Parallel
prefix approach.
III. DIFFERENCE BETWEEN PARALLEL-
PREFIX ADDERS AND OTHERS
The PPA’s pre-computes generate and propagate
signals are presented in [2]. Using the fundamental
carry operator (fco), these computed signals are
combined in [3].The fundamental carry operator is
denoted by the symbol “ο”,
For example, 4 bit CLA carry equation is given by
For example, 4 bit PPA carry equation is given by
Equations (8) and (9) are observed that, the carry look
ahead adder takes 3 steps to generate the carry, but the
bit PPA takes 2 steps to generate the carry.
IV. PARALLEL-PREFIX ADDER STRUCTURE
Parallel-prefix structures are found to be common in
high performance adders because of the delay is
logarithmically proportional to the adder width [2].
PPA’s basically consists of 3 stages
• Pre computation
• Prefix stage
• Final computation
The Parallel-Prefix Structure is shown in figure 2.
A. Pre computation
In pre computation stage, propagates and generates are
computed for the given inputs using the given
equations (1) and (2).
B. Prefix stage
In the prefix stage, group generate/propagate signals
are computed at each bit using the given equations.
The black cell (BC) generates the ordered pair in
equation (7), the gray cell (GC) generates only left
signal, following [2].
(BC) generates the ordered pair in equation (7), the
gray cell (GC) generates only left signal, following [2].
More practically, the equations (10) and (11) can be
expressed using a symbol “o “denoted by Brent and
Kung. Its function is exactly the same as that of a
black cell i.e.
Page 1358
The "o" operation will help make the rules of building
prefix structures.
C. Final computation
In the final computation, the sum and carryout are the
final output.
Where “-1” is the position of carry-input. The
generate/propagate signals can be grouped in different
fashion to get the same correct carries. Based on
different ways of grouping the generate/propagate
signals, different prefix architectures can be created.
Figure 3 shows the definitions of cells that are used in
prefix structures, including BC and GC. For analysis
of various parallel prefix structures, see [2], [3] &[4].
The 16 bit SKA uses black cells and gray cells as well
as full adder blocks too. This adder computes the
carries using the BC’s and GC’s and terminates with 4
bit RCA’s. Totally it uses 16 full adders. The 16 bit
SKA is shown in figure 4. In this adder, first the input
bits (a, b) are converted as propagate and generate (p,
g). Then propagate and generate terms are given to
BC’s and GC’s. The carries are propagated in advance
using these cells. Later these are given to full adder
blocks. Another PPA is known as STA is also tested
[6]. Like the SKA, this adder also terminates with a
RCA. It also uses the BC’s and GC’s and full adder
blocks like SKA’s but the difference is the
interconnection between them [7].The 16 bit STA is
shown in the below figure 5.
Fig. 4. 16 bit sparse kogge-Stone adder
Fig. 5. 16 bit spanning tree adder
Page 1359
KSA is another of prefix trees that use the fewest logic
levels. A 16-bit KSA is shown in Figure 6. The 16 bit
kogge stone adder uses BC’s and GC’s and it won’t
use full adders. The 16 bit KSA uses 36 BC’s and 15
GC’s. And this adder totally operates on generate and
propagate blocks. So the delay is less when compared
to the previous SKA and STA. The 16 bit KSA is
shown in figure 6.In this KSA, there are no full adder
blocks like SKA and STA [5] & [6]. Another carry tree
known as BKA which also uses BC’s and GC’s but
less than the KSA. So it takes less area to implement
than KSA. The 16 bit BKA uses 14 BC’s and 11 GC’s
but kogge stone uses 36 BC’s and 15 GC’s. So BKA
has less architecture and occupies less area than KSA.
The 16 bit BKA is shown in the below figure 7.
Fig. 6. 16 bit kogge stone adder
BKA occupies less area than the other 3 adders called
SKA, KSA, STA. This adder uses limited number of
propagate and generate cells than the other 3 adders. It
takes less area to implement than the KSA and has less
wiring congestion.
The operation of the 16 bit brent kung adder is given
below [3]. This adder uses less BC’s and GC’s than
kogge stone adder and has the better delay
performance which is observed in agilent 1692A logic
analyzer. These adders are implemented in verilog
HDL in Xilinx 13.2 ISE design suite and then verified
using Xilinx virtex 5 FPGA through chip scope
analyzer [7], [8] and [9]. And these were tested using
Agilent 1692A logic analyzer. This allows to measure
the adder delays directly. The Agilent 1692A logic
analyzer is integrated to PC(Personal Computer)
through Xilinx virtex 5 FPGA [10]. The test setup is
depicted in the figure 10.
V. DISCUSSION OF RESULTS
The delays observed for adder designs from synthesis
reports in Xilinx ISE 14.2 synthesis reports are shown
in Figure11.
Sparse Tree Adder wave form,area & delay
Page 1360
Brent Kung Adder wave form,area & delay.
Spanning Tree Adder wave form, area & delay.
Kogge Stone Adder wave form, area & delay.
Sparse-Kogge Stone Adder wave form, area & delay.
The area of the adder designs is measured in terms of
look up tables (LUT) and input output blocks (IOB)
taken for Xilinx Spartan 6 FPGA is plotted in the
figure 13. As per reference [1], ISE software doesn’t
give exact delay of the adders because it is not able to
analyze the critical path over the adder [1]. From the
figure 11, the CSA has more delay when compared to
other adders. Out of all adders, RCA has less delay.
SKA adder and BKA has about the same delay, where
as KSA and STA has same delay. According to the
synthesis reports, out of four parallel prefix adders,
Sparse - KOGGE STONE adder has better delay.
VI. CONCLUSION
From the study of analysis done on area and power, we
have concluded that the efficiency is improved by 5.77
% in ours delay for RCA, when compared to [1] and
for KSA it is improved by 19.28 % when compared
with [1].
REFERENCES
[1] David H.K.Hoe, Chris Martinez and Sri Jyothsna
Vundavalli”, Design and Characterization of Parallel
Prefix Adders using FPGAs”, 2011 IEEE 43rd
Southeastern Symposium in pp. 168-172, 2011.
Page 1361
[2] N. H. E. Weste and D. Harris, CMOS VLSI
Design, 4th edition, Pearson–Addison-Wesley, 2011.
[3] R. P. Brent and H. T. Kung, “A regular layout for
parallel adders,” IEEE Trans. Comput., vol. C-31, pp.
260-264, 1982.
[4] D. Harris, “A Taxonomy of Parallel Prefix
Networks,” in Proc. 37th Asilomar Conf. Signals
Systems and Computers, pp. 2213–7, 2003.
[5] P. M. Kogge and H. S. Stone, “A Parallel
Algorithm for the Efficient Solution of a General Class
of Recurrence Equations,” IEEE Trans. on Computers,
Vol. C-22, No 8, August 1973.
[6] D. Gizopoulos, M. Psarakis, A. Paschalis, and Y.
Zorian, “Easily Testable Cellular Carry Lookahead
Adders,” Journal of Electronic Testing: Theory and
Applications 19, 285-298, 2003.
[7] T. Lynch and E. E. Swartzlander, “A Spanning
Tree Carry Lookahead Adder,” IEEE Trans. on
Computers, vol. 41, no. 8, pp. 931-939, Aug. 1992.
[8] Beaumont-Smith, A, Cheng-Chew Lim ,”Parallel
prefix adder design”, Computer Arithmetic, 2001.
Proceedings. 15th IEEE Symposium,pp. 218 –
225,2001.M. Young, The Technical Writer's
Handbook. Mill Valley, CA: University Science, 1989.
[9] K. Vitoroulis and A. J. Al-Khalili, “Performance of
Parallel Prefix Adders Implemented with FPGA
technology,” IEEE Northeast Workshop on Circuits
and Systems, pp. 498-501, Aug. 2007. 172.
[10] S. Xing and W. W. H. Yu, “FPGA Adders:
Performance Evaluation and Optimal Design,” IEEE
Design & Test of Computers, vol. 15, no. 1, pp.24-29,
Jan. 1998.