Fixed and Floating Point Array Signal Processor...
Transcript of Fixed and Floating Point Array Signal Processor...
6 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai
International Journal of Electronics, Electrical and Computational System
IJEECS
ISSN 2348-117X
Volume 6, Issue 6
June 2017
Fixed and Floating Point Array Signal Processor Architecture
Implemented on FPGA and their performance Comparisons
Jayaraj U Kidav1,
Research Scholar and Scientist 'D',
Karunya University Coimbatore,NIELIT Calicut,
Kerala,India
Nidhi Antony3,
M.Tech Project Student ,
NIELIT Calicut,Kerala, India.
N.M Sivmangai2,
Associate Professor,
Karunya University Coimbatore,
Tamilnadu,India.
Dr.M.P Pillai4,
Exec. Director,
NIELIT Calicut,Kerala,India.
Abstract— Array Signal Processor or digital
beamformer is an inevitable processing block in various
antenna array signal processing applications including
RADAR/SONAR, MIMO, medical imaging etc.. In high
sampling rate applications like imaging SONAR digital
beamformer needs to handle high input data rate and
sampling frequency for processing. Now a days due to
the advancement in FPGA technology, most of the digital
beamformer implementation for high sampling rate
applications are based on FPGAs. Availability of high
speed I/Os, parallel hardware structures, internal block
RAMs, course grained processing blocks etc. enables
FPGA as implementation choice for those applications.
In digital beamformer, the blocks like fractional delay
unit, apodization unit, summer unit etc involves
mathematical operations and in order to get accurate
results data representation is key in FPGA
implementations. In this paper we discuss design and
FPGA implementation of fixed and floating point digital
beamformer architecture for high sampling rate
applications like imaging SONAR. We also discuss the
merits and demerits of the same. In most of the VLSI
signal processing architectures fixed point arithmetic is
preferred due to ease of implementation. We have used
Virtex-6 ML605 evaluation board as implementation
platform and utilized LogiCORE IP floating-point
operator v5.0 available in Virtex-6 FPGA for floating
point processing. In order to compare the accuracy of
implementation, initially we modelled the beamformer
using MATLAB. We also utilized the available
Ultrasound simulation program in Field II for imaging
SONAR array modelling and generating echo from
software phantoms. We designed and implemented both
architectures and investigated the performance in terms
of hardware resources and data rate. We could see that
Floating point architecture mitigates the data rate
requirement for high resolution imaging which requires
higher number of channels at the cost of hardware
resources. Compared to fixed point counterpart, floating
point architecture showed an improvement of about 75 %
in data rate. We implemented phased array beamformer
and as per delay calculations for various angles, it shows
that centre transducer elements requires less delay as
compared to left and right most elements. We proposed
and implemented a variable delay line structure for each
elements, which helped to reduce the number of flip flops
in digital beamformer architecture.
Index Terms— Data Rate, Digital Beamformer,
Floating point, FPGA, Imaging SONAR.
I. INTRODUCTION
Pulse-echo processing is the working principle
behind RADAR/SONAR, Medical imaging
applications etc. When voltage is applied to the
transducer probe, pulses are produced due to
piezoelectric effect. These pulses from the
transducer probe hits the target in region of interest
and as a result, echoes are produced. These signals
are then processed by the array signal processor that
is beamformer. Beamformer acts as a spatial filter
which deals with the directional transmission and
reception of signal[1].
Earlier, array systems involve simple
implementation of beamformer functions without
focusing [2]. Later, beamformer design was
customized by including focusing. Several
restrictions were there such as limited focal region
and high side lobe levels. These challenges were
solved by using high f-number and apodization. It
7 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai
International Journal of Electronics, Electrical and Computational System
IJEECS
ISSN 2348-117X
Volume 6, Issue 6
June 2017
dramatically enhanced the performance. High value
of f-number resulted in some drawbacks like
reduced aperture size. So dynamic focusing was
introduced to reduce f-number during receive
beamforming and to keep it as a constant until it
runs out of aperture. Coarse and fine delay are
needed for the implementation of dynamic focusing
in analog beamformer. The cost variations allied
with these systems led to the digital beamformer
design.
At the beginning the digital beamformers did not
have much significant impact. It was due to the
need for A/D converters with sufficiently large
number of bits and a high enough sampling rate.
Another factor which has facilitated this change is
the dramatic increases in gate counts of ASIC's and
the improvements in their design tools. With the
advancement in technologies there has been a
tremendous change in the digital beamformer
design and architecture.
Conventional antenna array beamforming
implementations are mostly based on Digital Signal
Processors (DSPs) . To handle dynamic range of
ultrasound echo from body, Imaging SONAR and
medical ultrasound beamformers usually requires
higher ADC resolution (14bit), also in order to
generate high quality clinical images higher number
of channels (64-128) and sampling frequency (40-
60MHz) is necessary. This causes the beamformer
to handle higher data rate, and real time processing
becomes predicament. In this paper we discuss an
FPGA based Digital Beamformer (DBF)
architecture, for fixed and floating point processing,
as FPGA can handle high input data rate via high
speed serial I/Os and flexibility in FPGA structure
mitigates the real time implementation challenges.
In most of the existing DBF architecture
implementations[16], [17], [29]-[36] the importance
of data representations is not discussed. The floating
point arithmetic on FPGA is discussed in [4]. Due
to the difficulty in implementation of floating point
arithmetic, resource constraints etc fixed point
implementation was adapted for realization of VLSI
Signal processing architectures on FPGAs. But in
applications like Medical Imaging under water
acoustic camera in order to get high accuracy in
processing the number of bits required for fixed
point arithmetic is massive. This causes high output
data rate which in turn causes the requirement of
high speed interfaces like PCIe, Gigabit Ethernet
etc. at DBF output. The contemporary FPGAs like
Xilinx Virtex-6 provides LogiCORE IP floating-
point operator v5.0, which mitigates
implementation challenges of floating point
arithmetic on FPGAs.
In this work, a 32 Channel DBF architecture has
been developed. The architecture is implemented on
FPGA, with fixed point and floating processing.
The architecture is also having the flexibility in
beamforming as [17] like receive dynamic
focussing and apodization. We also implemented
high accuracy fractional delay unit by adopting
Minimum Mean Square Error (MMSE) interpolator
[18].
We have used Field II software scanner [22], [26] to
validate our architecture implementation on FPGA.
II. FIXED POINT AND FLOATING POINT
REPRESENTATION
With the advancement in technologies there has
been an incredible change in the digital beamformer
design and architecture. Beamformer architecture
can be implemented based on fixed point arithmetic
or floating point arithmetic. Fixed point and floating
point are two formats used to represent numbers.
Floating point architecture supports integer or real
arithmetic while fixed point architecture supports
integer arithmetic so it represents all numbers using
integers. It uses binary scaling to make all numbers
robust into one of the integer data types [5].
8 bits (char, int8): [−128, 127]
16 bits (short, int16): [−32768, 32767]
32 bits (long, int32): [−2147483648,
2147483647]
In fixed-point representation, a real number y is
denoted by an integer Y with L = i + f + 1 bits,
where L is the wordlength, i is the number of
integer bits (excluding the sign bit), f is the number
of fractional bits. “Q-format”: Y is sometimes
called a Qi.f or Qf number.
Floating point number is represented more or less to
a fixed number of significant digits (the significand)
and scaled using an exponent. A number that can
be represented precisely is of the following form:
Significand × baseexponent;
The binary point is erratic (floating) and depends on
the value of the exponent. To obtain the value of the
floating-point number, the significand is multiplied
8 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai
International Journal of Electronics, Electrical and Computational System
IJEECS
ISSN 2348-117X
Volume 6, Issue 6
June 2017
by the base raised to the power of the exponent,
alike to shifting the radix point from its inferred
position by a number of places equal to the value of
the exponent to the right if the exponent is positive
or to the left otherwise. The length of the
significand determines the accuracy to which
numbers can be represented.
The IEEE has standardized the computer
representation for binary floating-point numbers in
IEEE 754 (a.k.a.IEC 60559).Different IEEE
floating point formats are
Single precision(32 bits)
Double precision(64 bits)
Double extended precision(80 bits)
Quadruple precision(128 bits)
Half precision(16 bits)
A. Half Precision(binary 16)
The IEEE 754 standard specifies a binary16 as
having the following format[8].
Sign bit: 1 bit
Exponent width: 5 bits
Significand precision: 11 bits (10 explicitly
stored)
Fig. 1. Half Precision
The format is assumed to have an inherent lead bit
with value 1 unless the exponent field is stored with
all zeros. Thus only 10 bits of the significand appear
in the memory format but the total precision is 11
bits. In IEEE 754 idiom, there are 10 bits of
significand, but there are 11 bits of significand
precision (log10 (211) ≈ 3.311 decimal digits).
III. DIGITAL BEAMFORMER ARCHITECTURE
A 32 channel DBF for receive beam forming is
designed to be implemented in Virtex 6 FPGA. A
14 bit ADC, sampling frequency 40 MHz gives the
digitized echo input to each channel. The received
echoes are sampled by ADC of sampling frequency
40 MHz These samples are delayed with specific
time delays and then summed to form a beam. The
structure is a combination of coarse delay and fine
delay. The architecture of DBF is shown in Fig. 2.
Digital beamformer can be implemented either in
fixed point architecture or floating point
architecture.
Fig. 2. High level architecture of 1 channel
The geometry of linear phased array used in
deriving the formula for calculating time delays for
each element, in Delay and Sum technique is as
per[13] we performed -30 degree to +30 degree
phased array sweep at a lateral resolution of 1
degree. According to these delay calculations we
could observe that there is a course delay difference
of 30 to 50 sampling clocks among centre elements
and left and right most elements. Hence we adopted
a delay structure as shown Fig. 3. and this structure
helped to reduce an average 20 to 25 flip flops per
delay line.
Fig. 3. Delay Line Structure
9 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai
International Journal of Electronics, Electrical and Computational System
IJEECS
ISSN 2348-117X
Volume 6, Issue 6
June 2017
IV. REALIZATION OF FIXED POINT DIGITAL
BEAMFORMER
Due to ease of implementation, digital
beamformer is generally implemented in fixed point
architecture. The basic building blocks of this
development include FIFO, coarse delay structure,
fine delay structure, apodization unit. The transmit
beam former output would be connected to a digital
to analog converter, then to a passive low pass filter
and to high voltage amplifier. The high voltage
amplifier output would be connected to an
ultrasound transducer array an analog switch/mux.
The high voltage amplifier outputs will energise the
transducer array to generate ultrasound waves. In
receive path the receive switch would be on and the
signal will reach to amplifier and filter stage. The
conditioned signal would convert into digital and
fed to Receive beam former for further processing.
The IF signal, generally in the range of 40 MHz to
60 MHz is converted into one word digital data
using 14/16 bit, high speed ADC. The digital data is
received at a sampling clock of 40 MHz and then
processed as follows:
In fixed point architecture 16 bit input is
represented in 16.0 Q format. The coarse delay
values are also represented in 16.0 Q format. The
resulting coarse delayed output is 16 bits. It is then
multiplied with 2.14 Q format filter coefficients and
added together resulting in 32 bit fine delayed
output. The fine delayed output is multiplied with
1.31 Q format apodization coefficients leading to 64
bit output. Later output from all channels is
summed together resulting in 64 bit beamformed
output. So in fixed point architecture for a 16 bit
input, 64 bit output is obtained.
Dynamic range refers to the range of echoes
processed and displayed by the ultrasound system.
It is directly proportional to the no of bits in fixed
point architecture. So as the number of bits
decreases dynamic range also decreases. The
dynamic range of 16 bit fixed point architecture is
96dB. As dynamic range decreases the echoes at the
weaker end of the spectrum will be lost. Dynamic
range can be considered as a variable threshold of
writing for weaker signals. For general imaging the
dynamic range should be kept at its maximum level
to maximize contrast resolution potential. However
in situations where low-level noise or artifacts
degrade image quality the dynamic range can be
reduced to partially eliminate these appearances
[15].The structure for a 32 channel digital
beamformer has to be implemented. The 32 channel
DBF will be performed in one Virtex 6 FPGA.
The hardware blocks in one channel to perform
digital beamforming is depicted in Fig. 2. A 32X16
bits FIFO is designed to buffer the digital echo from
ADC to delay structure for processing. Due to
requirement of huge memory to store input data for
beamforming, FIFO is selected instead of FPGA
internal block RAM. It is used to buffer the
incoming data.
In Delay and sum technique, a combination of
coarse and fine delay strategy is used. An optimized
delay structure is designed to reduce the utilization
of FPGA resources. The design differentiates the
channels of DBF into groups depending on the time
delays required for them. From the statistical study
of time delay values required for each channel for
any type of transducer array, we concluded that a
group of channels in the center always requires less
time delays than the channels at the both left and
right ends of array aperture. So the delay line
structure for channels at the center requires less
delay flops. The delay values depend on the probe
parameters and a number of I/O signals. The
calculated delay values for each probe are stored in
Look up tables (LUTs) as shown in Fig. 4.
Fig. 4. Coarse delay structure of fixed point
architecture
The samples after coarse delay are the inputs to fine
delay structure. A Farrow structure fractional delay
Finite impulse response (FIR) filter with MMSE
interpolator is used to generate fine delays
delays[9]. The filter coefficients are stored in LUTs
as depicted in Fig. 5.
10 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai
International Journal of Electronics, Electrical and Computational System
IJEECS
ISSN 2348-117X
Volume 6, Issue 6
June 2017
Fig. 5. Fine delay structure of fixed point
architecture
The user can select the apodization technique in
order to view the image. The weights of windowing
functions Hanning, Hamming and Kaiser are stored
in LUTs as illustrated in Fig. 6. It is required to
compute the complex multiplication for several
numbers of weights which will decide where the
beam needs to be formed. For sixteen elements to
form one beam we need to have sixteen weights and
for N number of beams, N different sets of sixteen
weights are required. We consider the weights are
fixed and calculated offline.
Fig. 6. Apodization structure of fixed point
architecture
A ping pong memory is used to store the beam
formed output and would be processed in ping pong
fashion.
In fixed point architecture for a 16 bit input, 64 bit
output is obtained. That is the output data rate is
2.56 Gbps.This is because in fixed point
architecture size of output will be sum of the sizes
of inputs being multiplied together. Thus we go for
floating point architecture where size of output will
be size of inputs being multiplied together.
Therefore for a 16 bit input, output bit size is 16 bit
and data rate is 0.64 Gbps.For the same input data
rate it reduces output bit rate by one fourth that of
fixed point output bit rate without affecting the
accuracy of the output. The dynamic range depends
directly on the no of bits. As the no of output bits
decreases dynamic range decreases. But compared
to fixed point, floating point has higher dynamic
range.
V. REALIZATION OF FLOATING POINT
DIGITAL BEAMFORMER
In this paper we designed and implemented floating
point architecture of digital beamformer that offers
better throughput, dynamic range and accuracy
compared to fixed point architecture.
In floating point architecture 16 bit input data is
converted from fixed point format to floating point
format by using floating point IP core version 5.0.
The coarse delay values are also converted from
fixed point format to floating point format. The
resulting coarse delayed output is 16 bits. It is then
multiplied with filter coefficients and added
together using floating point arithmetic resulting in
16 bit fine delayed output. The fine delayed output
is multiplied with 16 bit apodization coefficient
using floating point IP core version 5.0 leading to
16 bit output. Later output from all channels is
summed together resulting in 16 bit beamformed
output.
In floating point architecture size of output will be
size of inputs being multiplied together. Therefore
for a 16 bit input, output bit size is 16 bit itself and
data rate is 0.64 Gbps. It reduces output bit rate by
one fourth that of fixed point output bit rate without
any deterioration in accuracy of digital beamformer
output. Floating point architecture is explained in
detail in the following section.
A 32X16 bits FIFO is designed to buffer the digital
echo from ADC to delay structure for processing.
Input data is converted from fixed point format to
floating point format by using floating point IP core
version 5.0. FIFO buffers the input data, when 128
samples are stored Fifo_full becomes high and
FIFO stops writing. In Delay and sum technique, a
combination of coarse and fine delay strategy is
used. The samples read from FIFO are given to D
flip flops. D flip flops clocked at sampling
frequency are used to give the coarse delays to the
input data. They are positive edge triggered with
clock of sampling frequency i.e.40 MHz, so coarse
delays are integer multiples of sampling period. The
11 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai
International Journal of Electronics, Electrical and Computational System
IJEECS
ISSN 2348-117X
Volume 6, Issue 6
June 2017
output Qn of each D flip flop are given to
multiplexer MUX2.The MUX2 selects the Qn
according to the select line data. The select line data
is the output of MUX1.The MUX1 selects the delay
value required for the input data.
The pre calculated delay values (delay values are
number of clock periods) are stored in LUTs.MUX1
has select lines probe_id, ‘θ’ and ‘r’. Probe_id gives
the information of the active probe inserted in the
scanner, ‘θ’ gives the scan line angle and ‘r’ gives
the receive foci, thus the corresponding delay values
are selected from LUT and given as select line to
MUX2.This delay value decides which Qn is to be
selected and given to fine delay structure as shown
in Fig. 7. Each LUT stores the delay values for a
specific probe that the system will support.
Fig. 7. Coarse delay structure of floating point
architecture
Filter coefficients are converted from fixed point
format to floating point format by using floating
point IP core version 5.0.This converted values are
then stored in LUT’s.
Fig. 8. Fine delay structure of floating point
architecture
MUX3 and MUX4 are used to select filter
coefficients h1 and h2.The select lines are probe_id,
‘θ’, and ‘r’.The farrow structure is designed using D
flip flop. The samples are multiplied with h1 and h2
and added using floating point IP core version 5.0
as depicted in Fig. 8.The sampling frequency is 40
MHz
Apodization coefficients are converted from fixed
point format to floating point format by using
floating point IP core version 5.0.This converted
values are then stored in LUT’s. The different
window functions are stored in LUTs.The
Apo_select represents the window function selected
by the operator. The samples after coarse and fine
delay are multiplied with selected window function
using floating point IP core 5.0 as shown in Fig.9.
Fig. 9. Apodization unit of floating point
architecture
VI. RESULTS
We performed the simulations in MATLAB 2013a
and Xilinx 14.3 to verify the beamformed output.
Through the simulation results, we will illustrate the
performance comparison between the fixed point
and floating point beamformer architecture.
A. MATLAB Simulation
1) Transducer Array Simulation
We modeled transducer array using Field II Tool
box in MATLAB. For the simulations, we used a 32
element phased transducer array with 40 MHz
sampling frequency, 3.5 MHz centre frequency. The
focus is set at 70mm depth and sector angle at 64
degree. The distance between element centers
(pitch) is set as 0.160mm.The width of fill material
between ceramic elements (kerf) is set as
12 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai
International Journal of Electronics, Electrical and Computational System
IJEECS
ISSN 2348-117X
Volume 6, Issue 6
June 2017
0.025mm.The element width is obtained from
subtracting kerf from pitch, element height is
13mm, element subdivision in x direction is 1, and
element subdivision in y direction is 5.Fig.10.
shows the simulated phased array transducer.
Fig. 10. Ultrasound probe model in MATLAB
2) Point Target Phantom Simulation
To evaluate the performance, we designed a
phantom of amplitude 10^(25/20) with 2 point
targets located at the depth of 60 mm and 70 mm
using Field II Tool in MATLAB as in Fig.11. Fig.
12 shows the images of the point targets obtained
using different window functions in beamforming
process. As observed from Fig. 12, Kaiser Window
provides better result compared to other windows.
Echoes generated from cyst phantom are shown in
Fig. 13.
Fig. 11. Point Target Phantom Model
Fig. 12. Images of simulated point target phantom.
using different windows (a) Hanning window (b)
Hamming window, (c) Blackman window, (d)
Rectangular window, (e) Tukeywin window, (f)
Kaiser window.
Fig. 13. Generated echoes from phantom
Fig. 14. shows the MATLAB output of digital
beamformer architecture.
Fig. 14. MATLAB output of DBF
B. FPGA Simulation
For fixed point architecture we have used Virtex-6
ML605 evaluation board as implementation
platform. Input data as well as coarse and fine delay
values are loaded as .coe file in block memory
generators. We utilized RAM based shift registers
for implementing coarse delay.
13 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai
International Journal of Electronics, Electrical and Computational System
IJEECS
ISSN 2348-117X
Volume 6, Issue 6
June 2017
Fig. 15. FPGA simulation blocks of fixed point
DBF
Fig. 16. FPGA simulation of fixed point DBF
Fig. 17. FPGA output of fixed point DBF
For floating point architecture we have used Virtex-
6 ML605 evaluation board as implementation
platform. Input data as well as coarse and fine delay
values are loaded as .coe file in block memory
generators. We utilized RAM based shift registers
for implementing coarse delay. LogiCORE IP
floating-point operator v5.0 available in Virtex-6
FPGA is utilized for floating point processing.
Mainly three features of floating point IP cores are
used here, fixed to float conversion, floating point
addition, floating point multiplication.
Fig. 18. FPGA simulation blocks of floating point
DBF
Fig. 19. FPGA simulation of floating point DBF
14 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai
International Journal of Electronics, Electrical and Computational System
IJEECS
ISSN 2348-117X
Volume 6, Issue 6
June 2017
Fig. 20. FPGA output of floating point DBF
VII. DISCUSSION
Performance comparison of floating point
architecture with fixed point architecture was done
based on the simulation results from MATLAB and
FPGA. Various parameters analysed for the
performance comparison are explained in detail in
following section.
A. I/O Data Rate Estimation
For fixed point architecture, data rate is computed
by multiplying no: of bits, sampling frequency and
no. of channels. For a 16 bit input, 32 channel
digital beamformer with sampling frequency of 40
MHz the input data rate is found to 20480Mbps or
20.48Gbps.Whereas output data rate is found to be
2560Mbps or 2.56Gbps.
For floating point architecture data rate is computed
by multiplying no: of bits, sampling frequency and
no. of channels. For a 16 bit input, 32 channel
digital beamformer with sampling frequency of 40
MHz the input data rate is found to 20480Mbps or
20.48Gbps.Whereas output data rate is found to be
640Mbps or 0.64 Gbps.
For the same input data rate, floating point
architecture reduces output data rate to ¼ th of fixed
point architecture.
Performance improvement in output data rate
= 2.56−0.640
2.56=0.75=75%.
Fig. 21. Output Data Rate
B. Hardware Utilization
Hardware utilization in floating point architecture is
more compared to fixed point architecture because
it requires more multipliers (DSP48E1s), 24% and
19% respectively. Fig. 22 shows the hardware
utilization comparison of both architectures. Both
LUT and FF resource usage and maximum
frequency reduce with latency. Minimizing latency
minimizes resources. Floating point IP core offers
feasibility in changing latency values. In case of
fixed to float conversion operation latency value
ranges from 0 to 6.For floating point addition and
multiplication operation latency value ranges from 0
to 8.Minimum required latency is 1[10].Table I and
Table II shows the device utilization of both
architectures.
TABLE I
FLOATING POINT ARCHITECTURE DEVICE
UTILIZATION (VIRTEX 6 FPGA)
Device Utilization Summary (estimated values)
Logic Utilization Used Available Utilization
Number of Slice
Registers 7932 301440 2%
Number of Slice LUTs 26672 150720 17%
Number of fully used
LUT-FF pairs 4816 29788 16%
Number of bonded
IOBs 533 600 88%
Number of Block
RAM/FIFO 64 416 15%
Number of
BUFG/BUFGCTRLs 1 32 3%
Number of DSP48E1s 188 768 24%
15 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai
International Journal of Electronics, Electrical and Computational System
IJEECS
ISSN 2348-117X
Volume 6, Issue 6
June 2017
TABLE II
FIXED POINT ARCHITECTURE DEVICE
UTILIZATION
(VIRTEX 6 FPGA)
Device Utilization Summary (estimated values)
Logic Utilization Used Available Utilization
Number of Slice
Registers 4210 301440 1%
Number of Slice
LUTs 6279 150720 4%
Number of fully
used LUT-FF pairs 4696 5793 81%
Number of bonded
IOBs 239 600 39%
Number of Block
RAM/FIFO 64 416 15%
Number of
BUFG/BUFGCTRLs 2 32 3%
Number of
DSP48E1s 152 768 19%
Fig. 22. Hardware Utilization
C. Accuracy
To analyze the accuracy of the DBF output obtained
from both floating point and fixed point architecture
the deviation of FPGA simulation output from
MATLAB output is calculated. To find the error
between signals, first normalize the signals to zero
mean and unit variance.
s1=dbfMATLAB-mean (dbfMATLAB)/ (std (dbfMATLAB)
s2=dbffloat-mean (dbffloat)/std (dbffloat)
s3=dbffixed-mean (dbffixed)/std (dbffixed)
Errorfloat=max (s1-s2) = 6.6061
Errorfixed=max (s1-s3) = 6.4251
From the above calculation it is clear that error rate
is less for 64 bit fixed point architecture compared
to 16 bit floating point architecture as shown in Fig.
23.As the no of bits increases error rate decreases.
Fig. 23. Accuracy
D. Dynamic Range
Dynamic range is the ratio of largest and smallest
number that can be represented in data format.
Floating point architecture has high dynamic range
compared to fixed point architecture. Fig. 24
illustrates drastic increase in dynamic range of
floating point compared to fixed point. Table III
shows various dynamic range values of floating
point and fixed point architecture.
Dynamic range for fixed point=20log10(2No:of bits-1) dB
Dynamic range for floating point=20log10(22^No:of exponent
bits) dB
TABLE III
DYNAMIC RANGE
No:of
bits
Fixed point
dynamic
range(dB)
Floating point
dynamic
range(dB)
16 96 192
32 192 1541
64 385 12330
16 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai
International Journal of Electronics, Electrical and Computational System
IJEECS
ISSN 2348-117X
Volume 6, Issue 6
June 2017
Fig. 24. Dynamic Range
VIII. CONCLUSION
Digital beamformer is an inevitable processing
block in various antenna array signal processing
applications. In this paper we developed an FPGA
based floating point DBF architecture as FPGA can
handle high input data rate via high speed serial
I/Os and flexibility in FPGA structure mitigates the
real time implementation challenges in high
sampling rate applications. We implemented phased
array beamformer and as per delay calculations for
various angles, it shows that centre transducer
elements requires less delay as compared to left and
right most elements. We proposed and implemented
a variable delay line structure for each elements,
which helped to reduce the number of flip flops in
digital beamformer architecture. Performance
comparison of proposed floating point architecture
and fixed point architecture was carried out.
Accuracy of DBF output is more for 64 bit fixed
point architecture compared to 16 bit floating point
architecture, as no. of bits increases error rate
decreases. Here we utilized the floating point IP
core 5.0 available in Virtex 6 FPGA for performing
floating point arithmetic. Our implementation
shows 75% improvement in output data
rate.Dynamic range calculation shows drastic
increase in floating point architecture than fixed
point architecture. By minimizing the latency we
minimized the resources used in floating point
architecture. Floating point IP core offers feasibility
in changing latency values. Hardware utilization is
more for floating point architecture due to its usage
of multipliers in floating point arithmetic.
ACKNOWLEDGMENT
This work was supported by Microelectronics
division of the Ministry of Electronics and
Information Technology, Government of India, as
per order No 9(1)/2014 –MDD dated 15-12-2014.
REFERENCES [1] B.D.Van Veen, K.M. Buckley , “Beamforming: A
versatile approach to spatial filtering”, IEEE ASSP
Magazine, Volume: 5, Issue: 2,pp.4-24,1988.
[2] K.E.Thomenius, “Evolution of Ultrasound
Beamformers”, IEEE Ultrasonics
Symposium,pp.1615-1622,1996.
[3] C.A.Balanis, “Antenna Theory: Analysis and design”,
3rd edition ,Wiley,2005
[4] S. Sahin, A. Kavak, Y. Becerikli, and H. E. Demiray,
“ Implementation of floating point arithmetics using
an FPGA”, Mathematical Methods in Engineering,
pp 445-453,ISBN 978-1-4020-5677-2,Springer
Netherlands,January 2007.
[5] https://en.wikipedia.org/wiki/IEEE floating point
[6] S..W.Smith,Chapter 28, “Fixed versus Floating
Point”, The Scientist and Engineers Guide to Digital
Signal Processing, California Technical Pub. p. 514.
ISBN 0966017633,1997,Retrieved December 31,
2012.
[7] Texas Instruments,Signal Processing Overview of
Ultrasound Systems for Medical Imaging,2008.
[8] https://en.wikipedia.org/wiki/Half-precision floating-
point format
[9] J.U.Kidav, B. A. Sujathakumari, C.A .Laseena,
“Ultrasound Array Modelling and Beamforming
using Field II”,International Journal of Emerging
Research in Management Technology,ISSN: 2278-
9359 (Volume-4, Issue-6),2015.
[10] Xilinx(2011),LogiCORE IP Floating-Point Operator
v5.0
[11] M.S .Chaitra, B. G. Sudarshan, B. S.
Sathyanarayana and P. Kumar, “Ultrasound Imaging
System: A Review”, International Journal of
Pharmacology and Pharmaceutical Technology
(IJPPT), ISSN: 2277 3436, Volume-1, Issue-
2,2012.
[12] T.Haynes , “A Primer on Digital Beamforming”,
Spectrum Signal Processing, March 26,1998,
http://www.spectrumsignal.com.
[13] L. Azar, Y. Shi and S. C. Wooh,”Beam focusing
behavior of linear phased arrays”, NDTE
International, Elsevier,vol.33.page 189198, July
2000.
[14] J. A. Jensen, ”Field: A program for simulating
ultrasound systems”,Med. Biol. Eng. Comput.vol. 4,
pp. 351353,1996.
17 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai
International Journal of Electronics, Electrical and Computational System
IJEECS
ISSN 2348-117X
Volume 6, Issue 6
June 2017
[15]
http://www.wikiradiography.net/page/Ultrasound+P
hysics
[16] J.park, S.M.Wi, and J.S. Lee, “Computationally
efficient adaptive beamformer for ultrasound
imaging based on QR decomposition”, IEEE
Transactions on Ultrasonics, Ferroelectrics, and
Frequency Control, vol. 63, no. 2, February 2016.
[17] C.H. Hu, X.C. Xu, J.M.. Cannata, J.T. Yen, and
K.K. Shung, “Development of a Real-Time, High-
Frequency Ultrasound Digital Beamformer for High
Frequency Linear Array Transducers”, IEEE
Transactions on Ultrasonics, Ferroelectrics, and
Frequency Control, vol. 53, no. 2, February 2006.
[18] S. Sami Deeb and Robert A. LaTourette,”
Derivation of Beam Interpolation Coefficients with
Application to the K- Beamformer” NUWC-NPT
Technical Report 11,287 15 June 2001,IEEE Journal
On Very Large Scale Integration Systems .
[19] T.I. Laakso,V.Valimaki,M.Karjlainen and U.K.
Laine, “Splitting the unit delay: tools for fractional
delay filter design”, IEEE Signal Processing
Magazine, page 30 60 January 1996.
[20] P N T Wells, “Ultrasonic imaging of the human
body”,Rep. Prog. Phys.62 pp-671722,1999, Printed
in the UK
[21] P. Hoskins, K. Martin, A.Thrush,” Diagnostic
Ultrasound Physics and Equipment.”
[22] J. A. Jensen,” Users guide for the Field II program”,
Release 3.20, November 19 2010.
[23] W.Hua and L.Mei, “The Design of Delay Pulse
Circuit for Ultrasonic Phased Array System”,
Proceedings of 20th International Congress on
Acoustics, ICA 2010 23-27 August 2010, Sydney,
Australia.
[24] J.Y.Lu, “Transmit-Receive Dynamic Focusing with
Limited Diffraction Beams”, IEEE Ultrasonics
Symposium 1543,1997.
[25] http://www.signal-processing.com/us field.html
[26] J.A. Jensen, “Ultrasound imaging and its modeling”
Chapter in. Fink et al. (Eds.): Imaging of Complex
Media with Acoustic and Seismic Waves, Topics in
Applied Physics, vol. 84, pp. 135-165, Springer
Verlag,2002.
[27] http://www.tp-
link.in/resources/document/beamforming.pdf
[28] J.A.Jensen and P. Munk, “ Computer Phantoms For
Simulating Ultrasound B-Mode And CFM Images”,
Acoustical imaging, vol. 23, pp. 75-80, eds.: s. Lees
and l. A. Ferrari, plenum press,1997.
[29] I. Lie, M.E.Tanase, “A Compact FPGA Beamformer
Architecture”, WSEAS Int. Conf. On Dynamical
Systems and Control, Venice, Italy, November 2-4,
(Pp463-466),2005.
[30] B.G. Tomov and J.A. Jensen, “A new architecture
for a single-chip multi-channel beamformer based on
a standard FPGA”, IEEE Ultrasonics Symposium
,2001.
[31] J.Y. Um, E.W. Song, Y.J. Kim, S.E. Cho, M.K.
Chae, J. Song, B. Kim, S. Lee, J. Bang, Y.Kim, K.
Cho, B. Kim, J.Y. Sim, H.J.Park, “An Analog-
Digital-Hybrid Single-Chip RX Beamformer with
Non- Uniform Sampling for 2D-CMUT Ultrasound
Imaging to Achieve Wide Dynamic Range of Delay
and Small Chip Area”,IEEE International Solid-
State Circuits Conference,2014
[32] M.Almekkawy, J.Xu and M. Chirala, “An
Optimized Ultrasound Digital Beamformer with
Dynamic Focusing Implemented on FPGA”, IEEE
Conference proceedings IEEE Eng Med Biol Soc,
2014
[33] D.B. Casas, “Digital Beamforming Implementation
on an FPGA Platform”, SPCOM Group,July 2007.
[34] G.Meng, “Method and Apparatus for Multi-Beam
Beamformer Based On Real- Time Calculation of
Time Delay and Pipeline Design”, Patent
Application Publication, US 2011/0237950 A1, Sep.
29, 2011 Sheet 1 of 14
[35] G.I. Athanasopoulos, S.J. Carey, and J.V. Hatfield,
“Circuit Design and Simulation of a Transmit
Beamforming ASIC for High-Frequency Ultrasonic
Imaging Systems”, IEEE Transactions on
Ultrasonics, Ferroelectrics, and Frequency Control,
vol. 58, no. 7, July 2011
[36] V.N.Okorogu, G.C.Nwalozie, K.C.Okoli and
E.D.Okoye, “Design and Simulation of a Low Cost
Digital Beamforming (DBF) Receiver for Wireless
Communication”, International Journal of
Innovative Technology and Exploring Engineering
(IJITEE) ISSN: 2278-3075, Volume-2, Issue- 2,
January 2013.
[37] S. G. Dighe and M. T. Kanawade, “Field
Programmable Gate Array Technique’s” ,
International Journal of Computing and Technology,
Volume 2, Issue 12, December 2015 ISSN : 2348
6090.
[38] U.M.Baese, “Digital Signal Processing and Field
Programmable Gate Arrays”,3rd edition,Springer.
[39] M.M. Nguyen and J.T. Yen, “Performance
Improvement of Fresnel Beamforming Using Dual
Apodization with Cross-Correlation”, IEEE
Transactions on Ultrasonics, Ferroelectrics, and
Frequency Control, vol. 60, no. 3, March 2013
[40] S. A.Mohamed, E.D.Mohamed, M.F.Elshikh, and
M. A.Hassan, “Design of Digital Apodization
Technique for Medical Ultrasound Imaging”,
International Conference On Computing, Electrical
And Electronic Engineering(ICCEEE),2013
[41] J.Bhattacharyya, P.Mandal, R.Banerjee, S. Banerjee,
“ Real Time Dynamic Receive Apodization For An
Ultrasound Imaging System”, Proceedings of the
18 Jayaraj U Kidav, N.M Sivmangai, Nidhi Antony, Dr.M.P Pillai
International Journal of Electronics, Electrical and Computational System
IJEECS
ISSN 2348-117X
Volume 6, Issue 6
June 2017
19th International Conference on VLSI Design
(VLSID06)
[42] B.G.Tomov and J. A. Jensen, “ Compact
implementation of dynamic receive apodization in
ultrasound Scanners”, Proceedings of SPIE - The
International Society for Optical Engineering , April
2004
[43] M. A. Hassan, “ Comparison between windowing
Apodization functions techniques For medical
ultrasound imaging ” , American Journal of
Biomedical Science and Engineering 2015; 1(1): 1-
8, Published online January 30, 2015,
(http://www.aascit.org/journal/ajbse)
[44] I. Mahi P. and S. S. Kerur, “Design and Simulation
of Floating Point Pipelined ALU Using HDL and IP
Core Generator”, International Journal of Current
Engineering and Technology , ISSN 2277-4106,
2013, INPRESSCO,Available at
http://inpressco.com/category/ijcet.
[45] L.Gangwar and R.Chaudhary, “ Floating Point
Arithmetic Unit Using Verilog”, Advance in
Electronic and Electric Engineering.ISSN 2231-
1297, Volume 3, Number 8 (2013), pp. 1013-1018,
Research India Publications,
http://www.ripublication.com/aeee.htm
[46] IEEE Standard For Floating-Point Arithmetic,
Microprocessor Standards Committee of The IEEE
Computer Society,Approved 12 June 2008,IEEESA
Standards Board.