Parallel Computing Using FPGA ( Field Programmable Gate Arrays )

16
Parallel Computing Using FPGA (Field Programmable Gate Arrays) 15 th May, 2009 Studies in Parallel & Distributed Systems – 159.735 Sohaib Ahmed

description

Parallel Computing Using FPGA ( Field Programmable Gate Arrays ). Studies in Parallel & Distributed Systems – 159.735. Sohaib Ahmed. 15 th May, 2009. Outlines. FPGAs and their internal structures Why use FPGAs for parallel computing ? Types of FPGAs - PowerPoint PPT Presentation

Transcript of Parallel Computing Using FPGA ( Field Programmable Gate Arrays )

Page 1: Parallel Computing Using FPGA  ( Field Programmable Gate Arrays )

Parallel Computing Using FPGA (Field Programmable Gate

Arrays)

15th May, 2009

Studies in Parallel & Distributed Systems – 159.735

Sohaib Ahmed

Page 2: Parallel Computing Using FPGA  ( Field Programmable Gate Arrays )

Outlines FPGAs and their internal structures

Why use FPGAs for parallel computing ?

Types of FPGAs

Application Examples and Processing in Applications

FPGAs in Parallel Computing

FPGA Limitations

Design Methods for FPGAs

Conclusion

Page 3: Parallel Computing Using FPGA  ( Field Programmable Gate Arrays )

Ross Freeman, one of the Xilinx founder (www.xilinx.com) invented FPGAs in mid-1980s

Other vendors include Altera, Actel, Lattice Semiconductor and Atmel

Support the notion of reconfigurable computing

Reconfigurable Computing

Use of multiple reconfigurable devices (such as FPGAs) and multiple microprocessors

Processor(s) execute sequential and non-critical code while reconfigurable fabric (FPGAs) performed that code which can be mapped efficiently to hardware

FPGAs - Introduction

Page 4: Parallel Computing Using FPGA  ( Field Programmable Gate Arrays )

FPGAs Internal Structure

A semiconductor device consisting of :

Configurable Logic Blocks (CLBs)

Input/Output (I/O) Blocks (IOBs)

Static RAM (SRAM) Blocks

Digital Signal Processing Blocks (DSPBs)

Page 5: Parallel Computing Using FPGA  ( Field Programmable Gate Arrays )

Why using FPGAs ? Speed up

Hardware is faster than software [1]

FPGAs can support thousand-fold parallelism especially for low-precision computations

Cost

Development cost is much less than ASIC (Application-specific integrated circuits) for lower volumes

Flexibility

FPGAs are flexible as compare to ASIC as they can be reprogrammable

Technology Clock Speed Time Taken

XV2V6000 FPGA

66 MHz 0.36 ms

Optimized Software

2.6 GHz 196.71 ms

Page 6: Parallel Computing Using FPGA  ( Field Programmable Gate Arrays )

Types of FPGAs

CPLDs ( Complex Programmable Logic Devices)

Requires voltage levels that are not usually present on computer systems

Anti-fuse based devices

Program only once

Static-RAM-Based Services

Can be programmed while the device is running

Page 7: Parallel Computing Using FPGA  ( Field Programmable Gate Arrays )

Application Examples

Virtex-II Pro

Virtex-4

Xilinx Devices

Recent success of FPGA in Tsubame Cluster in Tokyo

Improved performance by additional 25%

Page 8: Parallel Computing Using FPGA  ( Field Programmable Gate Arrays )

Processing in Applications [2]

Page 9: Parallel Computing Using FPGA  ( Field Programmable Gate Arrays )

FPGAs in Parallel Computing

Dynamic matching of a node to the computational requirement of an application

Application specific computers become more flexible

Enables the support of multi modes of parallel computing : MIMD, SIMD etc

Partial reconfiguration can allow better hardware resource utilization

Can extend dynamic task allocation scheme to allow for dynamic hardware allocation

Support for variable grain size

Page 10: Parallel Computing Using FPGA  ( Field Programmable Gate Arrays )

FPGAs Limitations Capacity

Logic blocks have not dense representation as instructions have

Conventional processor run 90 % of code that takes 10 % of execution time

Reconfigurable logic takes 10 % of code that takes 90 % of execution time

Tools

Compilers for reconfigurable logic are not very good

Some operations are hard to implement on FPGAs like random access and pointer-based data structures

Page 11: Parallel Computing Using FPGA  ( Field Programmable Gate Arrays )

Design Methods for FPGA [3]

Use an algorithm optimal for FPGAs

Systolic arrays for correlation are efficient

Use a computing mode appropriate for FPGAs

Streaming, systolic, arrays of fine-grained automata preferable Searching biomedical databases for similar sequences

Use appropriate FPGA structures

Analyzing DNA or protein sequences A straightforward systolic array

Page 12: Parallel Computing Using FPGA  ( Field Programmable Gate Arrays )

Living with Amdahl’s Law

Speeding up an application significantly through an enhancement requires most of the application to be enhanced

NAMD & ProtoMol framework was designed for computational experimentation

Hide latency of independent functions

Latency hiding is a basic technique for achieving high performance in parallel applications

Functions on the same chip to operate in parallel

Use rate-matching to remove bottlenecks

Function level parallelism is built in

Design Methods for FPGA [3]

Page 13: Parallel Computing Using FPGA  ( Field Programmable Gate Arrays )

Take advantage of FPGA-specific hardware

Hard-wired components such as integer multipliers and independently accessible BRAMs (Block RAMs)

Xilinx VP100 has 400 independent accessible, 32-bit quad-ported BRAMs can help in achieving 20 Terabytes per sec at capacity

Use appropriate arithmetic precision

Use appropriate arithmetic mode

Minimize use of high-cost arithmetic operations

Design Methods for FPGA [3]

Page 14: Parallel Computing Using FPGA  ( Field Programmable Gate Arrays )

Current Progress in Hardware & Software

SRC-6 and SRC-7 are parallel architectures in which cross bar switch that can be piled for scalability

High performance computing vendors like Silicon Graphics Inc. (SGI), Cray and Linux Networx incorporated FPGAs in their parallel architectures [4]

VHDL, Verilog are used to create hardware kernel

Other hardware description languages like Carte C, Carte Fortran, Impulse C, Mitrion C and Handel-C are used.

Annapolis Micro Systems’ CoreFire, Starbridge Systems’ Viva, Xilinx System Generator and DSPlogic’s reconfigurable computing toolbox are the high-level graphical programming development tools [5]

Page 15: Parallel Computing Using FPGA  ( Field Programmable Gate Arrays )

Using FPGAs in Parallel computing offer following benefits :

Application acceleration

Flexibility in terms of application domain

Potential cost benefits over ASICs

The ability to exploit variable levels and modes of parallelism

More effective use of hardware resources

Conclusion

Page 16: Parallel Computing Using FPGA  ( Field Programmable Gate Arrays )

References

[1] Todman,T.J,Constantinides, G.A, Witon, S.J.E, Mencer,O., Luk,W. & cheung, P.Y.K (2005) Reconfigurable computing : architectures and design methods

[2] Altera Cooperation White Paper (2007). Accerating high performance computing with FPGAs. October 2007

[3] Herbordt, M.C., VanCourt, T., Yongfeng, G., Shukhwani, B., Conti,A., Model,J. & Disabello,D. (2007). Achieving high performance with FPGA-Based computing

[4] Buell, D., El-Ghazawi, T., Gaj,K.,& Kindratenko,V. (2007). High-Performance reconfigurable computing. IEEE Computer Society, March, 2007

[5] El-Ghazawi, T., El-Araby,E., Miaoqing Huang, Gaj,K., Kindratenko, V.,& Buell, D. (2008).The promise of high- performance reconfigurable computing. IEEE computer society, February, 2008 pp. 69 -76.

Any Questions ?

Thank You