Measuring the Gap between FPGAs and ASICs
description
Transcript of Measuring the Gap between FPGAs and ASICs
1
Measuring the Gap between FPGAs and ASICs
Authors: Ian Kuon, Jonathan Rose
(ECE, University of Toronto)
Presenter: Sang-Kyo Han
(ECE, University of Maryland)
Published at IEEE Transactions on Computer-AidedDesign of Integrated Circuits and Systems in 2007
2
I have an interest on FPGA power reduction.I need reference data about area, performance and power consumption comparison between FPGAs and ASICs.
My Motivations for Paper Selection
3
1. Introduction2. Historical Measurements3. New Comparison Methodology4. FPGA CAD Flow5. ASIC CAD Flow6. Comparison Metrics7. Results8. Conclusion
Contents
4
Introduction Motivations of the Research
It makes for system architects to choose their implementation medium between FPGAs and ASICs easier.
FPGA makers seeking to improve FPGAs can gain insight by quantitative measurements.
Focus on a Comparison Between a 90nm CMOS SRAM-programmable FPGA and a
90nm CMOS standard cell ASIC To Measure the Area, Performance and Power Consumption
Gap
More meaningful than past comparisons Wide range of benchmarks and real empirical experiments
5
Historical Measurements S.D. Brown [1992]
Reported cursorily logic density gap between FPGAs and MPGAs
P.S. Zuchowski [ICCAS, 2002] Found delay, gate density, dynamic power consumption gaps
between FPGA lookup table (LUT) and ASIC Unclear cause of variability of the values across process generations
S.J. Wilton [JSSC, 2005] Examined area and delay, but estimated the values for ASIC Performed only a single module
6
New Comparison Methodology
To provide a more definitive measurement Implemented a large set of benchmark circuits in FPGAs and
standard cells Selected carefully benchmarks (more detailed at the next page) Altera Stratix II FPGA based on TSMC’s 90nm process and
ASIC based on STMicroelectronics’s 90nm process
7
New Comparison Methodology
Benchmark Selection Considered a variety of benchmark Can significantly impact the results Two critical factors for selection:
• HDL RTL should be synthesized similarly by the different tools used for FPGA and ASIC.
(Two synthesis tools were sufficiently similar by checking the number of registers inferred from two synthesis processes.)
• The designs should be able to make use of the block memories and dedicated multipliers.
8
FPGA CAD FLOW
Altera Quartus-II Software Synthesis: QIS
• Logic synthesis is a process by which RTL is turned into a design implementation in terms of logic gates.
PNR (Placement and Routing): Fitter Static Timing Analysis: Timing Analyzer
• STA measures the critical path which determines the operating frequency of the
design.
Repeated the entire CAD flow five times using five different seeds• The final operating frequency of the design can
vary depending on the random seed given to the placement tool
Synthesis:QIS
RTL DesignDescription
PNR:Fitter
STA:Analyzer
9
ASIC CAD FLOW: SynthesisASIC Synthesis
Tool: Synopsis Design Compiler HDL sources analyzing and
constraints for compilation Gate-level optimization for
improving performance DFT (Design For Testability) to
test for manufacturing defects The desired clock period is
adjusted from the unrealistic 0.5ns constraint to the critical path delay.
Netlist and constraint saved for PNR tools
10
ASIC CAD FLOW: PNRASIC PNR
Tool: Cadence SOC Encounter Floorplan and Placement
• Target row utilization which is the percentage of the area required for the standard cells was set to 85%.
Inserting clock tree and Routing Post-routing for improving
performance DRC and Final netlist RC extraction: by Synopsys
StarRCXT Final timing and power analysis:
by Synopsys PrimeTime, PrimePower
11
Comparison Metrics Area
FPGA: Actual silicon area of the resources used by the design ASIC: Final core area of the placed and routed design
Speed Static timing analysis was used to measure the critical path.
(Timing analysis determines the maximum clock frequencies.) FPGA: timing analysis tool in Quartus-II ASIC: Synopsys PrimeTime
Power Preferred approach: to simulate post-placed and routed design
with testbench vectors. (but for most designs, not available) Statistical vectorless estimation: to estimate toggle rates and
probabilities at nodes
12
Results Area
Area ratio The hard heterogeneous
blocks do significantly reduce area gap.
Heterogeneous blocks are fundamentally similar to an ASIC except a programmable interface.
FPGAs take 40 times more area than ASICs when only logic used.
13
Results Speed
Ratio between the FPGA’s critical path delay relative to the ASIC for each module.
ASICs are designed for the worst case process. It is fairer to compare ASIC performance to that of the slowest FPGA speed grade.
The slower speed grade parts cause a larger performance gap.
FPGAs is 4.3 times slower than ASICs when only logic used.
14
Results Power
Ratio of FPGA dynamic power consumption to ASIC power consumption
FPGAs consume 12 times more dynamic power than ASICs when only logic used.
The slower speed grade parts cause a larger performance gap.
For static power, useful information was not found. (But, the static power gap and the area gap are correlated.)
15
Conclusion This paper has presented empirical measurements quantifying the gap between FPGAs and ASICs.
For logic-only circuits, FPGAs show on average 40 times larger area and 3.2 times slower speed and 12 times more dynamic power consumption than ASICs.
The use of hard multipliers and dedicated memories enable a substantial reduction in area and power consumption but have a relatively minor impact on the delay differences.
16
The EndThank You
Q & A