Thomas J Watson Research Center, IBM IBM | 19-20 August 2008 | M. Kapur © 2008 IBM Corporation...
-
date post
18-Dec-2015 -
Category
Documents
-
view
219 -
download
1
Transcript of Thomas J Watson Research Center, IBM IBM | 19-20 August 2008 | M. Kapur © 2008 IBM Corporation...
Thomas J Watson Research Center, IBM
IBM | 19-20 August 2008 | M. Kapur © 2008 IBM Corporation
FPGA-based acceleration platform for chip verification
RAMP Retreat, 19-20 August, 2008
FPGA-based acceleration platform for chip verification
IBM | 19-20 August 2008 | M. Kapur © 2008 IBM Corporation2
Architecture & FPGA Logic Design, Library models, Validation, Partitioning, Synthesis, Serialization, System Control Firmware, Board Testing CodeSameh Asaad, Mohit Kapur, Chuck Haymes, Daniel Littrell, Ben Parker,Bernard Brezzo, Jose Tierno
System packaging, PCB design, layout, mechanical, thermal Todd Takken, Al Lanzetta, Randy Bickford, Shurong Tian, Christopher Surovic, Paul Coteus
Host Control SoftwareRalph Bellofatto, Alda Ohmacht
This work is partly sponsored by U. of California Subcontract No. B554331. The prime contractor is LLNL
FPGA-based acceleration platform for chip verification
IBM | 19-20 August 2008 | M. Kapur © 2008 IBM Corporation3
The design of multi-core processors poses many challenges:• Architecture : How do we best organize many cores on a chip,
given certain application requirements, power and area constraints etc? How do we architect the memory hierarchy? IO ?
• Software Development: Can we get a head start in software development before having hardware brought up in the lab?
• Verification: Up to 70% of the hardware design cycle is spent on verification. There is a pressing need to address the verification bottleneck.
Software-based simulators are slow, not keeping up with the increase in processor complexity
FPGA-based simulation acceleration offers a viable alternative to address the above challenges, due to:
• Massive parallelism on each FPGA MHz level of simulation performance
• Flexible architecture that allows modeling of any digital circuit• Relative ease in constructing large systems of FPGAs and
SRAM/DDR memory
Our first target is a cycle-accurate, chip-level verification acceleration of DUT Chip
Introduction
Power Estimation
(Gate- Accurate
Model)
FPGA-based acceleration platform for chip verification
IBM | 19-20 August 2008 | M. Kapur © 2008 IBM Corporation4
Motivation for FPGA-based logic verification
Logic verification poses a bottleneck in processor design, accounting for as much as 70% of the design cycle
Simulation AWAN FPGA
Underlying Tech
Computers Dedicated ASIC FPGA
Simulation Engine
VHDL simulator (cyclesim)
Implements simulation engine in HW
Implements user design in HW
Speed Low (1-100 Hz) Medium (1-50 KHz) High (1-10 MHz)
Ease of use Easy Medium Hard
Cost Cheap Expensive Cheap
Challenges Hard to parallelize SW simulators
Cost of developing custom ASIC Machines become obsolete quickly
Multi-FPGA partitioning Reconciling processor design with FPGA flow Cycle-accurate models of arrays/custom cells
Software-based verification is too slow and hard to parallelize Dedicated hardware solutions are too expensive to develop FPGA-based verification has the best price/performance if we overcome its challenges.
FPGA-based acceleration platform for chip verification
IBM | 19-20 August 2008 | M. Kapur © 2008 IBM Corporation5
FPGA-friendly Library of Components Start by developing a cycle-accurate, FPGA-friendly
library of components for custom leaf cells:
• Memory (eDRAM) model, using FPGA + external SRAM
• Multi-port Register File model, using hyper-clocked Block RAM
• Latch/LCB model(s): function only
Component validation through Verity/6th Sense
Chip VHDL should instantiate “wrappers” for these components that enable retargeting to chip and FPGA prototype flows.
Multi-FPGA partitioning Flow: Build transparent serial communication channels
between FPGAs to multiplex N design signal onto M physical traces between FPGAs where N>>M (e.g. 100:1)
Generate wrappers around the partitioned components to connect the virtualized IO signals. One wrapper for each FPGA in the system
Synthesize/place/route each FPGA using normal FPGA flow
User FPGA User FPGA
Logic1 Logic2SE
R
Logic1 Logic2
Ex Control Ex Control
HostInterface
Host SW
Control FPGA
FPGA modeling methodology
FPGA-based acceleration platform for chip verification
IBM | 19-20 August 2008 | M. Kapur © 2008 IBM Corporation6
Partitioning Flow
Netlister IIcompiler
Hierachical PCBdesign description
(mainboard.nl, memory.nl, …)
Bill of Materials, …
Allegro files for layout
Structural Verilog netlistof FPGA system
DUT Top-levelnetlist (VHDL)
Portals VHDLcompiler
Portals Verilogcompiler
DADB
Mapping file
SerDes components
Infrastructurecomponents
Partitioner
FPGA wrapper file N
FPGA wrapper file 1DUT core instanceSerdes instancesInfrastructure instancesSynthesis directives
Tool automatically generates top level netlist for each FPGA in the system
Each FPGA is synthesized, placed and routed separately in parallel
FPGA-based acceleration platform for chip verification
IBM | 19-20 August 2008 | M. Kapur © 2008 IBM Corporation7
FPGA Daughter Card
Al Lanzetta
• Xilinx V5 LX330 FPGA (65 nm tech) •Total DDR memory capacity is 4 GB (2GB per DIMM)• Total SRAM memory capacity is 32 MB (4 MB per chip)• 180 LVDS pairs (136 gbps) to backplane through bottom edge connector• 4 Top connectors (2.4 + 2.4 gbps each) can be used for point-to-point links between any two cards in the system• GB Ethernet connection to host• Card can be used stand-alone or in-system
FPGA-based acceleration platform for chip verification
IBM | 19-20 August 2008 | M. Kapur © 2008 IBM Corporation8
FPGA Platform Isometric View
FPGA-based acceleration platform for chip verification
IBM | 19-20 August 2008 | M. Kapur © 2008 IBM Corporation9
ICAPVIRTEX5
(width=x32) iCon
UserLogic
USER FPGALX330
32-bitBus (100MHz)
Host ControlMachine
GBE/UDP
CAPTUREVIRTEX5
ClockControlMacro
400 MHzXTAL
12.5 MHz 100
MHz
ControlFPGALX30
Logic Allocation
File (1)
Preprocess to extract frames/offsets of interest
Setup File
Single-Step & ScanFrames of Interest
ScanFile
Post-Process to create waveform file
Waveform Generation Process Flow
Hardware Software
(1) Logic Allocation file contains a cross reference from every design latch to the corresponding bit location (frame:offset) in the readback stream Pre-processing extracts the frames:offsets to be read from the device After every clock step, software reads the frames of interest into scan file Post-processing converts the scan file to a waveform viewer file
FPGA-based acceleration platform for chip verification
IBM | 19-20 August 2008 | M. Kapur © 2008 IBM Corporation10
Single Core demo