Post on 20-Sep-2020
FPGA, a future proof programmable system fabric
Ivo BolsensCTO Xilinx March 2005
Georgia Tech 2
Outline
• State-of-the-art• Alternatives• Towards domain optimized programmable platforms
– Embedded Processing– Connectivity– DSP
• Domain specific system design flow• University Interaction• Conclusions
Georgia Tech 3
ProcessorMemoryLOGIC
Xilinx : From Inventor of Programmable Logic Device
To Leading Innovator in
Programmable Digital System Design
Xilinx : From Inventor of Programmable Logic Device
To Leading Innovator in
Programmable Digital System Design
Key components of an digital electronics system :
Where Xilinx Fits
Georgia Tech 4
Introducing Xilinx• Leader in fastest growing semiconductor segment
– Invented programmable chip in 1984– > 50,000 design starts/year
• Leader in semiconductor process technologies– First to 180nm, 150nm, 130nm and 90nm
• Pioneer of fabless semiconductor model– Focus on design, marketing, support – Partner for everything else
• A well-managed company and great place to work – #4, #5 in Fortune’s 2003, 2004 “Best 100 Places to Work”
Georgia Tech 5
Xilinx Product Portfolio
High Integration High Performance
Lowest System CostHigh Volume
Lowest Cost Logic
Highest VolumeLowest Cost Lowest Power
Software and Development Tools
Support Services
Georgia Tech 6
Xilinx Revenue
200
400
600
800
1993 1994 1995 1996 1997 1998 1999 2000
1000
2001
1200
1400
1600
Fiscal Years2002 2003 2004
Xilinx Revenue BreakdownCalendar Year 2004
Revenue by Geography Revenue by End Market
24%
42%
20% 14%
North America
EuropeJapan Asia
Pacific36%
53%
11%
Consumer,& Other
Communications
Storage & Servers
Source: Xilinx, Inc.
Georgia Tech 8
The Programmable MarketplaceCalendar Year 2004
Source: Company reportsLatest information available; computed on a 4-quarter rolling basisLattice revenues estimated based on company guidance for Q4CY03 results
XilinxXilinxAltera
LatticeActel QuickLogic: 2%
XilinxXilinx
All OthersAll Others
Xilinx revenues are greater than all other pure-play PLD companies combined.
PLD Segment FPGA Segment
Other: 1%
41%
59%
51%32%
6% 8%
Georgia Tech 9
Chip Requirements
• Programmable – mass market of one
• Regular – manufacturability
• Scalable – future proof, ride Moore’s law
• Parallelism – best performance and power
• Distributed memory– solve data transfer bottleneck
• Cost optimized– low NRE
++ + +
++ + +
++ + +
++ + +
Arithmetic/LogicMemory
“If FPGAs wouldn’t exist, people would have to invent them…”
Georgia Tech 10
A Decade of Progress
• 200x More Logic– Plus memory,
µP, DSP, MGT• 40x Faster• 50x Lower Power • 500x Lower Cost
1
10
100
1000
87 88 89 90 91 92 93 94 95 96 97 98 99 00 00
CLB CapacitySpeedPower per MHzPriceITRS Roadmap
Virtex &Virtex-E
XC4000
100x
10x
1x
Spartan-2
1000x
Virtex-II &Virtex-II Pro
Virtex-4XC4000 &Spartan
Spartan-3
'91 '92 '93 '94 '95 '96 '97 '98 '99 '00 '01 '02 '03 '04
Year
Georgia Tech 11
State-of-the-art Platform FPGA200,000 Flexible
Logic Cells
0.6-11.1 GbpsSerial Transceivers
500 MHz Programmable DSPExecution Units
1 Gbps Source Synchronous I/O
500 MHz, 10Mbits BRAM with FIFO & ECC500 MHz Digital Clock
Management
700 Mips PowerPC® Processor with Auxiliary Processing Unitand 10/100/1000 Ethernet Mac
AES Design Encryption
Georgia Tech 12
Highest FPGA Performance
Logic Fabric Performance
High-speed Serial I/O
On-chip RAMSpeed
Embedded Processing
DSP:
Perf
orm
ance
I/O LVDSBandwidth
I/O MemoryBandwidth
480 Gbps480 Gbps260 Gbps260 Gbps
10 Gbps10 Gbps500 MHz500 MHz 702 DMIPS702 DMIPS
BenchmarkBenchmark500 MHz500 MHzBreakthrough Performance
500MHz
Georgia Tech 13
The Platform FPGAMGTs I/OsMemory PowerPCLogic
Emulation DSPCo
mm
unica
tion
Port
CustomLogic
Inte
rnal
Mem
ory
Exte
rnal
Mem
ory P
ort
DSP
Acce
lerat
or
µP
Georgia Tech 14
Platform FPGAsDigital System Design Simplified
High-level synthesis
RTL0, 1 and delay
HW / SW partition
TimingStandards and interfaces
Termination
Clock distribution
Noise Margin
Crosstalk
DFM
ATPG
IR drop
RepeatersStartup init
Transmission linesClock generation
System Design
Platform FPGA Embedded IP
Georgia Tech 15
System Platform Example
Customer Benefits:– High performance
Gigabit Transceivers– Integration of PowerPC
processors– HW/SW
partitioning/flexibility
CDMA2000 “Converged” Base Station
Georgia Tech 16
Benefits of Configurability
DesignTime
BootTime
SampleRate
EventRate
Tota
l Rec
onfig
urat
ions
FPGA Vendor
DeviceTest
System Designer
SystemManufacturer
Field
WhyWhere
FPGA Economics
Design Verification
Edit/Compile/Debug
System Test
Power-upSelf TestCustomizationField UpgradeEvolvable Systems
Georgia Tech 17
Outline
• State-of-the-art• Alternatives• Towards domain optimized programmable platforms
– Embedded Processing– Connectivity– DSP
• Domain specific system design flow• University Interaction• Conclusions
Georgia Tech 18
The Processor Platform
Viktor Peng, MIPS
Georgia Tech 19
The ASIC/ASSP Platform
60%DesignDesign
100%Mask CostsMask Costs40%ManufacturingManufacturing
Cost increases over Cost increases over previous generationprevious generation**
Building a next generationBuilding a next generationSoCSoC ASIC (90nmASIC (90nm**) ?) ?
$30-50MDevelopment CostDevelopment Cost
$150MRevenues needed to Revenues needed to breakeven in 2 yrsbreakeven in 2 yrs
$80MTotal Cost of bringing Total Cost of bringing product to marketproduct to market
* Source Dataquest, IBS, Xilinx
• 2007 ASIC/ASSP forecast = ~$80B• ~500 successful ASIC/ASSP design starts
– NRE $30M, R&D 20% of revenue, $150M revenue
Georgia Tech 20
150nm / 200mm ASICs150nm / 200mm ASICs
FPGA- ASIC/ASSP Crossover
Production Volume
Cost
90nm / 300mm ASICs90nm / 300mm ASICs
150nm / 200mm FPGAs
150nm / 200mm FPGAs
90nm / 300mm FPGAs
90nm / 300mm FPGAs
FPGA Cost Advantage ASIC Cost AdvantageFPGA Cost Advantage ASIC Cost AdvantageFPGA Cost Advantage
Georgia Tech 21
Outline
• State-of-the-art• Alternatives• Towards domain optimized programmable platforms
– Embedded Processing– Connectivity– DSP
• Domain specific system design flow• University Interaction• Conclusions
Georgia Tech 22
Domain A Domain B
Domain Optimized PlatformsOne Family – Multiple Platforms
Column based features Platform A Platform B Platform x
...
Logic DomainHighest logic density
DSP DomainHighest DSP performance
Connectivity DomainEmbedded ProcessorsHigh-speed Serial I/O
Virtex-4 LX Virtex-4 SX Virtex-4 FX
Logic
Memory
DSP
Processing
High-speed I/O
Enables “Dial-In” hard IP MixLogic, DSP, BRAM, I/O, MGT, DCM, PowerPC
Enabled by Flip-Chip PackagingI/O Columns Distributed Throughoutthe Device
Georgia Tech 23
Both Hard & Soft IP Necessary for Programmable Systems
Programmable hard IP• Up to 10x less area• Up to 10x lower power• Up to 2x performanceCustomizable soft IP• Most flexible• Widest selection
ProtocolsPHY (ser./par.)Timing critical I/O logic & clocking
Connectivity
AlgorithmsDSP slice (MAC)DSP
PeripheralsAcceleratorsAdditional µPs
PowerPCProcessingSoftHardIP
Example: Virtex-4 FX platform FPGA
Georgia Tech 24
Outline
• State-of-the-art• Alternatives• Towards domain optimized programmable platforms
– Embedded Processing– Connectivity– DSP
• Domain specific system design flow• University Interaction• Conclusions
Georgia Tech 25
From Soft to Hard IP
• Range of processing solutions
Performance
Featu
res
Lowest Cost 8-bit ArchitectureSoft Core
32-bit General Purpose Architecture Soft Core with Acceleration
Highest Performance 32-bit General Purpose
Architecture With Acceleration
Plus: A broad range of common peripherals and IP
Only Dual PowerPC core architecture
Soft MicroBlaze150DMIPS45 cents200 in single FPGA
Georgia Tech 26
Accelerate Performance Beyond the Processor Core
• New Auxiliary Processing Unit (APU)– Direct interface from CPU
pipeline to FPGA logic– Simplifies integration of
Coprocessor and hardware accelerators
• Reduce number of bus cycles by factor of 10X
Georgia Tech 27
Close integration HW/SW• CPU centric Paradigm:
– CPU is the brain of the system– All the execution units are in the CPU– The rest of the system resources are storage or IO for the processor and
at its disposal.– CPU resources are designed to be used only by the CPU
• New Paradigm:– The programmable fabric claims some of the responsibility
• Some Execution Units are implemented in the fabric– Soft blocks can use the resources in the CPU
– The execution flow in the fabric and the CPU have direct coupling
New APU interface facilitates this and is a step in this direction
Georgia Tech 28
Use model : Coprocessor
ProcessorBlock
Soft FPU
Floating Point Unit
Operands
ResultExecution Unit
Register File
Load Data
D-Cache
BRAM DSOCM
D-PLB
Georgia Tech 29
Comparison with Traditional Bus-based
ProcessorBlock Soft Aux.
ProcessorAPU I/F
APU PLB
Write Instruction
and operands
Read Result and Status
1 APU cycle
ExecutionExecution
1 APU cycle +1 CPU cycle
Write Operand1
Read Status
5 PLB cycles + 2 CPU cycles
ExecutionExecution
Write Operand2 and Instruction
Read Result
5 PLB cycles + 2 CPU cycles
6 PLB cycles + 3 CPU cycles
6 PLB cycles + 3 CPU cycles
NEX APU cycle
NEX PLB cycle
ProcessorBlock Soft Aux.
ProcessorAPU I/F
Georgia Tech 30
Use Model : Streaming
Processor BlockSoft
Auxiliary Processor
OperationOperation
Instructions
Status
Stream of Data
Control
Data
Data
Georgia Tech 31
Embedded Design Methodology
• Support HW, SW and mixed HW-SW design using domain-specific as well as a unified tool chain
Self ContainedMicrocontroller
ApplicationsComplex Embedded
Computing
• Supporting both ends of the usage spectrum provides all points in between as well.
Georgia Tech 32
Single Environment ForHW and SW Development
• HW and SW Platform generators based on the Xilinx on the Platform Specification Format for programmable systems
• Tight coupling enables a customized SW platform to be generated that matches the customized HW platform
• SW Development• Download to Board
• SW Debug
Industry Standard SWDevelopment Flow
Industry Standard HW Development Flow
• Logic Development• Place and Route
• Download to FPGA• HW Debug
Platform Studio Integrated Design Environment
Create Processor, Bus & Peripheral Subsystem,
Software Drivers,BSP
Innovative Technology
Georgia Tech 33
Board SupportPackage (BSP)
Board Support Package
• Initializes the processor system at power up
• Interfaces between RTOS and the peripheral device
•The drivers are designed to be portable across processor core and RTOS
• Allows reuse = higher quality
• Integrates the driver into RTOS• Satisfies the "plug-in" requirements of RTOS• Needs to be rewritten for each OS
• Initializes all parameters (e.g., MMU, int/ext reg)
Boot CodeInitialization Code
Ethe
rnet
10/10
0 De
vice D
river
UA
RT
1655
0D
evic
e D
river
IIC M
aste
r & S
lave
Dev
ice
Driv
er
ATM
Utop
ia Le
vel 2
De
vice D
river
Perip
hera
l n, n
+1…
Dev
ice
Driv
ers
RTOS Adaptation Layer
RTOS
Application SW
Georgia Tech 34
Built for Debug
• FPGA fabric provides full internal visibility
• Debug occurs at system speeds• Never too late in an FPGA
– Hardware problems can be fixed during development and after product deployment
• Enables on-chip co-verification– As part of design process
IO Pads
IO P
ads
IO P
ads
IO Pads
Boundary Scan TAP Controller
Embedded System Bus
MemoryArray
PPC405Core
IPCore
CustomCore
ICON
ILA
ILA
ILA
IBA CustomLogic
ILA
Georgia Tech 35
ASICs/ASSPs
DSP Sweet Spot - PerformancePerformance
FlexibilityMost Important
Low Unit CostMost Important
DSP Processors
1 MSPS
300 MSPS
FPGASweet Spot
Georgia Tech 36
The DSP ‘LUT’
18x18Multiplier
A0 A1
7
B0 B1
7> > > >Op Reg
36
O0 O47
> >
4848-bit
accum
500MHz Available on all Virtex-4 Platforms
.
Georgia Tech 37
Outline
• State-of-the-art• Alternatives• Towards domain optimized programmable platforms
– Embedded Processing– Connectivity– DSP
• Domain specific system design flow• University Interaction• Conclusions
Georgia Tech 38
Domain Specific Characteristics
• Networking perspective– The racing track pit stop
• lots of concurrent threads (=engineers) , on individual packets (=cars)
• DSP perspective – The manufacturing line
• Lots of data tokens (= cars), processed in a pipelined fashion (= dataflow)
• Processor perspective– Human operator
• Central control (= human), accelerators (= tools)
Different application domains require different methodologies to exploit capabilities hardware
Georgia Tech 39
DSP domain
• Library-based, visual data flow
• Seamlessly integrated with Simulink and MATLAB
• Automatic code generation– Synthesizable VHDL– IP cores– HDL test bench– Project and constraint files
Concepts that are familiarto the DSP designer
Georgia Tech 40
Networking: the Click front-end
Packetinput at top
Packet output at bottom
Each box is a simple processing element
packets flow through
Object Oriented
(Click: MIT, 2001)
Georgia Tech 41
...Design automation tools forsystem experts (entry, debug, ...)
Programmablelogic devices
Future : API
API access
Efficient mapping
Hooks forexisting IPcores andsoftware
Soft platform template
Provide concurrency,interconnection and
programmability
Exploit concurrency,interconnection and
programmability
Georgia Tech 42
Major API components
• Threads: lightweight concurrent message processing entities compiled to PLD implementations
• Hooks: wrappers for existing functional blocks with PLD implementations
• Interfaces: for moving messages into or out of the system perimeter
• Memories: for storage of messages, system state or system data
Georgia Tech 43
Bus-Macro
Arbiter
Module B
Bus Com 1
ID 1
Module C
Bus Com 2
ID 2
Module A
Bus Com 0
ID 0
Module E
Bus Com 3
ID 3
Run-timeModule
Controller
µController(MicroBlaze)
Flash-Memory
Bootstream
Slotstreams
Boot-CPLD
I/O (e.g. CAN)
ICAP
DecompressorUnit (LZSS)
Buffer
CAN-Interface
Buffer
Buf
fer
MA MB
MC MD
Search Slot Backup state Start Reconfiguration Restore state
Module D
Bus Com 2
ID 2
Save State!
State - Data
MD
Start addressEnd address
Last Bus-WordState - Data
Future : Transparent HW/SW
Georgia Tech 44
Future : The Programmable Systems Platform
FPGA Users Base
FPGA Users Base
Very High-Performance
DSP
Very High-Performance
DSP
EmbeddedProcessingEmbeddedProcessing
High-SpeedSerial
High-SpeedSerial
NextGeneration
NextGeneration
Georgia Tech 45
Outline
• State-of-the-art• Alternatives• Towards domain optimized programmable platforms
– Embedded Processing– Connectivity– DSP
• Domain specific system design flow• Future : University Interaction• Conclusions
Georgia Tech 46
Innovation depends on people
• People and innovation are among our most important assets
• Xilinx believes that people and innovation only thrive when actively nurtured
• Our business actions reflect this belief• We value long-term, strategic relationships with
top academic institutions to support teaching and research programs
Georgia Tech 47
Xilinx Research Labs andXilinx University Program (XUP)• Xilinx University Program and Research Labs
report directly to the CTO• Seeking a good “impedance match” with
university partners• Xilinx University Program manages donations to
Universities • Xilinx Research Labs manages external research
partnerships
Georgia Tech 48
XUP supports education
• Logic design– Integrated software environment (ISE)
• Embedded systems development– Embedded development kit (EDK)
• Digital signal processing– System Generator (SysGen)
• Free workshops for professors/students• PLDs, boards and development systems
Georgia Tech 49
www.digilentinc.com
Digilent boards
Georgia Tech 50
General Features
Video via XSGA connector
10/100 Ethernet PHY + Connector
Buttons, Switches,and LEDs
Audio via AC97 codec and
standard connectors
Keyboard and mouse• 2 PS-2 PortsRS-232
Georgia Tech 51
Designed with Education in Mind
Compact Flash card interfacefor individual project back-up
orIBM Miicrodrives with upto 8Gbit capacity
USB port for FPGA Configuration using standard USB cable
Support for supply current monitoring
Self-test / configuration
Flash memory
I/O under andover voltage protection
Georgia Tech 52
Memory HierarchyVirtex-II Pro XC2VP30 FPGA• 2448 Kbits of BRAM• 30,816 Logic Cells (Distributed RAM)
Expandable memory up to2 Gigabytes•DDR SDRAM DIMM Slot
Non-volatile Platform Flash PROM for
configurationstorage
Compact Flash card interfacefor individual project back-up
orIBM Miicrodrives with up to 8Gbiyte capacity
Georgia Tech 53
System Expansion
Additional I/O via user supplied four 60-pin headers
High-speed connectors compatible with Digilentmodule boards
High-speed Gigabit serial I/O• User Supplied SMA
Low-speed connectors compatible with Digilent peripheral boards
High-speed Gigabit serial I/O• Serial ATA connectors
Virtex-II Pro XC2VP30 FPGA• 4 of 8 Rocket I/O Tranceivers (MGTs)• 120 max of 556 User I/O
Georgia Tech 54
Digital I/O 4 Digital Breadboard
• Digilent Low cost, low speed plug-in modules to expand the curriculum• High speed plug-in modules coming soon: video module in Jan 2005• Visit www.digilentinc.com
Digilent System Expansion Modules
1W Amplified Speaker
Digital I/O 5 512K SRAM and Flash 1MHz Analog Acquisition
Georgia Tech 55
Block Diagram
2VP30
Compact Flash Configuration
DDR SDRAM DIMM
USB Configuration
AC97 Audio CODEC & Stereo AMP
75 MHz SATA clock
10/100 Ethernet PHY
Three Serial ATA connectors
RS232
PS-2 (x2)
Buttons (5), LEDs (4), switches (4)
Platform Flash Configuration
High-speed and low-speed I/O expansion connectors
SVGA
Additional I/O via four user-supplied 60-pin headers
Internal Power Supplies3.3V, 2.5V, and 1.5V
External Power
100 MHz system clock
2 user supplied clocks
One 3.125 Gbps port via 4 user-supplied SMA connectors
Georgia Tech 56
http://www.xilinx.com/univ/
Georgia Tech 57
Conclusions
• Programmable FPGA platforms are at the forefront of electronic systems design
• Simple entry point is to treat system on chip as shrunken version of system on board
• Opportunities for innovative use of (programmable) functional IP units, memories and (configurable) networks on chip
• Alternatives require not just new tools and methodologies, but also new thinking