FPGAs in 2008 and beyond the future platform for transforming, transporting and computing TWEPP...
-
Upload
frederica-higgins -
Category
Documents
-
view
214 -
download
1
Transcript of FPGAs in 2008 and beyond the future platform for transforming, transporting and computing TWEPP...
FPGAs in 2008 and beyondthe future platform for transforming, transporting and computingTWEPP Topical Workshop on Electronics for Particle Physics September 15-19, 2008, Naxos, Greece
Peter AlfkePresented by Volker LindenstruthIncluding slides from Ivo Bolsens
2 TWEPP 2008
Agenda
• The FPGA Trends
• The Triple Play Opportunity
• The Platform Approach
• V5 News
• Conclusions
3 TWEPP 2008
Twenty Years of EvolutionTwenty Years of Evolution
• 1988: XC3090• 2008: XC5VLX330T
• 1000 times the number of LUTs• 2000 times the number of configuration bits = complexity• 20 times the speed• 500 times cheaper per function, not counting inflation
Moore’s Law has been good to all of us!
PA
4 TWEPP 2008
FPGA Status
More Logic Packing6-LUT architecture
Faster Time To MarketCross Platform Compatibility in
“T” devices
Optimized Serial IO Power & Performance
Low power 100Mbps to 3.2Gbps GTP150Mbps to 6.5Gbps GTX
Greater System integrationPowerPC 440 with crossbar and APU
Integrated ReliabilitySystem Monitor
TLTLTLTL DLDLDLDL PHYPHYPHYPHY
New PCI ExpressIntegrated x1, x4, x8
End-point blocks
Processor
Processor
CrossbarCrossbarCrossbarCrossbar
DMADMADMADMA
DMADMADMADMA
SPLB1SPLB1SPLB1SPLB1
DMADMADMADMA
DMADMADMADMA
MCIMCIMCIMCI
MPLBMPLBMPLBMPLB
DCRDCRDCRDCR
APUAPUControlControl
APUAPUControlControl
CPMCPMCPMCPM
PowerPCPowerPC 440440
PowerPCPowerPC 440440
SPLB0SPLB0
Clocking FlexibilityDCM (precision synthesis)
+ PLL (Low jitter)
DCMDCMPLL
IB
5 TWEPP 2008
1.00E+02
1.00E+03
1.00E+04
1.00E+05
1.00E+06
1.00E+07
1.00E+08
1.00E+09
1985 1990 1995 2000 2005 2010 2015 2020 2025
Year
Nu
mb
er
of
LC
sFPGA Capacity Trends
Historical Data
Largest Xilinx FPGA
ITRS 2013: 2.6M LCs(3.1B transistors)
IB
6 TWEPP 2008
10
100
1000
10000
1985 1990 1995 2000 2005 2010 2015 2020 2025
Sy
ste
m S
pe
ed
(M
Hz)
FPGA Performance Trends
Historical FPGA data
2007: 325 MHz typical;500 MHz max
2013: 500MHz typical, 750MHz max
IB
7 TWEPP 2008
0.1
1
10
100
1985 1990 1995 2000 2005 2010 2015 2020 2025
Po
we
r (W
)
0
1
2
3
4
5
6
Co
re P
ow
er
Su
pp
ly (
V)
[Lin
es
]
FPGA Power Trends
ITRS 2013: 25W
Total Power of Largest Device
ITRS LP Vdd
Vccint
?
IB
10 TWEPP 2008
Agenda
• The FPGA Trends
• The Triple Play Opportunity
• The Platform Approach
• V5 News
• Conclusions
IB
11 TWEPP 2008
The Triple Play Opportunity
• Global Internet traffic will reach 44bn gigabytes per month in 2012, compared to less than 7bn in 2007
• Video goes from 22% of consumer traffic in 2007 to 90% in 2012
• Mobile data traffic will roughly double each year from 2008-2012
11
Voice
Data
Video
Source: Cisco Visual Networking Index – Forecast and Methodology 2007-2012Source: Cisco Visual Networking Index – Forecast and Methodology 2007-2012
Backdrop: focus on reducing power consumption to reduce operating expenseBackdrop: focus on reducing power consumption to reduce operating expense
IB
12 TWEPP 2008
Consumer Internet Traffic Analysis 2007-2012
12
IP T
raffi
c (P
etaB
ytes
/mon
th)
Source: Cisco
All Video: 49% of IP
traffic
P2P is large portion of IP traffic: 33%
Fastest growth: Video to TV
Video Comm will be driver beyond 2012
VoIP Video Comm
IB
13 TWEPP 2008
Triple Play : Key Technologies
1. Digital Signal Processing– Transforming data
2. Packet Processing– Transporting data
3. Tera Computing– Analyzing data
Voice
Data
Video
IB
14 TWEPP 2008
Wired Networking Trends
• The line rate is what drives the processing and input/output challenge on each line card
• The challenge is growing in step with Internet traffic growth
Gb/
s
Peta
Byte
s/m
onth
Line rate increasing 2x every 2 years
OC-19210G Ethernet
OC-19210G Ethernet
OC-768OC-768
40/100G Ethernet40/100G Ethernet
OC-3072 (?)OC-3072 (?)
1T Ethernet1T Ethernet
500G optical500G optical
Sources: IP Traffic, Cisco; Line Rate, Standards/Nortel
IB
15 TWEPP 2008
Trend Towards Packetized Network-on-Board
15
Protocol layerProtocol layer
Transport layerTransport layer
Routing layerRouting layer
Link layerLink layer
Physical layerPhysical layer
QPI (processor-memory)QPI (processor-memory)
Transaction layerTransaction layer
Data link layerData link layer
Physical layerPhysical layer
PCI-Express(local interconnect)
PCI-Express(local interconnect)
MIPI (mobile peripherals)MIPI (mobile peripherals)
Layer 1 headerLayer 1 header Layer 2 headerLayer 2 header …… Layer n headerLayer n header Data payloadData payload
Packetssent over physical serial link
Packetssent over physical serial link
Transport layerTransport layer
Network layerNetwork layer
Data link layerData link layer
PHY adapter sub-layerPHY adapter sub-layer
Physical layerPhysical layer
ExamplesExamples
IB
16 TWEPP 2008
Wireless Networking Trends
Longer-term trends:• Multi-mode radio and
cognitive radio (increasing adaptability)
• Mobility as central feature of Internet (increasing demands)
Computation and programmability grow, power budgets shrink
(2001) (2005) (2008) (2011) (2015)
Wireless Coder-decoder Compute Requirements
Source : Nokia/NXP
IB
17 TWEPP 2008
Agenda
• The FPGA Trends
• The Triple Play Opportunity
• The Platform Approach
• V5 News
• Conclusions
IB
18 TWEPP 2008
Bridging the Gap
The full powerof silicon
Misery of details
Systems and
Applications
Circuits and
Architectures
Platform
Design Methods and
Software
The opportunities
IB
19 TWEPP 2008
Platform-based Methodology
• Use of high level language:• Describe top-level relationships between system components• Program processor-based components• Program some logic-based components• APIs hide all HW components• APIs hide external interface detail• Debugging in Soft terms
• Hides underlying platform detail: • ISE and EDK tools• OS device drivers• Board settings
• Use of high level language:• Describe top-level relationships between system components• Program processor-based components• Program some logic-based components• APIs hide all HW components• APIs hide external interface detail• Debugging in Soft terms
• Hides underlying platform detail: • ISE and EDK tools• OS device drivers• Board settings
Other input/outputOther input/output
Memory input/outputMemory input/outputAPIAPIAPIAPI
APIAPIAPIAPI
Streaming packetized data output
Streaming packetized data output
API
API
API
API
Streaming packetized data input
Streaming packetized data input
APIAPI
APIAPI
ProcessorProcessor
LogicLogic LogicLogic
LogicLogic
LogicLogic
LogicLogic
LogicLogic
LogicLogic
ProcessorProcessor
Function blocksFunction blocks
FPGA
APIAPIAPIAPI
Abstraction layerAbstraction layer
ISE inputs
ISE inputs
EDKinputs EDK
inputs Platformdetails
Platformdetails
Settingsfor
other ICs
Settingsfor
other ICs
DevicedriversDevicedrivers
Platform
IB
20 TWEPP 2008
Design Methodology Roadmap
HDL
IP block integration
HDL
IP block integration
Programming language and development environment
Programming language and development environment
Software + APIs
Platform-baseddesign and debug
Software + APIs
Platform-baseddesign and debug
Domain-specificmethodologies
Component-basedsoftware
engineering
Domain-specificmethodologies
Component-basedsoftware
engineering
Back end technologyBack end technology
Flatteningsynthesisand PAR
Flatteningsynthesisand PAR
Structuredsynthesisand PAR
Structuredsynthesisand PAR
HoursHours MinutesMinutes
Compilation timeCompilation time
User interfaceUser interface
EDA worldGUI
(e.g. ISE)
EDA worldGUI
(e.g. ISE)
IDE worldGUI
(e.g. Eclipse)
IDE worldGUI
(e.g. Eclipse)
HardwareHardware SoftwareSoftware
Look and feelLook and feel
IB
21 TWEPP 2008
High Performance Compute Platform
I/OHub
I/OHub
IDE, FDC,USB, Etc.
PCI Express
PCI ExpBridge
PCI ExpBridge
MemoryCtlr Hub(MCH)
MemoryCtlr Hub(MCH)
ProcessorProcessorProcessor
ProcessorProcessorProcessor
PCI-XPCI-XBridge
PCI-XBridge
PCI-XPCI-XBridge
PCI-XBridge
DDR
PCI
I / O FPGAProcess
FPGA I / O FPGAProcess
FPGA I / O FPGAProcess
FPGA
Local DRAM
I/OHub
I/OHub
IDE, FDC,USB, Etc.
PCI Express
PCI ExpBridge
PCI ExpBridge
MemoryCtlr Hub(MCH)
MemoryCtlr Hub(MCH)
ProcessorProcessorProcessor
ProcessorProcessorProcessor
PCI-XPCI-XBridge
PCI-XBridge
PCI-XPCI-XBridge
PCI-XBridge
DDR
PCI
I / O FPGAProcess
FPGA I / O FPGAProcess
FPGA I / O FPGAProcess
FPGA
Local DRAM
Co- Processing
Peer Processing
DDR
AMD-8132™PCI-X®Bridge
AMD-8132™PCI-X®Bridge
PCI-X
Host Base RAID 5 Host Base RAID 5
PCI-X
BMCBMC SIOSIOTPMTPMBIOSBIOS
LPC
DDR
IB
22 TWEPP 2008
Heterogeneous Multi-Processor
Intel Socket 604
CPU / FPGAConfigurationsCPU / FPGA
Configurations
Four Independent 1066MHz Front Side Buses
“Clarksboro”7300 MCH
Four Memory Channels
IntelXeonCPU
ACP Module• 1066Mhz I/O Characterization• FSB protocol on FPGA• 8.5 GByte/s Memory BW
34GB/Sec32GB/Sec
IB
23 TWEPP 2008
C/C++/FORTRAN Source
Serialapplication
onstandard
processor
The Software Solution
High-performanceinterconnect
Dynamic
library call
C Inner Loop
ParallelalgorithmOn FPGA
Compile directly from C to FPGA-ISA
IB
24 TWEPP 2008
Abstraction API
Heterogeneous Clusters:• Intel, AMD x86 Binaries• FPGA Bitstreams• Infiniband & 10GE Backplane
• Message Passing Interface (MPI)– HPC Industry Standard API
• Communication in heterogeneous systems– Processor-to-Processor– Processor-to-Hardware Engine – Hardware Engine-to-Hardware Engine
• Identical API in C & HDL– Node Discovery (ID)– Node Programming (.exe and .bit)– Data Send & Receive– Synchronization
• Abstract and isolate hardware changes– Promotes portability– Allows code reuse– Improves system scalability
MPI
MPIMPI
IB
25 TWEPP 2008
Measure It and Fix ItThe Engineering Innovation Process
Combining Xilinx FPGA Technologies with Software and Hardware to help the “Domain Expert”
LEGO Mindstorms NXT“the smartest, coolest toy
of the year”100’s thousands of students
CERN Large Hadron Collider“the most powerful instrument
on earth”Engineering Team
Standard Component
Hardware and Backend Software
Consistent User/Domain-level Software Tools and COTS
Hardware
+
Design Prototype Deploy
IB
26 TWEPP 2008
Agenda
• The FPGA Trends
• The Triple Play Opportunity
• The Platform Approach
• V5 News
• Conclusions
IB
27 TWEPP 2008
Virtex-5 Common FeaturesVirtex-5 Common Features
PCI-ExpressEndpoint Blocks
PCI-ExpressEndpoint Blocks
GTP3.75 Gbps
Transceivers
GTP3.75 Gbps
Transceivers
6-LUT+
Express Fabric
6-LUT+
Express Fabric36Kb
Dual-Port Block RAM / FIFO
with ECC
36KbDual-Port
Block RAM / FIFO with ECC
SelectIO with IDELAY/ODELAY
and SerDes
SelectIO with IDELAY/ODELAY
and SerDes
550 MHz Clock Management
550 MHz Clock Management
25x18 DSP Slices25x18 DSP Slices
Advanced Configuration
Options
Advanced Configuration
Options
10/100/1000 Mbps Ethernet
MAC
10/100/1000 Mbps Ethernet
MAC
IntegratedSystem MonitorA/D Converter
IntegratedSystem MonitorA/D Converter
PowerPC440Processors
PowerPC440Processors
GTX6.5 Gbps
Transceivers
GTX6.5 Gbps
Transceivers
30% Higher Performance
Precision+
low jitter
Higher Precision
Higher Bandwidth
Saves Logic
New Capability
New Capability
PA
29 TWEPP 2008
More Than Just a PPC440…More Than Just a PPC440…
External External DDR2 DDR2
MemoryMemory
External External DDR2 DDR2
MemoryMemory
I/O Bus (PLB V46)I/O Bus (PLB V46)
Processor BlockProcessor Block DMADMA
DMADMA
SPLB1SPLB1
DMADMA
DMADMA
MCIMCIMCIMCI
MPLBMPLBMPLBMPLB
DCRDCR
APUControl
APUControl
CPMCPM
PowerPCPowerPC 440440
PowerPCPowerPC 440440
SPLB0PPC440MC_
DDR2PPC440MC_
DDR2
Counter
Timer
Counter
Timer RS232RS232
TEMACTEMACTEMACTEMAC
TEMACTEMACTEMACTEMAC
Wrapper
Wrapper
Wrapper
Wrapper
Wrapper
Wrapper
Wrapper
Wrapper
Memory Controller InterfaceMemory Controller Interface
Interrupt ControllerInterrupt
ControllerGPIOGPIO
• Four built-in DMA channels provide high speed access to memory or I/O • Separate memory and I/O buses greatly improve system performance • External masters can access memory or I/O through the crossbar
I2CI2C GPIOGPIO TimerTimer Interrupt ControllerInterrupt
ControllerRS232RS232 Custom IPCustom IP
PLBPLB
Crossbar
PA
30 TWEPP 2008
PPC440+128-bit FPU via APUPPC440+128-bit FPU via APU
• Soft co-processor module, free-of-chargeaccessible by the 440 processor instruction pipeline
• Single- and double-precision IEEE-754 compliant• Speed-up 6 to 30 times, 200 MFLOPS sustained
Processor BlockProcessor Block
PowerPCPowerPC 440440
PowerPCPowerPC 440440
FCB I/F
Register
file
Pipeline Interlock controller
Multiply
Add
Div
Sqrt
Abs/Neg
Fused add
Load
FPSCR
Decode
Store
FP to Int
Cmp
Int to FP
INSTRUCTION
DATA (load)
DATA (store)
INST_VALID
DATA_VALID
DONE
128-bit(V4=32-bit)
Reg addresses
APUAPUControlControl
APUAPUControlControl
PA
31 TWEPP 2008
FPGAFabric
Interface
FPGAFabric
Interface
GTX Multi-Gigabit TransceiverGTX Multi-Gigabit Transceiver
PMAPMA PCSPCS
PMAPMA PCSPCSTxTx
RxRx
GTX Transceiver• 8 to 24 transceivers per device (40 and 48 in ‘TXT subfamily)
• Supporting data rates from 150 Mbps to 6.5 Gbps
• Power dissipation less than 250 mW per channel
• Programmable Tx pre-emphasis and Rx equalization
PA
32 TWEPP 2008
6.5 Gb/s Transmit Eye Opening6.5 Gb/s Transmit Eye Opening
http://www.xilinx.com/support/documentation/virtex-5.htm#19312
PA
33 TWEPP 2008
New Additions to the FamilyNew Additions to the Family
• XC5VTX150T and ‘TX240T• Like ‘FX130T and ‘FX200T
minus the PPC microprocessor.but with twice the number of GTX transceivers40 and 48 GTXs respectively
• Availability: ES 4Q08, Production 1Q09
When you need lots of fast transceivers
PA
34 TWEPP 2008
Conclusions
• FPGA the programmable platform for– Transforming, transporting and computing digital data
• A trend towards specialized HW and SW to support programmable system solutions
• A strategy of working with Academics to enable exploration of new system applications & research
• Multi-gigabit transceivers are very popular• Moore’s Law will give us much more logic and lower cost
– Speed, power consumption, packaging pose a difficult challenge• Users want and need to improve design productivity
– The pace is becoming faster and more competitive.
PA/IB