Fast Algorithms for Minimum Evolution Richard Desper, NCBI Olivier Gascuel, LIRMM.
VLSI-SoC 2001 IFIP - LIRMM Stream-based Arrays: Converging Design Flows for both, Reiner Hartenstein...
-
Upload
wilfred-gray -
Category
Documents
-
view
214 -
download
0
Transcript of VLSI-SoC 2001 IFIP - LIRMM Stream-based Arrays: Converging Design Flows for both, Reiner Hartenstein...
VLSI-SoC 2001 IFIP - LIRMM
Stream-based Arrays: Converging Design Flows for both,
Reiner Hartenstein
University ofKaiserslautern
December 2- 4, 2001, Montpellier, France
Reconfigurable
and Hardwired ....
© 2001, [email protected] http://www.fpl.uni-kl.de2
University of Kaiserslautern
Xputer Lab>> Stream-based
Computing
• Stream-based Computing
• Stream-based Compilation Techniques
• Use in Co-Design
• Now it’s up to You !http://www.uni-kl.de
© 2001, [email protected] http://www.fpl.uni-kl.de3
University of Kaiserslautern
Xputer Lab
XPU family (IP cores):PACT Corp., Munich
commercial rDPAs: rDPA (coarse grain) becoming
important
XPU128**) bought
**
**
flexible array: MorphICs
CALISTO: Silicon Spice
CS2000 family:Chameleon Systems
MECA family: Malleable
FIPSOC: SIDSA
ACM: Quicksilver Tech
CHESS array: Elixent
MorphoSys: Morpho Tech
http
://pa
ctco
rp.c
om
© 2001, [email protected] http://www.fpl.uni-kl.de4
University of Kaiserslautern
Xputer Lab
rDPU not used used for routing only operator and routing port location markerLegend: backbus connect
array size: 10 x 16 = 160 rDPUs
SNN filter Example: KressArray Family
not usedbackbus connect
KressArrayXplorer:rout thru only
http://kressarray.de You may use iton your Netscape
© 2001, [email protected] http://www.fpl.uni-kl.de5
University of Kaiserslautern
Xputer Lab Rapidly toward the Break-through
• replaceConcurrent Processes by more efficient parallelism: stream-based DPAs1
**) reconfigurable
2 ) KressArray** [1995]
and others [later]
terms:DPU: datpath unitDPA: data path arrayrDPU: reconfigurable DPUrDPA: reconfigurable DPA
Kress: a generalization of systolic array synthesis:
stream-based rDPAs2
____
*) hardwired
1 ) systolic array*
[1980]
[Broderson]
Bee Project
chip-on-a-day* [2000]
Generalization ofthe Systolic Array
super systolic synthesis
© 2001, [email protected] http://www.fpl.uni-kl.de6
University of Kaiserslautern
Xputer Lab compare Concurrent Computing
DPUinstructionsequencer
DPUinstructionsequencer
DPUinstructionsequencer
DPUinstructionsequencer
....
Bus(es) or switch box
CPUextremely inefficient
•
massive bottleneck phenomena at run time •control flow overhead•instruction fetch / interpretation overhead •address computation overhead - may be massive
© 2001, [email protected] http://www.fpl.uni-kl.de7
University of Kaiserslautern
Xputer Lab... with Stream-based Computing:
(r)DPA
for both,• reconfigurable, and• hardwired [Brodersen]
DPU DPUDPU
DPU DPUDPU
DPU DPUDPU
•transport-triggered execution
driven by data stream fr. / to memoryor, fr. / to peripheral interface
•no instruction sequencer inside !
avoids run time overhead and bottleneck
phenomena
rDPA: drastically reduced reconfigurability overhead
•„instruction fetch“: at compile time
© 2001, [email protected] http://www.fpl.uni-kl.de
University of Kaiserslautern
Xputer Lab
8
Soft rDPA ?
Memorysoft CPU
miscellanous
soft
soft
DPUDPU
arra
y
arra
ysoft
soft
DPUDPU
arra
y
arra
y
HLL Compiler
•50 mio system gates soon
•even large rDPAs as soft IPs become feasible
•by >2005: don’t care about area
efficiency ?
© 2001, [email protected] http://www.fpl.uni-kl.de9
University of Kaiserslautern
Xputer Lab>> Stream-based Compilation
Techniques
• Stream-based Computing
• Stream-based Compilation
Techniques
• Use in Co-Design
• Now it’s up to You !http://www.uni-kl.de
© 2001, [email protected] http://www.fpl.uni-kl.de10
University of Kaiserslautern
Xputer Lab
norouting!
equations
linearprojection
or algebraicmapping
DPU architecturey
+*
x
a
placement
a12
a11 a21
a32
a31
a23 a33
a22
a13
Systolic Stream-based Computing System
linear pipelinesand uniformarrays only The Mathematician’s
Synthesis Method
Systolic Array [H. T. Kung, 1980]: a DPA (Data Path Array)
computingin space
placement
computingin time
systolicarrays etc.
and other transformationsmigration by re-timing
this dichotomy iscompletely ignoredby our CS curricula
y10
y20
y30
---
y1
y2
y3
---
x1
x2
x3
-
- -
datastreams
© 2001, [email protected] http://www.fpl.uni-kl.de11
University of Kaiserslautern
Xputer Lab
2
General Stream-based Computing Systemheterogenous DPA or rDPA
simulated
annealing
free form
pipe network
Mapper
expression treeDPU architectures
y
+*
x
a
simultaneousplacement& routing
3
+
++
+
***sh
*sh
sh sh
xf
xf
-
-
1
Schedulerdatastreams
4
2
© 2001, [email protected] http://www.fpl.uni-kl.de12
University of Kaiserslautern
Xputer Lab
•an example by Nageldinger’s KressArray Xplorer
Memory Communication Architecture …•hot research topic in embedded systems
•storage context transformations [Cathoor, Herz, Kougia, Soudris]
•Synthesizable Memory Communication Architecture
• startups provide memory IPs or generators
application not usedLegend:
sequencersmemory ports
Optimized ParallelMemory Controller
GAG generic sequencer methodology available
Herz
© 2001, [email protected] http://www.fpl.uni-kl.de13
University of Kaiserslautern
Xputer Lab>> Use in Co-Design
• Stream-based Computing
• Stream-based Compilation
Techniques
• Use in Co-Design
• Now it’s up to You !http://www.uni-kl.de
© 2001, [email protected] http://www.fpl.uni-kl.de14
University of Kaiserslautern
Xputer Lab
datacounter(s)
programcou n ter:
state register
CompilerMemory
Datapath
hardwired
Sequencer
Computer Computer tightly coupledby compact
instruction code
“von Neumann”
“von Neumann”does not supportsoft data pathsdoes not supportsoft data paths
Datapath
reconfigurable
Xputer Xputer
SchedulerCompiler
Memory
(multiple)sequencer
DatapathArray
University of Kaiserslautern
Xputer Lab
loosely coupledby decision data bits only
Xputer:Xputer:The Soft Machine Paradigm
The Soft Machine Paradigm reconfigurablereconfigurable
Computer:the wrong Machine Paradigm“von Neumann”
also for hardwiredalso for hardwired[Broderson]
enabling technologypublished 10 years ago
now a hot topic area
full day courselast week at Tampere, Finland
© 2001, [email protected] http://www.fpl.uni-kl.de15
University of Kaiserslautern
Xputer Lab
partitioning compiler
high level programming language source
Co-Compilation
Analyzer/ Profiler
supportingdifferentplatforms
Resource Parameters
Xputer
“Soft” Machine Paradigm
Configware running on
inte
rfac
e
ReconfigurableAccelerators
X-Ccompiler
KressArray
DPSS
GNU Ccompiler
X-C
Partitioner
Hardware / Software Co-Design turnsto Configware / Software Co-DesignJürgen Becker’s Co-DE-X Co-Compiler[ASP-DAC’95]
Computer
Machine Paradigm
Software running on
Processor
© 2001, [email protected] http://www.fpl.uni-kl.de16
University of Kaiserslautern
Xputer LabLoop Transformation
Examples
loop 1-8bodyendloop
loop 9-16bodyendloop
fork
joinstrip mining
loop 1-4triggerendloop
loop 1-2triggerendloop
loop 1-16bodyendloop
sequential processes:
loop 1-8triggerendloop
reconf.array:host:
resource parameter drivenCo-Compilation
loop 1-8bodybodyendloop
loop unrolling
© 2001, [email protected] http://www.fpl.uni-kl.de17
University of Kaiserslautern
Xputer Lab>> Now it’s up to You !
• Stream-based Computing
• Stream-based Compilation Techniques
• Use in Co-Design
• Now it’s up to You !
http://www.uni-kl.de
© 2001, [email protected] http://www.fpl.uni-kl.de18
University of Kaiserslautern
Xputer LabHowever, current CS Education ….
Hardware invisible:under the surface
… is based on the Submarine Model
Brain usage:procedural-only
Software Faculty Colleagues shy away from the Paradigm Shift:their Brain hurts? - can’t be: this Half has been amputated
Algorithm
Assembly Language
procedural high level Programming
Language
Hardware
Software
This model disables ...
© 2001, [email protected] http://www.fpl.uni-kl.de19
University of Kaiserslautern
Xputer Lab
Hardware,Configware
... this model disablesHardware and Software as Alternatives
Algorithm
Software
partitioning
Software onlySoftware & Hardw/Configw
procedural structural
Brain Usage:both Hemispheres
Hardw/Configw only
© 2001, [email protected] http://www.fpl.uni-kl.de20
University of Kaiserslautern
Xputer LabThe Dominance of the Submarine
Model ...
Hardware
... indicates, that our CS education system produces zillions of mentally disabled
Persons
(procedural) structurallydisabled
… completely disabled to cope with solutions other than software only
It‘s time to attack the software faculty dictatorship.Get
involved!
© 2001, [email protected] http://www.fpl.uni-kl.de21
University of Kaiserslautern
Xputer Lab>>> thank you
thank you for listeningIt’s up to You !
© 2001, [email protected] http://www.fpl.uni-kl.de22
University of Kaiserslautern
Xputer Lab>>> END
END
© 2001, [email protected] http://www.fpl.uni-kl.de23
University of Kaiserslautern
Xputer LabThe Impact of Reconfigurable
Logic• Reconfigurable platforms bring a new dimension to digital
system development and have a strong impact on SoC design.
• A rapidly growing large user base of HDL-savvy designers with FPGA experience.
• Flexibility promises spin-around times downto minutes instead of months for real time in-system debugging, profiling, verification, tuning, field-maintenance, and field upgrades
• A New Business Model (in-field debugging and upgrading ... )
• A Fundamental Paradigm Shift in Silicon Application
Revenue/ month
Time / months
Update 1
Product
Update 2
1 10 20
ASIC Product
reconfigurable Product with download
30
[T. Kean]
© 2001, [email protected] http://www.fpl.uni-kl.de24
University of Kaiserslautern
Xputer LabThe History of
Paradigm Shifts
“Mainstream Silicon Applicationis switching every 10 Years”
TTL µproc.,memory
custom
standard
1957
1967
1977
1987
1997
2007
Makimoto’s Wave
ASICs,accel’s
LSI,MSI
??
“The Programmable System-on-a-Chipis the next wave“
reconfigurablePublished
in 1989
© 2001, [email protected] http://www.fpl.uni-kl.de25
University of Kaiserslautern
Xputer LabHow’s next Wave ?
2007
custom
standard
1957
1967
1977
1987
1997
procedural programming
algorithm: variable
resources: fixed
Tredennick’sParadigm Shifts
hardwired
algorithm: fixed
resources: fixed
2007FPGAs
structural programming
algorithm: variable
resources: variable
no further wave !
Coarse grain
RAs
Hartenstein’s Curve
© 2001, [email protected] http://www.fpl.uni-kl.de26
University of Kaiserslautern
Xputer LabThe Impact of
Makimoto’s Paradigm Shifts
TTL µproc.,memory
custom
standard
ASICs,accel’s
LSI,MSI
reconfigurable
1957
1967
1977
1987
1997
2007
Proceduralpersonalization via RAM-based
Machine Paradigm
structuralpersonalization:
RAM-basedbefore run time
Dr. Makimoto: FPL 2000 keynote
Software Industry’sSecret of Success
Configware Success Storyby new Machine ParadigmConfigware Success Storyby new Machine Paradigm
© 2001, [email protected] http://www.fpl.uni-kl.de27
University of Kaiserslautern
Xputer LabThe History of
Paradigm Shifts
“Mainstream Silicon Applicationis switching every 10 Years”
custom
standard
1957
1967
1977
1987
1997
2007
Makimoto’s Wave
TTL µproc.,memory FPGAs
ASICs,accel’s
LSI,MSI
coarsegrain
© 2001, [email protected] http://www.fpl.uni-kl.de28
University of Kaiserslautern
Xputer Lab
KressArray Family generic Fabrics: a few examples
Examples of 2nd Level Interconnect:layouted overrDPU cell - no separate routing areas !
+
rout-through and function
rout-throug
h only more NNports:
rich Rout Resources
Select Function
Repertory
select Nearest Neighbour (NN) Interconnect: an example
16 32 8 24
4
2 rDPU
Select mode, number, width of NNports
Wired by Abutment