Reiner Hartenstein (invited paper, invited book chapter): Memorial … · 2015. 11. 3. · Reiner...
Transcript of Reiner Hartenstein (invited paper, invited book chapter): Memorial … · 2015. 11. 3. · Reiner...
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 1
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
Reconfigurable Computing and the von Neumann Syndrome
Reiner Hartenstein
TU Delft, Sept 28, 2007
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
2
(Preface:) it’s old stuff !
• Most of the enabling technologies of Reconfigurable Computing have been published in the 70ies and 80ies: being also the keys to cope with the von Neumann syndrome*
• This is mainly ignored by the CS community by the tunnel view of a reductionist mind set.
• We need to think out of the box: R&D and education need a twin paradigm approach
*) this term has been coined by C. V. Ramamoorthy
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
3
Stoffsammlung
• vN critics Backus - Arvind • Dataflow critics Dan Gajski • Microprogramming Manchester • HDL (Computer Mag) • 2 Gehirnhälften • Overhead-based vN • Manycore programmiong crisis
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
4
###
• ####
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
5
###
• ####
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
6
###
• ####
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 2
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
7
###
• ####
non-von-Neumann accelerators
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
8
### • Reconfigurable Computing (RC) is found everywhere. see one-click-only per area on page:
http://xputers.informatik.uni-kl.de/RCeducation07/pervasiveness.html
• Lots of papers report speedups by up to 4 orders of magnitude by Software to configware migrations (onto FPGAs)
• Configware industry is the growing counterpart to the software industry. Microsoft is heavily working on coming up with a configware operating system, probably part of later releases of Windows.
• The basic machine paradigm under configware is not instruction-stream-driven. For this reason it is going to turn entire traditional CS mind set topside-down
• Together with the many-core crisis, the disruptive RC mind set is heavily shaking the foundations of Computer Science. See, for instance, (HPCwire:) Confronting Parallelism: The View from Berkeley: http://www.hpcwire.com/hpc/1288079.html
• The Landscape of Parallel Computing Research: A View From Berkeley: http://view.eecs.berkeley.edu/wiki/Main_Page
• Reconfigurable Computing (RC) is found everywhere. see one-click-only per area on page: http://xputers.informatik.uni-kl.de/RCeducation07/pervasiveness.html
• Mainstream in embedded systems since years ago, FPGAs with 7 bio US-$ are the fastest growing section of the microchip market.
• There are masses of books on using FPGAs http://www.fpl.uni-kl.de/FPGAbooks/
• - - A trailblazing book will appear toward the end of 2007 with Springer Verlag: Christphe Bobda: "Introduction to Reconfigurable Computing Systems - Architectures, Algorithms and Applications". For Reconfigurable Computing it should play a similar role as known for VLSI design from the goundbreaking historical book (1979, 1980): "Introduction to VLSI Systems"; by Carver Mead and Lynn Conway
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
9
### • http://www.fpl.uni-kl.de/FPGAbooks/ • More than 170 international conference series cover Reconfigurable
Computing: tp://hartenstein.de/NewJournal.pdf • - - A trailblazing book will appear toward the end of 2007 with Springer
Verlag: Christphe Bobda: "Introduction to Reconfigurable Computing Systems - Architectures, Algorithms and Applications“ For Reconfigurable Computing it should play a similar role as known for VLSI design from the goundbreaking historical book (1979, 1980) "Introduction to VLSI Systems"; by Carver Mead and Lynn Conway
• Reconfigurable Computing is found everywhere. see one-click-only per area on page: http://xputers.informatik.uni-kl.de/RCeducation07/pervasiveness.html
• (HPCwire:) Confronting Parallelism: The View from Berkeley: http://www.hpcwire.com/hpc/1288079.html
• The Landscape of Parallel Computing Research: A View From Berkeley: http://view.eecs.berkeley.edu/wiki/Main_Page
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
10
Moore disaster
• Moore said nothing about improved gigaFLOPS per $ orwatt or square inch increasing passive power has stalled the entire industry
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
11
Carpaltunnel syndrome
• vN syndrom is also a tunnel syndrome: tunnel view syndrome - into a tunnel of horror
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
12
Feedback loop • Wiener 1961 • Implemented in the ancient world • Re-discovered for steam engines • Digitalized and made programable Zuse / von Neumann
instruction stream
“CPU”
Sequencer
(controller) DPU
evoke
decision
data
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 3
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
13
Feedback loop • ##
“rDPA”
rDPU
reconfiguration
code rDPU
reconfiguration
code
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
14
####
• we are embarking on a new computing age -- the age of massive parallelism.
• Will everyone have multiple parallel computers at their disposal every day?
• Smith: Yes. Even mobile devices will exploit multicore processors, not only for better performance but also to extend battery life by replacing the relatively power-hungry serial processors used today.
• are there prospects for global address space (GAS) languages? • an increase in the population of HPC-competent people, according to
Smith. He anticipates that the mainstream will adopt desktops as their own "personal supercomputers," while smart phones will be used as PDAs, MP3 players, and so on. The reinvention of the computing profession is a job not just for universities, but for companies such as Microsoft, which must make the developer community familiar with the new computing philosophy, Smith contends.
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
15
Finally, after 60 years, we are witnessing the collapse of the spirit from the Mainframe Age – triggered by the run-away break-through of Reconfigurable Computing."
Arthur Schopenhauer
• Arthur Schopenhauer: "Approximately every 30 years, we declare the scientific, literary and artistic spirit of the age bankrupt. In time, the accumulation of errors collapses under the absurdity of its own weight."
• Reiner H.: "Mesmerized by the Gordon Moore Curve, we in computer science slowed down our own learning curve. Finally, after 60 years, we are witnessing the collapse of the spirit from the Mainframe Age –
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
16
Outline
• The Pervasiveness of FPGAs • The Reconfigurable Computing Paradox • The Gordon Moore gap • The von Neumann syndrome • The Anti Machine • We need a twin paradigm approach • Conclusions
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
17
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
18
Tools etc.
• Wiki nachsehen
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 4
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
19
Configware: more compute power than by Software
80% of all (micro)processors are embedded
average acceleration factor >5 ->
25% o‘ embedded µProc. are accelerated by FPGA(s)
(very cautious estimation)
-> Every 5th µProc is accelerated by FPGA(s)
Conclusion: most compute power
comes from Configware
very pessimistic estimation
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
20
FPGAs as accelerators found everywhere
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
21
Pervasiveness of RC
mirror: http://www.fpl.uni-kl.de/ RCeducation08/pervasiveness.html
http://hartenstein.de/pervasiveness.html
one click only per keyword on this list: shows number of hits by google
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
22
Outline
• The Pervasiveness of FPGAs
• The Reconfigurable Computing Paradox
• The Gordon Moore gap
• The von Neumann syndrome
• The Anti Machine
• We need a twin paradigm approach
• Conclusions simple FPAGs
coarse-grained arrays
saving energy
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
23
Software-to-Configware (FPGA) Migration:
molecular dynamics simulation 88
some published speed-up factors [2003 – 2005]
100
103
106
real-time face detection 6000
video-rate stereo vision
900 pattern recognition 730
SPIHT wavelet-based image compression 457
FFT 100
Reed-Solomon Decoding 2400
Viterbi Decoding 400
1000
MAC
DSP and wireless
Image processing, Pattern matching,
Multimedia
BLAST 52
protein identification 40
Smith-Waterman pattern matching
288
Bioinformatics GRAPE
20
Astrophysics
speed
up f
acto
r
crypto 1000
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
24
Software-to-Configware (FPGA) Migration:
molecular dynamics simulation 88
some published speed-up factors [2003 – 2005]
100
103
106
real-time face detection 6000
video-rate stereo vision
900 pattern recognition 730
SPIHT wavelet-based image compression 457
FFT 100
Reed-Solomon Decoding 2400
Viterbi Decoding 400
1000
MAC
DSP and wireless
Image processing, Pattern matching,
Multimedia
BLAST 52
protein identification 40
Smith-Waterman pattern matching
288
Bioinformatics
GRAPE 20 Astrophysics
speed
up f
acto
r
crypto 1000
The RC
paradox
deficiency
factor: >10,000
speed-up
factor: 6,000
total discrepancy: >60,000,000
3000
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 5
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
25
“simple” FPGAs are only the beginning
• Less discrepancy for platform FPGAs and coarse-grained reconfigurable arrays
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
26
Hollerith • Prototyped 1884 by Hollerith
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
27
The first Reconfigurable Computer
•Prototyped 1884 by Herman Hollerith
•A century before FPGA introduction
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
28
Hollerith • #
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
29
Hollerith • #
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
30
####
• ‘Bilder suchen: chickens + 2 Oxen
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 6
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
31
Executive Summary doesn‘t help
2 strong Reconfigurable Computing oxen
manycore critics decades ago?
vs. 1024 von Neumann chickens ?
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
32
Outline
• The Pervasiveness of FPGAs
• The Reconfigurable Computing Paradox
• The Gordon Moore gap
• The von Neumann syndrome
• The Anti Machine
• We need a twin paradigm approach
• Conclusionsin & the multicore crisis
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
33
Moore’s law not applicable to all aspects of VLSI
What is the reason of the paradox ?
The Gordon Moore curve does not indicate performance
The peak clock frequency does not indicate performance
the law of Gates
astronomic code size causes
massive overhead, due to
von Neumann syndrome
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
34
Rapid Decline of Computational Density
[BWRC, UC Berkeley, 2004]
1990 1995 2000 2005
200
100
0
50
150
75
25
125
175
SP
EC
fp2000/M
Hz/B
illio
n T
ransis
tors
HP
alph
a: d
own
by
100
in
6 y
rs
IBM
: dow
n b
y 2
0 in
6 y
rs
stolen from Bob Colwell
memory wall, caches, ...
primary design goal: avoiding a paradigm shift
dramatic demo of the von Neumann Syndrome
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
35
Outline
• The Pervasiveness of FPGAs • The Reconfigurable Computing Paradox • The Gordon Moore gap • The von Neumann syndrome • The Anti Machine • We need a twin paradigm approach • Conclusions
the overhead-prone paradigm
refusing the paradigm shift
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
36
„It is feared that domain scientists will have to learn how to design hardware. Can we avoid the need for hardware design skills and understanding?“
Avoiding the paradigm shift?
Tarek El-Ghazawi, panelist at SuperComputing 2006
„A leap too far for the existing HPC community“ panelist Allan J. Cantle
SuperComputing, Nov 11-17, 2006, Tampa, Florida, over 7000 registered attendees, and 274 exhibitors
We need a bridge strategy by developing advanced tools for training the software community to think in fine grained parallelism and pipelining techniques.
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 7
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
37
The von Neumann Syndrome
The data-stream-based anti machine approach:
The instruction-stream-based von Neumann approach:
has no von Neumann bottle-necks
the watering pot model [Hartenstein]
has several
von Neumann overhead
phenomena
per CPU!
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
38
The Law of more
• 1000 processors running in parallel means 1000 instruction streams with all their overhead phenomena
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
39
Have to re-think basic assumptions
Instead of physical limits, fundamental misconceptions of algorithmic complexity theory limit the progress and will necessitate new breakthroughs.
Not processing is costly, but moving data and messages
We’ve to re-think basic assumptions behind computing
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
40
Refusing the paradigm shift leads to …
• an array of overhead phenomena • waste of researcher capacity on “speculative”
methods - the newest: “transactional memory” • multithreading* is not the silver bullet • highly disappointing computational density • the multicore programming crisis • massive programmer productivity decline • massive software engineering problems
*) is indeterministic
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
41
blind on one eye …
• Most “computer scientists” have mainly ignored the RC break-through
• Curriculum recommendations miss to hit most of the IT job market
• instruction-stream-based only: blind on the other eye ….
• … reductionist tunnel view …
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
42
Outline
• The Pervasiveness of FPGAs • The Reconfigurable Computing Paradox • The Gordon Moore gap • The von Neumann syndrome • The Anti Machine • We need a twin paradigm approach • Conclusions
instruction-stream vs. data stream
history of systolic arrays
bridging the chasm: an old hat
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 8
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
43
Von Neumann CPU
DPU program counter
DPU CPU
term program counter
execution triggered by paradigm
CPU
yes instruction
fetch
instruction-stream-based
RAM memory - World of Software -Engineering
Program Source: Software
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
44
Data-stream-based
• in contrast to von Neumann, which is instruction-stream-based, the anti machine is data-stream-based (no instruction fetch at run time)
• Sequencing by one or multiple data counters (each located with an ASM*)
• The history of data streams …….
*) ASM = auto-sequencing memory block
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
45
Here is the common model
program counter
DPU CPU
RAM memory
von Neumann bottleneck
von Neumann instruction-stream-
based machine
co-processors
accelerator CPU
instruction-stream-based
data-stream-
based
har
dw
are
software
mainframe age:
microprocessor age:
configware age:
CPU accelerator reconfigurable
software/configware co-compiler
software configware accelerator reconfigurable
accelerator hardwired
CPU
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
46
Overhead avoided by anti machine
# feature von Neumann
machine
hardwired
anti machine
reconfigurable
anti machine
11 state address computation
overhead at run time
instruction stream none
12 data address computation
overhead at run time
instruction stream none
13 Inter PU communication
overhead at run time
instruction stream none
14 instruction fetch at run time instruction stream none
15 data meet PU at run time instruction stream none
16 synchonization overhead instruction stream none
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
47
Data meeting the Processing Unit (PU)
by Software
by Configware
routing the data by memory-cycle-hungry instruction streams thru shared memory
placement of the execution locality ...
We have 2 choices
pipe network generated by configware compilation
... partly explaining the RC paradox
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
48
Outline
• The Pervasiveness of FPGAs • The Reconfigurable Computing Paradox • The Gordon Moore gap • The von Neumann syndrome • The Anti Machine • We need a twin paradigm approach • Conclusions
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 9
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
49
Dual paradigm: an old hat
Mapped into a Hardware mind set: action box = Flipflop, decision box = (de)multiplexer
Software mind set: instruction-stream-based: flow chart -> control instructions (FSM: state transition)
-> Register Transfer Modules (DEC: mid 1970ies); similar concept: Case Western Reserve Univ. ;
FF
token bit
evoke
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
50
Dual paradigm: an old hat (2)
“procedure call” or function call
call Module-name (parameters); Software: time domain
Hardware Description Languages;
Hardware description: space domain
An old hat: we just need to accept it
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
51
We need a twin paradigm approach
• We need a duality of 2 cultures: • a kind of transdisciplinary approach • 1) the instruction-stream-based mind set • = computing in time (procedural semantics) • and 2) the data-stream-based mind set • = computing in space (structural semantics)
We do not need a paradigm shift We must adopt the second paradigm
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
52
Why the two paradigms are twins
• Both paradigms have the same syntax rules • Their sequencers use the same circuity • Their semantics is only slightly different • But there is an external asymmetry: • The location of the counter (with the CPU
or with memory) • The number of counters: single (program
counter), multiple (data counters)
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
53
Similarity of Programming Language Paradigms
language category instruction stream Languages data stream Languages
both deterministic procedural sequencing: traceable, checkpointable
operation sequence driven by:
read next instruction, goto (instr. addr.),
jump (to instr. addr.), instr. loop, loop nesting
no parallel loops, escapes, instruction stream branching
read next data item, goto (data addr.),
jump (to data addr.), data loop, loop nesting, parallel loops, escapes, data stream branching
state register program counter data counter(s)
address computation
massive memory cycle overhead overhead avoided
Instruction fetch memory cycle overhead overhead avoided
parallel memory bank access interleaving only no restrictions
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
54
Outline
• The Pervasiveness of FPGAs • The Reconfigurable Computing Paradox • The Gordon Moore gap • The von Neumann syndrome • The Anti Machine • We need a twin paradigm approach • Conclusions
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 10
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
55
Have to re-think basic assumptions
Instead of physical limits, fundamental misconceptions of algorithmic complexity theory limit the progress and will necessitate new breakthroughs.
Not processing is costly, but moving data and messages
We’ve to re-think basic assumptions behind computing
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
56
Conclusions
• De facto performance of von Neumann computing systems is dramatically behind the expectations from the Gordon Moore curve
• Massive von Neumann parallelism causes a progressive decline of programmer productivity
• Trouble stems from a refused paradigm shift • Reconfigurable Computing provides improvement
by orders of magnitude • We need a twin paradigm education • Upgrading CS curriculum recommendations is overdue
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
57
RCeducation 2008
http://www.fpl.uni-kl.de/RCeducation08/
The 3rd International Workshop on Reconfigurable Computing Education
April 10, 2008, Montpellier, France
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
58
The Configware Age
• Mainframe age and
microprocessor(-only) age are history
• We are living in the
configware age right now!
• Attempts to avoid the paradigm
shift will again create a disaster
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
59
FPGA experts needed
• Inserat kopieren: FPGA expert saught
• Akute Mangelware
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
60
thank you for your patience
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 11
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
61
Impact of Makimoto’s wave
TTL µproc., memory
custom
standard
ASICs, accel’s
LSI, MSI
1957
1967
1977
1987
1997
2007
Procedural personalization via RAM-based
Machine Paradigm
Personalization (CAD) before fabrication
structural personalization:
RAM-based before run time
Software Industry’s Secret of Success
Repeat Success Story by new Machine Paradigm !
Configware Industry
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
62 © 2001, [email protected]
University of Kaiserslautern
Xputer Lab
instructions
program cou n ter:
state register
Compiler RAM
Datapath
har dw ired
Sequencer
Computer tightly coupled
by compact instruction code
“von Neumann” does not support
soft data paths
Datapath
Xputer
Scheduler
Compiler
RAM
(multiple) sequencer
Datapath Array
“instructions”
University of Kaiserslautern
Xputer Lab
loosely coupled by decision data bits only
Xputer: The Soft
Machine
Paradigm reconfigurable
also for hardwired
Computer: the wrong Machine Paradigm
“von Neumann”
s
d a ta cou n ter
(anti machine)
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
63
Reconfigurable semiconductor market
Xilinx 42%
Altera 37%
Lattice 15%
Actel 6%
Top 4 PLD Manufacturers 2000
total: $3.7 Bio
• [Dataquest] > $7 billion by 2003.
• PLD vendors’ and their alliances provide libraries of “soft IPs”
Configware Market
• fastest growing semiconductor market segment
coarse-grained:
rDPUs: configurable functional blocks
fine-grained:
cLBs, rLBs: configurable logic blocks
PACT AG, Munich, Germany http://pactcorp.com
Quicksilver, San Jose http://quicksilver-tech.com
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
64
Semiconductor Revolutions
“Mainstream Silicon Application is switching every 10 Years”
TTL
custom
standard
1957
1967
1977 LSI, MSI
µproc., memory
1987
1997 ASICs, accel’s
1st
des
ign
cris
is
2nd
des
ign
cris
is
hardware people new breed (M&C)
software people new breed needed
2007
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
65
Semiconductor Revolutions
“Mainstream Silicon Application is switching every 10 Years”
TTL µproc., memory
custom
standard
1957
1967
1977
1987
1997
2007
ASICs, accel’s
LSI, MSI
“The Programmable System-on-a-Chip is the next wave“
Tredennick’s Paradigm Shifts
hardwired
algorithm: fixed
resources: fixed
procedural programming
algorithm: variable
resources: fixed
structural programming
algorithm: variable
resources: variable
vN machine paradigm
anti machine paradigm
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
66
Impact of Makimoto’s wave
TTL µproc., memory
custom
standard
ASICs, accel’s
LSI, MSI
1957
1967
1977
1987
1997
2007
Procedural personalization via RAM-based
Machine Paradigm
Software Industry’s Secret of Success
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 12
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
67
Impact of Makimoto’s wave
TTL µproc., memory
custom
standard
ASICs, accel’s
LSI, MSI
1957
1967
1977
1987
1997
2007
structural personalization:
RAM-based before run time
Repeat Success Story by new Machine Paradigm !
Configware Industry
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
68
Impact of Data-stream-based ...
TTL µproc., memory
custom
standard
ASICs, accel’s
LSI, MSI
1957
1967
1977
1987
1997
2007
structural personalization:
hardwired before fabrication
Repeat Success Story by new Machine Paradigm !
Embedded Hardware/ Configware Industry
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
69
The History of Paradigm Shifts
“Mainstream Silicon Application is switching every 10 Years”
TTL µproc., memory
“The Programmable System-on-a-Chip is the next wave“
custom
standard
1957
1967
1977
1987
1997
2007
ASICs, accel’s
LSI, MSI
1st
Design
Crisis
2nd
Design
Crisis
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
70
The Impact of Makimoto’s Paradigm Shifts
TTL µproc., memory
custom
standard
ASICs, accel’s
LSI, MSI
1957
1967
1977
1987
1997
2007
Procedural personalization via RAM-based
Machine Paradigm
Personalization (CAD) before fabrication
structural personalization:
RAM-based before run time
Dr. Makimoto: FPL 2000 keynote
Software Industry’s Secret of Success
Repeat Success Story by new Machine Paradigm !
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
71
Makimoto’s 3rd wave
The next EDA Industry Revolution
1978
Transistor entry: Applicon, Calma, CV ...
1992
Synthesis (HDLs): Cadence, Synopsys ... 1985
Schematics entry: Daisy, Mentor, Valid ...
[Keutzer / Newton]
McKinsey Curves
EDA industry paradigm switching every 7 years
1999 (Co-) Compilation:
data-stream-based DPAs
[Hartenstein]
Von Neumann does not support Morphware:
“The Programmable System-on-a-Chip
is the next wave“
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
72
The anti universe
•Paul Dirac predicted a complete anti universe consisting of antimatter
•“There are regions in the universe, which consist of antimatter .....
•We are not aware, that there is a new area in computing sciences , which consists of antimatter of computing
• .... But there are asymmetries”
•Reconfigurable Computing is made from this antimatter: data-stream-based computing
•when a particle hits its antiparticle, both are converted into energy: Annihilation
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 13
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
73
anti particles
• 1956: anti neutron created on Bevatron
• 1928: Paul Dirac: „there should be an anti electron having positive charge“ (Nobel price 1933)
• 1932: Carl David Anderson detected this „positron“ in cosmic radiation (Nobel price 1936)
• 1955 Owen Chamberlain et al. create anti proton on Bevatron
• 1954: new accelerators: cyclotron, like Berkeley‘s Bevatron
• 1965: creation of a deuterium anti nucleus at CERN
hydrogen anti hydrogen
• 1995: hydrogen anti atom created at CERN – by forcing positron and anti proton to merge by very low energy.
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
74
Matter & Antimatter: Atom and Anti Atom
The World of Matter -
machine paradigm: the Atom
Anti Matter -
machine paradigm: Anti Atom
+ + -
- - +
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
75
Matter & Antimatter of Informatics : Machine and Anti Machine
+
CPU
- 1936 1st electronic computer (Konrad Zuse)
Machine paradigm: „von Neumann“
1946 v. N. machine paradigm
1971 1st microprocessor (Ted Hoff)
1979 „data streams“ (systolic array: Kung / Leiserson)
- DPU
+
Anti Machine paradigm
1990 anti machine paradigm published
1995 rDPA / DPSS (supersystolic: Rainer Kress)
novel
compilation
techniques
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
76
- DPU
Data
Path
Unit
DPU
+
CPU
Data
Path
instruction sequencer instruction
stream
Matter vs. antimatter: CPU vs. DPU
- +
dat
a st
ream
dat
a st
ream
s +
+
Data
Path
Unit
DPU
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
77
heavy anti atoms: DPA = DPU array
- DPA
- DPU
- DPU
- DPU
- DPU
- DPU
- DPU
- DPU
- DPU
- DPU -
DPA
+
+
+
+
+
+
+
+
+
coher
ent
dat
a st
ream
s sp
inni
ng a
roun
d
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
78
Parallelism by Concurrency
+ -
+
- -
+
- +
+
-
- +
- +
independent instruction streams difficult ...
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 14
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
79
>> Anti Machine and its Resources
• Microelectronics History
• fine grain and coarse grain Morphware
• Anti Matter of Computing
• Anti Machine and its Resources
• Problems to be solved http://www.uni-kl.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
80
Dichotomy of machine paradigms
DPU instruction sequencer
CPU
M instruction stream
M
(r) DPU
asM
data stream
M M M M
M M M M asM address
generator
(r)DPU Array
(r)DPA
(r)DPU or
data streams
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
81
Terminology: DPU versus CPU ...
• DPU: data path unit • DPA: DPU array • GA: gate array • rDPU: reconfigurable DPU • rDPA: reconfigurable DPA • rGA: reconfigurable GA
• DPU is no CPU: there is nothing central - like in a DPA
DPU DPU
DPU instruction sequencer
CPU
DPA r
r
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
82
What is the trend ?
• vN is needed for embedded systems, OS, compilers, Sauerkraut software, non-performance-critical applications, others ….
• vN is obsolete for massive parallelism, except some special application areas
• Anti machine is the way to go for massive parallelism, also data-intensive applications
• Morphware is the way for high performance with short product life cycles, unstable standards
•Data-stream-based Computing is heading for mainstream
–1979 „data streams“ (Kung / Leiserson)
–1997 SCCC (LANL) Streams-C Configurabble Computing
–SCORE (UCB) Stream Computations Organized for Reconfigurable Execution
–ASPRC (UCB) Adapting Software Pipelining for Reconfigurable Computing
–2000 Bee (UCB), ...
–Most stream-based multimedia systems, etc.
–Many other areas ....
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
83
Conclusion: all knowledge needed is available
•machine paradigm
•anti architectural resources
•sequencing methodology: hw & sw
•parallel memory IP core and module generator vendors
courses / embedded tutorials: • DATE. Munich, 2001
• ASP-DAC, Yokohama, 2001 • SBCCI, Brasilia, 2001
full day courses:
Univ. Montpellier 1998 Nokia / Univ. Tampere, Finland, 2002
CNRS Paris France, 2002 UnB, Brasilia, 2002
• 10 keynotes 2001 / 2002
• 5 invited talks 2001 / 2002
•anything else needed
•compilation techniques
•hw / sw partitioning methodology
• languages
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
84
Main problems to be solved
computing in space
computing in time
systolic arrays etc.
and other transformations migration by re-timing
this dichotomy is completely ignored by our CS curricula
•Each programmer should have qualified awareness on dichotomy and morphware
•curricular innovations are urgently needed
•Lack of qualified users and implementers
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 15
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
85
CS education .....
software person
procedural
structural
hardware person
Configware / Software Co-Design? Hardware / Software Co-Design?
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
86
Annihilation?
- +
-
+ -
+
avoidable by careful
methodology
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
87
However, current CS Education ….
Hardware invisible: under the surface
… is based on the Submarine Model
Brain usage: procedural-only
Algorithm
Assembly Language
procedural high level Programming Language
Hardware
This model disables ...
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
88
Hardware, Configware
Hardware and Software as Alternatives
Algorithm
Software
partitioning
Software only
Software & Hardw/Configw
procedural structural
Brain Usage: both Hemispheres
Hardw/Configw only
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
89
The Dominance of the Submarine Model ...
Hardware
... indicates, that our CS education system produces zillions of
mentally disabled Persons
(procedural) structurally disabled
… completely disabled to cope with solutions other than software only
It‘s time to attack the software faculty dictatorship. Get involved!
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
90
Antimatter Search ?
Antimatter Search
in EE & CS we do not need to search
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 16
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
91
Digital System Platforms clearly distinguished (2)
platform program source running on it
machine paradigm
hardware (not programmable)
none
morphware
fine grain rGA (FPGA) configware
coarse grain
rDPU, rDPA reconfigurable data stream processor
flowware & configware anti
machine data stream processor (hardwired) flowware
instruction stream processor software von Neumann machine
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
92
Matter & Antimatter
The World of Matter machine paradigm: the Atom
+ + -
The World of Anti Matter machine paradigm: Anti Atom
- - +
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
93
Matter & Antimatter of Informatics :
- DPU
+
Anti Machine paradigm
+
CPU
-
nothing central !
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
94
heavy anti atoms: DPA = DPU array
- DPA
- DPU
- DPU
- DPU
- DPU
- DPU
- DPU
- DPU
- DPU
- DPU -
DPA
+
+
+
+
+
+
+
+
+
flow
ware
: dat
a st
ream
s sp
inni
ng a
roun
d
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
95
What are the Challenges ? (1) [ST microelectronics, MorphICs, Dataquest, eASIC]
1
2
0 10 12 18 months
factor
4y
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
96
What are the Challenges ? (3) [ST microelectronics, MorphICs, Dataquest, eASIC]
1
2
0 10 12 18 months
factor
*) Department of Trade and Industry, London
30y
10y
4y
3y avoid application-
specific silicon !
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 17
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
97
What are the Challenges ? (4) [ST microelectronics, MorphICs, Dataquest, eASIC]
1
2
0 10 12 18 months
factor
*) Department of Trade and Industry, London
30y
Battery capacity (1.03/year)
10y
4y
3y
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
98
What are the Challenges ? (5) [ST microelectronics, MorphICs, Dataquest, eASIC]
1
2
0 10 12 18 months
factor
*) Department of Trade and Industry, London
30y Battery capacity (1.03/year)
10y
4y
3y
5y
2y new
compilation techniques
needed ! supported
by a new machine
paradigm
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
99
Machine Paradigms
machine category Computer (the Machine:
“v. Neumann”) The Anti Machine
driven by: Instruction streams data streams (no “dataflow”)
engine principles instruction sequencing sequencing data streams
state register single program counter (multiple) data counter(s)
Communication path set-up .
at run time at load time
resource DPU (e.g. single ALU) DPU or DPA (DPU array) etc. data path
operation sequential parallel pipe network etc.
( “instruction fetch” )
also hardwired implementations* *) e g. Bee project Prof. Broderson
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
100
Throughput vs. Efficiency
1000
100
10
1
0.1
0.01
0.001 2 1 0.5 0.25 0.13 0.1 0,07
MOPS / mW
µ feature size
S S
S S
resources needed for
reconfigurability
L
L L
L L
L
L L L
area used by application
~1 Bit CLB
T. Claasen et al.: ISSCC 1999
Wiring by abutment: 32 Bit example
*) R. Hartenstein: ISIS 1997
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
101
Throughput vs. Flexibilityy
1000
100
10
1
0.1
0.01
0.001 2 1 0.5 0.25 0.13 0.1 0,07
MOPS / mW
µ feature size
T. Claasen et al.: ISSCC 1999
Wiring by abutment: 32 Bit example
*) R. Hartenstein: ISIS 1997
flexibility
throughput
hard- wired
von Neumann
FPGAs
coarse grain goes far beyond bridging the gap
coarse grain
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
102
PACT XPP: Reference Module: XPU128 Co-Processor
ALU - PAE
CF
G
PAE
core
ALU CtrlALU
CF
GC
FG
PAE
core
CF
GC
FG
PAE
core
PAE
core
ALU CtrlALUALU CtrlALU
CF
GC
FG
CF
GC
FG
XPP128 ALU-Array
• 2 X PACs (Cluster)
• 128 X ALU-PAEs
• 32 X 1Kbyte RAM-PAEs
• 8X I/O Elements
• Full 32 or 24 Bit Design
• 2 Configuration Hierarchies
• Evaluation Board (2001)
• XDS Development Tool with Simulator
• PAE Core is 32- or 24-Bit ALU with DSP-Instruction Set and Controller
• Connecttions: Inputs + Outputs (Channels) + Events
[Jürgen Becker,
Univ. Karlsruhe]
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 18
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
103
The Dominance of the Submarine Model ...
Hardware
... indicates, that our CS education system produces zillions of
mentally disabled Persons
(procedural) structurally disabled
… completely disabled to cope with solutions other than software only
It‘s time to attack the software faculty dictatorship. Get involved!
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
104
However, current CS Education ….
Hardware invisible: under the surface
… is based on the Submarine Model
Brain usage: procedural-only
Algorithm
Assembly Language
procedural high level Programming Language
Hardware
This model disables ...
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
105
Hardware, Configware
Hardware and Software as Alternatives
Algorithm
Software
partitioning
Software only
Software & Hardw/Configw
procedural structural
Brain Usage: both Hemispheres
Hardw/Configw only
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
106
Impact of Makimoto’s wave
TTL µproc., memory
custom
standard
ASICs, accel’s
LSI, MSI
1957
1967
1977
1987
1997
2007
Procedural personalization via RAM-based
Machine Paradigm
Personalization (CAD) before fabrication
structural personalization:
RAM-based before run time
Software Industry’s Secret of Success
Repeat Success Story by new Machine Paradigm !
Configware Industry
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
107
scalability
The Scalability Problem
The Routing congestion Problem grows with the size of the FPGA
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
108
rDPU not used used for routing only operator and routing port location markerLegend: backbus connect
array size: 10 x 16 = 160 rDPUs
Structured Configware Design
rout thru only
not used backbus connect
SNN filter KressArray Mapping Example
(Mead & Conway Revival)
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 19
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
109
Nasty Matter
+
CPU
Data
Path
instruction sequencer
RAM
Address Computation Overhead
Instruction Fetch Overhead
central von Neumann bottleneck
extremely power hungry and area inefficient
reconfigurable?
the wrong machine paradigm
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
110
- DPU
Data
Path
Unit
DPU
Data
Path
instruction sequencer
Matter vs. Antimatter: CPU vs. DPU
+
dat
a st
ream
dat
a st
ream
s
+
+
Data
Path
Unit
DPU
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
111
+
CPU
Data
Path
instruction sequencer
+ simple machine paradigm + scalability
+ relocatability + compatibility
= secret of success of software industry
RAM
RAM-based CPU:
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
112
Parallelism by Concurrency
independent instruction streams
....
Bus(es) or switch box
Data
Path
instruction sequencer
Data
Path
instruction sequencer
Data
Path
instruction sequencer
Data
Path
instruction sequencer
+ -
+
-
- +
+
+
-
+
- +
-
-
difficult coordination
massive run time overhead
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
113
Semiconductor Revolutions
“Mainstream Silicon Application is switching every 10 Years”
TTL µproc., memory
custom
standard
1957
1967
1977
1987
1997
2007
ASICs, accel’s
LSI, MSI
“The Programmable System-on-a-Chip is the next wave“
Tredennick’s Paradigm Shifts
hardwired
algorithm: fixed
resources: fixed
procedural programming
algorithm: variable
resources: fixed
structural programming
algorithm: variable
resources: variable
vN machine paradigm
anti machine paradigm
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
114
Impact of Makimoto’s wave
TTL µproc., memory
custom
standard
ASICs, accel’s
LSI, MSI
1957
1967
1977
1987
1997
2007
Procedural personalization via RAM-based
Machine Paradigm
Software Industry’s Secret of Success
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 20
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
115
Impact of Makimoto’s wave
TTL µproc., memory
custom
standard
ASICs, accel’s
LSI, MSI
1957
1967
1977
1987
1997
2007
structural personalization:
RAM-based before run time
Repeat Success Story by new Machine Paradigm !
Configware Industry
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
116
Impact of Data-stream-based ...
TTL µproc., memory
custom
standard
ASICs, accel’s
LSI, MSI
1957
1967
1977
1987
1997
2007
structural personalization:
hardwired before fabrication
Repeat Success Story by new Machine Paradigm !
Embedded Hardware/ Configware Industry
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
117
Rapidly growing CS education gap
•Our computing curricula are obsolete
• introduction is strictly „procedural-only“
•vN-only use of terms like „computer organisation“, „ computer structures“, „ computer architecture
•graduates are not prepared to the real world
– most applications for embedded systems (>90% by 2010)
•our graduates are unable to compete with EE graduates
•only a few % curricula need to be changed
•my mission: getting you involved
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
118
http://kressarray.de
Efficient Memory Communication should be directly supported by the Mapper Tools
sequencers
memory ports
application
not used
Legend: Optimized Parallel Memory Controller
An example by Nageldinger’s KressArray Xplorer
Synthesizable Memory Communication
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
119
Data-Stream-based Soft Machine
Scheduler Memory
(data memory)
memory bank
memory bank
memory bank
memory bank
memory bank
...
...
“instructions”
rDPA Compiler
Sequencers (data stream
generator)
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
120
############### Terminology has been
highly confusing
1
2
0 10 12 18
mon
ths
factor
*) Department of Trade and Industry, London
30y
Battery capacity (1.03/year)
10y
4y
24 36 48
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 21
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
121
The RC Rush is already running (RC = Reconfigurable Computing)
µproc., memory
TTL
standard
ASICs, accel’s
custom
LSI, MSI
reconfigurable platforms
1987
1957
1967
1977 1997
2007
“Mainstream Silicon Application is switching every 10 Years”
Makimoto’s Wave
1s
t d
es
ign
cri
sis
2n
d d
es
ign
cri
sis
the RC rush
rapidly growing no. of courses
http://FPL.org 216 submissions
(DAC’02 had less than 500)
professors took courses
M&C rush roots of the
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
122
Semiconductor Revolutions
“Mainstream Silicon Application is switching every 10 Years”
TTL µproc., memory
custom
standard
1957
1967
1977
1987
1997
2007
ASICs, accel’s
LSI, MSI
“The Programmable System-on-a-Chip is the next wave“
Tredennick’s Paradigm Shifts
hardwired
algorithm: fixed
resources: fixed
procedural programming
algorithm: variable
resources: fixed
structural programming
algorithm: variable
resources: variable
vN machine paradigm
anti machine paradigm
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
123
No vN bottleneck
The anti machine has no von
Neumann bottleneck.
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
124
3 different mind sets
TTL µproc., memory
1957
1967
1977
1987
1997
2007
ASICs, accel’s
LSI, MSI
FPGAs
coarse grain
soft CPUs
hardware people CS people new breed needed
Common terminology needed
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
125
What‘s the problem ?
.... by signals rippling through a network of transistors.
The typical programmer has problems to understand function evaluation without machine mechanisms....
Traditional CS: programming is (control-)procedural, instruction-stream-based – sources: software
accelerators µprocessor
It‘s the gap between procedural and structural mind set
Crossing the Hardware / Software Chasm [Mike Butts]
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
126
What‘s the problem ?
accelerators µprocessor
The brain hurts on paradigm shift ?
no, it can‘t ...
Brain usage: procedural-only
structural hemisphere missing
Crossing the Hardware / Software Chasm [Mike Butts]
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 22
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
127
Reconfigurable Computing: a second programming domain
Migration of programming to the structural domain
The opportunity to introduce the structural domain to programmers ...
The structural domain has become RAM-based
... to bridge the gap by clever abstraction mechanisms using a simple new machine paradigm
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
128
control-procedural vs. data-procedural
The structural domain is primarily data-stream-based:
..... mostly not yet modelled that way: most flowware is hidden by its indirect
instruction-stream-based implementation
Flowware provides a (data-)procedural abstraction from the (data-stream-based) structural domain
Flowware converts „procedural vs. structural“ into „control-procedural vs. data-procedural“ ...
... a Troyan horse to introduce the structural domain to the procedural mind set of programmers
Flowware*
*) explained later
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
129
How to achieve acceptance
No hardware description languages
Courses tailored for students not being hardware-savvy
Tools usable by users not being hardware designers
EDA tools based on term rewriting [Arvind] [Mauricio Ayala]
[Courtesy Richard Newton]
Your name here: your proposals
how to hide the ugliness from the user [Herman Schmit]
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
130
>> Why it’s time for a New CS
http://www.uni-kl.de
• Preface
• Terminology clean-up
• Why it’s time for a New CS
• Draft of a Roadmap
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
131
McKinsey Curve: dynamics of R&D disciplines
maturity of a discipline
year
fundmental issues
consolidation
saturation: limitations met
new discipline on top of it by ....
... by innovation
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
132
data streams ...
History of Computing
mainframes PC
?
1957
1967
1977
1987
1997
2007
new CS
maturity
classical CS
morphware
.... but awareness still missing ... still ignored by most CS curricula
it´s already existing ...
here?
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 23
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
133
EDA Industry Revolutions
1978
Transistor entry: Applicon, Calma, CV ...
1992
Synthesis: Cadence, Synopsys ... 1985
Schematics entry: Daisy, Mentor, Valid ...
courtesy [Keutzer / Newton]
EDA industry paradigm switching every 7 years
1999 HLLs, (Co-) Compilation
Data-Stream-based DPU arrays
2006 coming closer to programmers‘ mind set
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
134
it‘s time for a new CS
it‘s time for a new CS ...
configware trend flowware trend
embedded systems: hw/cw/sw co-design
CS crisis: qualification
problems urging us
next EDA wave: high level languages
opportunities
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
135
all ingredients available
algorithmic cleverness: new directions for „algorithms and data structures“ specialists
morphware scalability / configware relocatability: achievable by EDA support
all ingredients available: published the past 30 years
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
136
On-chip memory
algorithmic cleverness: new directions for „algorithms and data structures“ specialists
RC: on chip distributed memory architecture
vN: code size of astronomic dimensions -> off-chip memory
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
137
http://kressarray.de
Efficient Memory Communication should be directly supported by the Mapper Tools
sequencers
memory ports
application
not used
Legend: Optimized Parallel Memory Controller
Synthesizable Distributed Memory
An example by Nageldinger’s KressArray Xplorer
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
138
Software Industry
Procedural personalization via RAM-based
Machine Paradigm
Software Industry’s Secret of Success
µprocessor, memory ICs
1957
1967
1977
1987
1997
2007
go mainstream
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 24
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
139
Configware Industry ?
structural personalization:
RAM-based before run time
Repeat Success Story by new Machine Paradigm !
Configware Industry
µprocessor memory ICs morphware
1957
1967
1977
1987
1997
2007
goes mainstream
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
140
KressArray Family generic Fabrics: a few examples
Examples of 2nd Level Interconnect: layouted over rDPU cell - no separate routing areas !
+
rout-through and function
rout-through
only more NNports:
rich Rout Resources
Select Function
Repertory
select Nearest Neighbour (NN) Interconnect: an example
16 32 8 24
4
2 rDPU
Select mode, number, width of NNports
http://kressarray.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
141
stolen from Bob Colwell
processor/memory commmunication bottleneck
vN bottleneck vN: unbalanced
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
142
>>> we need ... <<<<<
We need a Mead-&-Conway-like text book
We need undergraduate lab courses on HW / CW / SW partitioning We need new courses with extended scope on parallelism and algorithmic cleverness for HW / CW / SW migration / partitioning What else do we need ? Your proposals ?
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
143
>>> we need support <<<<<
We need the support of the open-minded
members of the classical CS community
Let us assemble a list with e-mail addresses
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
144
>>> thank you <<<<<
thank you for your patience
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 25
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
145
>>> book <<<<<
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
146
>>> END <<<
END
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
147
Having introduced Data streams
x x x
x x x
x x x
|
| |
x x
x
x
x
x
x x
x
- -
-
input data stream
x x
x
x
x
x
x x
x
- -
-
-
-
-
-
-
-
-
-
-
x x x
x x x
x x x
|
|
|
|
|
|
|
|
|
|
|
| output data streams
time
port #
time
time
port # time
port #
systolic array research: throughout the 80ies:
Mathematicians‘ hobby
The road map to HPC:
ignored for decades ~1980
DPA (pipe network)
execution transport-triggered
no memory wall
H. T. Kung
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
148
Who generates the Data Streams?
Mathematicians: it‘s not our job
x x x
x x x
x x x
|
| |
x x
x
x
x
x
x x
x
- -
-
x x
x
x
x
x
x x
x
- -
-
-
-
-
-
-
-
-
-
-
x x x
x x x
x x x
|
|
|
|
|
|
|
|
|
|
|
|
(it‘s not algebraic)
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
149
Without a sequencer …
… it’s not a machine reductionist approach:
(it‘s not our job)
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
150
of course algebraic (linear projection)
only for applications with regular data dependencies
Mathematicians caught by their own paradigm trap
Rainer Kress discarded their algebraic synthesis methods and replaced it by simulated annealing: rDPA
1995
Synthesis Method?
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 26
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
151
The counterpart of the von Neumann machine
x x x
x x x
x x x
|
| |
x x
x
x
x
x
x x
x
- -
-
x x
x
x
x
x
x x
x
- -
-
-
-
-
-
-
-
-
-
-
x x x
x x x
x x x
|
|
|
|
|
|
|
|
|
|
|
|
(r)DPA
ASM
ASM
ASM
ASM
ASM
ASM
AS
M
AS
M
AS
M
AS
M
AS
M
AS
M
data counter
GAG RAM
ASM: Auto-Sequencing
Memory
data counters instead of a program counter
data counters: located at memory (not at data path)
coarse-grained
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
152
rDPU not used used for routing only operator and routing port location markerLegend: backbus connect
array size: 10 x 16 rDPUs
Coarse-grained Reconfigurable Array
rout thru only
not used backbus connect
SNN filter on (supersystolic) KressArray (mainly a pipe network)
reconfigurable Data Path Unit, 32 bits wide
no CPU
rDPU
note: software perspective without instruction streams: pipelining
compiled by Nageldinger‘s KressArray Xplorer with Juergen Becker‘s CoDe-X inside
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
153
Simple KressArray Configuration Example
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
154
C or FORTRAN ?
Computer scientists haven’t been interested in programming clusters. If putting the cluster on a chip is what excites them, fine.
Gordon Bell:
It will still have to run Fortran!
*) like CoDe-X
Support tools have been demonstrated by academia
Classical programming languages, but with a slightly different semantics (data-procedural) are good candidates for parallel programming.
Reiner Hartenstein (conclusion of this talk):
or C (X-C)
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
155
thank you for your patience
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
156
END
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 27
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
157
Start-ups for Coarse-grained Platforms
• One company has failed
• Several companies have succeeded, but their technology disappeared through acquisition.
• Two companies are still available
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
158
Multi Core: Just more CPUs ?
Complexity and clock frequency of single-core microprocessors come to an end
Without a paradigm shift just more CPUs on chip lead to the dead roads known from supercomputing
Multi-core microprocessor chips emerging: soon 32 cores on an AMD chip, and 80 on an intel
Multi-threading is not the silver bullet
We’ve to re-think basic assumptions behind computing
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
159
New Languages for Parallelism ?
Efforts to extend standards-based, serial programming languages with features to describe parallel constructs are likely to fail.
Nick Tredennick:
Term Rewriting Systems may raise the abstraction level up to math formulae
Mauricio Ayala-Rincón:
What’s more likely to succeed are languages that raise the level of abstraction in algorithm description
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
160
C or FORTRAN ?
Computer scientists haven’t been interested in programming clusters. If putting the cluster on a chip is what excites them, fine.
Gordon Bell:
It will still have to run Fortran!
Loop transformations from C or Fortran by automatically partitioning software/configware co-compilers* targetting coarse-grained reconfigurable arrays are quite promising.
Reiner Hartenstein:
*) like CoDe-X
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
161
rDPU not used used for routing only operator and routing port location markerLegend: backbus connect
route-thru-only rDPU
3 vert. NNports, 32 bit
http://kressarray.de
Xplorer Plot: SNN Filter Example
+ [13]
2 hor. NNports, 32 bit
operator
result
operand
operand
route thru
backbus connect
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
162
data counter
GAG RAM
ASM
data counter
GAG RAM
ASM
data counter
GAG RAM
ASM
Configware Compilation
configware code
flowware code
mapper
configware compiler
scheduler
source „program“
Configware Engineering
placement & routing
data
programming the data counters
configware compilation fundamentally different from software compilation
x x x
x x x
x x x
|
| |
x x x
x
x x
x x
x
- -
-
x x x x
x x
x x x
- - -
- - -
- - -
- - -
x x x
x x x
x x x
|
|
|
|
|
|
|
|
|
|
|
| data streams
rDPA
pipe network
data counter
GAG RAM
ASM: Auto-Sequencing Memories ASM
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 28
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
163
rDPU not used used for routing only operator and routing port location markerLegend: backbus connect
array size: 10 x 16 = 160 rDPUs
rout thru only
not used backbus connect
SNN filter on (supersystolic) KressArray (mainly a pipe network)
reconfigurable Data Path Unit, e. g. 32 bits wide
no CPU
rDPU
note: software perspective without instruction streams
Symptom of the von Neumann Syndrome
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
164
Hybrid Multi Core example
twin paradigm machine
each core can run CPU mode
or rDPU mode
rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU
CPU
CPU CPU
CPU
CPU CPU
CPU CPU
64 cores
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
165
History
• 1987-1990: Xputer – anti machine
• 1990/1 GAG address generator & ASM
• 1994: Compilers for Xputers
• 1995: Kress Array ASP-DAC
• 1995: Compilation f. corse-grained arrays
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
166
Loop Transformation Examples
loop 1-8 body body endloop
loop 1-8 body endloop
loop 9-16 body endloop
fork
join
strip mining
loop 1-4 trigger endloop
loop 1-2 trigger endloop
loop 1-8 trigger endloop
reconf.array: host: loop 1-16 body endloop
sequential processes: resource parameter driven Co-Compilation
loop unrolling
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
167
reconfigurability overhead>
routing congestion
wiring overhead
overhead:
>> 10 000
1980 1990 2000 2010 100
103
106
109
FPGA logical
FPGA routed
density:
FPGA physical
transistors
/ microchip
immense area
inefficiency
1st DeHon‘s Law [1996: Ph. D thesis, MIT]
general purpose “simple” FPGA
Deficiencies of reconfigurable fabrics (FPGA) (fine-grained)
power guzzler
slow clock
deficiency
factor: >10,000
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
168
Software-to-Configware (FPGA) Migration:
molecular dynamics simulation 88
some published speed-up factors [2003 – 2005]
100
103
106
real-time face detection 6000
video-rate stereo vision
900 pattern recognition 730
SPIHT wavelet-based image compression 457
FFT 100
Reed-Solomon Decoding 2400
Viterbi Decoding 400
1000
MAC
DSP and wireless
Image processing, Pattern matching,
Multimedia
BLAST 52
protein identification 40
Smith-Waterman pattern matching
288
Bioinformatics GRAPE
20
Astrophysics
speed
up f
acto
r
crypto 1000
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 29
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
169
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
170
Software-to-Configware (FPGA) Migration:
molecular dynamics simulation 88
some published speed-up factors [2000– 2008]
100
103
106
real-time face detection 6000
video-rate stereo vision
900 pattern recognition 730
SPIHT wavelet-based image compression 457
FFT 100
Reed-Solomon Decoding 2400
Viterbi Decoding 400
1000
MAC
DSP and wireless
Image processing, Pattern matching,
Multimedia
BLAST 52
protein identification 40
Smith-Waterman pattern matching
288
Bioinformatics
GRAPE 20 Astrophysics
speed
up f
acto
r
crypto 1000
3000
34000
DES breaking
xputer
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
171
“simple” FPGAs are only the beginning
• Less discrepancy for platform FPGAs and coarse-grained reconfigurable arrays
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
172
carpal • #
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
173
Software-to-Configware (FPGA) Migration: Oil and gas [2005]
100
103
106
speed
up f
acto
r
oil and gas 17
side effect: slashing the electricity bill
by more than an order of magnitude
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
174
An accidentially discovered side effect
•Software to FPGA migration of an oil and gas application:
•only a speed-up factor of 17
•Electricity bill down to <10%
•Hardware cost down to <10%
•All other publications reporting speed-up did not report energy consumption.
Saves > $10,000 in electricity bills per year (7¢ / kWh) - .... per 64-processor 19" rack
Herb Riley, R. Associates
- This will change.
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 30
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
175
What’s Really Going On
With Oil Prices? [BusinessWeek, Jan. 29, 2007]
$52 Price in Feb 2007 [NY Mercantile Exch.: Jan. 17]
$200 Minimum oil price in 2010, in a bet by investment banker Matthew Simmons
[BusinessWeek]
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
176
Energy as a strategic issue
•Google‘s ann. electricity bill: 50,000,000 $
•Amsterdam: 25% goes into server farms
•NY city server farms: 1/4 km2 floor area
[Mark P. Mills]
•Predicted for the USA in the year 2020: 30-50% of the entire national electricity consumption goes into cyber infrastructure
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
177
Energy: an im portant motivation
platform example Energy: W / Gflops energy factor
MDgrape-3* (domain-specific 2004)
0.2 1
Pentium 4 14 70
Earth Simulator (supercomputer 2003)
128 640
*) feasible also on reconfigurable platforms
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
178
Executive Summary doesn‘t help
We must first understand the nature of the paradigm
Understanding the Paradox ?
von Neumann chickens ?
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
179
Outline
• The Pervasiveness of FPGAs
• The Reconfigurable Computing Paradox
• The Gordon Moore gap
• The von Neumann syndrome
• The Anti Machine
• We need a twin paradigm approach
• Conclusionsin & the multicore crisis
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
180
Moore’s law not applicable to all aspects of VLSI
What is the reason of the paradox ?
The Gordon Moore curve does not indicate performance
The peak clock frequency does not indicate performance
the law of Gates
astronomic code size causes
massive overhead, due to
von Neumann syndrome
Reiner Hartenstein (invited paper, invited book chapter): The von Neumann Syndrome; Stamatis Vassiliadis Memorial Symposium, Sep 28, 2007, Delft, Netherlands 31
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
181
Rapid Decline of Computational Density
[BWRC, UC Berkeley, 2004]
1990 1995 2000 2005
200
100
0
50
150
75
25
125
175
SP
EC
fp2000/M
Hz/B
illio
n T
ransis
tors
HP
alph
a: d
own
by
100
in
6 y
rs
IBM
: dow
n b
y 2
0 in
6 y
rs
stolen from Bob Colwell
memory wall, caches, ...
primary design goal: avoiding a paradigm shift
dramatic demo of the von Neumann Syndrome
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
182
END
© 2007, [email protected] http://hartenstein.de
TU Kaiserslautern
183
END