However, we are far >> Outline - - TU Kaiserslautern · [email protected] 15 May 2014 Reiner...

10
[email protected] 15 May 2014 Reiner Hartenstein (invited presentation), HiPEAC Computing Systems Week, 15 May 2014, Barcelona, Spain 1 Reiner Hartenstein, TU Kaiserslautern, Germany KIT Karlsruhe Institute of Technology, Germany http://hartenstein.de Reiner Hartenstein IEEE fellow SDPS fellow FPL fellow Computing Systems Week, May 2014, Barcelona, Spain How many dimensions has the space beyond Reconfigurable Computing? Re-animation of Moore‘s Law by including a third dimension http://fpl.org/s/Reiner_CSW14.pdf http://www.fpl.uni-kl.de/EIS/Reiner_CSW14.pdf © 2014, [email protected] 2 2 Memristor Multiple Crossbar Layers on top of CMOS Sea of memory above the logic: CMOS compatible CMOS [source: Kvatinsky et al.] Metal7 Metal6 Metal5 Metal4 Metal3 Metal2 Metal1 © 2014, [email protected] 3 *) RC: Reconfigurable Computing or FPGAs Much more efficient by new technologies, will RC* be more essential than ever before? sensitizing you to watch more recent technology developments reaching far beyond traditional FPGAs … Will new computing paradigms be so efficient, that RC* is no more needed to save power ?… Who will win this technology race ? USA? Europe? In Computing RC* is everywhere – not only in FPGAs heterogeneous is everywhere ! Software Flowware Configware PIMware However, we are far away from beyond RC* © 2014, [email protected] 4 >> Outline << Why FPGAs are important Traditional FPGA Operation Routability FPGA Technology trends New Computing Paradigm Conclusions http://www.uni-kl.de http://fpl.org/s/Reiner_CSW14.pdf © 2014, [email protected] 5 The History of Computing the 1st electrical computer, ready prototyped for mass production ? which year, which company ? Do you know ? © 2014, [email protected] 6 Prototype 1884: Herman Hollerith the first reconfigurable computer !! 1989 US census use The LUT (lookup table) datastream-based ! datastream-based ! non-volatile !! 6 The History of Computing size: only 2 refrigerators !!! (much smaller then mainframes)

Transcript of However, we are far >> Outline - - TU Kaiserslautern · [email protected] 15 May 2014 Reiner...

Page 1: However, we are far >> Outline - - TU Kaiserslautern · reiner@hartenstein.de 15 May 2014 Reiner Hartenstein (invited presentation), HiPEAC Computing Systems Week, 15 May 2014, Barcelona,

[email protected] 15 May 2014

Reiner Hartenstein (invited presentation), HiPEAC Computing Systems Week, 15 May 2014, Barcelona, Spain 1

Reiner Hartenstein, TU Kaiserslautern, Germany KIT Karlsruhe Institute of Technology, Germany http://hartenstein.de

Computing Systems Week May 2014, Barcelona, Spain

Reiner Hartenstein

IEEE fellow

SDPS fellow

FPL fellow

Computing Systems Week, May 2014, Barcelona, Spain

How many dimensions has the space beyond

Reconfigurable Computing?

Re-animation of Moore‘s Law

by including a third dimension

http://fpl.org/s/Reiner_CSW14.pdf http://www.fpl.uni-kl.de/EIS/Reiner_CSW14.pdf © 2014, [email protected]

http://hartenstein.de 2

2

Memristor Multiple Crossbar Layers on top of CMOS

2

Sea of memory above the logic: CMOS compatible

CMOS

[source: Kvatinsky et al.]

Metal7

Metal6

Metal5

Metal4

Metal3

Metal2

Metal1

© 2014, [email protected]

http://hartenstein.de 3

3

*) RC: Reconfigurable Computing or FPGAs

Much more efficient by new technologies, will RC* be more essential than ever before?

… sensitizing you to watch more recent technology developments reaching far beyond traditional FPGAs …

Will new computing paradigms be so efficient, that RC* is no more needed to save power ?…

Who will win this technology race ? USA? Europe?

In Computing RC* is everywhere – not only in FPGAs

heterogeneous is everywhere !

Software Flowware Configware PIMware

However, we are far away from beyond RC*

© 2014, [email protected]

http://hartenstein.de 4

4

>> Outline <<

• Why FPGAs are important

• Traditional FPGA Operation

• Routability

• FPGA Technology trends

• New Computing Paradigm

• Conclusions http://www.uni-kl.de

4

http://fpl.org/s/Reiner_CSW14.pdf

© 2014, [email protected]

http://hartenstein.de 5

The History of Computing

5

the 1st electrical computer, ready

prototyped for mass production ?

which year,

which company ?

5

Do you know ?

© 2014, [email protected]

http://hartenstein.de 6 6

Prototype 1884: Herman Hollerith

the first reconfigurable computer !!

1989 US census use

The LUT (lookup table)

datastream-based ! datastream-based !

non-volatile !!

6 The History of Computing

size: only 2 refrigerators !!!

(much smaller then mainframes)

Page 2: However, we are far >> Outline - - TU Kaiserslautern · reiner@hartenstein.de 15 May 2014 Reiner Hartenstein (invited presentation), HiPEAC Computing Systems Week, 15 May 2014, Barcelona,

[email protected] 15 May 2014

Reiner Hartenstein (invited presentation), HiPEAC Computing Systems Week, 15 May 2014, Barcelona, Spain 2

Reiner Hartenstein, TU Kaiserslautern, Germany KIT Karlsruhe Institute of Technology, Germany http://hartenstein.de

© 2014, [email protected]

http://hartenstein.de 7

The biggest Mistake in the History of Computing

spreading the von Neumann syndrome* pandemic ….

7

http://hartenstein.de/EIS2/#mistake *) this term has been coined by C. V. Ramamoorthy at San Diego

…caused by the separation of

processing and memory

ENIAC etc. (mid‘ 40ies): paradigm shift from data streams to instruction streams

Nathan‘s Law

Patterson‘s Law

© 2014, [email protected]

http://hartenstein.de 8

Even Hardware Design went von Neumann

to-day: again Reconfigurable ISA*

IBM 360 computer series: load different instruction sets from Floppy Discs

*) Instruction Set Architecture

Dr.-Ing. Ralf König

Dissertations on CMP** with reconfigurable ISA*:

Dr.-Ing. Timo Stripf

8

**) Chip Multi Processor

[Günter Koch et al.: “The universal Bus considered harmful”; 1st EUROMICRO Symp., 1975, Nice, France] http://hartenstein.de/NIZZA/

However, no nested von Neumann machines:

© 2014, [email protected]

http://hartenstein.de 9

Future Supercomputers

going heterogeneous

http://hartenstein.de

9

© 2014, [email protected]

http://hartenstein.de 10

10

vN to RC migration within the E.I.S. Project

Speed-up reaches > 4 OoM* Energy saving factors 10% of speed-ups

4,300

RC is absolutely essential to cope with the von Neumann Syndrome

http://hartenstein.de/Hartenstein-EDA-Innov-Europe-3.pdf http://xputers.informatik.uni-kl.de/staff/hartenstein/eishistory_en.html

Speed-up by vN to RC migrations

onto FPGAs

*) OoM = Orders of Magnitude

© 2014, [email protected]

http://hartenstein.de 11

The Dead FPGA Market

FPGA’2013 Panel: "Are FPGAs Suffering from the Innovator’s Dilemma?"

< 1.5% semi

market share

for more than

10 years

Optoelectronics

Discrete

Logic

Special Purpose

DRAM

Microprocessor

Analog

Flash Memory

/ASSPs

PLDs

almost The FPGA

Marketing

Paradox 11 Xilinx

© 2014, [email protected]

http://hartenstein.de 12

12

The Reconfigurable Computing Paradox

http://xputer.de/RCpx/#pdx

12

Reconfigurability Overhead* and much slower clock

Circumventing the von Neumann Syndrome

*) subject of my talk

we‘ll be bothered by addictions to acronyms

Page 3: However, we are far >> Outline - - TU Kaiserslautern · reiner@hartenstein.de 15 May 2014 Reiner Hartenstein (invited presentation), HiPEAC Computing Systems Week, 15 May 2014, Barcelona,

[email protected] 15 May 2014

Reiner Hartenstein (invited presentation), HiPEAC Computing Systems Week, 15 May 2014, Barcelona, Spain 3

Reiner Hartenstein, TU Kaiserslautern, Germany KIT Karlsruhe Institute of Technology, Germany http://hartenstein.de

© 2014, [email protected]

http://hartenstein.de 13

13

acroym meaning

ANN Artificial Neuronal Network

AVFS Adaptive Voltage and Frequency Scaling

BE Back End

BiCMOS Bipolar CMOS

BJT Bipolar JunctionTransistor

BLE Basic Logic Element

BRAM Block RAM

BRS Bipolar Resistive Switching

CB Connection Box

CC Cluster Computing

CNT Carbon Nano-Tubes

CoMAC Cache-only-Memory-Access

ccNUMA Cache coherent NUMA

CGM Coarse-grained Multicomputer

CREW Concurrent Read Exclusive Write

CRS Complementary Resistive Switch

CBRAM Conductive-Bridge Memory

CRAM Card RAM

DMA Direct memory access

DMS Data Systems Memory

DRA cell Dynamic Resource Assignment

DRAM Dynamic Random Access Memory

DSM Distributed Shared Memory

eNVM embedded NonVolatile Memory

EPIC Explicitly Parallel Instruction Computing

EREW Exclusive read exclusive write

FDSOI Fully-Depleted Silicon on-Insulator

acroym meaning

FE Front End

FEOL Front End Of Line

FeRAM Ferroelectric RAM

FET Field-Effect Transistor

FF Flipflop

FIN FET Fin-Shaped Field Effect Transistor

FIT Failure-In-Time

GAA Gate-All-Around

FPCA Field-Programmable Counter Array

GeS2 Germanium Sulfide

GMR Giant MagnetoResistance

GMS Generic Memristic Structure

HCML Hybrid CMOS Memristor Logic

HDD Hard Disc

hPRAM Hierarchical PRAM

HRS High Resistive State

IEDM International Electron Devices Meeting

IMPLY IMPLY-based memristor logic methodology

ISSC International Solid State Circuits Conference (ISSCC)

LB Logic Box

LUT Look-Up Table

MAGIC Memristor Aided Logic

MIMD Multiple instructions multiple data

ML Machine Learning

MLC Multi-Level Cell

MLCs in PCMs

suffer from resistance drift phenomena [29]: we use SLCs

MoS2 Molibdenium Bisulphite

MRAM Magnetic RAM

MRL Memristor-Ratioed Logic

MTJ Magnetic tunnel junction

NoRMA No-Remote-Memory-Access

NUMA Non-Uniform-Memory-Access

acroym meaning

NV-RAM Non-volatile RAM

OxRRAM metal Oxide Resistive switching Memory

PC Peripheral Circuitry

PLMA Programmable Logic Memristor Array

PRAM Parallel Random Access Machine

qPRAM Queuing PRAM

PCM Phase Change Memory (PCRAM)

PDP Power Delay Product

PG Power Gating

ReRAM Resistive RAM

RH High Resistance state

RH/RL the ratio of High Resistance state to Low Resistance state

RL Low Resistance state

RRAM Resistive RAM

SB Switch Box

SB-GNRFET

Schottky-Barrier-type Graphene Nano-Ribbon Field-Effect Transistor

SIMD Single instruction multiple data

SiNW Silicon NanoWires

SLC Single-Level Cell

SMP Symmetric Multi-Processing

SPMD Single Program Multiple Data

SRAM Static Random Access Memory

SReRAM Resistive RAM

STI Shallow Trench Isolation

STT-RAM Spin-Transfer Torque Magnetic RAM

SVM Support Vector Machine

TFET Tunnel FET

ULP Ultra-Low Power

UMA Uniform-Memory-Access

UPC Universal Parallel C

UTBB Ultra-Thin Body and Box

ZRAM Zero Capacitor Ram

13

© 2014, [email protected]

http://hartenstein.de 14

14

>> Outline <<

• Why FPGAs are important

• Traditional FPGA Operation

• Routability

• FPGA Technology trends

• New Computing Paradigm

• Conclusions http://www.uni-kl.de

14

http://fpl.org/s/Reiner_CSW14.pdf

© 2014, [email protected]

http://hartenstein.de 15

Conventional FPGA Architecture

15 [Jason Cong et al.]

Acronyms: LUT Look-Up Table CB Connect Block: connects LUT SB Switch Block: routing resource

15

LUB

LUB

LUB

LUB

SB

SB

SB SB

SB

SB

SB

SB

SB

Acronyms: LUB Look-Up Block CB Connect Block: connects LUB SB Switch Block: routing resource

© 2014, [email protected]

http://hartenstein.de 16

Island-style FPGA: details

16

16 Acronyms: LUB Look-Up Block CB Connect Block: connects LUB SB Switch Block: routing resource

LUB

LUB

LUB

LUB

LUB

LUB

CB

SB

Switch

SP Switch Point

only here: 990 without LUBs

CP Connect Point

Flipflop: Part of Configuration-RAM

FF

© 2014, [email protected]

http://hartenstein.de 17

17

Tradidional Switch Block

Switch Block

Acronyms: LUB Look-Up Block CB Connect Block: connects LUB SB Switch Block: routing resource

Switch Point (SP)

17

FF

part of

configuration

memory

150

© 2014, [email protected]

http://hartenstein.de 18

18

>> Outline <<

• Why FPGAs are important

• Traditional FPGA Operation

• Routability

• FPGA Technology trends

• New Computing Paradigm

• Conclusions

http://www.uni-kl.de

18

http://fpl.org/s/Reiner_CSW14.pdf

Page 4: However, we are far >> Outline - - TU Kaiserslautern · reiner@hartenstein.de 15 May 2014 Reiner Hartenstein (invited presentation), HiPEAC Computing Systems Week, 15 May 2014, Barcelona,

[email protected] 15 May 2014

Reiner Hartenstein (invited presentation), HiPEAC Computing Systems Week, 15 May 2014, Barcelona, Spain 4

Reiner Hartenstein, TU Kaiserslautern, Germany KIT Karlsruhe Institute of Technology, Germany http://hartenstein.de

© 2014, [email protected]

http://hartenstein.de 19

Routing Congestion

Example

here solved by detour connections

rout thru …

detour connections

19

direct connection impossible

© 2014, [email protected]

http://hartenstein.de 20

A Patent on Routing

Congestion

Sued by Avant! Corporation (2002*) against patent about avoiding routing congestion

*) http://en.wikipedia.org/wiki/Bundesgerichtshof

Quickturn Design System Inc., vendor of ASIC emulators

W. A. Malthaler, Earle Vaughan: An Experimental Electronically Controlled Automatic Switching System; Bell System Technical Journal, May 1952

1999: subsidiary of CADENCE

20

as an expert appointed by the court* I found

50 years ago !

© 2014, [email protected]

http://hartenstein.de 21

Better Routability*

21

[source: Gomez-Prado et al.]

Why do Flexbility and Topology matter? *) measures the number of

circuits that can be routed Higher Flexbility = better Routability

© 2014, [email protected]

http://hartenstein.de 22 [[Lemieux_et_al.]

Different SB Architecture and Notation Examples

22

[Gomez-Prado et al.] [X. Maet et al.] [Lemieux_et_al.]

[[Lemieux_et_al.]

24

[[Hartenstein]

96

bidir.

© 2014, [email protected]

http://hartenstein.de 23

The very popular Wilton Switch Box*

23

[source: Masud et al.]

here 2 possible routes using disjoint blocks

*) S. Wilton: Architecture and Algorithms for Field-Programmable Gate Arrays with Embedded Memory. PhD thesis, University of Toronto, 1997

already 20 years ago: frequent literature on routing optimization

(non-planar layout)

© 2014, [email protected]

http://hartenstein.de 24

24

>> Outline <<

• Why FPGAs are important

• Traditional FPGA Operation

• FPGA Technology trends

• New Computing Paradigm

• Conclusions

http://www.uni-kl.de

24

http://fpl.org/s/Reiner_CSW14.pdf

Page 5: However, we are far >> Outline - - TU Kaiserslautern · reiner@hartenstein.de 15 May 2014 Reiner Hartenstein (invited presentation), HiPEAC Computing Systems Week, 15 May 2014, Barcelona,

[email protected] 15 May 2014

Reiner Hartenstein (invited presentation), HiPEAC Computing Systems Week, 15 May 2014, Barcelona, Spain 5

Reiner Hartenstein, TU Kaiserslautern, Germany KIT Karlsruhe Institute of Technology, Germany http://hartenstein.de

© 2014, [email protected]

http://hartenstein.de 25

25

[Giovanni de Micheli et al.]

The customizable resources in FPGAs contribute to more than 80% of the total area and delay.

FPGA Improvement 25

Architecture can be improved by direct working on memories and their efficient use in routing operations.

Much more efficient routing by replacing the pass transistors with Memristors

FPGAs with non-volatile Configuration Memory

© 2014, [email protected]

http://hartenstein.de 26

First MRAM FPGA 26

LIRMM:”World’s First MRAM-based FPGA!”

taped out in 2010

What MRAM Device?

Device-rel. experiences

© 2014, [email protected]

http://hartenstein.de 27

RRAM PCM STT MRAM

PCM/CBRAM Magnetic Tunneling Junctions

Programmable Metallization Cells

Carbon Nanotubes DRAM

FIN FETs FLASH FRAM OxRRAM PRAM SiNW

etc. many other acronyms

27

The Memristor

TiO2 semiconductor: high resitance and

“predicted” by L. Chua 1971

Postulated by Karl Steinbuch 1962

Widrow’s Memistor Corp. 1963-1965 27

[Weisheng Zhao, Lionel Torres, LIRMM, 2011]

Resistor with Memory (MRAM)

The 4th Fundamental Element: „missing connection among variables“

© 2014, [email protected]

http://hartenstein.de 28

More Technology Efforts

28

Candidate Vendors: CYPRESS Everspin Fujitsu Honeywell HRL IBM Infineon intel LAPIS Micromem Panasonic Qualcom Renesas RAMTRON Samsung ST Toshiba … many others

Conferences and Blogs:

Ph

ysi

cs

conf

eren

ces

http://www.nanoarch.org/ http://www3.nd.edu/~cnna2014/

Technologies: FDSOI-FETs Tunnel FETs (TFETs) Silicon NanoWires (SiNW) Carbon Nano-Tubes (CNTs), Shallow Trench Isolation (STI) Ultra-Thin Body and Box (UTBB) FIN FETs (also called Trigate FETS) FETs with on-line controllable-polarity Fully-Depleted Silicon on-Insulator (FDSOI) III-V nanowires on silicon by direct epitaxial growth Of course, MRAM, Graphene, etc. and many more ….

© 2014, [email protected]

http://hartenstein.de 29

29

29 3 ReRAM Technologies

PANASONIC ReRAM PANASONIC

ReRAM

Dotation Movement

Toshiba Press Release 2009

(Ferromagnetic Effects)

[D. Strukov et al.]

„Spintronics“

Dotation Movement

Phase-change RAM

© 2014, [email protected]

http://hartenstein.de 30

30 Processing in Memory (PIM) – a Paradigm Shift

30 The Technology Challenge

Post-CMOS Nanocomputing

ICs in the 21st Century

Overcome the fundamental Limitations of CMOS

Involve new Paradigms for Manufacturing

Reanimation of Moore‘s Law

[NanoArch‘14 Call for Papers]

Page 6: However, we are far >> Outline - - TU Kaiserslautern · reiner@hartenstein.de 15 May 2014 Reiner Hartenstein (invited presentation), HiPEAC Computing Systems Week, 15 May 2014, Barcelona,

[email protected] 15 May 2014

Reiner Hartenstein (invited presentation), HiPEAC Computing Systems Week, 15 May 2014, Barcelona, Spain 6

Reiner Hartenstein, TU Kaiserslautern, Germany KIT Karlsruhe Institute of Technology, Germany http://hartenstein.de

© 2014, [email protected]

http://hartenstein.de 31

31

>> Outline <<

• Why FPGAs are important

• Traditional FPGA Operation

• FPGA Technology trends

• New Computing Paradigm

• Conclusions

http://www.uni-kl.de

31

http://fpl.org/s/Reiner_CSW14.pdf © 2014, [email protected]

http://hartenstein.de 32

Memristor People‘s strange Notations

32

Switch Point (SP)

Switch Block (SB)

© 2014, [email protected]

http://hartenstein.de 33

33

No Standards yet

© 2014, [email protected]

http://hartenstein.de 34

Logic within Memristor-based Crossbar Memory

34

[S. Kvatinsky et al.]

Ron ‘1‘ Roff ‘0‘

Where is the output pin?

A good tutorial is missing !

© 2014, [email protected]

http://hartenstein.de 35

Unidirectional Routing Paths

Double Rail Required for non-volatile SB memory

Switch Point example

New Logic Design Methodology needed 35

topic aerea not yet arrived here:

nor

at s

imila

r co

nfer

ence

s

© 2014, [email protected]

http://hartenstein.de 36

36

[sou

rce:

Kva

tinsk

y et

al.]

Memristor Ratioed Logic (MRL)

[Kvatinsky et al.] [Kvatinsky et al.]

• „Enhancing Computation“

• „Similar to CMOS logic“

• „Using CMOS for inversion and amplification“

• „Memristors operate only as computational elements“

Ron logical ‘1‘

Roff logical ‘0‘

memristance decreases (is set to ‘1‘)

current

current

memristance increases (is set to ‘0‘)

(no change below a threshold voltage)

[source: Kvatinsky et al.]

experts claim:

Logic Design

Methodology

still missing

Page 7: However, we are far >> Outline - - TU Kaiserslautern · reiner@hartenstein.de 15 May 2014 Reiner Hartenstein (invited presentation), HiPEAC Computing Systems Week, 15 May 2014, Barcelona,

[email protected] 15 May 2014

Reiner Hartenstein (invited presentation), HiPEAC Computing Systems Week, 15 May 2014, Barcelona, Spain 7

Reiner Hartenstein, TU Kaiserslautern, Germany KIT Karlsruhe Institute of Technology, Germany http://hartenstein.de

© 2014, [email protected]

http://hartenstein.de 37

37

Memory computes?

Also see: F. Clermidy et al.: „Resistive Memories: Which Applications?“ DATE2014

[Yiyu Shi et al.] DATE 2014

[Chun Zhang et al.] DATE 2014

PIM: what‘s the consequence ?

Will Programmers be replaced by Designers ? Will there be a Programmability wall ?

a good tutorial needed !

do you understand this ?

© 2014, [email protected]

http://hartenstein.de 38

38

The Memristor Challenge

Memristors for multiple Stacking of non-volatile Memory

> petaByte* (>1015Byte) in one Centimeter of Square**

Kilo (k) 103

Mega (M) 106

Giga (G) 109

Tera (T) 1012

Peta (P) 1015

Exa (E) 1018

Zetta (Z) 1021

Yotta (YB) 1024

38

*) DVDx105 (1 DVD: 10 gigaByte (1010)

New Methodologies: Text Book missing

**) [Th. L. Sterling, H. P. Zima]

New Directions beyond Simple Switches

Rethinking the Methodologies and Design Tools needed

New Logic, Memory Concepts and Circuit Styles

[NanoArch‘14 Call for Papers]

© 2014, [email protected]

http://hartenstein.de 39

39

>> Outline <<

• Why FPGAs are important

• Traditional FPGA Operation

• FPGA Technology trends

• New Computing Paradigm

• Conclusions

http://www.uni-kl.de

39

http://fpl.org/s/Reiner_CSW14.pdf © 2014, [email protected]

http://hartenstein.de 40

History of Logic Design

“EE is notorious for its powerful mathematical methods which have nothing to do with electricity.”

“It’s not surprising, that the science of logically prescribed actions has yet no home of its own”

remarks by a book review,1959

1958 lectures on logic design:

a good translation

translates relay contacts into tubes and transistors

40

© 2014, [email protected]

http://hartenstein.de 41

Education

Revolution:

the M-&-C* VLSI

Design Revolution

41

the E.I.S.-Projekt:

http://xputer.de/EIS/

[1980]

The by far most

effective project

in the history of

modern computer

science

Designer Population missing

*) Mead-&-Conway © 2014, [email protected]

http://hartenstein.de 42

Education Reinvented: the

M-&-C Design Revolution

42

reduced width of specialization

The Mead-&-Conway strategy:

Clearing out & intuitive models

Silicon Foundry

ta

ll th

in

m

an

Application

cohere

nce

traditional division of specialization:

Logic level

Switching level

Circuit level

Register Transfer (RT) level

Application level

Layout level

in-house technology

submit reject

submit reject

submit reject

submit reject

submit reject

width of specialization

frag

ment

atio

n

the E.I.S.-Project: http://xputer.de/EIS/

Page 8: However, we are far >> Outline - - TU Kaiserslautern · reiner@hartenstein.de 15 May 2014 Reiner Hartenstein (invited presentation), HiPEAC Computing Systems Week, 15 May 2014, Barcelona,

[email protected] 15 May 2014

Reiner Hartenstein (invited presentation), HiPEAC Computing Systems Week, 15 May 2014, Barcelona, Spain 8

Reiner Hartenstein, TU Kaiserslautern, Germany KIT Karlsruhe Institute of Technology, Germany http://hartenstein.de

© 2014, [email protected]

http://hartenstein.de 43

Future of Logic Design

43 From Transistors to Memristors is drastically more difficult

We need a massively new book similar to Samuel S. Caldwell’s type of book !!

Memristor network design is notorious for its powerful results which seems to have nothing to do with logic.

It’s not surprising, that the science of logic design of Memristor-based systems has yet no home of its own.

M E M R I S T O R S

A N D

L O G I C

D E S I G N

memristic

?

© 2014, [email protected]

http://hartenstein.de 44

Mead-&-Conway-II needed

44

a non-

trivial

task

Mead-&-Conway-I:

• Design Rules: reduced to only 2 pages

• only: resistor, p-transistor, n-transistor

• MPC infrastructure w. DRC and fabrication for student exercises

• the tall thin man

Mead-&-Conway-II:

• Design Rules: reduced to only 2 pages

• only: resistor, p-transistor, n-transistor, memristor

• MPC infrastructure w. DRC and fabrication for student exercises

• the tall thin man II

clas

sica

l C

MO

S

CM

OS

Designer

Population

missing

© 2014, [email protected]

http://hartenstein.de 45

45

Booting not needed ?

However, memristors together with related technology progress means that we must reinvent computing

Computing without booting will bring Computing efficiency improvements by several orders of magnitude

Europe needs an extraordinarily massive R&D funding program to avoid becoming incompetent

The Reanimation of Moore‘s Law

Will USA be here faster than Europe ?

Technology leadership: USA

Design leadership: not yet existing

© 2014, [email protected]

http://hartenstein.de 46

46

46

Thank You ! 46

Questions ?

© 2014, [email protected]

http://hartenstein.de 47

47

47

47

Backup for

Discussion

http://fpl.org/s/Reiner_CSW14.pdf © 2014, [email protected]

http://hartenstein.de 48 48

A huge design space (1)

published in 1960

Single vs. Multiple

Instruction

vs. Data

Mike Flynn‘s Taxonomy

48

Single Instruction Single Data

Single Instruction Multiple Data

Multiple Instruction Multiple Data

Multiple Instruction Single Data

Page 9: However, we are far >> Outline - - TU Kaiserslautern · reiner@hartenstein.de 15 May 2014 Reiner Hartenstein (invited presentation), HiPEAC Computing Systems Week, 15 May 2014, Barcelona,

[email protected] 15 May 2014

Reiner Hartenstein (invited presentation), HiPEAC Computing Systems Week, 15 May 2014, Barcelona, Spain 9

Reiner Hartenstein, TU Kaiserslautern, Germany KIT Karlsruhe Institute of Technology, Germany http://hartenstein.de

© 2014, [email protected]

http://hartenstein.de 49

49

A huge design space (2)

extending Flynn‘s taxonomy by going heterogeneous

reconfigurable or not

Diana‘s Taxonomy

Diana Goehringer: Ph.D. thesis, KIT Karlsruhe

49

http://kressarray.de/

http://helios.informatik.uni-kl.de/papers/publications/phd_theses.html#goehringer

http://configware.org/

© 2014, [email protected]

http://hartenstein.de 50 50

A huge design space (3)

Reiner‘s Taxonomy

datastream-based (FPGAs anti-machine)

noI versus SI or MI

Programmability crisis solution

impossible without mastering the entire

design space

50

http://data-streams.org/

http://anti-machine.org/

© 2014, [email protected]

http://hartenstein.de 51

even more

… to cope with complex

vN-typical storage

hierarchies

Mead-&-Conway-III …

heterogeneous platforms .

ta

ll th

in

m

an

II

I co

here

nce

traditional division of specialization:

• Logic level

• Switching level • Circuit level

• Register Transfer (RT) level

• application level

• Layout level

• inter-processor NoC

• pipe-networks • arbiter

• multiplexing

• distribution mechanisms • interconnect at all levels

• off-chip communication

• global and local memory • distributed memory

not needed here

© 2014, [email protected]

http://hartenstein.de 52

52

The Dead Supercomputer Society

•ACRI •Alliant •American Supercomputer •Ametek •Applied Dynamics •Astronautics •BBN •CDC •Convex •Cray Computer •Cray Research •Culler-Harris •Culler Scientific •Cydrome •Dana/Ardent/ Stellar/Stardent

•DAPP •Denelcor •Elexsi •ETA Systems •Evans and Sutherland •Computer •Floating Point Systems •Galaxy YH-1 •Goodyear Aerospace MPP •Gould NPL •Guiltech •ICL •Intel Scientific Computers •International Parallel Machines •Kendall Square Research •Key Computer Laboratories •MasPar

Research 1985 – 1995 [Gordon Bell, keynote ISCA 2000]

•Meiko

•Multiflow

•Myrias •Numerix •Prisma •Tera •Thinking Machines •Saxpy •Scientific Computer •Systems (SCS) •Soviet Supercomputers •Supertek •Supercomputer Systems •Suprenum •Vitesse Electronics

© 2014, [email protected]

http://hartenstein.de 53

53

rDPU not used used for routing only operator and routing port location markerLegend: backbus connect

array size: 10 x 16 rDPUs

A Coarse-Grained Reconfigurable Array

rout thru only

not used backbus connect

SNN filter on (supersystolic) KressArray (mainly a pipe network)

reconfigurable Data Path Unit, 32 bits wide

reconfigurable Data Path Unit, 32 bits wide

no CPU

rDPU rDPU

Compiled by Ulrich Nageldinger‘s KressArray Xplorer

53

(from a conference presentation)

© 2014, [email protected]

http://hartenstein.de 54

54

Brick Wall in the Brain

After this talk* a VIP jumps up from the floor: „But you can‘t implement decisions!“

This statement has been highly embarrassing since came from a topmost level I T R&D manager of a very large worldwide electronics/computer industry group

*) RAW workshop, late 90ies at Orlando, Florida

We immediately see the brick wall in his brain

54

Page 10: However, we are far >> Outline - - TU Kaiserslautern · reiner@hartenstein.de 15 May 2014 Reiner Hartenstein (invited presentation), HiPEAC Computing Systems Week, 15 May 2014, Barcelona,

[email protected] 15 May 2014

Reiner Hartenstein (invited presentation), HiPEAC Computing Systems Week, 15 May 2014, Barcelona, Spain 10

Reiner Hartenstein, TU Kaiserslautern, Germany KIT Karlsruhe Institute of Technology, Germany http://hartenstein.de

© 2014, [email protected]

http://hartenstein.de 55

… can‘t implement decisions?

S = R + (if C then A else B endif);

=1

+

A B R C

section of a very large pipe network:

decision box turns

into a multiplexer*

“That’s so simple! why did it take 30

years to find out?”

in the year 1971**:

**) the hardware description languages community C. G. Bell et al: IEEE Trans-C21/5, May 1972

W. A. Clark: 1967 SJCC, AFIPS Conf. Proc.

55

http://wrongroadmap.com/

© 2014, [email protected]

http://hartenstein.de 56

Dual paradigm mind set: an old hat - but still ignored

time to space mapping: procedural to structural

C. G. Bell et al: The Description and Use of Register-Transfer Modules (RTM's); IEEE Trans-C21/5, May 1972

FF

token bit

evoke

FF FF

56

56

von Neumann machine paradigm (program counter)

Anti-machine paradigm (data counters) h

ttp:

//an

ti-m

achin

e.o

rg/