General Purpose Processors as Processor Arrays Peter Cappello UC, Santa Barbara.
General Purpose Processors as Processor Arrays
description
Transcript of General Purpose Processors as Processor Arrays
![Page 1: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/1.jpg)
General Purpose Processors as Processor Arrays
Peter CappelloUC, Santa Barbara
![Page 2: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/2.jpg)
VLSI Design Forces in 1986
“Nature, to be commanded, must be obeyed.” – Sir Francis Bacon
• High performance parallelism
![Page 3: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/3.jpg)
VLSI Design Forces in 1986
• High performance parallelism
![Page 4: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/4.jpg)
VLSI Design Forces in 1986
• Power is scarce limit resistive delay
![Page 5: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/5.jpg)
VLSI Design Forces in 1986
• Power is scarce limit resistive delay limit long communication
![Page 6: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/6.jpg)
VLSI Design Forces in 1986
• Power is scarce limit resistive delay limit long communication• Area is scarce limit wire crossing
![Page 7: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/7.jpg)
VLSI Design Forces in 1986
• Power is scarce limit resistive delay limit long communication• Area is scarce limit wire crossing
![Page 8: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/8.jpg)
VLSI Design Forces in 1986
• Power is scarce limit resistive delay limit long communication• Area is scarce limit wire crossing
![Page 9: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/9.jpg)
VLSI Design Forces in 1986
• $$ are scarce design is expensive reuse components
![Page 10: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/10.jpg)
VLSI Design Forces in 1986
• $$ are scarce design is expensive reuse components
![Page 11: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/11.jpg)
VLSI Design Forces in 1986
• $$ are scarce design is expensive reuse components
![Page 12: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/12.jpg)
VLSI Design Forces in 1986
• $$ are scarce design is expensive reuse components
![Page 13: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/13.jpg)
VLSI Design Forces in 1986
• $$ are scarce design is expensive reuse components
![Page 14: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/14.jpg)
VLSI Design Forces in 1986
In 2D systolic arrays, clock skew is an issue wavefront arrays
Islands of synchrony inan ocean of asynchrony
![Page 15: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/15.jpg)
Processor Array Properties
1. Have multiple processors
![Page 16: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/16.jpg)
Processor Array Properties
1. Have multiple processors2. Neighbors abut (no long wires)
![Page 17: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/17.jpg)
Processor Array Properties
1. Have multiple processors2. Neighbors abut3. Only neighbors communicate directly
![Page 18: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/18.jpg)
Processor Array Properties
1. Have multiple processors2. Neighbors abut 3. Only neighbors communicate directly4. Have a constant # of processor types
![Page 19: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/19.jpg)
Processor Array Properties
1. Have multiple processors2. Neighbors abut3. Only neighbors communicate directly4. Have a constant # of processor types5. Scale: larger problems larger arrays
![Page 20: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/20.jpg)
No 3D PA Has Properties 1 - 5
Enclose 3D PA in minimal sphere of radius r.
r
![Page 21: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/21.jpg)
No 3D PA Has Properties 1 - 5
Scale PA in all 3 dimensions.
r
![Page 22: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/22.jpg)
No 3D PA Has Properties 1 - 5
1. Power consumption = Θ( r3 ).
r
![Page 23: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/23.jpg)
No 3D PA Has Properties 1 - 5
1. Power consumption = Θ( r3 ).2. Heat dissipation via surface = Θ( r2 ).
r
![Page 24: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/24.jpg)
VLSI Design Forces in 2006
“Nature, to be commanded, must be obeyed.”
– Sir Francis Bacon
• Power is scarce limit clock frequency parallelism• Power is scarce limit resistive delay limit long communication
![Page 25: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/25.jpg)
Trends in GPP in 2006
• Chip multiprocessors (CMP)
• Vector IRAM
• Cell
• TRIPS
• RAW
![Page 26: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/26.jpg)
Trends in GPP in 2006
Chip Multiprocessors (CMP)– Parallel processors– Crossbar
![Page 27: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/27.jpg)
Trends in GPP in 2006
Vector IRAM – Vector Intelligent RAM• For mobile multimedia devices
Stream data processing• Combine GPP and DSP
– Parallel – linear array– Crossbar
![Page 28: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/28.jpg)
![Page 29: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/29.jpg)
Trends in GPP in 2006Cell processor
“The Department of Energy said Wednesday that it had awarded I.B.M. a contract to build a supercomputer capable of 1,000 trillion calculations a second, using an array of 16,000 Cell processor chips that I.B.M. designed for the coming PlayStation 3 video game machine.” Sept. 7, 2006. NY Times.
• Parallel processors – BIU – Bus interface unit– RMT – Replacement management table– SL1 – 1st-level cache– PPE – PowerPC Element– SPE – Synergistic Processor Element– Element interconnect bus
![Page 30: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/30.jpg)
![Page 31: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/31.jpg)
Trends in GPP in 2006
• TRIPSTera-op, Reliable, Intelligently adaptive
Processing System
The following slides are taken from a talk:"
The Design and Implementation of the TRIPS Prototype Chip," HotChips 17, Palo Alto, CA, August, 2005.
![Page 32: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/32.jpg)
• E – execution tile
• R – register bank
• D – 8KB data cache
• I – instruction cache
• G – global control
![Page 33: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/33.jpg)
• Instructions execute as a data flow graph– An instruction’s output
is another instruction’s input.
– Minimize use of register/cache for intermediate values
• Register reads/writes access the register banks
• Loads/stores access the data cache banks
![Page 34: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/34.jpg)
![Page 35: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/35.jpg)
![Page 36: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/36.jpg)
![Page 37: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/37.jpg)
![Page 38: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/38.jpg)
Trends in GPP in 2006
RAW (MIT)The following slides are taken from a RAW talk:Evaluating The Raw Microprocessor:
Scalability and Versatility Presented at the International Symposium on Computer Architecture, June 21, 2004.
![Page 39: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/39.jpg)
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
RF >>
+
Replace the crossbar with a point-to-point, pipelined, routed network.
![Page 40: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/40.jpg)
Distribute the Register File
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
RF
RFRF RFRF
RFRF RFRF
RFRF RFRF
RFRF RFRF
![Page 41: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/41.jpg)
Distribute the rest.
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
RFRF RFRF
RFRF RFRF
RFRF RFRF
RFRF RFRF
Control
WideFetch
(16 inst)
UnifiedLoad/Store
Queue
PC I$PC
D$I$
PC
D$I$
PC
D$I$
PC
D$
I$PC
D$I$
PC
D$I$
PC
D$I$
PC
D$
I$PC
D$I$
PC
D$I$
PC
D$I$
PC
D$
I$PC
D$I$
PC
D$I$
PC
D$I$
PC
D$
[ISCA99]
![Page 42: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/42.jpg)
Tiles!
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
RFRF RFRF
RFRF RFRF
RFRF RFRF
RFRF RFRF
I$PC
D$I$
PC
D$I$
PC
D$I$
PC
D$
I$PC
D$I$
PC
D$I$
PC
D$I$
PC
D$
I$PC
D$I$
PC
D$I$
PC
D$I$
PC
D$
I$PC
D$I$
PC
D$I$
PC
D$I$
PC
D$
![Page 43: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/43.jpg)
Conclusions•VLSI Scalable microprocessors are possible.
Constant factors are beginning to give way to asymptotics: - 16 ALU Raw – Oct 2002 - 64 ALU Raw – Now - 1,024 ALU Raw - 2010 - 32,768 ALU Raw – If Moore’s Law makes it to 2 nm•There is an opportunity to make processors more
“versatile” i.e., steal applications from custom chips.
•Tiled Processor Architectures are a promising approach and merit further research.
![Page 44: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/44.jpg)
GPP Predictions: In 10 Years
• Encapsulate registers/cache/processor into an array (RAW)
• Partition off-chip memory: Encapsulate memory & processor.Safely increase parallel access (concurrent programming)
• For non-recursive applications GPP (mobile multimedia):– no bus; quasi-nearest neighbor networks.
• For recursive applications GPP (gaming, control)– replace bus w/ lean on-chip short-diameter communication network.
– 1 network-on-chip routes register/cache/instruction/control.
– Need >= 1K processors/chip to justify network-on-chip.
![Page 45: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/45.jpg)
Predictions
• Increasing complexity of:– Applications– Technology
Increasing specialization of labor
![Page 46: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/46.jpg)
Predictions
• Increasing complexity of:– Applications– Technology Increasing specialization of labor
• Rate of change of increase in complexity is increasing over time Increasing adaptability is important!
![Page 47: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/47.jpg)
Yet another taxonomy!
RECONFIGURABILITY
ARCHITECTURALSPECIFICITY
ASIC PROTOTYPEASIC
GPP CCM
STATIC DYNAMIC
SPECIFIC
GENERAL
![Page 48: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/48.jpg)
Yet another taxonomy!
ASIC PROTOTYPEASIC
GPP CCM
STATIC DYNAMIC
SPECIFIC
GENERAL
ARCHITECTURALSPECIFICITY
RECONFIGURABILITY
![Page 49: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/49.jpg)
STATIC DYNAMIC
COMMUNICATIONLATENCY
TP
DP
ASIC PROTOTYPEASIC
GPP CCM
APPLICATIONSPECIFICITY
SPECIFIC
GENERAL
RECONFIGURABILITY
![Page 50: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/50.jpg)
DP Communication Topology
FPGA FPGA
FPGA FPGA
FPGA FPGA
FPGA FPGA
FPGA FPGA
FPGA FPGA
FPGA FPGA
FPGA FPGA
EDGE ISA(2D VLIW)
With CoresFFT, RISC
High Throughput (iterative)Communication topology
![Page 51: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/51.jpg)
TP Communication Topology
FPGA FPGA
FPGA FPGA
FPGA FPGA
FPGA FPGA
FPGA FPGA
FPGA FPGA
FPGA FPGA
FPGA FPGA
EDGE ISA(2D VLIW)
With CoresRAM, RISC
Low Latency (recursive)Communication topology
![Page 52: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/52.jpg)
General PurposeLanguage
Domain SpecificLanguage
Computational Model
ComputeSubstrate
CommunicateSubstrate
ConfigurableHardware
StaticHardware
Fabrication Technology
DISCIPLINE PROCESS
CS, DE
CS, CE
CE, EE
EE
EE, ME
Circuit layout
Processor architecting
CompilingCS
CS, DE
Application programDE
CE, EE Processor layout
FPGA/Circuit design
Language design
Fabrication process
Compute model design
![Page 53: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/53.jpg)
Conclusion
• Last 20 years witnessed dramatic advances
![Page 54: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/54.jpg)
Conclusion
• Last 20 years witnessed dramatic advances• Next 20 years will witness even more
dramatic advances.
![Page 55: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/55.jpg)
Spare slides follow
![Page 56: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/56.jpg)
Recursive Computation via a Tree of Meshes Network?
![Page 57: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/57.jpg)
Quasi-Scalable
![Page 58: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/58.jpg)
Quasi-Scalable
![Page 59: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/59.jpg)
Quasi-Scalable
RF D$ GLOBAL LOCAL
ADDRESS
![Page 60: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/60.jpg)
Interleave Memory & Processor Tiles
• Slightly more chips
• Compiler localizes memory
accesses
• EDGE ISA deals with
variable access times
(TRIPS).
![Page 61: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/61.jpg)
![Page 62: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/62.jpg)
Cell architecture
![Page 63: General Purpose Processors as Processor Arrays](https://reader035.fdocuments.in/reader035/viewer/2022062521/56813a87550346895da282b5/html5/thumbnails/63.jpg)
Specialization of LaborHigh Level / Domain-Specific
Language
Computational ModelExposes Comm. Topology
ISA Network
FPGA
Fabrication
APPLICATIONPROGRAMMER
COMPILER
COMPUTERARCHITECT
COMPUTERENGINEER
ELECTRICAL &COMPUTERENGINEER