CUstom Built HEterogeneous Multi- Core...
Transcript of CUstom Built HEterogeneous Multi- Core...
![Page 1: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/1.jpg)
CUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH):
Breaking the Conventions
Nagarajan Venkateswaran
Director, Waran Research Foundation
Karthikeyan Palavedu Saravanan - Nachiappan Chidambaram Nachiappan
Research Trainees (2008 - 2010), Waran Research Foundation
Aravind Vasudevan - Balaji Subramaniam - Ravindhiran Mukundarajan
Former Research Trainees (2007 - 2009), Waran Research Foundation
1Chennai, India
![Page 2: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/2.jpg)
Motivation : Heterogeneity Redefined
• Cost Effective High Performance Custom Built Heterogeneous Multi-Core Node Design for wider class applications
– Inter and Intra core heterogeneity
• Breaking the Conventions
– Multiple User Multiple Application without Space-Time sharing in a Cluster : Cost sharing across users
– Single User Multiple Application without Space-Timer Sharing (non-multiprogramming) : Cost sharing across applications
2Chennai, India
![Page 3: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/3.jpg)
Overview
• Custom Built Heterogeneous Multi-Core Architectures (CUBEMACH)
• Design Space – Architectural Space
– Optimization Space – Customer Vendor Interaction
– Simulation Space
• CUBEMACH Design and Simulation Tool Framework
• Conclusion
3Chennai, India
![Page 4: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/4.jpg)
Custom Built Heterogeneous Multi-Core Architectures (CUBEMACH)
• CUBEMACH promises
– Increased Resource Utilization
– Multiple Application Flavored Architectures
– Elimination of Space Time Sharing at the Quantum Level during Multiple Application Execution
– Manufacturing and Operational Cost reduction
4Chennai, India
![Page 5: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/5.jpg)
Overview
• Custom Built Heterogeneous Multi-Core Architectures (CUBEMACH)
• Design Space
– Architectural Space– Optimization Space
– Customer Vendor Interaction
– Simulation Space
• CUBEMACH Design and Simulation Tool Framework
• Conclusion
5Chennai, India
![Page 6: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/6.jpg)
CUBEMACH Design Paradigm
6Chennai, India
![Page 7: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/7.jpg)
ONNET
CUBEMACH
Architectural Space
SCOSPCOS
Compiler-On-Silicon
ALFU
SRAMDRAM
MemoryALISA
Architectural Design Space - CUBEMACH
7Chennai, India
![Page 8: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/8.jpg)
Architectural Space
• Why ALU Why Not ALFU??
– Hardwired units
–Design : Homogeneously Structured
–Reduced Instruction Generation & Fetches : Employ a Higher Level ISA
–Reduced memory-functional unit interaction
–Helps execute multiple applications without space & time sharing
8Chennai, India
![Page 9: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/9.jpg)
Control
UnitAlgorithm
Size
Memory
In
Processor
HLFU
Characteristics
ALFU
RequirementsDelay
ALFU Types
Class of
Algorithms
Type of MIP
Cell
Number
of MIP
Cells
HLFU
Control
Centralize/
Decentralized
Grain SizeArchitecture
Class of
Units
Input Bits
Scalar
ALFU
Algorithm Level Functional Unit
9Chennai, India
![Page 10: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/10.jpg)
ALU vs ALFU Instruction Generation Results
10Chennai, India
![Page 11: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/11.jpg)
Sample Algorithm Level Functional Units
• Matrix Centric Units
• Matmul
• Matadd
• Chain Matadd
• Scalar Units
• Scalar Adder / Subtractor
• Scalar Multiplier
• Scalar Divider
• Comparator
• Sorter
• Multiple Operand Adder
• Min / Max Finder
• Vector Units
• Inner Product
• Graph Theoretic Units
• Graph Traversal Unit –
BFS, DFS
• KL Graph Partitioning
Architectural Space Contd…
11Chennai, India
![Page 12: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/12.jpg)
ALISA – Algorithm Level Instruction Set Architecture
• Algorithm Level Instructions
• Triggers ALFUS
• ALISA Multiple VLIWs
• ALISA for heterogeneous multi-cores
Architectural Space Contd…
12Chennai, India
![Page 13: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/13.jpg)
Hierarchical Compilation Scheme
• PCOS Partitions A Problem
Into Sub-Problems – Level 1
• SCOS Partitions The Sub-
Problems Into ALFU Level
Instruction – Level 2
PCOS
SCOS
Application
Sub - Application
Instruction
Architectural Space Contd…
13Chennai, India
![Page 14: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/14.jpg)
ALISA & Compiler On Silicon
SCOSPCOS
No. of parallel
Units
Rate of Output
GenerationScheduler
Processing rate
No. of Ports
BISA
Length
No. of I/O
Ports
Scheduler O/P
Generation rate
Compiler-On-
Silicon
Number of Instructions Per
ALISATypes of Instructions Per
ALISA
Types of Instructions in
ISA
ALISA
No of ALISA Fields
Decoding/ Encoding LogicField Length
14Chennai, India
![Page 15: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/15.jpg)
ON-Node-Network Architecture
2D - Torus
Sub-Local Router Local Router
ALFU
Population
Global Router
Core
Architectural Space Contd…
15Chennai, India
![Page 16: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/16.jpg)
ON-Node-Network Architecture
H- Tree Topology
Architectural Space Contd…
Sub-Local Router
Local Router
Global Router
16Chennai, India
![Page 17: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/17.jpg)
ONNET Conventional NOCs
Type of Switch MIN Crossbar
Number of Routers
N* log 2 (N) N2
Hierarchy Yes No
Switching Latency Log2(Number of Inputs) * Switch Delay
Number of Inputs* Switch Delay
Comparison of Conventional NOCs with ONNET
17Chennai, India
![Page 18: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/18.jpg)
ONNET
Routers
Organization
Packet
SwitchingInput/OutputPacketization
Address
Decoding
Destination
IDBuffer Size
Output Data
SizeHLFU ID Path
Latency
Destination
Router ID
Type of MIN
Input Traffic
HLFU
Count
Logical
Grouping
Buffer/
Stack Size
Route
LocationData Rate
I/O PortNo. of
DecodersDecoding
Rate
Length of
Stack
No. of Buffers
Packet Size
I/P Data Size
Word Length
On Node Network Architecture
18Chennai, India
![Page 19: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/19.jpg)
Overview
• Motivation• Custom Built Heterogeneous Multi-Core
Architectures (CUBEMACH)• Design Space
– Architectural Space– Optimization Space
–Customer Vendor Interaction– Simulation Space
• CUBEMACH Design and Simulation Tool Framework
• Conclusion
19Chennai, India
![Page 20: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/20.jpg)
CUBEMACH
Architectural
Space
Multiple ApplicationsMultiple
ApplicationsMultiple
Applications Input
CUBEMACH
Simulation
Space
Optimization Space
Core
Formation
Power Model
Performance Model
Desired Power to Performance Ratio
CUBEMACH
Optimization
Space
Simulated Annealing
Selected
ParametersFinal
CUBEMACH
Calculated Power to
Performance Ratio
Initial Candidate Architecture Parameters
Game Theory
1
2
3 4 5
5a
5b
9
7
6
8
20Chennai, India
![Page 21: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/21.jpg)
Optimization Space
• Generates Optimized CUBEMACH for input specifications such as,
– Power – Performance – Cost
– Initial Architecture
• Power and Performance Model
• Uses GT and SA for optimization of Power and performance
• Uses KL For Core Grouping
21Chennai, India
![Page 22: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/22.jpg)
Sample CUBEMACH Architecture
22
![Page 23: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/23.jpg)
CUBEMACH Design Implementation : Supercomputer
On Chip (SCOC) IP Cores
23Chennai, India
![Page 24: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/24.jpg)
SCOC IP Cores
• ALFUs designed as SCOC IP Cores
• Soft IP Core
• Coarse-grained Reusable Soft IP Cores
• Scalable IP Cores
24Chennai, India
![Page 25: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/25.jpg)
Customer Vendor Interaction
App 2App 3App 4
App 1CUBEMACH Node
Manufacturers/ System Vendors
Customers ApplicationRequirements –
Power & Performance
Initial Heterogeneous Multi Core Candidate Architecture
Final Architecture
Layout
Fabrication of IP Cores
Simultaneous Multiple Applications
Optimizer
Workload Generation
Intermediate Format
CUBEMACH Design Space
Simulator
SCOC IP CORES
Intermediate CUBEMACH
Optimized CUBEMACH`
Optimization Space Contd…
25Chennai, India
![Page 26: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/26.jpg)
Overview
• Motivation• Custom Built Heterogeneous Multi-Core
Architectures (CUBEMACH)• Design Space
– Architectural Space– Optimization Space
–Customer Vendor Interaction– Simulation Space
• CUBEMACH Design and Simulation Tool Framework
• Conclusion
26Chennai, India
![Page 27: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/27.jpg)
CUBEMACH Simulator
• pThread based Simulator
• Evaluates candidate CUBEMACH Architecture
• Feed results to CUBEMACH Optimizer
• CUBEMACH Optimization Engine (COE) produces Optimized Architecture
• Simulation & Optimization : An iterative process
• Consists of
ALFU Sub-Simulator COS Sub-Simulator
ONNET Sub-Simulator Memory Sub-Simulator
27Chennai, India
![Page 28: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/28.jpg)
CUBEMACH Simulator
28
![Page 29: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/29.jpg)
Integrated CUBEMACH Design Paradigm …
What we have seen . . .
29Chennai, India
![Page 30: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/30.jpg)
Core
Formation
ONNET
Routers
Organization
Packet
SwitchingInput/Output
Packetization
Address
Decoding
Destination
IDBuffer Size
Output Data
Size HLFU ID Path
Latency
Destination
Router ID
Type of
MIN
Input Traffic
HLFU
Count
Logical
Grouping
Buffer/
Stack
Size
Route
LocationData Rate
I/O Port
No. of
DecodersDecodin
g Rate
Length
of Stack
No. of BuffersPacket Size
I/P Data Size
Word Length
CUBEMACH
Architectural
Space
Multiple ApplicationsMultiple
ApplicationsMultiple
Applications Input
Power Model
Performance Model
Desired Power to
Performance Ratio
CUBEMACH
Optimization
Space
CUBEMACH
Simulation
Space
Game Theory
Simulated Annealing
Selected
Parameters
SimulatorSimulation
Results
Intermediate Architectural Parameters
Final CUBEMACH
Calculated Power to
Performance Ratio
Initial Candidate Architecture Parameters
SCOSPCOS
No. of
parallel Units
Rate of
Output
GenerationScheduler
Processing
rate
No. of
Ports
BISA
Length
No. of
I/O Ports
Scheduler O/P
Generation rate
Compiler-On-Silicon
Control
UnitAlgorithm
Size
Memory
In
Processor
HLFU
Characteristics
ALFU
RequirementsDelay
ALFU
Types
Class of
Algorithms
Type of MIP
Cell
Number
of MIP
Cells
HLFU
Control
Centralize/
Decentralized
Grain
Size Architecture
Class of
Units
Input Bits
Scalar
ALFU
SRAMDRAM
Mapping and
Replacement
HeuristicPacket
SizeNo of
Blocks
DRAM
Size
SRAM
Size
Cache
Line Size
Word Length
Memory
No Of
Ports
Number of
Instructions Per
ALISA
Types of
Instructions
Per ALISA
Types of
Instructions
in ISA
ALISA
No of ALISA
Fields
Decoding/
Encoding Logic
Field Length
30
![Page 31: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/31.jpg)
Sample CUBEMACH Architecture :
Simulation Results
Matrix Based Algorithms Graph Based Algorithms
31Chennai, India
![Page 32: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/32.jpg)
Sample CUBEMACH Architecture :
Simulation Results
Mixture of Algorithms Comparison of Performance
delivered by Optimized
Architectures for corresponding
types of Algorithms
32Chennai, India
![Page 33: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/33.jpg)
Overall Resource Utilization of :(i) Initial CUBEMACH Architecture : Mean = 59 %(ii) Optimized CUBEMACH Architecture : Mean = 74 %
Sample CUBEMACH Architecture :
Simulation Results
33Chennai, India
![Page 34: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/34.jpg)
In Initial Candidate CUBEMACH Architecture,
•Matrix ALFUS – low usage•Scalar ALFUS – average usage•Graph ALFUS – high usage
In Optimized Candidate CUBEMACH Architecture,
•Matrix ALFUS – high usage•Scalar ALFUS – high usage•Graph ALFUS – high usage
Sample CUBEMACH Architecture : Simulation Results
34Chennai, India
![Page 35: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/35.jpg)
Conclusion
• Custom Built Heterogeneous Multi-Core Architectures (CUBEMACH) promises,
– Increased Resource Utilization
– Multiple application flavored architectures
– Elimination of Space Time Sharing at the Quantum Level during Multiple Application Execution (without multiprogramming)
– Manufacturing and Running Cost reduction
35Chennai, India
![Page 36: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/36.jpg)
Thank You
Questions??
36Chennai, India
![Page 37: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/37.jpg)
Customizable Compiler-On-Silicon
• What Compiler-On-Silicon?
• Why do we need Compiler-On-Silicon ?
• Why go for Customizable Compiler-On-Silicon ?
Architectural Space Contd…
37Chennai, India
![Page 38: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/38.jpg)
ONNET
Architecture uses -
• Multistage Interconnect Network
• Hardware Packetization Unit
• ONNET Design Space
– H-Tree Structure within a Core
– 2D Torus Across Cores
– MIN Type
Architectural Space Contd…
ONNET
Routers
Organizatio
n
Packet
SwitchingInput/Outpu
t
Packetization
Address
Decoding
Destination
IDBuffer Size
Output Data
SizeHLFU ID Path
Latency
Destination
Router ID
Type of
MIN
Input
Traffic
HLFU
Count
Logical
Grouping
Buffer/
Stack
Size
Route
LocationData
RateI/O Port
No. of
DecodersDecodin
g Rate
Length
of Stack
No. of
Buffers
Packet Size
I/P Data Size
Word Length
38Chennai, India
![Page 39: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/39.jpg)
Architectural Design Space - CUBEMACH
• ALFU – Algorithm Level Functional Units
• BISA – Backbone Instruction Set Architecture
• COS – Compiler On Silicon
• ONNET – On Node Network
• Novel Cache Mapping Scheme
• SCOC IP Cores : Achieving cost effectiveness
( Super Computer On Chip - IP Cores)
39Chennai, India
![Page 40: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/40.jpg)
Features -
• Communication across heterogeneous multi-cores
• Data requirements of diverse ALFUs
• High bandwidth
• Scalable
• Hierarchical Network-On-Chip
On Node Network Architecture
40Chennai, India
![Page 41: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/41.jpg)
SRAMDRAM
Mapping and
Replacement HeuristicPacket Size
No of Blocks
DRAM Size
SRAM Size
Cache Line Size
Word Length
Memory
No Of Ports
Memory
41Chennai, India
![Page 42: CUstom Built HEterogeneous Multi- Core …graal.ens-lyon.fr/~abenoit/hcw2010/slides/saravanan.pdfCUstom Built HEterogeneous Multi-Core ArCHitectures (CUBEMACH): Breaking the Conventions](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec029cff68ad239a37dfb0e/html5/thumbnails/42.jpg)
Advantages of SCOC IP Cores
• Fully Customizable
• Greatly reduces Design-Turnaround-Time
• Physically Design Friendly
– Constraints of Area, Power and Performance
• Constrained & Rigid Design Methodology
42Chennai, India