Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The...
Transcript of Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The...
![Page 1: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/1.jpg)
Tilera’s Many-core Processor
A scalable architecture on a single chip.
J. Whitesell & S. LadavichJ. Whitesell & S. LadavichTuesday, May 14Tuesday, May 14thth, 2013, 2013
1
![Page 2: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/2.jpg)
2
![Page 3: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/3.jpg)
History of Tilera
3
![Page 4: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/4.jpg)
History of Tilera
Pros and Cons of Building a Manycore Architecture
4
![Page 5: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/5.jpg)
History of Tilera
Pros and Cons of Building a Manycore Architecture
The Tilera Approach
5
![Page 6: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/6.jpg)
History of TileraPros and Cons of Building a Manycore ArchitectureThe Tilera Approach
6
Tilera’s …
![Page 7: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/7.jpg)
History of TileraPros and Cons of Building a Manycore ArchitectureThe Tilera Approach
7
Tilera’s …Tile Architecture
![Page 8: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/8.jpg)
History of TileraPros and Cons of Building a Manycore ArchitectureThe Tilera Approach
8
Tilera’s …Tile ArchitectureiMesh Network Topology
![Page 9: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/9.jpg)
History of TileraPros and Cons of Building a Manycore ArchitectureThe Tilera Approach
9
Tilera’s …Tile ArchitectureiMesh Network Topology
Applications …
![Page 10: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/10.jpg)
History of TileraPros and Cons of Building a Manycore ArchitectureThe Tilera Approach
10
Tilera’s …Tile ArchitectureiMesh Network Topology
Applications …Server
![Page 11: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/11.jpg)
History of TileraPros and Cons of Building a Manycore ArchitectureThe Tilera Approach
11
Tilera’s …Tile ArchitectureiMesh Network Topology
Applications …ServerMedia
![Page 12: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/12.jpg)
History of TileraPros and Cons of Building a Manycore ArchitectureThe Tilera Approach
12
Tilera’s …Tile ArchitectureiMesh Network Topology
Applications …ServerMediaCloud
![Page 13: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/13.jpg)
History of TileraPros and Cons of Building a Manycore ArchitectureThe Tilera Approach
13
Tilera’s …Tile ArchitectureiMesh Network Topology
Applications …ServerMediaCloud
Performance Analysis and Benchmarking
![Page 14: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/14.jpg)
1990
14
1994
2002
2004
2007
2011
![Page 15: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/15.jpg)
1990
15
1994
2002
2004
2007
2011
Multi-processor made of single chips
MIT’s Dr. Anant Agarwal leads the way for Tiled Manycore
![Page 16: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/16.jpg)
1990
16
1994
2002
2004
2007
2011
Multi-processor made of single chips 32-node mesh-
mesh based cache-coherent processor
MIT’s RAW architecture
![Page 17: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/17.jpg)
1990
17
1994
2002
2004
2007
2011
Multi-processor made of single chips 32-node mesh-
mesh based cache-coherent processor
DARPA pays the bill! Gives 10s of millions supporting RAW
MIT’s RAW architecture
![Page 18: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/18.jpg)
1990
18
1994
2002
2004
2007
2011
Multi-processor made of single chips 32-node mesh-
mesh based cache-coherent processor
DARPA pays the bill! Gives 10s of millions supporting RAW
Tilera’s stealth launch
“Tilera has solved the multi-processor scalability problem!”does not exist!”
![Page 19: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/19.jpg)
1990
19
1994
2002
2004
2007
2011
Multi-processor made of single chips 32-node mesh-
mesh based cache-coherent processor
DARPA pays the bill! Gives 10s of millions supporting RAW
Tilera’s stealth launch
Tilera’s corporate launch
“Tilera has solved the multi-processor scalability problem!”
“Tilera has solved the multi-processor scalability problem!”does not exist!”
![Page 20: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/20.jpg)
1990
20
1994
2002
2004
2007
2011
Multi-processor made of single chips 32-node mesh-
mesh based cache-coherent processor
DARPA pays the bill! Gives 10s of millions supporting RAW
Tilera’s stealth launch
Tilera’s corporate launch
Latest lineGx series is released
![Page 21: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/21.jpg)
Traditional Architectures aren’t ScalableMost Multi-Core Chips Stop Around 8 CoresBus Interconnect▪ Creates a Bottleneck for MM Access▪ Consumes Chip-Area & Power
21
![Page 22: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/22.jpg)
On-Chip Memory LimitsSoftware Support▪ Efficient API Development is Challenging▪ Parallel Languages and Programmers are Needed
22
![Page 23: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/23.jpg)
On-Chip Communication is Fast!Reduced OverheadsFiner Grain Size
On-Chip Network Footprint is Small!Natural Tiled Connections2-D Mesh Suits 2-D Substrate
23
![Page 24: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/24.jpg)
Create a Basic Modular UnitHomogeneous Across ChipKnown as a Tile▪ Full-Featured Processor Core▪ Processor Engine▪ Cache Engine▪ Switch Engine
▪ Capable of Running an OS
24
Basic Look Inside a Tile
![Page 25: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/25.jpg)
Processor Engine64-bit VLIW Architecture
▪ 3 Execution PipelinesALU, Flow Control, LD/ST
Cache EngineDynamic Distributed Cache▪ Shared L2 Caches (L3)
Switch EngineDirect Neighbor ConnectionsI/O Connections on Periphery
25
Detailed Look Inside a Tile
![Page 26: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/26.jpg)
Networks are easy!
26
![Page 27: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/27.jpg)
Networks are easy!Communication is cheap!
27
![Page 28: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/28.jpg)
28
Leverage Multiple Independent Networks
![Page 29: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/29.jpg)
29
1) How many networks are needed?
![Page 30: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/30.jpg)
30
1) How many networks are needed?2) What functionalities do the networks have?
![Page 31: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/31.jpg)
31
How are the message types and communications defined?
Message Types:
Dedicated Networks:
![Page 32: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/32.jpg)
32
How are the message types and communications defined?
Implicit Message Passing Explicit Message Passing
Message Types:
Dedicated Networks:
![Page 33: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/33.jpg)
33
How are the message types and communications defined?
1
Implicit Message Passing Explicit Message Passing
Message Types:
1)Implicit
Dedicated Networks:
1)MDN2)TDN
![Page 34: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/34.jpg)
34
How are the message types and communications defined?
1
Implicit Message Passing Explicit Message Passing
Message Types:
1)Implicit
Dedicated Networks:
1)MDN2)TDN
Implicit Messages through…
Tile-to-tile shared address spaceNon-uniform / distributed cache access (NUCA)
Shared address space in off-chip / main memoryUniform memory access (UMA)
![Page 35: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/35.jpg)
35
How are the message types and communications defined?
1
Implicit Message Passing Explicit Message Passing
MessagesStreaming Data
Message Types:
1)Implicit
Dedicated Networks:
1)MDN2)TDN
![Page 36: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/36.jpg)
36
How are the message types and communications defined?
1
Implicit Message Passing Explicit Message Passing
MessagesStreaming Data
Message Types:
1)Implicit2)Message Passing
Dedicated Networks:
1)MDN2)TDN3)UDN
2
![Page 37: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/37.jpg)
37
How are the message types and communications defined?
1
2
Implicit Message Passing Explicit Message Passing
MessagesStreaming Data
Small BuffersLarge Buffers
Message Types:
1)Implicit2)Message Passing
Dedicated Networks:
1)MDN2)TDN3)UDN
![Page 38: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/38.jpg)
38
How are the message types and communications defined?
1
2
Implicit Message Passing Explicit Message Passing
MessagesStreaming Data
Small BuffersLarge Buffers
Message Types:
1)Implicit2)Message Passing3)Streaming Data
a) Small stream
Dedicated Networks:
1)MDN2)TDN3)UDN
3a
![Page 39: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/39.jpg)
39
How are the message types and communications defined?
1
2
3a
Implicit Message Passing Explicit Message Passing
MessagesStreaming Data
Small BuffersLarge Buffers
Message Types:
1)Implicit2)Message Passing3)Streaming Data
a) Small streamb) Large stream
Dedicated Networks:
1)MDN2)TDN3)UDN
3b
![Page 40: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/40.jpg)
40
How are the message types and communications defined?
1
3b
2
3a
Implicit Message Passing Explicit Message Passing
MessagesStreaming Data
Small BuffersLarge Buffers
Special Case:High PerformanceStreaming
Message Types:
1)Implicit2)Message Passing3)Streaming Data
a) Small streamb) Large stream
Dedicated Networks:
1)MDN2)TDN3)UDN
![Page 41: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/41.jpg)
41
How are the message types and communications defined?
1
3b
2
3a
Implicit Message Passing Explicit Message Passing
MessagesStreaming Data
Small BuffersLarge Buffers
Special Case:High PerformanceStreaming
Message Types:
1)Implicit2)Message Passing3)Streaming Data
a) Small streamb) Large streamc) Large/Continuous
Dedicated Networks:
1)MDN2)TDN3)UDN4)STN
3c
![Page 42: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/42.jpg)
42
How are the message types and communications defined?
1
3b
3c
2
3a
Implicit Message Passing Explicit Message Passing
MessagesStreaming Data
Small BuffersLarge Buffers
Special Case:IO MessagesSystem Traffic
Special Case:High PerformanceStreaming
Message Types:
1)Implicit2)Message Passing3)Streaming Data
a) Small streamb) Large streamc) Large/Continuous
Dedicated Networks:
1)MDN2)TDN3)UDN4)STN
![Page 43: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/43.jpg)
43
How are the message types and communications defined?
1
3b
3c
2
3a
Implicit Message Passing Explicit Message Passing
MessagesStreaming Data
Small BuffersLarge Buffers
Special Case:IO MessagesSystem Traffic
Special Case:High PerformanceStreaming
Message Types:
1)Implicit2)Message Passing3)Streaming Data
a) Small streamb) Large streamc) Large/Continuous
4)System Level & IO
Dedicated Networks:
1)MDN2)TDN3)UDN4)STN5)IDN
4
![Page 44: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/44.jpg)
44
How are the message types and communications defined?
1
3b
3c
2
3a
4
Implicit Message Passing Explicit Message Passing
MessagesStreaming Data
Small BuffersLarge Buffers
Special Case:IO MessagesSystem Traffic
Special Case:High PerformanceStreaming
Message Types:
1)Implicit2)Message Passing3)Streaming Data
a) Small streamb) Large streamc) Large/Continuous
4)System Level & IO
Dedicated Networks:
1)MDN2)TDN3)UDN4)STN5)IDN
5 Independent Hardware Networks:
Memory Dynamic NetworkTile Dynamic NetworkUser Dynamic Network
Static NetworkI/O Dynamic Network
![Page 45: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/45.jpg)
45
How are the message types and communications defined?
1
3b
3c
2
3a
4
Implicit Message Passing Explicit Message Passing
MessagesStreaming Data
Small BuffersLarge Buffers
Special Case:IO MessagesSystem Traffic
Special Case:High PerformanceStreaming
Dedicated Networks:
1)MDN2)TDN3)UDN4)STN5)IDN
5 Independent Hardware Networks:
Memory Dynamic NetworkTile Dynamic NetworkUser Dynamic Network
Static NetworkI/O Dynamic Network
Which minimize overheads for all desired forms of communication
Message Types:
1)Implicit2)Message Passing3)Streaming Data
a) Small streamb) Large streamc) Large/Continuous
4)System Level & IO
![Page 46: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/46.jpg)
Parallel Processing in Embedded DomainNetwork▪ Lossless Packet Capture▪ Intrusion Detection & Prevention
Multimedia▪ Video Conferencing▪ IP Surveillance
Cloud▪ In-Memory Caching▪ Server Load Balancing
46
![Page 47: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/47.jpg)
Numerous EvaluationsSingle-Core Performance▪ CoreMark Score
Parallelized Performance▪ Information Fusion▪ Gaussian Elimination▪ MemCached
Comparisons of SMPs & Many-Core
47
![Page 48: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/48.jpg)
48
Evaluates Single-Core Performance4 Algorithms1 Final ScoreC
oreM
ark
Scor
e
Single-Core Single Thread CoreMark Comparison
Tilera’s Processors Feature:VLIW Architecture3 Pipelines64-bit Instr. Words
All or None Exec.
![Page 49: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/49.jpg)
49
Embedded Wireless Sensor NetworksCluster Heads Receive from 10 SensorsHead Node Performs Reduction▪ Moving Average Filter
![Page 50: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/50.jpg)
50
Information Fusion Application
Results Vary Based on ApplicationInteger-Based ArithmeticFloating-Point Intensive
Gaussian Elimination Application
![Page 51: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/51.jpg)
51
Information Fusion Application
Why?Tiles Lack a Dedicated Floating-point Unit!
Gaussian Elimination Application
![Page 52: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/52.jpg)
Distributed Memory Caching SystemCreates a Virtual Memory PoolUsed for Key-Value StoresDesigned to Alleviate Database Load
Currently Implemented by…Social Media Giants▪ Facebook, Twitter, and Zynga
52
![Page 53: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/53.jpg)
53
For a Fixed Memory Footprint▪ Tilera Achieves 3.35x Throughput @ Less Power▪ Better Performance per Watt
![Page 54: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/54.jpg)
The Tile Architecture Exhibits…Superior Scalability▪ Modular Design▪ Low Cost of On-Chip Communication▪ Exploiting a Variety of Task Grain Sizes▪ ILP and TLP
High Performance per Watt▪ Relatively Low Clock Speeds▪ Idle Mode for Unused Tiles▪ Reducing Costs of Web Datacenters
54
![Page 55: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/55.jpg)
55
![Page 56: Tilera’s Many-core Processormeseec.ce.rit.edu/756-projects/spring2013/2-3.pdfR. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance](https://reader034.fdocuments.in/reader034/viewer/2022051902/5ff15db77a94616e4807efd1/html5/thumbnails/56.jpg)
56
Waingold, E.; Taylor, M.; Srikrishna, D.; Sarkar, V.; Lee, W.; Lee, V.; Kim, J.; Frank, M.; Finch, P.; Barua, R.; Babb, J.; Amarasinghe, S.; Agarwal, A., "Baring it all to software: Raw machines," Computer , vol.30, no.9, pp.86,93, Sep 1997 CURRENTLY NOT NEEDED
Tilera Corporation, “Tile Processor User Architecture Manual,” UG101, Nov. 2011 [Rev. 2.4]
Wentzlaff, D.; Griffin, P.; Hoffmann, H.; Liewei Bao; Edwards, B.; Ramey, C.; Mattina, M.; Chyi-Chang Miao; Brown, J.F.; Agarwal, A., "On-Chip Interconnection Architecture of the Tile Processor," Micro, IEEE , vol.27, no.5, pp.15,31, Sept.-Oct. 2007
Munir, A.; Gordon-Ross, A.; Ranka, S., "Parallelized benchmark-driven performance evaluation of SMPs and tiled multi-core architectures for embedded systems," Performance Computing and Communications Conference (IPCCC), 2012 IEEE 31st International , vol., no., pp.416,423, 1-3 Dec. 2012
Berezecki, M.; Frachtenberg, E.; Paleczny, M.; Steele, K., "Many-core key-value store," Green Computing Conference and Workshops (IGCC), 2011 International , vol., no., pp.1,8, 25-28 July 2011
R. Schooler, “The TILE-Gx Processor: Enabling HPC through Massive-Scale Manycore,” IEEE High Performance EMbedded Computing Conference Proceedings, 2010. Presentation Slides 28-30.
Links to Other Images (Presentation Only):
Tilera Silicon - http://www.datacenterdynamics.com/focus/archive/2011/07/facebook-tilera-chips-more-energy-efficient-x86
AMD Phenom Silicon - http://siliconmadness.blogspot.com/2010/05/amd-phenom-ii-x6-overclocking-record.html
Scalability Graph - www.ll.mit.edu/HPEC/agendas/.../S2_1405_Schooler_presentation.ppt
Tilera Products and Theme - http://www.tilera.com/contact/media_library
Single Tile Detail - http://semiaccurate.com/2009/10/29/look-100-core-tilera-gx/