CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy...
-
Upload
phillip-nelson -
Category
Documents
-
view
219 -
download
0
Transcript of CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy...
![Page 1: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/1.jpg)
CS668- Lecture 2 - Sept. 30Today’s topics
• Parallel Architectures (Chapter 2)• Memory Hierarchy• Busses and Switched Networks• Interconnection Network Topologies• Multiprocessors / Multicomputers• Flynn’s Taxonomy• Analysis of Interconnection Networks
![Page 2: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/2.jpg)
Theoretic Computer Architectures
• Turing Machine• Von Neumann Architecture• Fetch/Execute Cycle• Memory Models• RAM model• PRAM model extension• Shared Memory vs. Distributed Shared
Memory vs. Distributed Memory
![Page 3: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/3.jpg)
Processors and the Memory Hierarchy
• Registers (1 clock cycle, 100s of bytes)• 1st level cache (3-5 clock cycles, 100s KBytes)• 2nd level cache (~10 clock cycles, MBytes)• Main memory (~100 clock cycles, GBytes)• Disk (milliseconds, 100GB to gianormous)
registers
1st level Instructions
1st level Data
2nd Level unified (Instructions & Data)
CPU
![Page 4: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/4.jpg)
IBM Dual Core
From Intel® 64 and IA-32 Architectures Optimization Reference Manualhttp://www.intel.com/design/processor/manuals/248966.pdf
![Page 5: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/5.jpg)
Shared Memory Multiprocessor• One or more memories• Global address space (all system memory visible to all
processors)• Transfer of data between processors is usually implicit, just read
(write) to (from) a given address (OpenMP)• Complex Cache-coherency protocols to maintain consistency
between processors.
Interconnection Network
Memory
CPU
Memory
CPU
Memory
CPU
(UMA) Uniform-memory-access Shared-memory System
![Page 6: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/6.jpg)
Distributed Shared Memory
• Single address space with implicit communication• Hardware support for read/write to non-local memories, cache
coherency• Latency for a memory operation is greater when accessing non local
data than when accessing date within a CPU’s own memory
(NUMA)Non-Uniform-memory-access Shared-memory System
Interconnection Network
MemoryCPU MemoryCPU MemoryCPU
![Page 7: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/7.jpg)
Distributed Memory / Message Passing
• Each processor has access to its own memory only• Data transfer between processors is explicit, user calls message
passing functions• Common Libraries for message passing
– MPI, PVM• User has complete control/responsibility for data placement and
management
Interconnection Network
MemoryCPU MemoryCPU MemoryCPU
![Page 8: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/8.jpg)
Hybrid Systems
• Distributed memory system with multiprocessor shared memory nodes.
• Most common architecture for current generation of parallel machines
Interconnection Network
CPU
Mem
ory
CPU
CPU
Network Interface
CPU
Mem
ory
CPU
CPU
Network Interface
CPU
Mem
ory
CPU
CPU
Network Interface
![Page 9: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/9.jpg)
Flynn’s Taxonomy (figure 2.20 from Quinn)
SISDUniprocessor
SIMDProcessor arraysPipelined vector
processors
MISDSystolic array
MIMDMultiprocessorsMulticomputers
Single Multiple
Sin
gle
Mu
ltip
leData stream
Inst
ruct
ion
stre
am
![Page 10: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/10.jpg)
Analysis of Switch Network Topologies
• View switched network as a graph– n - Vertices = processors or switches– m - Edges = communication paths
• Two kinds of topologies– Direct - ratio of switches to processors 1:1– Indirect - ratio is d:1
![Page 11: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/11.jpg)
Evaluating Switch Topologies
• Diameter• Bisection width• Number of edges / node (d = degree)• Constant edge length? (yes/no)
– Layout area/wire length
![Page 12: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/12.jpg)
2-D Mesh Network
• Direct topology• Switches arranged into a 2-D lattice• Communication allowed only between
neighboring switches• Variants allow wraparound connections
between switches on edge of mesh
![Page 13: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/13.jpg)
2-D Meshes
![Page 14: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/14.jpg)
Evaluating 2-D Meshes
• Diameter: (n1/2)
• Bisection width: (n1/2)
• Number of edges per switch: 4
• Constant edge length? Yes
![Page 15: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/15.jpg)
Binary Tree Network
• Indirect topology• n = 2d processor nodes, n-1 switches
![Page 16: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/16.jpg)
Evaluating Binary Tree Network
• Diameter: 2 log n
• Bisection width: 1
• Edges / node: 3
• Constant edge length? Yes/No?
![Page 17: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/17.jpg)
Hypertree Network
• Indirect topology• Shares low diameter of binary tree• Greatly improves bisection width• From “front” looks like k-ary tree of height
d• From “side” looks like upside down binary
tree of height d
![Page 18: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/18.jpg)
Hypertree Network
![Page 19: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/19.jpg)
Evaluating 4-ary Hypertree
• Diameter: log n
• Bisection width: n / 2
• Edges / node: 6
• Constant edge length? No
![Page 20: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/20.jpg)
Butterfly Network
• Indirect topology• n = 2d processor
nodes connectedby n(log n + 1)switching nodes
0 1 2 3 4 5 6 7
3,0 3,1 3,2 3,3 3,4 3,5 3,6 3,7
2,0 2,1 2,2 2,3 2,4 2,5 2,6 2,7
1,0 1,1 1,2 1,3 1,4 1,5 1,6 1,7
0,0 0,1 0,2 0,3 0,4 0,5 0,6 0,7Rank 0
Rank 1
Rank 2
Rank 3
![Page 21: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/21.jpg)
Butterfly Network Routing
![Page 22: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/22.jpg)
Evaluating Butterfly Network
• Diameter: log n
• Bisection width: n / 2
• Edges per node: 4
• Constant edge length? No
![Page 23: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/23.jpg)
Hypercube
• Directory topology• 2 x 2 x … x 2 mesh• Number of nodes a power of 2• Node addresses 0, 1, …, 2k-1• Node i connected to k nodes whose
addresses differ from i in exactly one bit position
![Page 24: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/24.jpg)
Hypercube Addressing
0010
0000
0100
0110 0111
1110
0001
0101
1000 1001
0011
1010
1111
1011
11011100
![Page 25: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/25.jpg)
Evaluating Hypercube Network
• Diameter: log n
• Bisection width: n / 2
• Edges per node: log n
• Constant edge length? No
![Page 26: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/26.jpg)
Shuffle-exchange
• Direct topology• Number of nodes a power of 2• Nodes have addresses 0, 1, …, 2k-1• Two outgoing links from node i
– Shuffle link to node LeftCycle(i)– Exchange link to node [xor (i, 1)]
![Page 27: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/27.jpg)
Shuffle-exchange Illustrated
0 1 2 3 4 5 6 7
![Page 28: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/28.jpg)
Shuffle-exchange Addressing
0000 0001 0010 0011 0100 0101
1110 11111000 1001 1010 1011 1100 1101
0110 0111
![Page 29: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/29.jpg)
Evaluating Shuffle-exchange
• Diameter: 2log n - 1
• Bisection width: n / log n
• Edges per node: 2
• Constant edge length? No
![Page 30: CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.](https://reader035.fdocuments.in/reader035/viewer/2022062314/56649e615503460f94b5ca8a/html5/thumbnails/30.jpg)
Comparing Networks
• All have logarithmic diameterexcept 2-D mesh
• Hypertree, butterfly, and hypercube have bisection width n / 2
• All have constant edges per node except hypercube
• Only 2-D mesh keeps edge lengths constant as network size increases