Tables CSCI 201L Jeffrey Miller, Ph.D. HTTP :// WWW - SCF. USC. EDU /~ CSCI 201 USC CSCI 201L.
CSCI-455/522
description
Transcript of CSCI-455/522
![Page 1: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/1.jpg)
CSCI-455/522
Introduction to High Performance Computing
Lecture 2
![Page 2: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/2.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.2
Types of Parallel Computers
Two principal types:
• Shared memory multiprocessor
• Distributed memory multicomputer
![Page 3: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/3.jpg)
P P P P P P
BUS
Memory
M
P
M
P
M
P
M
P
M
P
M
P
Network
Shared memory - single address space. All processors have access to a pool of shared memory. (Ex: SGI Origin, Sun E10000)
Distributed memory - each processor has it’s own local memory. Must do message passing to exchange data between processors. (Ex: CRAY T3E, IBM SP, clusters)
Shared vs. Distributed Memory
![Page 4: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/4.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.4
Shared Memory Multiprocessor
![Page 5: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/5.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.5
Conventional ComputerConsists of a processor executing a program stored in a (main) memory:
Each main memory location located by its address. Addresses start at 0 and extend to 2b - 1 when there are b bits (binary digits) in address.
Main memory
Processor
Instructions (to processor)Data (to or from processor)
![Page 6: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/6.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.6
Shared Memory Multiprocessor SystemNatural way to extend single processor model - have multiple processors connected to multiple memory modules, such that each processor can access any memory module :
Processors
Interconnectionnetwork
Memory moduleOneaddressspace
![Page 7: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/7.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.7
Simplistic View of a Small Shared Memory Multiprocessor
Examples:• Dual Pentiums• Quad Pentiums
Processors Shared memory
Bus
![Page 8: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/8.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.8
Quad Pentium Shared Memory MultiprocessorProcessor
L2 Cache
Bus interface
L1 cache
Processor
L2 Cache
Bus interface
L1 cache
Processor
L2 Cache
Bus interface
L1 cache
Processor
L2 Cache
Bus interface
L1 cache
Memory controller
Memory
I/O interface
I/O bus
Processor/memorybus
Shared memory
![Page 9: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/9.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.9
Programming Shared Memory Multiprocessors
• Threads - programmer decomposes program into individual parallel sequences, (threads), each being able to access variables declared outside threads.
Example Pthreads
• Sequential programming language with preprocessor compiler directives to declare shared variables and specify parallelism.
Example OpenMP - industry standard - needs OpenMP compiler
![Page 10: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/10.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.10
• Sequential programming language with added syntax to declare shared variables and specify parallelism.
Example UPC (Unified Parallel C) - needs a UPC compiler.
• Parallel programming language with syntax to express parallelism - compiler creates executable code for each processor (not now common)
• Sequential programming language and ask parallelizing compiler to convert it into parallel executable code. - also not now common
![Page 11: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/11.jpg)
P P P P P P
BUS
Memory
Uniform memory access (UMA): Each processor has uniform access to memory. Also known as symmetric multiprocessors, or SMPs (Sun E10000)
P P P P
BUS
Memory
P P P P
BUS
Memory
Network
Non-uniform memory access (NUMA): Time for memory access depends on location of data. Local access is faster than non-local access. Easier to scale than SMPs (SGI Origin)
Shared Memory: UMA vs. NUMA
![Page 12: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/12.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.12
Distributed Memory /Message-Passing
Multicomputers
![Page 13: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/13.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.13
Message-Passing Multicomputer
Complete computers connected through an interconnection network:
Processor
Interconnectionnetwork
Local
Computers
Messages
memory
![Page 14: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/14.jpg)
Distributed Memory: MPPs vs. Clusters
• Processor-memory nodes are connected by some type of interconnect network– Massively Parallel Processor (MPP): tightly
integrated, single system image.– Cluster: individual computers connected by s/w
CPU
MEM
CPU
MEM CPU
MEM
CPU
MEM
CPU
MEM
CPU
MEM CPU
MEM
CPU
MEM CPU
MEM
CPU
MEM
CPU
MEM
CPU
MEM
CPU
MEM
CPU
MEM CPU
MEM
CPU
MEM CPU
MEM
CPU
MEM
InterconnectNetwork
![Page 15: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/15.jpg)
Distributed Memory: MPPs vs. Clusters
• Processor-memory nodes are connected by some type of interconnect network– Massively Parallel Processor (MPP): tightly
integrated, single system image.– Cluster: individual computers connected by s/w
CPU
MEM
CPU
MEM CPU
MEM
CPU
MEM
CPU
MEM
CPU
MEM CPU
MEM
CPU
MEM CPU
MEM
CPU
MEM
CPU
MEM
CPU
MEM
CPU
MEM
CPU
MEM CPU
MEM
CPU
MEM CPU
MEM
CPU
MEM
InterconnectNetwork
![Page 16: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/16.jpg)
Clusters
• Similar to MPPs– Commodity processors and memory
• Processor performance must be maximized
– Memory hierarchy includes remote memory– No shared memory--message passing
• Communication overhead must be minimized
• Different from MPPs– All commodity, including interconnect and OS– Multiple independent systems: more robust– Separate I/O systems
![Page 17: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/17.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.17
Interconnection Networks
• Limited and exhaustive interconnections• 2- and 3-dimensional meshes• Hypercube (not now common)• Using Switches:
– Crossbar– Trees– Multistage interconnection networks
![Page 18: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/18.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved.
Communications Networks
• Custom– Many vendors have custom interconnects that
provide high performance for their system, specially MPP
– CRAY T3E interconnect is the fastest for MPPs: lowest latency, highest bandwidth
• Commodity– Used in some MPPs and all clusters– Myrinet, Gigabit Ethernet, Fast Ethernet, etc.
![Page 19: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/19.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved.
Types of Interconnects• Fully connected
– not feasible• Array and torus
– Intel Paragon (2D array), CRAY T3E (3D torus)• Crossbar
– IBM SP (8 nodes)• Hypercube
– SGI Origin 2000 (hypercube), Meiko CS-2 (fat tree)• Combinations of some of the above
– IBM SP (crossbar & fully connected for 80 nodes)– IBM SP (fat tree for > 80 nodes)
![Page 20: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/20.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved.
![Page 21: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/21.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved.
![Page 22: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/22.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.22
Two-dimensional Array (Mesh)
Also three-dimensional - used in some large high performance systems.
LinksComputer/processor
![Page 23: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/23.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.23
Three-dimensional Hypercube
000 001
010 011
100
110
101
111
![Page 24: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/24.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.24
Four-dimensional Hypercube
Hypercubes popular in 1980’s - not now
0000 0001
0010 0011
0100
0110
0101
0111
1000 1001
1010 1011
1100
1110
1101
1111
![Page 25: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/25.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.25
Crossbar Switch
SwitchesProcessors
Memories
![Page 26: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/26.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.26
Tree
Switchelement
Root
Links
Processors
![Page 27: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/27.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.27
Multistage Interconnection NetworkExample: Omega network
000
001
010
011
100
101
110
111
000
001
010
011
100
101
110
111
Inputs Outputs
2 ´ 2 switch elements(straight-through or
crossover connections)
![Page 28: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/28.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved.
![Page 29: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/29.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved.
![Page 30: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/30.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved.
![Page 31: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/31.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved.
![Page 32: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/32.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved.
![Page 33: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/33.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved.
![Page 34: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/34.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.35
Distributed Shared Memory Making main memory of group of interconnected computers look as though a single memory with single address space. Then can use shared memory programming techniques.
Processor
Interconnectionnetwork
Shared
Computers
Messages
memory
![Page 35: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/35.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.36
Flynn’s Classifications
Flynn (1966) created a classification for computers based upon instruction streams and data streams:
– Single instruction stream-single data stream (SISD) computer
Single processor computer - single stream of instructions generated from program. Instructions operate upon a single stream of data items.
![Page 36: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/36.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.37
Single Instruction Stream-Multiple Data Stream (SIMD) Computer
• A specially designed computer - a single instruction stream from a single program, but multiple data streams exist. Instructions from program broadcast to more than one processor. Each processor executes same instruction in synchronism, but using different data.
• Developed because a number of important applications that mostly operate upon arrays of data.
![Page 37: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/37.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.38
Multiple Instruction Stream-Multiple Data Stream (MIMD) Computer
General-purpose multiprocessor system - each processor has a separate program and one instruction stream is generated from each program for each processor. Each instruction operates upon different data.
Both the shared memory and the message-passing multiprocessors so far described are in the MIMD classification.
![Page 38: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/38.jpg)
The Banking Analogy
• Tellers: Parallel Processors
• Customers: tasks
• Transactions: operations
• Accounts: data
![Page 39: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/39.jpg)
Vector/Array
• Each teller/processor gets a very fine-grained task
• Use pipeline parallelism
• Good for handling batches when operations can be broken down into fine-grained stages
![Page 40: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/40.jpg)
SIMD (Single-Instruction-Multiple-Data)
• All processors do the same things or idle
• Phase 1: data partitioning and distributed
• Phase 2: data-parallel processing
• Efficient for big, regular data-sets
![Page 41: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/41.jpg)
Systolic Array
• Combination of SIMD and Pipeline parallelism
• 2-d array of processors with memory at the boundary
• Tighter coordination between processors
• Achieve very high speeds by circulating data among processors before returning to memory
![Page 42: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/42.jpg)
MIMD(Multi-Instruction-Multiple-Data)
• Each processor (teller) operates independently
• Need synchronization mechanism– by message passing– or mutual exclusion (locks)
• Best suited for large-grained problems
• Less than data-flow parallelism
![Page 43: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/43.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.44
Networked Computers as a Computing Platform
• A network of computers became a very attractive alternative to expensive supercomputers and parallel computer systems for high-performance computing in early 1990’s.
• Several early projects. Notable:
– Berkeley NOW (network of workstations) project.
– NASA Beowulf project.
![Page 44: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/44.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.45
Key advantages:
• Very high performance workstations and PCs readily available at low cost.
• The latest processors can easily be incorporated into the system as they become available.
• Existing software can be used or modified.
![Page 45: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/45.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.46
Software Tools for Clusters
• Based upon Message Passing Parallel Programming:
• Parallel Virtual Machine (PVM) - developed in late 1980’s. Became very popular.
• Message-Passing Interface (MPI) - standard defined in 1990s.
• Both provide a set of user-level libraries for message passing. Use with regular programming languages (C, C++, ...).
![Page 46: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/46.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.47
Beowulf Clusters*
• A group of interconnected “commodity” computers achieving high performance with low cost.
• Typically using commodity interconnects - high speed Ethernet, and Linux OS.
* Beowulf comes from name given by NASA Goddard Space Flight Center cluster project.
![Page 47: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/47.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.48
Cluster Interconnects
• Originally fast Ethernet on low cost clusters• Gigabit Ethernet - easy upgrade path
More Specialized/Higher Performance• Myrinet - 2.4 Gbits/sec - disadvantage: single vendor• cLan• SCI (Scalable Coherent Interface)• QNet• Infiniband - may be important as infininband interfaces
may be integrated on next generation PCs
![Page 48: CSCI-455/522](https://reader031.fdocuments.in/reader031/viewer/2022012919/56814654550346895db369f8/html5/thumbnails/48.jpg)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, © 2004 Pearson Education Inc. All rights reserved. 1.49
Dedicated cluster with a master node
Dedicated Cluster User
Switch
Master node
Compute nodes
Up link
2nd Ethernetinterface
External network