13-1 Lec 6 Chap. 13Multiprocessors 13-1Characteristics of Multiprocessors Multiprocessors System =...
-
Upload
alan-chad-copeland -
Category
Documents
-
view
224 -
download
0
Transcript of 13-1 Lec 6 Chap. 13Multiprocessors 13-1Characteristics of Multiprocessors Multiprocessors System =...
13-1Lec 6 Chap. 13Multiprocessors
13-1 Characteristics of Multiprocessors Multiprocessors System = MIMD
An interconnection of two or more CPUs with memory and I/O equipment» a single CPU and one or more IOPs is usually not included in a multiprocessor system
Unless the IOP has computational facilities comparable to a CPU
Computation can proceed in parallel in one of two ways 1) Multiple independent jobs can be made to operate in parallel 2) A single job can be partitioned into multiple parallel tasks
Classified by the memory Organization 1) Shared memory or Tightly-coupled system
» Local memory + Shared memory higher degree of interaction between tasks
2) Distribute memory or Loosely-coupled system» Local memory + message passing scheme (packet or message )
most efficient when the interaction between tasks is minimal
13-2 Interconnection Structure Multiprocessor System Components
1) Time-shared common bus 2) Multi-port memory 3) Crossbar switch 4) Multistage switching network 5) Hypercube system
CPU, IOP, Memory unit Interconnection Components
Chap. 13 MultiprocessorsComputer System Architecture
13-2
Time-shared Common Bus Time-shared single common bus system :
» Only one processor can communicate with the memory or another processor at any given time
when one processor is communicating with the memory, all other processors are either busy with internal operations or must be idle waiting for the bus
Dual common bus system : » System bus + Local bus
» Shared memory the memory connected to the common system bus is shared by all processors
» System bus controller Link each local bus to a common system bus
Memory unit
CPU 1
CPU 2
CPU 3
IOP 1 IOP 2
Tightly coupled system
Chap. 13 MultiprocessorsComputer System Architecture
13-3
Multi-port memory : multiple paths between processors and memory
» Advantage : high transfer rate can be achieved
» Disadvantage : expensive memory control logic / large number of cables & connectors
Crossbar Switch :
MM 1
MM 4
Memory modules
MM 2 MM 3
CPU 1
CPU 4
CPU 3
CPU 2
MM 1
MM 4
Memory modules
MM 2 MM 3
CPU 1
CPU 4
CPU 3
CPU 2
Memory Module I/O Port Block diagram of crossbar switch
Memory module
Multiplexers and
arbitration logic
Data
Memory
Read/write
Address
enable
Data,address, and control form CPU 1
Data,address, and control form CPU 4
Data,address, and control form CPU 3
Data,address, and control form CPU 2
MM CPUs
Chap. 13 MultiprocessorsComputer System Architecture
13-4
cluster
cluster
cluster
cluster
cluster
cluster cluster
Crossbar- Hierarchies
clustercluster
cluster cluster clustercluster
cluster
cluster
cluster
Crossbar
Cluster
NodeNode
Node
Node
Node
4
PU
Node
CU
Network Interface
8
I/O
Local Memory
8
Chap. 13 MultiprocessorsComputer System Architecture
Crossbar Switch
13-5Crossbar Switch
Chap. 13 MultiprocessorsComputer System Architecture
13-6
Multistage Switching Network Control the communication between a number of sources and destinations
» Tightly coupled system : PU
» Loosely coupled system : PU
MM
PU Basic components of a multistage switching network :
two-input, two-output interchange switch : Fig. 13-6 2 Processor (P1 and P2) are connected through switches to 8 memory
modules (000 - 111) : Fig. 13-7 Omega Network : Fig. 13-8
» 2 x 2 Interchange switch 를 사용하여 N input x N output network topology 구성
A
B
A connected to 0
A
B
B connected to 1
A
B1
0
B connected to 0
A
B1
0
A connected to 1
0
1001
0
0101
0100
1
0110
1
111
000
0
1
0
0P0
1
000001
100101
011
111
0
Chap. 13 MultiprocessorsComputer System Architecture
5
4
3
1
7
0
1
01
1
13-7
Hypercube Interconnection : Fig. 13-9 : one-cube, two-cube, three-cube Loosely coupled system Hypercube Architecture : Intel iPSC ( n = 7, 128 node → n-cube, 2n node )
13-3 Interprocessor Arbitration : Bus Control Single Bus System : Address bus, Data bus, Control bus Multiple Bus System : Memory bus, I/O bus, System bus
System bus : Bus that connects CPUs, IOPs, and Memory in multiprocessor system(bus controller/arbitrator)
Data transfer method over the system bus Synchronous bus : achieved by driving both units from a common clock
source Asynchronous bus : accompanied by handshaking control signals
0
0
01
10
11
00
010
© Korea Univ. of Tech. & Edu.Dept. of Info. & Comm.Chap. 13 MultiprocessorsComputer System Architecture
011
110
101
100
111
000
001
13-8
System Bus : IEEE Standard 796 MultiBus 86 signal lines :
» Bus Arbitration : BREQ, BUSY, …
Bus Arbitration Algorithm : Static / Dynamic Static : priority fixed
» Serial (daisy-chain) arbitration :
» Parallel arbitration : Fig. 13-11
Dynamic : priority flexible» Time slice (fixed length time)
» Polling
» LRU
» FIFO
» Rotating daisy-chain
* Bus Busy Line If this line is inactive,
no other processor is using the bus
Chap. 13 MultiprocessorsComputer System Architecture
13-9
Chap. 13 MultiprocessorsComputer System Architecture
13-4 Interprocessor Communication & Synchronization Interprocessor Communication
shared memory : tightly coupled system
» Accessible to all processors : common memory
» Act as a message center similar to a mailbox
no shared memory : loosely coupled system
» message passing through I/O channel communication
Interprocessor Synchronization
Enforce the correct sequence of processes and ensure mutually exclusive access to shared writable data
Mutual Exclusion» Protect data from being changed simultaneous by two or more processor
Mutual Exclusion with Semaphore» Critical Session
Once begun, must complete execution before another processor accesses
» Semaphore Indicate whether or not a processor is executing a critical section
» Hardware Lock Processor generated signal to prevent other processors from using system bus
13-10
X = 120 Main memory
Bus
X = 120
X = 52
X = 52
Caches
P1 P2 P3Processors
X = 52
Main memory
Bus
X = 120
X = 52
X = 52
Caches
P1 P2 P3Processors
(a) With write-through cache policy
(b) With write-back cache policy
Semaphore shared memory 1) TSL SEM (Test and Set while Locked)
» Hardware Lock SEM
» 2 memory cycle : Test semaphore (semaphore)
: Set semaphore ( processor shared memory )
2) R = 0 : shared memory is available
R = 1: processor can not access shared memory (semaphore originally set)
13-5 Cache Coherence Conditions for Incoherence : Fig. 13-12, 13
Multiprocessor system with private caches» Write through : P2, P3 Incoherence» Write back : P2, P3, Main memory Incoherence
R M [ SEM ]
M [ SEM ] 1
Chap. 13 MultiprocessorsComputer System Architecture
13-11
Chap. 13 MultiprocessorsComputer System Architecture
Solution to the Cache Coherence Problem Software
» 1) Shared writable data are non-cacheable
» 2) Writable data exists in one cache : Centralized global table
Hardware » 1) Monitor possible write operation : Snoopy cache controller
» IEEE Computer, 1988, Feb.
“Synchronization, coherence, and event ordering in multiprocessors”
» IEEE Computer, 1990, June.
“A survey of cache coherence schemes for multiprocessors”
13-12
snoopy cache is a type of memory cache that performs bus sniffing.
Such caches are used in systems where many processors or computers share the same memory and each have their own cache.
In such systems processor 'A' may read a value from memory, then processor 'B' does the same. If either of the processors now change the value by writing back to memory they will invalidate the other processor's cached value.
# In order to prevent this and maintain cache coherence # snoopy caches monitor ('snoop on') the memory bus to detect any writes to values that they are holding, including changes coming from other processors or distributed computers
Watches bus for write operations to the shared memory.
Invalidates cache entry if the write address appears
Chap. 13 MultiprocessorsComputer System Architecture
Snoopy Cache Controller
13-13
SMP Single OS Shared Memory Memory Interconnect OpenMP API: http://www.openmp.org/
MPP Multiple OS Distributed Memory Processor Interconnect MPI API : http://www.mpi-forum.org/
Cluster Cluster of IA32 (1 or 2 CPU)
Node Interconnect Constellation
Cluster of SMP node Node Interconnect
CPU
Memory
Node
Node Interconnect
Memory Interconnect
Chap. 13 MultiprocessorsComputer System Architecture
* Clusters in* top500.org * “simple” Cluster : 1 processor in each node
Cluster of small SMP’s : small # processors / node
Constellations : large # processors / node
13-14Parallel Machine
Code
Chap. 13 MultiprocessorsComputer System Architecture
13-15www.top500.org
MPP – Massively Parallel Processors Loosely coupled system, clusters are “rising”
Clusters “simple” Cluster (1 processor in each node) Cluster of small SMP’s (small # processors / node) Constellations (large # processors / node)
Older Architectures SIMD – Single Instruction Multiple Data Vector Processors (Old Cray machines)
Chap. 13 MultiprocessorsComputer System Architecture
13-17Beowulf Clusters
http://www.beowulf.org http://www.scyld.com http://linuxhpc.org
Chap. 13 MultiprocessorsComputer System Architecture
13-18Cloud computing
Internet-based computing, whereby shared resources, software and information are provided to computers and other devices on-demand.
The term "cloud" is used as a metaphor for the Internet, based on the cloud drawing used in the past to represent the telephone network, and later to depict the Internet in computer network diagrams. - Wikipedia -
Chap. 13 MultiprocessorsComputer System Architecture