MIMD Shared Memory
-
Upload
gareth-hayden -
Category
Documents
-
view
17 -
download
1
description
Transcript of MIMD Shared Memory
MIMD Shared Memory
Multiprocessors
MIMD -- Shared Memory Each processor has a full CPU Each processors runs its own code
– can be the same program as other processors
or different All processors access the same memory
– Same address space for all processors– UMA Uniform Memory Access
» all memory accessible in same time for every processor
– NUMA Non-Uniform Memory Access» memory is localized
» each processor can access some memory faster than other
MIMD - SM - UMA
PROCESSORS
MEMORY
MODULES
CONNECTION
Options for Connection -- UMA Bus
– Sequential, can be used for one message at a time Switching Network
– Can send many messages at once» depends on connection scheme
– Crossbar» Maximal connections» expensive
– Omega (also called Butterfly, Banyan)» several permutations of proc-mem possible
Bus
Needs smart local cache schemes to reduce bus traffic
Works for low number of processors Depending on technology 20-50 processors
overloads bus, performance degrades Common on 4, 8 processor SMP servers
Bus
Cache
Processors
Bus
Memory
Crossbar switch
Every permutation of processor to memory can work
Expensive N*M switches where
where N = number of processors,
M = Number of memory modules
Processors
Memory
Switches
Crossbar switch
Omega Network
Every Processor Connects to Every Memory Many, but not all, permutations possible An Extra stage adds redundancy and more
permutations Number of switches = (N/2) log N
» For N processors, N memory modules
Number of stages = log N (determines latency)
Omega Network
Processors
Memory
Omega Network
000001
010011
100101
110111
000001
010011
100101
110111
Destination = 101
Omega Network -- A Permutation
000001
010011
100101
110111
000001
010011
100101
110111
Destination = 101
Omega Network with combining
Smart Switches– combine two requests with same destination– make memory accesses equivalent to serial sequence– split return values appropriately
Time trade-off Used in NYU Ultra-computer
– also in IBM RP3 experimental machine Example: Fetch and Increment
Omega Network
000001
010011
100101
110111
000001
010011
100101
110111
Destination = 101
Options for Connection -- NUMA
Each Processor has a segment of memory closer than others– Could be several different levels of access
All Processors still use same address space Omega network with wrap around
– BBN Butterfly Hierarchy of Rings (or other switches)
– Kendall Square Research KSR-1– SGI Origin series
Hierarchical Rings
DirectoryNodes
ComputeNodeTo higher level
ring
Issues for MIMD Shared Memory
Memory Access– Can reads be simultaneous?– How to control multiple writes?
Synchronization mechanism needed– semaphores– monitors
Local caches need to be coordinated– cache coherency protocols