12.4 Memory Organization in Multiprocessor Systems
description
Transcript of 12.4 Memory Organization in Multiprocessor Systems
12.4 Memory Organization in 12.4 Memory Organization in Multiprocessor SystemsMultiprocessor Systems
By: Melissa JamiliBy: Melissa JamiliCS 147, Section 1CS 147, Section 1December 2, 2003December 2, 2003
OverviewOverview
Shared MemoryShared Memory UsageUsage OrganizationOrganization
Cache CoherenceCache Coherence Cache coherence problemCache coherence problem SolutionsSolutions Protocols for marking and manipulating dataProtocols for marking and manipulating data
Shared MemoryShared Memory
Two purposesTwo purposes1.1. Message passingMessage passing2.2. SemaphoresSemaphores
Message PassingMessage Passing
Direct message passing without shared Direct message passing without shared memorymemory One processor sends a message directly to One processor sends a message directly to
another processoranother processor Requires synchronization between processors Requires synchronization between processors
or a bufferor a buffer
Message Passing (cont.)Message Passing (cont.) Message passing with shared memoryMessage passing with shared memory
First processor writes a message to the shared First processor writes a message to the shared memory and signals the second processor that it has memory and signals the second processor that it has a waiting messagea waiting message
Second processor reads the message from shared Second processor reads the message from shared memory, possibly returning an acknowledge signal to memory, possibly returning an acknowledge signal to the sender.the sender.
Location of the message in shared memory is Location of the message in shared memory is known beforehand or sent with the waiting known beforehand or sent with the waiting message signalmessage signal
SemaphoresSemaphores
Stores information about current stateStores information about current state Information on protection and availability of Information on protection and availability of
different portions of memorydifferent portions of memory
Can be accessed by any processor that Can be accessed by any processor that needs the informationneeds the information
Organization of Shared MemoryOrganization of Shared Memory
Not organized into a single shared Not organized into a single shared memory modulememory module
Partitioned into several memory modulesPartitioned into several memory modules
Four-processor UMA architecture Four-processor UMA architecture with Benes networkwith Benes network
InterleavingInterleaving
Process used to divide the shared Process used to divide the shared memory address space among the memory address space among the memory modulesmemory modules
Two types of interleavingTwo types of interleaving1.1. High-orderHigh-order2.2. Low-orderLow-order
High-order InterleavingHigh-order Interleaving
Shared address space is divided into Shared address space is divided into contiguous blocks of equal size.contiguous blocks of equal size.
Two high-order bits of an address Two high-order bits of an address determine the module in which the location determine the module in which the location of the address resides.of the address resides. Hence the nameHence the name
Example of 64 Mb shared memory Example of 64 Mb shared memory with four moduleswith four modules
Low-order InterleavingLow-order Interleaving
Low-order bits of a memory address Low-order bits of a memory address determine its moduledetermine its module
Example of 64 Mb shared memory Example of 64 Mb shared memory with four moduleswith four modules
Low-order Interleaving (cont.)Low-order Interleaving (cont.) Low-order interleaving originally used to reduce Low-order interleaving originally used to reduce
delay in accessing memorydelay in accessing memory CPU could output an address and read request to one CPU could output an address and read request to one
memory modulememory module Memory module can decode and access its dataMemory module can decode and access its data
CPU could output another request to a different CPU could output another request to a different memory modulememory module Results in pipelining its memory requests.Results in pipelining its memory requests.
Low-order interleaving not commonly used in Low-order interleaving not commonly used in modern computers since cache memorymodern computers since cache memory
Low-order vs. High-order Low-order vs. High-order InterleavingInterleaving
In a low-order interleaving system, In a low-order interleaving system, consecutive memory locations reside in consecutive memory locations reside in different memory modulesdifferent memory modules Processor executing a program stored in a Processor executing a program stored in a
contiguous block of memory would need to contiguous block of memory would need to access different modules simultaneouslyaccess different modules simultaneously
Simultaneous access possible but difficult to Simultaneous access possible but difficult to avoid memory conflictsavoid memory conflicts
Low-order vs. High-order Low-order vs. High-order Interleaving (cont.)Interleaving (cont.)
In a high-order interleaving system, In a high-order interleaving system, memory conflicts are easily avoidedmemory conflicts are easily avoided Each processor executes a different programEach processor executes a different program Programs stored in separate memory Programs stored in separate memory
modulesmodules
Interconnection network is set to connect Interconnection network is set to connect each processor to its proper memory each processor to its proper memory modulemodule
Cache CoherenceCache Coherence
Retain consistencyRetain consistency Like cache memory in uniprocessors, Like cache memory in uniprocessors,
cache memory in multiprocessors improve cache memory in multiprocessors improve performance by reducing the time needed performance by reducing the time needed to access data from memoryto access data from memory
Unlike uniprocessors, multiprocessors Unlike uniprocessors, multiprocessors have individual caches for each processorhave individual caches for each processor
Cache Coherence ProblemCache Coherence Problem Occurs when two or more caches hold the value Occurs when two or more caches hold the value
of the same memory location simultaneouslyof the same memory location simultaneously One processor stores a value to that location in its One processor stores a value to that location in its
cachecache Other cache will have an invalid value in its locationOther cache will have an invalid value in its location
Write-through cache will not resolve this problemWrite-through cache will not resolve this problem Updates main memory but not other cachesUpdates main memory but not other caches
Cache coherence problem with four Cache coherence problem with four processors using a write-back processors using a write-back
cachecache
Solutions to the Cache Coherence Solutions to the Cache Coherence ProblemProblem
Mark all shared data as non-cacheableMark all shared data as non-cacheable Use a cache directoryUse a cache directory Use cache snoopingUse cache snooping
Non-CacheableNon-Cacheable
Mark all shared data as non-cacheableMark all shared data as non-cacheable Forces accesses of data to be from shared Forces accesses of data to be from shared
memorymemory Lowers cache hit ratio and reduces overall Lowers cache hit ratio and reduces overall
system performancesystem performance
Cache DirectoryCache Directory Use a cache directoryUse a cache directory
Directory controller is integrated with the main Directory controller is integrated with the main memory controller to maintain the cache memory controller to maintain the cache directorydirectory
Cache directory located in main memoryCache directory located in main memory Contains information on the contents of local Contains information on the contents of local
cachescaches Cache writes sent to directory controller to Cache writes sent to directory controller to
update cache directoryupdate cache directory Controller invalidates other caches with same dataController invalidates other caches with same data
Cache SnoopingCache Snooping
Each cache (snoopy cache) monitors Each cache (snoopy cache) monitors memory activity on the system busmemory activity on the system bus
Appropriate action is taken when a Appropriate action is taken when a memory request is encounteredmemory request is encountered
Protocols for marking and Protocols for marking and manipulating datamanipulating data
MESI protocol most commonMESI protocol most common Each cache entry can be in one of the following Each cache entry can be in one of the following
states:states:1.1. Modified: Cache contains memory value, which is Modified: Cache contains memory value, which is
different from value in shared memorydifferent from value in shared memory2.2. Exclusive: Only one cache contains memory value, Exclusive: Only one cache contains memory value,
which is same value in shared memorywhich is same value in shared memory3.3. Shared: Cache contains memory value Shared: Cache contains memory value
corresponding to shared memory, other caches can corresponding to shared memory, other caches can hold this memory locationhold this memory location
4.4. Invalid: Cache does not contain memory locationInvalid: Cache does not contain memory location
How the MESI Protocol WorksHow the MESI Protocol Works
Four possible memory access scenarios:Four possible memory access scenarios:1.1. Read hitRead hit2.2. Read missRead miss3.3. Write hitWrite hit4.4. Write missWrite miss
MESI Protocol (cont.)MESI Protocol (cont.)
Read hitRead hit Processor reads dataProcessor reads data State unchangedState unchanged
MESI Protocol (cont.)MESI Protocol (cont.) Read missRead miss
Processor sends read request to shared memory via Processor sends read request to shared memory via system bussystem bus1.1. No cache contains dataNo cache contains data
MMU loads data from main memory into processor’s cacheMMU loads data from main memory into processor’s cache Cache marked as E (exclusive)Cache marked as E (exclusive)
2.2. One cache contains data, marked as EOne cache contains data, marked as E Data loaded into cache, marked as S (shared)Data loaded into cache, marked as S (shared) Other cache changes from state E to SOther cache changes from state E to S
3.3. More than one cache contains the data, marked as SMore than one cache contains the data, marked as S Data loaded into cache, marked as SData loaded into cache, marked as S Other cache states with data remain unchangedOther cache states with data remain unchanged
4.4. One cache contains data, marked as M (modified)One cache contains data, marked as M (modified) Cache with modified data temporarily blocks memory read Cache with modified data temporarily blocks memory read
request and updates main memoryrequest and updates main memory Read request continues, both caches mark data as SRead request continues, both caches mark data as S
MESI Protocol (cont.)MESI Protocol (cont.)
Write hitWrite hit1.1. Cache contains data in state M or ECache contains data in state M or E
Processor writes data to cacheProcessor writes data to cache State becomes MState becomes M
2.2. Cache contains data in state SCache contains data in state S Processor writes data, marked as MProcessor writes data, marked as M All other caches mark this data as I (invalid)All other caches mark this data as I (invalid)
MESI Protocol (cont.)MESI Protocol (cont.) Write missWrite miss
Begins by issuing a read with intent to modify Begins by issuing a read with intent to modify (RWITM)(RWITM)1.1. No cache holds data, one cache holds data marked as E, No cache holds data, one cache holds data marked as E,
or one or more caches hold data marked Sor one or more caches hold data marked S Data loaded from main memory into cache, marked as MData loaded from main memory into cache, marked as M Processor writes new data to cacheProcessor writes new data to cache Caches holding this data change states to ICaches holding this data change states to I
2.2. One other cache holds data as MOne other cache holds data as M Cache temporarily blocks request and writes its value back to Cache temporarily blocks request and writes its value back to
main memory, marks data as Imain memory, marks data as I Original cache loads data, marked as MOriginal cache loads data, marked as M Processor writes new value to cacheProcessor writes new value to cache
Four-processor system using cache Four-processor system using cache snooping and the MESI protocolsnooping and the MESI protocol
ConclusionConclusion Shared memoryShared memory
Message passingMessage passing SemaphoresSemaphores InterleavingInterleaving
Cache coherenceCache coherence Cache coherence problemCache coherence problem SolutionsSolutions
Non-cacheableNon-cacheable Cache directoryCache directory Cache snoopingCache snooping
MESI protocolMESI protocol