12.4 Memory Organization in Multiprocessor Systems

31
12.4 Memory 12.4 Memory Organization in Organization in Multiprocessor Multiprocessor Systems Systems By: Melissa Jamili By: Melissa Jamili CS 147, Section 1 CS 147, Section 1 December 2, 2003 December 2, 2003

description

12.4 Memory Organization in Multiprocessor Systems. By: Melissa Jamili CS 147, Section 1 December 2, 2003. Overview. Shared Memory Usage Organization Cache Coherence Cache coherence problem Solutions Protocols for marking and manipulating data. Shared Memory. Two purposes - PowerPoint PPT Presentation

Transcript of 12.4 Memory Organization in Multiprocessor Systems

Page 1: 12.4 Memory Organization in Multiprocessor Systems

12.4 Memory Organization in 12.4 Memory Organization in Multiprocessor SystemsMultiprocessor Systems

By: Melissa JamiliBy: Melissa JamiliCS 147, Section 1CS 147, Section 1December 2, 2003December 2, 2003

Page 2: 12.4 Memory Organization in Multiprocessor Systems

OverviewOverview

Shared MemoryShared Memory UsageUsage OrganizationOrganization

Cache CoherenceCache Coherence Cache coherence problemCache coherence problem SolutionsSolutions Protocols for marking and manipulating dataProtocols for marking and manipulating data

Page 3: 12.4 Memory Organization in Multiprocessor Systems

Shared MemoryShared Memory

Two purposesTwo purposes1.1. Message passingMessage passing2.2. SemaphoresSemaphores

Page 4: 12.4 Memory Organization in Multiprocessor Systems

Message PassingMessage Passing

Direct message passing without shared Direct message passing without shared memorymemory One processor sends a message directly to One processor sends a message directly to

another processoranother processor Requires synchronization between processors Requires synchronization between processors

or a bufferor a buffer

Page 5: 12.4 Memory Organization in Multiprocessor Systems

Message Passing (cont.)Message Passing (cont.) Message passing with shared memoryMessage passing with shared memory

First processor writes a message to the shared First processor writes a message to the shared memory and signals the second processor that it has memory and signals the second processor that it has a waiting messagea waiting message

Second processor reads the message from shared Second processor reads the message from shared memory, possibly returning an acknowledge signal to memory, possibly returning an acknowledge signal to the sender.the sender.

Location of the message in shared memory is Location of the message in shared memory is known beforehand or sent with the waiting known beforehand or sent with the waiting message signalmessage signal

Page 6: 12.4 Memory Organization in Multiprocessor Systems

SemaphoresSemaphores

Stores information about current stateStores information about current state Information on protection and availability of Information on protection and availability of

different portions of memorydifferent portions of memory

Can be accessed by any processor that Can be accessed by any processor that needs the informationneeds the information

Page 7: 12.4 Memory Organization in Multiprocessor Systems

Organization of Shared MemoryOrganization of Shared Memory

Not organized into a single shared Not organized into a single shared memory modulememory module

Partitioned into several memory modulesPartitioned into several memory modules

Page 8: 12.4 Memory Organization in Multiprocessor Systems

Four-processor UMA architecture Four-processor UMA architecture with Benes networkwith Benes network

Page 9: 12.4 Memory Organization in Multiprocessor Systems

InterleavingInterleaving

Process used to divide the shared Process used to divide the shared memory address space among the memory address space among the memory modulesmemory modules

Two types of interleavingTwo types of interleaving1.1. High-orderHigh-order2.2. Low-orderLow-order

Page 10: 12.4 Memory Organization in Multiprocessor Systems

High-order InterleavingHigh-order Interleaving

Shared address space is divided into Shared address space is divided into contiguous blocks of equal size.contiguous blocks of equal size.

Two high-order bits of an address Two high-order bits of an address determine the module in which the location determine the module in which the location of the address resides.of the address resides. Hence the nameHence the name

Page 11: 12.4 Memory Organization in Multiprocessor Systems

Example of 64 Mb shared memory Example of 64 Mb shared memory with four moduleswith four modules

Page 12: 12.4 Memory Organization in Multiprocessor Systems

Low-order InterleavingLow-order Interleaving

Low-order bits of a memory address Low-order bits of a memory address determine its moduledetermine its module

Page 13: 12.4 Memory Organization in Multiprocessor Systems

Example of 64 Mb shared memory Example of 64 Mb shared memory with four moduleswith four modules

Page 14: 12.4 Memory Organization in Multiprocessor Systems

Low-order Interleaving (cont.)Low-order Interleaving (cont.) Low-order interleaving originally used to reduce Low-order interleaving originally used to reduce

delay in accessing memorydelay in accessing memory CPU could output an address and read request to one CPU could output an address and read request to one

memory modulememory module Memory module can decode and access its dataMemory module can decode and access its data

CPU could output another request to a different CPU could output another request to a different memory modulememory module Results in pipelining its memory requests.Results in pipelining its memory requests.

Low-order interleaving not commonly used in Low-order interleaving not commonly used in modern computers since cache memorymodern computers since cache memory

Page 15: 12.4 Memory Organization in Multiprocessor Systems

Low-order vs. High-order Low-order vs. High-order InterleavingInterleaving

In a low-order interleaving system, In a low-order interleaving system, consecutive memory locations reside in consecutive memory locations reside in different memory modulesdifferent memory modules Processor executing a program stored in a Processor executing a program stored in a

contiguous block of memory would need to contiguous block of memory would need to access different modules simultaneouslyaccess different modules simultaneously

Simultaneous access possible but difficult to Simultaneous access possible but difficult to avoid memory conflictsavoid memory conflicts

Page 16: 12.4 Memory Organization in Multiprocessor Systems

Low-order vs. High-order Low-order vs. High-order Interleaving (cont.)Interleaving (cont.)

In a high-order interleaving system, In a high-order interleaving system, memory conflicts are easily avoidedmemory conflicts are easily avoided Each processor executes a different programEach processor executes a different program Programs stored in separate memory Programs stored in separate memory

modulesmodules

Interconnection network is set to connect Interconnection network is set to connect each processor to its proper memory each processor to its proper memory modulemodule

Page 17: 12.4 Memory Organization in Multiprocessor Systems

Cache CoherenceCache Coherence

Retain consistencyRetain consistency Like cache memory in uniprocessors, Like cache memory in uniprocessors,

cache memory in multiprocessors improve cache memory in multiprocessors improve performance by reducing the time needed performance by reducing the time needed to access data from memoryto access data from memory

Unlike uniprocessors, multiprocessors Unlike uniprocessors, multiprocessors have individual caches for each processorhave individual caches for each processor

Page 18: 12.4 Memory Organization in Multiprocessor Systems

Cache Coherence ProblemCache Coherence Problem Occurs when two or more caches hold the value Occurs when two or more caches hold the value

of the same memory location simultaneouslyof the same memory location simultaneously One processor stores a value to that location in its One processor stores a value to that location in its

cachecache Other cache will have an invalid value in its locationOther cache will have an invalid value in its location

Write-through cache will not resolve this problemWrite-through cache will not resolve this problem Updates main memory but not other cachesUpdates main memory but not other caches

Page 19: 12.4 Memory Organization in Multiprocessor Systems

Cache coherence problem with four Cache coherence problem with four processors using a write-back processors using a write-back

cachecache

Page 20: 12.4 Memory Organization in Multiprocessor Systems

Solutions to the Cache Coherence Solutions to the Cache Coherence ProblemProblem

Mark all shared data as non-cacheableMark all shared data as non-cacheable Use a cache directoryUse a cache directory Use cache snoopingUse cache snooping

Page 21: 12.4 Memory Organization in Multiprocessor Systems

Non-CacheableNon-Cacheable

Mark all shared data as non-cacheableMark all shared data as non-cacheable Forces accesses of data to be from shared Forces accesses of data to be from shared

memorymemory Lowers cache hit ratio and reduces overall Lowers cache hit ratio and reduces overall

system performancesystem performance

Page 22: 12.4 Memory Organization in Multiprocessor Systems

Cache DirectoryCache Directory Use a cache directoryUse a cache directory

Directory controller is integrated with the main Directory controller is integrated with the main memory controller to maintain the cache memory controller to maintain the cache directorydirectory

Cache directory located in main memoryCache directory located in main memory Contains information on the contents of local Contains information on the contents of local

cachescaches Cache writes sent to directory controller to Cache writes sent to directory controller to

update cache directoryupdate cache directory Controller invalidates other caches with same dataController invalidates other caches with same data

Page 23: 12.4 Memory Organization in Multiprocessor Systems

Cache SnoopingCache Snooping

Each cache (snoopy cache) monitors Each cache (snoopy cache) monitors memory activity on the system busmemory activity on the system bus

Appropriate action is taken when a Appropriate action is taken when a memory request is encounteredmemory request is encountered

Page 24: 12.4 Memory Organization in Multiprocessor Systems

Protocols for marking and Protocols for marking and manipulating datamanipulating data

MESI protocol most commonMESI protocol most common Each cache entry can be in one of the following Each cache entry can be in one of the following

states:states:1.1. Modified: Cache contains memory value, which is Modified: Cache contains memory value, which is

different from value in shared memorydifferent from value in shared memory2.2. Exclusive: Only one cache contains memory value, Exclusive: Only one cache contains memory value,

which is same value in shared memorywhich is same value in shared memory3.3. Shared: Cache contains memory value Shared: Cache contains memory value

corresponding to shared memory, other caches can corresponding to shared memory, other caches can hold this memory locationhold this memory location

4.4. Invalid: Cache does not contain memory locationInvalid: Cache does not contain memory location

Page 25: 12.4 Memory Organization in Multiprocessor Systems

How the MESI Protocol WorksHow the MESI Protocol Works

Four possible memory access scenarios:Four possible memory access scenarios:1.1. Read hitRead hit2.2. Read missRead miss3.3. Write hitWrite hit4.4. Write missWrite miss

Page 26: 12.4 Memory Organization in Multiprocessor Systems

MESI Protocol (cont.)MESI Protocol (cont.)

Read hitRead hit Processor reads dataProcessor reads data State unchangedState unchanged

Page 27: 12.4 Memory Organization in Multiprocessor Systems

MESI Protocol (cont.)MESI Protocol (cont.) Read missRead miss

Processor sends read request to shared memory via Processor sends read request to shared memory via system bussystem bus1.1. No cache contains dataNo cache contains data

MMU loads data from main memory into processor’s cacheMMU loads data from main memory into processor’s cache Cache marked as E (exclusive)Cache marked as E (exclusive)

2.2. One cache contains data, marked as EOne cache contains data, marked as E Data loaded into cache, marked as S (shared)Data loaded into cache, marked as S (shared) Other cache changes from state E to SOther cache changes from state E to S

3.3. More than one cache contains the data, marked as SMore than one cache contains the data, marked as S Data loaded into cache, marked as SData loaded into cache, marked as S Other cache states with data remain unchangedOther cache states with data remain unchanged

4.4. One cache contains data, marked as M (modified)One cache contains data, marked as M (modified) Cache with modified data temporarily blocks memory read Cache with modified data temporarily blocks memory read

request and updates main memoryrequest and updates main memory Read request continues, both caches mark data as SRead request continues, both caches mark data as S

Page 28: 12.4 Memory Organization in Multiprocessor Systems

MESI Protocol (cont.)MESI Protocol (cont.)

Write hitWrite hit1.1. Cache contains data in state M or ECache contains data in state M or E

Processor writes data to cacheProcessor writes data to cache State becomes MState becomes M

2.2. Cache contains data in state SCache contains data in state S Processor writes data, marked as MProcessor writes data, marked as M All other caches mark this data as I (invalid)All other caches mark this data as I (invalid)

Page 29: 12.4 Memory Organization in Multiprocessor Systems

MESI Protocol (cont.)MESI Protocol (cont.) Write missWrite miss

Begins by issuing a read with intent to modify Begins by issuing a read with intent to modify (RWITM)(RWITM)1.1. No cache holds data, one cache holds data marked as E, No cache holds data, one cache holds data marked as E,

or one or more caches hold data marked Sor one or more caches hold data marked S Data loaded from main memory into cache, marked as MData loaded from main memory into cache, marked as M Processor writes new data to cacheProcessor writes new data to cache Caches holding this data change states to ICaches holding this data change states to I

2.2. One other cache holds data as MOne other cache holds data as M Cache temporarily blocks request and writes its value back to Cache temporarily blocks request and writes its value back to

main memory, marks data as Imain memory, marks data as I Original cache loads data, marked as MOriginal cache loads data, marked as M Processor writes new value to cacheProcessor writes new value to cache

Page 30: 12.4 Memory Organization in Multiprocessor Systems

Four-processor system using cache Four-processor system using cache snooping and the MESI protocolsnooping and the MESI protocol

Page 31: 12.4 Memory Organization in Multiprocessor Systems

ConclusionConclusion Shared memoryShared memory

Message passingMessage passing SemaphoresSemaphores InterleavingInterleaving

Cache coherenceCache coherence Cache coherence problemCache coherence problem SolutionsSolutions

Non-cacheableNon-cacheable Cache directoryCache directory Cache snoopingCache snooping

MESI protocolMESI protocol