Cache coherence problem and its solutions
-
Upload
majid-saleem -
Category
Devices & Hardware
-
view
6.797 -
download
0
Transcript of Cache coherence problem and its solutions
![Page 1: Cache coherence problem and its solutions](https://reader036.fdocuments.in/reader036/viewer/2022070521/58f9aadd760da3da068b7d7e/html5/thumbnails/1.jpg)
Majid saleem 11-ntu-1232BSCS 6th(A)Submitted To: Sir Nasir MahmoodSubmission Date: 05-07-2014National Textile university Faisalabad
![Page 2: Cache coherence problem and its solutions](https://reader036.fdocuments.in/reader036/viewer/2022070521/58f9aadd760da3da068b7d7e/html5/thumbnails/2.jpg)
CACHE COHERENCE:
………………………
![Page 3: Cache coherence problem and its solutions](https://reader036.fdocuments.in/reader036/viewer/2022070521/58f9aadd760da3da068b7d7e/html5/thumbnails/3.jpg)
DEFINATION:
Cache coherence is the discipline that ensures that changes in the values of shared operands are propagated throughout the system in a timely fashion.
![Page 4: Cache coherence problem and its solutions](https://reader036.fdocuments.in/reader036/viewer/2022070521/58f9aadd760da3da068b7d7e/html5/thumbnails/4.jpg)
EXPLANATION:
shared memory multiprocessor separate cache memory for each processor possible to have many copies of any one instruction operand one copy in the main memory and one in each cache memory. When one copy of an operand is changed, the other copies of the
operand must be changed also.
![Page 5: Cache coherence problem and its solutions](https://reader036.fdocuments.in/reader036/viewer/2022070521/58f9aadd760da3da068b7d7e/html5/thumbnails/5.jpg)
THERE ARE THREE DISTINCT LEVELS OF CACHE COHERENCE:
every write operation appears to occur instantaneously all processors see exactly the same sequence of changes of values for each
separate operand different processors may see an operation and assume different sequences
of values; this is considered to be a noncoherent behavior
![Page 6: Cache coherence problem and its solutions](https://reader036.fdocuments.in/reader036/viewer/2022070521/58f9aadd760da3da068b7d7e/html5/thumbnails/6.jpg)
CACHE COHERENCY PROBLEM •
update from a writing processor is not known to other processors
![Page 7: Cache coherence problem and its solutions](https://reader036.fdocuments.in/reader036/viewer/2022070521/58f9aadd760da3da068b7d7e/html5/thumbnails/7.jpg)
TWO TYPES OF SOLUTIONS:
Software-based Hardware base
![Page 8: Cache coherence problem and its solutions](https://reader036.fdocuments.in/reader036/viewer/2022070521/58f9aadd760da3da068b7d7e/html5/thumbnails/8.jpg)
SOFTWARE-BASED Compiler based or with run-time system support. With or without hardware assist. Tough problem because perfect information is needed in the
presence of memory aliasing and explicit parallelism.
![Page 9: Cache coherence problem and its solutions](https://reader036.fdocuments.in/reader036/viewer/2022070521/58f9aadd760da3da068b7d7e/html5/thumbnails/9.jpg)
TWO TYPES OF HARDWARE_BASE SOLUTION:
Snooping Directory_Based
![Page 10: Cache coherence problem and its solutions](https://reader036.fdocuments.in/reader036/viewer/2022070521/58f9aadd760da3da068b7d7e/html5/thumbnails/10.jpg)
SNOOPING: used with low-end MPs few processors centralized memory bus-based distributed implementation: responsibility for
maintaining coherence lies with each cache
![Page 11: Cache coherence problem and its solutions](https://reader036.fdocuments.in/reader036/viewer/2022070521/58f9aadd760da3da068b7d7e/html5/thumbnails/11.jpg)
LOW END MP’S
![Page 12: Cache coherence problem and its solutions](https://reader036.fdocuments.in/reader036/viewer/2022070521/58f9aadd760da3da068b7d7e/html5/thumbnails/12.jpg)
SNOOPING IMPLEMENTATION
A distributed coherency protocol coherency state associated with each cache block each snoop maintains coherency for its own cache
![Page 13: Cache coherence problem and its solutions](https://reader036.fdocuments.in/reader036/viewer/2022070521/58f9aadd760da3da068b7d7e/html5/thumbnails/13.jpg)
CONTINUE…….HOW THE BUS IS USED?
broadcast medium entire coherency operation is atomic wrt other processors keep-the-bus protocol: master holds the bus until the entire
operation has completed
![Page 14: Cache coherence problem and its solutions](https://reader036.fdocuments.in/reader036/viewer/2022070521/58f9aadd760da3da068b7d7e/html5/thumbnails/14.jpg)
CONTINUE……. SPLIT-TRANSACTION BUSES request & response are different phases state value that indicates that an operation is in
progress do not initiate another operation for a cache block that
has one in progress
![Page 15: Cache coherence problem and its solutions](https://reader036.fdocuments.in/reader036/viewer/2022070521/58f9aadd760da3da068b7d7e/html5/thumbnails/15.jpg)
BUS IN SNOOPING
![Page 16: Cache coherence problem and its solutions](https://reader036.fdocuments.in/reader036/viewer/2022070521/58f9aadd760da3da068b7d7e/html5/thumbnails/16.jpg)
AN EXAMPLE SNOOPING PROTOCOL: INVALIDATION-BASED COHERENCY PROTOCOL EACH CACHE BLOCK IS IN ONE OF THREE STATESshared: clean in all caches & up-to-date in memory block can be read by any processor exclusive: dirty in exactly one cache only that processor can write to it (it’s the owner of the block) invalid: block contains no valid data
![Page 17: Cache coherence problem and its solutions](https://reader036.fdocuments.in/reader036/viewer/2022070521/58f9aadd760da3da068b7d7e/html5/thumbnails/17.jpg)
DIRECTORY-BASED• used with higher-end MPs more processors distributed memory multi-path interconnect centralized for each address: responsibility for
maintaining coherence lies with the directory for each address
![Page 18: Cache coherence problem and its solutions](https://reader036.fdocuments.in/reader036/viewer/2022070521/58f9aadd760da3da068b7d7e/html5/thumbnails/18.jpg)
DIRECTORY IMPLEMENTATIONDistributed memory machine processor-memory pairs are connected via a multi-path interconnection
network point-to-point communication snooping with broadcasting is wasteful of the parallel communication
capability each processor (or cluster of processors) has its own memory a processor has fast access to its local memory & slower access to
“remote” memory located at other processors NUMA (non-uniform memory access) machines
![Page 19: Cache coherence problem and its solutions](https://reader036.fdocuments.in/reader036/viewer/2022070521/58f9aadd760da3da068b7d7e/html5/thumbnails/19.jpg)
INTERCONNECTION NETWORK
![Page 20: Cache coherence problem and its solutions](https://reader036.fdocuments.in/reader036/viewer/2022070521/58f9aadd760da3da068b7d7e/html5/thumbnails/20.jpg)
DIRECTORY IMPLEMENTATION• directory tracks state of cache blocks• shared: at least 1 processor has the data cached & memory is up- to-date block can be read by any processor • exclusive: 1 processor (the owner) has the data cached & memory is stale only that processor can write to it • invalid: no processor has the data cached & memory is up-to-date ctory
tracks state of cache blocks
![Page 21: Cache coherence problem and its solutions](https://reader036.fdocuments.in/reader036/viewer/2022070521/58f9aadd760da3da068b7d7e/html5/thumbnails/21.jpg)
CONTINUE….. Directory blocks play different roles during a memory operation.
home node: the memory location of the requested data local node: where the memory request initiated remote node: an alternate location for the data if this processor has requested & cached it
![Page 22: Cache coherence problem and its solutions](https://reader036.fdocuments.in/reader036/viewer/2022070521/58f9aadd760da3da068b7d7e/html5/thumbnails/22.jpg)
ANY QUESTION.????Thank you…………………………………………….