CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in...
Transcript of CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in...
![Page 1: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/1.jpg)
CS61C:GreatIdeasinComputerArchitecture(MachineStructures)
CachesPart2
Instructors:JohnWawrzynek &VladimirStojanovichttp://inst.eecs.berkeley.edu/~cs61c/
![Page 2: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/2.jpg)
Second-LevelCache(SRAM)
TypicalMemoryHierarchy
Control
Datapath
SecondaryMemory(Disk
OrFlash)
On-ChipComponents
RegFile
MainMemory(DRAM)Data
CacheInstrCache
Speed(cycles):½’s 1’s 10’s 100’s 1,000,000’s
Size(bytes): 100’s 10K’s M’sG’sT’s
2
• Principleoflocality+memoryhierarchypresentsprogrammerwith≈asmuchmemoryasisavailableinthecheapest technologyatthe≈speedofferedbythefastest technology
Cost/bit:highest lowest
Third-LevelCache(SRAM)
![Page 3: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/3.jpg)
Processor
Control
Datapath
Review:AddingCachetoComputer
3
PC
Registers
Arithmetic&LogicUnit(ALU)
MemoryInput
Output
Bytes
Enable?Read/Write
Address
WriteData
ReadData
Processor-Memory Interface I/O-MemoryInterfaces
Program
Data
Cache
![Page 4: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/4.jpg)
0000000001000100001100100001010011000111010000100101010010110110001101011100111110000100011001010011101001010110110101111100011001110101101111100111011111011111
0000000001000100001100100001010011000111010000100101010010110110001101011100111110000100011001010011101001010110110101111100011001110101101111100111011111011111
0000000001000100001100100001010011000111010000100101010010110110001101011100111110000100011001010011101001010110110101111100011001110101101111100111011111011111
8 88Byte
Word8-Byte Block
address address address
2 LSBs are 0 3 LSBs are 0
0
1
2
3
01234567012345670123456701234567
Byte offset in blockBlock #10/20/15 4
MemoryBlock-addressingexample
![Page 5: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/5.jpg)
010100100000
010100110000
010101000000
010101010000
010101100000
010101110000
010110000000
010110010000
010110100000
010110110000
010100100000
010100110000
010101000000
010101010000
010101100000
010101110000
010110000000
010110010000
010110100000
010110110000
82
83
84
85
86
87
88
89
90
91
2
3
4
5
6
7
0
1
2
3
0
1
0
1
0
1
0
1
0
1
010100100000
010100110000
010101000000
010101010000
010101100000
010101110000
010110000000
010110010000
010110100000
010110110000
Blocknumberaliasingexample
10/20/15 5
Block# Block#mod8 Block#mod2
12-bitmemoryaddresses,16Byteblocks
![Page 6: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/6.jpg)
CachesReview
6
• PrincipleofLocality• TemporalLocalityandSpatialLocality
• HierarchyofMemories(speed/size/costperbit)toExploitLocality
• Cache– copyofdatainlowerlevelofmemoryhierarchy
• DirectMappedtofindblockincacheusingTagfieldandValidbitforHit
• Cachedesignorganizationchoices:• FullyAssociative,Set-Associative,Direct-
Mapped
![Page 7: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/7.jpg)
CacheOrganizations• “FullyAssociative”:Blockcangoanywhere– Firstdesigninlecture– Note:NoIndexfield,but1comparator/block
• “DirectMapped”:Blockgoesoneplace– Note:Only1comparator– Numberofsets=numberblocks
• “N-waySetAssociative”:Nplacesforablock– Numberofsets=numberofblocks/N– Ncomparators– FullyAssociative:N=numberofblocks– DirectMapped:N=1
7
![Page 8: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/8.jpg)
ProcessorAddressFieldsusedbyCacheController
• BlockOffset:Byteaddresswithinblock• SetIndex:Selectswhichset• Tag:Remainingportionofprocessoraddress
• SizeofIndex=log2(numberofsets)• SizeofTag=Addresssize– SizeofIndex– log2(numberofbytes/block)
Block offsetSetIndexTag
8
ProcessorAddress(32-bitstotal)
![Page 9: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/9.jpg)
• Onewordblocks,cachesize=1Kwords(or4KB)
Direct-MappedCacheReview
20Tag 10Index
DataIndex TagValid012...
102110221023
3130 ... 131211 ... 210Byteoffset
20
Data
32
Hit
9
Validbitensures
somethingusefulincacheforthisindex
CompareTagwith
upperpartofAddress toseeifaHit
Readdatafromcacheinstead
ofmemoryifaHit
Comparator
![Page 10: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/10.jpg)
Four-WaySet-AssociativeCache• 28 =256setseachwithfourways(eachwithoneblock)
3130 ... 131211... 210 Byteoffset
DataTagV012...
253254255
DataTagV012...
253254255
DataTagV012...
253254255
SetIndex
DataTagV012...
253254255
8Index
22Tag
Hit Data
32
4x1select
Way0 Way1 Way2 Way3
10
![Page 11: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/11.jpg)
HandlingStoreswithWrite-Through
• Storeinstructionswritetomemory,changingvalues
• Needtomakesurecacheandmemoryhavesamevaluesonwrites:2policies
1)Write-ThroughPolicy:writecacheandwritethroughthecachetomemory– Everywriteeventuallygetstomemory– Tooslow,soincludeWriteBuffertoallowprocessortocontinueoncedatainBuffer
– Bufferupdatesmemoryinparalleltoprocessor
11
![Page 12: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/12.jpg)
Write-ThroughCache
• Writebothvaluesincacheandinmemory
• WritebufferstopsCPUfromstallingifmemorycannotkeepup
• Writebuffermayhavemultipleentriestoabsorbburstsofwrites
• Whatifstoremissesincache?
12
Processor
32-bitAddress
32-bitData
Cache
32-bitAddress
32-bitData
Memory
1022 99252
720
12
1312041 Addr Data
WriteBuffer
![Page 13: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/13.jpg)
HandlingStoreswithWrite-Back
2)Write-BackPolicy:writeonlytocacheandthenwritecacheblockbacktomemorywhenevictblockfromcache–Writescollectedincache,onlysinglewritetomemoryperblock
– Includebittoseeifwrotetoblockornot,andthenonlywritebackifbitisset• Called“Dirty”bit(writingmakesit“dirty”)
13
![Page 14: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/14.jpg)
Write-BackCache
• Store/cachehit,writedataincacheonly&setdirtybit– Memoryhasstalevalue
• Store/cachemiss,readdatafrommemory,thenupdateandsetdirtybit– “Write-allocate”policy
• Load/cachehit,usevaluefromcache
• Onanymiss,writebackevictedblock,onlyifdirty.Updatecachewithnewblockandcleardirtybit.
14
Processor
32-bitAddress
32-bitData
Cache
32-bitAddress
32-bitData
Memory
1022 99252
720
12
1312041
DDDD
DirtyBits
![Page 15: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/15.jpg)
Write-Throughvs.Write-Back
• Write-Through:– Simplercontrollogic– Morepredictabletimingsimplifiesprocessorcontrollogic
– Easiertomakereliable,sincememoryalwayshascopyofdata(bigidea:Redundancy!)
• Write-Back– Morecomplexcontrollogic– Morevariabletiming(0,1,2memoryaccessespercacheaccess)
– Usuallyreduceswritetraffic
– Hardertomakereliable,sometimescachehasonlycopyofdata
15
![Page 16: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/16.jpg)
Administrivia• Project3-1duedateWednesday10/21.• Project3-2duedatenow10/28(release10/21)
• Midterm1:– gradesposted
16
![Page 17: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/17.jpg)
Cache(Performance) Terms
• Hitrate:fractionofaccessesthathitinthecache• Missrate:1– Hitrate• Misspenalty:timetoreplaceablockfromlowerlevelinmemoryhierarchytocache
• Hittime:timetoaccesscachememory(includingtagcomparison)
• Abbreviation:“$”=cache(ABerkeleyinnovation!)
17
![Page 18: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/18.jpg)
AverageMemoryAccessTime(AMAT)• AverageMemoryAccessTime(AMAT)istheaveragetimetoaccessmemoryconsideringbothhitsandmissesinthecache
AMAT= Timeforahit+Missrate× Misspenalty
18
![Page 19: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/19.jpg)
B:400psec
C:600psec
A:≤200psec☐
☐
☐
☐
19
Clickers/PeerinstructionAMAT=Timeforahit+Missratex Misspenalty
Givena200psec clock,amisspenaltyof50clockcycles,amissrateof0.02missesperinstructionandacachehittimeof1clockcycle,whatisAMAT?
![Page 20: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/20.jpg)
Example:Direct-MappedCachewith4Single-WordBlocks,Worst-CaseReferenceString
0 4 0 4
0 4 0 4
• Considerthemainmemoryaddressreferencestringofwordnumbers:04040404
Startwithanemptycache- allblocksinitiallymarkedasnotvalid
20
![Page 21: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/21.jpg)
Example:Direct-MappedCachewith4Single-WordBlocks,Worst-CaseReferenceString
0 4 0 4
0 4 0 4
miss miss miss miss
miss miss miss miss
00Mem(0) 00Mem(0)01 4
01Mem(4)000
00Mem(0)01 4
00Mem(0)01 4
00Mem(0)01 4
01Mem(4)000
01Mem(4)000
Startwithanemptycache- allblocksinitiallymarkedasnotvalid
• Ping-pong effectduetoconflictmisses- twomemorylocationsthatmapintothesamecacheblock
• 8requests,8misses
21
• Considerthemainmemoryaddressreferencestringofwordnumbers:04040404
![Page 22: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/22.jpg)
AlternativeBlockPlacementSchemes
• DMplacement:mem block12in8blockcache:onlyonecacheblockwheremem block12canbefound—(12modulo8)=4
• SAplacement:foursetsx 2-ways(8cacheblocks),memoryblock12inset(12mod4)=0;eitherelementoftheset
• FAplacement:mem block12canappearinanycacheblocks22
![Page 23: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/23.jpg)
Example:2-WaySetAssociative$(4words=2setsx2waysperset)
0
Cache
MainMemory
Q:Howdowefindit?
Usenext1lowordermemoryaddressbittodeterminewhichcacheset(i.e.,modulothenumberofsetsinthecache)
Tag Data
Q:Isitthere?
Compareall thecachetagsinthesettothehighorder3memoryaddressbits totellifthememoryblockisinthecache
V
0000xx0001xx0010xx0011xx0100xx0101xx0110xx0111xx1000xx1001xx1010xx1011xx1100xx1101xx1110xx1111xx
Set
1
01
Way
0
1
OnewordblocksTwoloworderbitsdefine thebyteintheword(32bwords)
23
![Page 24: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/24.jpg)
Example:4Word2-WaySA$SameReferenceString
0 4 0 4
• Considerthemainmemorywordreferencestring04040404Startwithanemptycache- allblocks
initiallymarkedasnotvalid
24
![Page 25: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/25.jpg)
Example:4-Word2-WaySA$SameReferenceString
0 4 0 4
• Considerthemainmemoryaddressreferencestring04040404
miss miss hit hit
000Mem(0) 000Mem(0)
Startwithanemptycache- allblocksinitiallymarkedasnotvalid
010Mem(4) 010Mem(4)
000Mem(0) 000Mem(0)
010Mem(4)
• Solvestheping-pongeffectinadirect-mappedcacheduetoconflictmissessincenowtwomemorylocationsthatmapintothesamecachesetcanco-exist!
• 8requests,2misses
25
![Page 26: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/26.jpg)
Four-WaySet-AssociativeCache• 28 =256setseachwithfourways(eachwithoneblock)
3130 ... 131211... 210 Byteoffset
DataTagV012...
253254255
DataTagV012...
253254255
DataTagV012...
253254255
Index DataTagV012...
253254255
8Index
22Tag
Hit Data
32
4x1select
Way0 Way1 Way2 Way3
26
![Page 27: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/27.jpg)
DifferentOrganizationsofanEight-BlockCache
Totalsizeof$inblocksisequaltonumberofsets× associativity.Forfixed$sizeandfixedblocksize,increasing associativitydecreasesnumberofsetswhileincreasingnumberofelementsperset.Witheightblocks,an8-wayset-associative$issameasafullyassociative$.
27
![Page 28: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/28.jpg)
RangeofSet-AssociativeCaches• Forafixed-sizecacheandfixedblocksize,eachincreasebyafactoroftwoinassociativitydoublesthenumberofblocksperset(i.e.,thenumberorways)andhalvesthenumberofsets– decreasesthesizeoftheindexby1bitandincreasesthesizeofthetagby1bit
Wordoffset ByteoffsetIndexTag
28
![Page 29: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/29.jpg)
RangeofSet-AssociativeCaches• Forafixed-sizecacheandfixedblocksize,eachincreasebyafactoroftwoinassociativitydoublesthenumberofblocksperset(i.e.,thenumberorways)andhalvesthenumberofsets– decreasesthesizeoftheindexby1bitandincreasesthesizeofthetagby1bit
Wordoffset ByteoffsetIndexTag
Decreasingassociativity
Fullyassociative(onlyoneset)Tagisallthebitsexceptblockandbyteoffset
Directmapped(onlyoneway)Smallertags,onlyasinglecomparator
Increasingassociativity
SelectsthesetUsedfortagcompare Selectsthewordintheblock
29
![Page 30: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/30.jpg)
TotalCacheCapacity=
30
Associativity× #ofsets× block_sizeBytes=blocks/set× sets× Bytes/block
ByteOffsetTag Index
C=N× S× B
address_size =tag_size +index_size +offset_size=tag_size +log2(S)+log2(B)
![Page 31: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/31.jpg)
Clickers/PeerInstruction• Foracachewithconstanttotalcapacity, ifweincreasethenumberofwaysbyafactorof2,whichstatementisfalse:
• A:Thenumberofsetscouldbedoubled• B:Thetagwidthcoulddecrease• C:Theblocksizecouldstaythesame• D:Theblocksizecouldbehalved• E:Tagwidthmustincrease
31
![Page 32: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/32.jpg)
TotalCacheCapacity=
32
Associativity× #ofsets× block_size
Bytes=blocks/set× sets× Bytes/block
ByteOffsetTag Index
C=N× S× B
ClickerQuestion:Cremainsconstant,Sand/orBcanchangesuchthatC=2N*(SB)’=>(SB)’=SB/2
Tag_size =address_size – (log2(S)+log2(B))=address_size – log2(SB)=address_size – log2(SB/2)=address_size – (log2(SB)– 1)
address_size =tag_size +index_size +offset_size=tag_size +log2(S)+log2(B)
![Page 33: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/33.jpg)
CostsofSet-AssociativeCaches• N-wayset-associativecachecosts– Ncomparators(delayandarea)– MUXdelay(setselection)beforedataisavailable– Dataavailableaftersetselection(andHit/Missdecision).DM$:blockisavailablebeforetheHit/Missdecision• InSet-Associative,notpossibletojustassumeahitandcontinueandrecoverlaterifitwasamiss
• Whenmissoccurs,whichway’sblockselectedforreplacement?– LeastRecentlyUsed(LRU):onethathasbeenunusedthelongest(principleoftemporallocality)• Musttrackwheneachway’sblockwasusedrelativetootherblocksintheset
• For2-waySA$,onebitperset→setto1whenablockisreferenced;resettheotherway’sbit(i.e.,“lastused”)
33
![Page 34: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/34.jpg)
CacheReplacementPolicies• RandomReplacement
– Hardwarerandomlyselectsacacheevict• Least-RecentlyUsed
– Hardwarekeepstrackofaccesshistory– Replacetheentrythathasnotbeenusedforthelongesttime– For2-wayset-associativecache,needonebitforLRUreplacement
• ExampleofaSimple“Pseudo”LRUImplementation– Assume64FullyAssociativeentries– Hardwarereplacementpointerpointstoonecacheentry– Wheneveraccessismadetotheentrythepointerpointsto:
• Movethepointertothenextentry– Otherwise:donotmovethepointer– (exampleof“not-most-recentlyused”replacementpolicy)
:
Entry0Entry1
Entry63
ReplacementPointer
34
![Page 35: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/35.jpg)
BenefitsofSet-AssociativeCaches• ChoiceofDM$versusSA$dependsonthecostofamiss
versusthecostofimplementation
• Largestgainsareingoingfromdirectmappedto2-way(20%+reductioninmissrate)
35
![Page 36: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/36.jpg)
UnderstandingCacheMisses:The3Cs
• Compulsory(coldstartorprocessmigration,1st reference):– Firstaccesstoblockimpossibletoavoid;smalleffectforlong
runningprograms– Solution:increaseblocksize(increasesmisspenalty;verylarge
blockscouldincreasemissrate)• Capacity:
– Cachecannotcontainallblocksaccessedbytheprogram– Solution:increasecachesize(mayincreaseaccesstime)
• Conflict(collision):– Multiplememorylocationsmappedtothesamecachelocation– Solution1:increasecachesize– Solution2:increaseassociativity (mayincreaseaccesstime)
36
![Page 37: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/37.jpg)
HowtoCalculate3C’susingCacheSimulator
1. Compulsory:setcachesizetoinfinityandfullyassociative,andcountnumberofmisses
2. Capacity:Changecachesizefrominfinity,usuallyinpowersof2,andcountmissesforeachreductioninsize– 16MB,8MB,4MB,…128KB,64KB,16KB
3. Conflict:Changefromfullyassociativeton-waysetassociativewhilecountingmisses– Fullyassociative,16-way,8-way,4-way,2-way,1-way
37
![Page 38: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/38.jpg)
3CsAnalysis
• Threesourcesofmisses(SPEC2000integerandfloating-pointbenchmarks)– Compulsorymisses0.006%;notvisible– Capacitymisses,functionofcachesize– Conflictportiondependsonassociativity andcachesize 38
![Page 39: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/39.jpg)
ImprovingCachePerformance
• Reducethetimetohitinthecache– E.g.,Smallercache
• Reducethemissrate– E.g.,Biggercache
• Reducethemisspenalty– E.g.,Usemultiplecachelevels
39
AMAT=Timeforahit+MissratexMisspenalty
![Page 40: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/40.jpg)
ImpactofLargerCacheonAMAT?• 1)Reducesmisses(whatkind(s)?)• 2)LongerAccesstime(Hittime):smallerisfaster– Increaseinhittimewilllikelyaddanotherstagetothepipeline
• Atsomepoint,increaseinhittimeforalargercachemayovercometheimprovementinhitrate,yieldingadecreaseinperformance
• Computerarchitectsexpendconsiderableeffortoptimizingorganizationofcachehierarchy– bigimpactonperformanceandpower!
40
![Page 41: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/41.jpg)
Clickers:Impactoflongercacheblocksonmisses?
• Forfixedtotalcachecapacityandassociativity,whatiseffectoflongerblocksoneachtypeofmiss:– A:Decrease,B:Unchanged,C:Increase
• Compulsory?• Capacity?• Conflict?
41
![Page 42: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/42.jpg)
Clickers:ImpactoflongerblocksonAMAT
• Forfixedtotalcachecapacityandassociativity,whatiseffectoflongerblocksoneachcomponentofAMAT:– A:Decrease,B:Unchanged,C:Increase
• HitTime?• MissRate?• MissPenalty?
42
![Page 43: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/43.jpg)
Clickers/PeerInstruction:Forfixedcapacityandfixedblocksize,howdoesincreasingassociativityeffectAMAT?
43
![Page 44: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/44.jpg)
CacheDesignSpace• Severalinteractingdimensions
– Cachesize– Blocksize– Associativity– Replacementpolicy– Write-throughvs.write-back– Writeallocation
• Optimalchoiceisacompromise– Dependsonaccesscharacteristics
• Workload• Use(I-cache,D-cache)
– Dependsontechnology/cost• Simplicityoftenwins
Associativity
CacheSize
BlockSize
Bad
Good
Less More
FactorA FactorB
44
![Page 45: CS 61C: Great Ideas in Computer Architecture (Machine ... · PDF fileCS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: John Wawrzynek & Vladimir](https://reader030.fdocuments.in/reader030/viewer/2022020411/5a9b69647f8b9a8b5d8e8ff5/html5/thumbnails/45.jpg)
And,InConclusion…
• NameoftheGame:ReduceAMAT–ReduceHitTime–ReduceMissRate–ReduceMissPenalty
• Balancecacheparameters(Capacity,associativity,blocksize)
45