A Memory Model for RISC-V€¦ · A Memory Model for RISC-V SizhuoZhang, MuralidaranVijayaraghavan,...
Transcript of A Memory Model for RISC-V€¦ · A Memory Model for RISC-V SizhuoZhang, MuralidaranVijayaraghavan,...
WhynotSC/TSO?
Theybothhavesimplespecifications,bothaxiomaticallyandoperationallyButsimpleimplementationshavelowperformancen Strictorderingrequirementsformemory
instructionsn Toimproveperformance,onemust
w speculativelyexecutememoryinstructionsw monitorcoherenceinvalidationtraffic,andw keepcheckpointsandrollbackoninvalidation
WhynotPOWER/ARM?
Theiroperationalmodelsexposetoomuchmicroarchitectural detailsn Branchspeculation,OOOexecution,etc is
exposedinthememorymodelspecification!Theiraxiomaticmodelsaretoocomplexwithnowell-understoodrelationtomicroarchitecturen Onecannotsaywithconfidenceifaparticular
microarchitectural implementationobeysthemodel
Propertiesforanewmemorymodel
Simplespecificationwithoutmicroarchitectural detailslikeBranchspeculationButwellestablishedcorrespondencetomicroarchitectureimplementationsInclusionofsufficientfencestoforceSC-likebehaviorwhennecessary
OurproposalforRISC-Vmemorymodel:WMM
SimpleoperationalspecificationlikeSCandTSOProcessor …
InstantaneousMemory
SC:• Storesupdatememoryinstantly• Loadreadsmemoryinstantly
ProcessorInstantaneousInorderExecution
OurproposalforRISC-Vmemorymodel:WMM
SimpleoperationalspecificationlikeSCandTSOProcessor …
InstantaneousMemory
TSO:• Storesaredequeued inorder• Whenstoresaredequeued fromstorebuffer,itupdatesmemory
instantly• Loadreadstheyoungeststorefromstorebuffer,or(ifnotpresent)
memoryinstantly
StoreBuffer
Processor
StoreBuffer
InstantaneousInorderExecution
OurproposalforRISC-Vmemorymodel:WMM
SimpleoperationalspecificationlikeSCandTSOProcessor …
InstantaneousMemory
WMM:• Storesaredequeued inorderonlyforsameaddress• Whenstoresaredequeued fromstorebuffer,itupdatesmemory
instantlyandenterseveryotherinvalidationbufferinstantly• Loadreadstheyoungeststorefromstorebuffer,or(ifnotpresent)
oldestentryininvalidationbuffer,or(ifnotpresent)memoryinstantly• Oldestinvalidationbufferentrycanbethrownoutanytime
StoreBuffer
Processor
StoreBuffer
InvalidationBuffer
InvalidationBuffer
InstantaneousInorderExecution
FencesinWMM
ReconcileFence:ClearsInvalidationbufferCommitFence:FlushesStorebuffer,i.e.waittillstorebufferisemptybeforeexecuting
AxiomaticDefinitionofWMMPreservedprogramorderaxiom
𝑋 <#$ 𝑌 ∧ 𝑜𝑟𝑑𝑒𝑟 𝑋, 𝑌 ⇒ 𝑋 <.$ 𝑌
Loadvalueaxiom𝐿𝑑𝑎𝑣 ⇒ 𝑣 = 𝑚𝑎𝑥𝑚𝑜 𝑣5 𝑆𝑡𝑎𝑣5 <#$ 𝐿𝑑𝑎 ∨ 𝑆𝑡𝑎𝑣5 <.$ 𝐿𝑑𝑎
𝑜𝑟𝑑𝑒𝑟 𝑋, 𝑌 𝑌Ld b Stbv’ Reconcile Commit
𝑋
Ld a a=b True True True
Stav False a=b False True
Reconcile True True True True
Commit False True True True
St-StFence:CommitLd-Ld Fence:Reconcile
St-Ld Fence:Commit+ReconcileLd-StFence:Notneeded
ImplementingWMM– ProcessorsideAtypicalOOOimplementationobeysWMM
Aloadcanexecuteasearlyasitwantswithoutbeingsquashedaslongasitdoesn’tovertakeareconcileorloadorstoretosameaddressn Localchecks,nomonitoringofcoherence
invalidationsn Loadaddressspeculationallowed– squashedonlyif
predictedaddressiswrongAllinstructionsarecommittedinordern Storesvisibletootherthreads/coresonlyafter
commitw Storescannotovertakeloads
n Prevents“out-of-thin-air”generationofvalues
Out-of-thin-airissue
Noprocessorcanproducevaluesoutofthinairn ButincompletesetofaxiomsseeminglyallowsthisInsistingonin-ordercommitsandadvertisingstoresonlyaftercommittootherthreads/processorstakescareofthisissue
Thread1 Thread2Ld R1=x Ld R2=ySty=R1 Stx=42
Initiallyx=y=R1=R2=0Finallyx=y=R1=R2=42
ImplementingWMM– Memoryside
Typicalcache-coherentmemorywithwriteback cachesattachedtotypicalOOOprocessorsimplementsWMMIfL1iswrite-through,itstillimplementsWMMunlessthecoreisSMTSMTcoreswithL1write-throughcachesimplementa“non-multicopy-atomic”memory,whichisnotcoveredbyWMM
Don’tdoit
MappingC++11toWMMC++11 WMM
Non-atomic Load Load
Load Relaxed Load
LoadConsume Load;Reconcile
LoadAcquire Load;Reconcile
LoadSC Commit;Reconcile;Load;Reconcile
Non-atomicStore Store
StoreRelaxed Store
StoreRelease Commit;Store
StoreSC Commit;Store
UsingoperationalspecificationofWMMmakesitstraightforward
Conclusion
WMMisamemorymodelwithsimplespecificationandhighperformantimplementationn BlendswellwithRISC-Vphilosophyandshould
beusedasthememorymodelforRISC-V
Thankyou!
[email protected]@csail.mit.edu