A Memory Model for RISC-V€¦ · A Memory Model for RISC-V SizhuoZhang, MuralidaranVijayaraghavan,...

15
A Memory Model for RISC-V Sizhuo Zhang, Muralidaran Vijayaraghavan, Arvind RISC-V Workshop, November 29, 2016

Transcript of A Memory Model for RISC-V€¦ · A Memory Model for RISC-V SizhuoZhang, MuralidaranVijayaraghavan,...

AMemoryModelforRISC-V

Sizhuo Zhang,Muralidaran Vijayaraghavan,Arvind

RISC-VWorkshop,November29,2016

WhynotSC/TSO?

Theybothhavesimplespecifications,bothaxiomaticallyandoperationallyButsimpleimplementationshavelowperformancen Strictorderingrequirementsformemory

instructionsn Toimproveperformance,onemust

w speculativelyexecutememoryinstructionsw monitorcoherenceinvalidationtraffic,andw keepcheckpointsandrollbackoninvalidation

WhynotPOWER/ARM?

Theiroperationalmodelsexposetoomuchmicroarchitectural detailsn Branchspeculation,OOOexecution,etc is

exposedinthememorymodelspecification!Theiraxiomaticmodelsaretoocomplexwithnowell-understoodrelationtomicroarchitecturen Onecannotsaywithconfidenceifaparticular

microarchitectural implementationobeysthemodel

Propertiesforanewmemorymodel

Simplespecificationwithoutmicroarchitectural detailslikeBranchspeculationButwellestablishedcorrespondencetomicroarchitectureimplementationsInclusionofsufficientfencestoforceSC-likebehaviorwhennecessary

OurproposalforRISC-Vmemorymodel:WMM

SimpleoperationalspecificationlikeSCandTSOProcessor …

InstantaneousMemory

SC:• Storesupdatememoryinstantly• Loadreadsmemoryinstantly

ProcessorInstantaneousInorderExecution

OurproposalforRISC-Vmemorymodel:WMM

SimpleoperationalspecificationlikeSCandTSOProcessor …

InstantaneousMemory

TSO:• Storesaredequeued inorder• Whenstoresaredequeued fromstorebuffer,itupdatesmemory

instantly• Loadreadstheyoungeststorefromstorebuffer,or(ifnotpresent)

memoryinstantly

StoreBuffer

Processor

StoreBuffer

InstantaneousInorderExecution

OurproposalforRISC-Vmemorymodel:WMM

SimpleoperationalspecificationlikeSCandTSOProcessor …

InstantaneousMemory

WMM:• Storesaredequeued inorderonlyforsameaddress• Whenstoresaredequeued fromstorebuffer,itupdatesmemory

instantlyandenterseveryotherinvalidationbufferinstantly• Loadreadstheyoungeststorefromstorebuffer,or(ifnotpresent)

oldestentryininvalidationbuffer,or(ifnotpresent)memoryinstantly• Oldestinvalidationbufferentrycanbethrownoutanytime

StoreBuffer

Processor

StoreBuffer

InvalidationBuffer

InvalidationBuffer

InstantaneousInorderExecution

FencesinWMM

ReconcileFence:ClearsInvalidationbufferCommitFence:FlushesStorebuffer,i.e.waittillstorebufferisemptybeforeexecuting

AxiomaticDefinitionofWMMPreservedprogramorderaxiom

𝑋 <#$ 𝑌 ∧ 𝑜𝑟𝑑𝑒𝑟 𝑋, 𝑌 ⇒ 𝑋 <.$ 𝑌

Loadvalueaxiom𝐿𝑑𝑎𝑣 ⇒ 𝑣 = 𝑚𝑎𝑥𝑚𝑜 𝑣5 𝑆𝑡𝑎𝑣5 <#$ 𝐿𝑑𝑎 ∨ 𝑆𝑡𝑎𝑣5 <.$ 𝐿𝑑𝑎

𝑜𝑟𝑑𝑒𝑟 𝑋, 𝑌 𝑌Ld b Stbv’ Reconcile Commit

𝑋

Ld a a=b True True True

Stav False a=b False True

Reconcile True True True True

Commit False True True True

St-StFence:CommitLd-Ld Fence:Reconcile

St-Ld Fence:Commit+ReconcileLd-StFence:Notneeded

ImplementingWMM– ProcessorsideAtypicalOOOimplementationobeysWMM

Aloadcanexecuteasearlyasitwantswithoutbeingsquashedaslongasitdoesn’tovertakeareconcileorloadorstoretosameaddressn Localchecks,nomonitoringofcoherence

invalidationsn Loadaddressspeculationallowed– squashedonlyif

predictedaddressiswrongAllinstructionsarecommittedinordern Storesvisibletootherthreads/coresonlyafter

commitw Storescannotovertakeloads

n Prevents“out-of-thin-air”generationofvalues

Out-of-thin-airissue

Noprocessorcanproducevaluesoutofthinairn ButincompletesetofaxiomsseeminglyallowsthisInsistingonin-ordercommitsandadvertisingstoresonlyaftercommittootherthreads/processorstakescareofthisissue

Thread1 Thread2Ld R1=x Ld R2=ySty=R1 Stx=42

Initiallyx=y=R1=R2=0Finallyx=y=R1=R2=42

ImplementingWMM– Memoryside

Typicalcache-coherentmemorywithwriteback cachesattachedtotypicalOOOprocessorsimplementsWMMIfL1iswrite-through,itstillimplementsWMMunlessthecoreisSMTSMTcoreswithL1write-throughcachesimplementa“non-multicopy-atomic”memory,whichisnotcoveredbyWMM

Don’tdoit

MappingC++11toWMMC++11 WMM

Non-atomic Load Load

Load Relaxed Load

LoadConsume Load;Reconcile

LoadAcquire Load;Reconcile

LoadSC Commit;Reconcile;Load;Reconcile

Non-atomicStore Store

StoreRelaxed Store

StoreRelease Commit;Store

StoreSC Commit;Store

UsingoperationalspecificationofWMMmakesitstraightforward

Conclusion

WMMisamemorymodelwithsimplespecificationandhighperformantimplementationn BlendswellwithRISC-Vphilosophyandshould

beusedasthememorymodelforRISC-V

Thankyou!

[email protected]@csail.mit.edu

Backup