SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017 · •...
Transcript of SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017 · •...
2/9/17
1
SamiraKhanUniversityofVirginia
Feb9,2017
IntrotoMicroarchitecture:Single-Cycle
CS3330
AGENDA• Reviewfromlastlecture
• ISAtradeoffs
• Single-cycleMicroarchitecture
2
Review:ISAvs.Microarchitecture• ISA(InstructionSetArchitecture)
• Agreeduponinterfacebetweensoftwareandhardware
• SW/compilerassumes,HWpromises• Whatthesoftwarewriterneedstoknowtowriteanddebugsystem/userprograms
• Microarchitecture• SpecificimplementationofanISA• Notvisibletothesoftware
• Microprocessor• ISA,uarch,circuits• “Architecture” =ISA+microarchitecture
Microarchitecture
Circuits
ISA
Problem
Algorithm
Program
Transistors
3
Review:ISA• Instructions
• Opcodes,AddressingModes,DataTypes• InstructionTypesandFormats• Registers,ConditionCodes
• Memory• Addressspace,Addressability,Alignment• Virtualmemorymanagement
• Call,Interrupt/ExceptionHandling• AccessControl,Priority/Privilege• I/O:memory-mappedvs.instr.• Task/threadManagement• PowerandThermalManagement• Multi-threadingsupport,Multiprocessorsupport
4
2/9/17
2
Microarchitecture• ImplementationoftheISAunderspecific designconstraintsandgoals• Anythingdoneinhardwarewithoutexposuretosoftware
• Pipelining(willseelater)• Clockgating• Caching?Levels,size,associativity,replacementpolicy• Prefetching?• Voltage/frequencyscaling?• Errorcorrection?
5
PropertyofISAvs.Uarch?• ADDinstruction’sopcode• Numberofgeneralpurposeregisters• Numberofportstotheregisterfile• NumberofcyclestoexecutetheMULinstruction• Whetherornotthemachineemployspipelinedinstructionexecution
• Remember• Microarchitecture:ImplementationoftheISAunderspecific designconstraintsandgoals
6
DesignPoint• Asetofdesignconsiderationsandtheirimportance
• leadstotradeoffsinbothISAanduarch• Considerations
• Cost• Performance• Maximumpowerconsumption• Energyconsumption(batterylife)• Availability• ReliabilityandCorrectness• TimetoMarket
• Designpointdeterminedbythe“Problem” space(applicationspace),theintendedusers/market
7
DesignPoint• Asetofdesignconsiderationsandtheirimportance
• leadstotradeoffsinbothISAanduarch• Considerations
• Cost• Performance• Maximumpowerconsumption• Energyconsumption(batterylife)• Availability• ReliabilityandCorrectness• TimetoMarket
• Designpointdeterminedbythe“Problem” space(applicationspace),theintendedusers/marketLookForward&Up
8
2/9/17
3
ROLEOFTHE(COMPUTER)ARCHITECT
from Yale Patt’s lecture notes9
ROLEOFTHE(COMPUTER)ARCHITECT• Lookbackward(tothepast)
• Understandtradeoffsanddesigns,upsides/downsides,pastworkloads.Analyzeandevaluatethepast
• Lookforward(tothefuture)• Bethedreamerandcreatenewdesigns.Listentodreamers• Pushthestateoftheart.Evaluatenewdesignchoices
• Lookup(towardsproblemsinthecomputingstack)• Understandimportantproblemsandtheirnature• Developarchitecturesandideastosolveimportantproblems
• Lookdown(towardsdevice/circuittechnology)• Understandthecapabilitiesoftheunderlyingtechnology• Predictandadapttothefutureoftechnology(youaredesigningforNyearsahead).Enablethefuturetechnology 10
ApplicationSpace
• Dream,andtheywillappear…
11
Tradeoffs:SoulofComputerArchitecture• ISA-leveltradeoffs
• Microarchitecture-leveltradeoffs
• SystemandTask-leveltradeoffs• Howtodividethelaborbetweenhardwareandsoftware
• Computerarchitectureisthescienceandartofmakingtheappropriatetrade-offstomeetadesignpoint
• Whyart?
12
2/9/17
4
ISAPrinciplesandTradeoffs
ManyDifferentISAsOverDecades• x86• PDP-x:ProgrammedDataProcessor(PDP-11)• VAX• IBM360• CDC6600• SIMDISAs: CRAY-1,ConnectionMachine• VLIWISAs:Multiflow,Cydrome,IA-64(EPIC)• PowerPC,POWER• RISCISAs:Alpha,MIPS,SPARC,ARM
• Whatarethefundamentaldifferences?• E.g.,howinstructionsarespecifiedandwhattheydo• E.g.,howcomplexaretheinstructions
14
MIPS
opcode6-bit
rs5-bit
rt5-bit
immediate16-bit
I-type
R-type06-bit
rs5-bit
rt5-bit
rd5-bit
shamt5-bit
funct6-bit
opcode6-bit
immediate26-bit
J-type
15
ARM
16
2/9/17
5
WhatAretheElementsofAnISA?• Instructions
• Opcode• Operandspecifiers(addressingmodes)
• Howtoobtaintheoperand?
• Datatypes• Definition:Representationofinformationforwhichthereareinstructionsthatoperateontherepresentation
• Integer,floatingpoint,character,binary,decimal,BCD• Doublylinkedlist,queue,string,bitvector,stack
• VAX:INSQUEUEandREMQUEUEinstructionsonadoublylinkedlistorqueue;FINDFIRST
• DigitalEquipmentCorp.,“VAX11780ArchitectureHandbook,” 1977.• X86:SCANopcodeoperatesoncharacterstrings;PUSH/POP
Why are there different addressing modes?
17
DataTypeTradeoffs• Whatisthebenefitofhavingmoreorhigh-leveldatatypesintheISA?• Whatisthedisadvantage?
• Thinkcompiler/programmervs.microarchitect
• Conceptofsemanticgap• Datatypescoupledtightlytothesemanticlevel,orcomplexityofinstructions
• Example:EarlyRISCarchitecturesvs.Intel432• EarlyRISC:Onlyintegerdatatype• Intel432:Objectdatatype,capabilitybasedmachine
18
Complexvs.SimpleInstructions• Complexinstruction:Aninstructiondoesalotofwork,e.g.manyoperations
• Insertinadoublylinkedlist• ComputeFFT• Stringcopy
• Simpleinstruction:Aninstructiondoessmallamountofwork,itisaprimitiveusingwhichcomplexoperationscanbebuilt
• Add• XOR• Multiply
19
Complexvs.SimpleInstructions• AdvantagesofComplexinstructions
+Denserencodingà smallercodesizeà bettermemoryutilization,savesoff-chipbandwidth,bettercachehitrate(betterpackingofinstructions)
+Simplercompiler:noneedtooptimizesmallinstructionsasmuch
• DisadvantagesofComplexInstructions- Largerchunksofworkà compilerhaslessopportunitytooptimize(limitedinfine-grainedoptimizationsitcando)
- Morecomplexhardwareà translationfromahighleveltocontrolsignalsandoptimizationneedstobedonebyhardware
20
2/9/17
6
ISA-levelTradeoffs:SemanticGap• WheretoplacetheISA? Semanticgap
• Closertohigh-levellanguage(HLL)à Smallsemanticgap,complexinstructions
• Closertohardwarecontrolsignals?à Largesemanticgap,simpleinstructions
• RISCvs.CISCmachines• RISC:Reducedinstructionsetcomputer• CISC:Complexinstructionsetcomputer
• FFT,QUICKSORT,POLY,FPinstructions?• VAXINDEXinstruction(arrayaccesswithboundschecking)
21
ISA-levelTradeoffs:SemanticGap• Sometradeoffs(foryoutothinkabout)
• Simplecompiler,complexhardwarevs.complexcompiler,simplehardware
• Burdenofbackwardcompatibility
• Performance?EnergyConsumption?• Optimizationopportunity:ExampleofVAXINDEXinstruction:who(compilervs.hardware)putsmoreeffortintooptimization?
• Instructionsize,codesize
22
SmallversusLargeSemanticGap• CISCvs.RISC
• Complexinstructionsetcomputerà complexinstructions• Initiallymotivatedby“notgoodenough” codegeneration
• Reducedinstructionsetcomputerà simpleinstructions• JohnCocke,mid1970s,IBM801
• Goal:enablebettercompilercontrolandoptimization
• RISCmotivatedby• Memorystalls(noworkdoneinacomplexinstructionwhenthereisamemorystall?)
• Whenisthiscorrect?• Simplifyingthehardwareà lowercost,higherfrequency• Enablingthecompilertooptimizethecodebetter
• Findfine-grainedparallelismtoreducestalls
23
ISA-levelTradeoffs:InstructionLength• Fixedlength:Lengthofallinstructionsthesame
+Easiertodecodesingleinstructioninhardware+Easiertodecodemultipleinstructionsconcurrently-- Wastedbitsininstructions(Whyisthisbad?)-- Harder-to-extendISA(howtoaddnewinstructions?)
• Variablelength:Lengthofinstructionsdifferent(determinedbyopcodeandsub-opcode)
+Compactencoding(Whyisthisgood?)Intel432:6to321bitinstructions.
-- Morelogictodecodeasingleinstruction-- Hardertodecodemultipleinstructionsconcurrently
• Tradeoffs• Codesize(memoryspace,bandwidth,latency)vs.hardwarecomplexity• ISAextensibilityandexpressivenessvs.hardwarecomplexity• Performance?Energy?Smallercodevs.easeofdecode
24
2/9/17
7
ISA-levelTradeoffs:UniformDecode• Uniformdecode:Samebitsineachinstructioncorrespondtothesamemeaning
• Opcodeisalwaysinthesamelocation• Dittooperandspecifiers,immediatevalues,…• Many“RISC” ISAs:Alpha,MIPS,SPARC+Easierdecode,simplerhardware+Enablesparallelism:generatetargetaddressbeforeknowingtheinstructionisabranch
-- Restrictsinstructionformat(fewerinstructions?)orwastesspace
• Non-uniformdecode• E.g.,opcodecanbethe1st-7thbyteinx86+Morecompactandpowerfulinstructionformat-- Morecomplexdecodelogic
25
ISA-levelTradeoffs:NumberofRegisters• Affects:
• Numberofbitsusedforencodingregisteraddress• Numberofvalueskeptinfaststorage(registerfile)• (uarch)Size,accesstime,powerconsumptionofregisterfile
• Largenumberofregisters:+Enablesbetterregisterallocation(andoptimizations)bycompileràfewersaves/restores
-- Largerinstructionsize-- Largerregisterfilesize
26
ISA-levelTradeoffs:AddressingModes• Addressingmodespecifieshowtoobtainanoperandofaninstruction
• Register• Immediate• Memory(displacement,registerindirect,indexed,absolute,memoryindirect,autoincrement,autodecrement,…)
• Moremodes:+helpbettersupportprogrammingconstructs(arrays,pointer-basedaccesses)
-- makeitharderforthearchitecttodesign-- toomanychoicesforthecompiler?
• Manywaystodothesamethingcomplicatescompilerdesign• Wulf,“CompilersandComputerArchitecture,” IEEEComputer1981
27
ANoteonRISCvs.CISC• Usually,…
• RISC• Simpleinstructions• Fixedlength• Uniformdecode• Fewaddressingmodes
• CISC• Complexinstructions• Variablelength• Non-uniformdecode• Manyaddressingmodes
28
2/9/17
8
FoodforThoughtforYou• HowwouldyoudesignanewISA?
• Wherewouldyouplaceit?• WhatdesignchoiceswouldyoumakeintermsofISAproperties?
• Whatwouldbethefirstquestionyouaskinthisprocess?
• “Whatismydesignpoint?”
LookForward&Up
29
Y86-64InstructionSet#1Byte
pushq rA A 0 rA F
jXX Dest 7 fn Dest
popq rA B 0 rA F
call Dest 8 0 Dest
cmovXX rA, rB 2 fn rA rB
irmovq V, rB 3 0 F rB V
rmmovq rA, D(rB) 4 0 rA rB D
mrmovq D(rB), rA 5 0 rA rB D
OPq rA, rB 6 fn rA rB
ret 9 0
nop 1 0
halt 0 0
0 1 2 3 4 5 6 7 8 9
30
NowThatWeHaveanISA• Howdoweimplementit?
• i.e.,howdowedesignasystemthatobeysthehardware/softwareinterface?
31
ImplementingtheISA:MicroarchitectureBasics
2/9/17
9
HowDoesaMachineProcessInstructions?• Whatdoesprocessinganinstructionmean?• RememberthevonNeumannmodel
AS=Architectural(programmervisible)statebeforeaninstructionisprocessed
Processinstruction
AS’ =Architectural(programmervisible)stateafteraninstructionisprocessed
• Processinganinstruction:TransformingAStoAS’ accordingtotheISAspecificationoftheinstruction
33
The“Processinstruction” Step• ISAspecifiesabstractlywhatAS’ shouldbe,givenaninstructionandAS
• Itdefinesanabstractfinitestatemachinewhere• State=programmer-visiblestate• Next-statelogic=instructionexecutionspecification
• FromISApointofview,thereareno“intermediatestates” betweenASandAS’duringinstructionexecution
• Onestatetransitionperinstruction
• MicroarchitectureimplementshowASistransformedtoAS’• Therearemanychoicesinimplementation• Wecanhaveprogrammer-invisiblestatetooptimizethespeedofinstructionexecution:multiplestatetransitionsperinstruction
• Choice1:ASà AS’ (transformAStoAS’ inasingleclockcycle)• Choice2:ASà AS+MS1à AS+MS2à AS+MS3à AS’ (takemultipleclockcyclesto
transformAStoAS’)
34
AVeryBasicInstructionProcessingEngine• Eachinstructiontakesasingleclockcycletoexecute• Onlycombinationallogicisusedtoimplementinstructionexecution
• Nointermediate,programmer-invisiblestateupdates
AS=Architectural(programmervisible)stateatthebeginningofaclockcycle
Processinstructioninoneclockcycle
AS’ =Architectural(programmervisible)stateattheendofaclockcycle
35
AVeryBasicInstructionProcessingEngine• Single-cyclemachine
• Whatistheclockcycletimedeterminedby?• Whatisthecriticalpath ofthecombinationallogicdeterminedby?
AS’ AS(State)CombinationalLogic
36
2/9/17
10
Assembly/MachineCodeView
Programmer-VisibleState• PC:Programcounter
• Addressofnextinstruction• Called“RIP”(x86-64)
• Registerfile• Heavilyusedprogramdata
• Conditioncodes• Storestatusinformationaboutmostrecentarithmeticorlogicaloperation
• Usedforconditionalbranching
CPU
PC
Registers
Memory
CodeDataStack
Addresses
Data
InstructionsConditionCodes
• Memory• Byteaddressablearray• Codeanduserdata
• Stacktosupportprocedures
Instructions(andprograms)specifyhowtotransformthevaluesofprogrammervisiblestate37
Single-cyclevs.Multi-cycleMachines• Single-cyclemachines
• Eachinstructiontakesasingleclockcycle• Allstateupdatesmadeattheendofaninstruction’sexecution• Bigdisadvantage:Theslowestinstructiondeterminescycletimeà longclockcycletime
• Multi-cyclemachines• Instructionprocessingbrokenintomultiplecycles/stages• Stateupdatescanbemadeduringaninstruction’sexecution• Architecturalstateupdatesmadeonlyattheendofaninstruction’sexecution• Advantageoversingle-cycle:Theslowest“stage” determinescycletime
n Bothsingle-cycleandmulti-cyclemachinesliterallyfollowthevonNeumannmodelatthemicroarchitecturelevel
38
InstructionProcessing“Stage”• Instructionsareprocessedunderthedirectionofa“controlunit” stepbystep.
• Instructionstage:Sequenceofstepstoprocessaninstruction• Fundamentally,therearefivephases:
• Fetch• Decode• EvaluateAddress/FetchOperands• Execute• StoreResult
• Notallinstructionsrequireallstages
39
InstructionProcessing“Cycle” vs.MachineClockCycle• Single-cyclemachine:
• Allphasesoftheinstructionprocessingcycletakeasinglemachineclockcycle tocomplete
• Multi-cyclemachine:• Allsixphasesoftheinstructionprocessingcyclecantakemultiplemachineclockcycles tocomplete
• Infact,eachphasecantakemultipleclockcyclestocomplete
40
2/9/17
11
InstructionProcessingViewedAnotherWay
• InstructionstransformData(AS)toData’ (AS’)• Thistransformationisdonebyfunctionalunits
• Unitsthat“operate” ondata
• Theseunitsneedtobetoldwhattodotothedata
• Aninstructionprocessingengineconsistsoftwocomponents• Datapath:Consistsofhardwareelementsthatdealwithandtransformdatasignals
• functionalunitsthatoperateondata• hardwarestructures(e.g.wiresandmuxes)thatenabletheflowofdataintothefunctionalunitsandregisters
• storageunitsthatstoredata(e.g.,registers)• Controllogic:Consistsofhardwareelementsthatdeterminecontrolsignals,i.e.,signalsthatspecifywhatthedatapath elementsshoulddotothedata
41
Single-cyclevs.Multi-cycle:Control&Data• Single-cyclemachine:
• Controlsignalsaregeneratedinthesameclockcycleastheoneduringwhichdatasignalsareoperatedon
• Everythingrelatedtoaninstructionhappensinoneclockcycle(serializedprocessing)
• Multi-cyclemachine:• Controlsignalsneededinthenextcyclecanbegeneratedinthecurrentcycle
• Latencyofcontrolprocessingcanbeoverlappedwithlatencyofdatapath operation(moreparallelism)
42
ManyWaysofDatapath andControlDesign• Therearemanywaysofdesigningthedatapathandcontrollogic
• Single-cycle,multi-cycle,pipelineddatapath andcontrol
• Hardwired/combinationalvs.microcoded/microprogrammedcontrol• Controlsignalsgeneratedbycombinationallogicversus• Controlsignalsstoredinamemorystructure
43
Flash-Forward:PerformanceAnalysis• Executiontimeofaninstruction
• {CPI}x{clockcycletime}
• Executiontimeofaprogram• Sumoverallinstructions[{CPI}x{clockcycletime}]• {#ofinstructions}x{AverageCPI}x{clockcycletime}
• Singlecyclemicroarchitectureperformance• CPI=1• Clockcycletime=long
• Multi-cyclemicroarchitectureperformance• CPI=differentforeachinstruction
• AverageCPIà hopefullysmall• Clockcycletime=short
Now, we have two degrees of freedomto optimize independently
44
2/9/17
12
ASingle-CycleMicroarchitectureACloserLook
Remember…• Single-cyclemachine
AS(State)CombinationalLogic
AS’
46
Let’sStartwiththeStateElements• Dataandcontrolinputs
Registerfile
A
B
W dstW
srcA
valA
srcB
valB
valW
ALU
Operation
A
B
MUX
0
1
PC
InstructionMem
InstrAddr Instruction
DataMem
AddressReadData
WriteData
RegWrite
MemWrite
MemRead
MUXSelect
47
ForNow,WeWillAssume• “Magic” memoryandregisterfile
• Synchronouswrite• theselectedregisterisupdatedonthepositiveedgeclocktransitionwhenwriteenableisasserted
• Cannotaffectreadoutputinbetweenclockedges
48
2/9/17
13
InstructionProcessing• 6(5)genericsteps
• Instructionfetch(IF)• Instructiondecodeandregisteroperandfetch(ID/RF)• Execute/Evaluatememoryaddress(EX/AG)• Memoryoperandfetch(MEM)• Store/writeback result(WB)• PCUpdate
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
IF ID/RF EX/AG WBMEM
newPC
49
InstructionProcessing• 6(5)genericsteps
• Instructionfetch(IF)• Instructiondecodeandregisteroperandfetch(ID/RF)• Execute/Evaluatememoryaddress(EX/AG)• Memoryoperandfetch(MEM)• Store/writeback result(WB)• PCUpdate
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
IF ID/RF EX/AG WBMEM
newPC
50
InstructionProcessing• 6(5)genericsteps
• Instructionfetch(IF)• Instructiondecodeandregisteroperandfetch(ID/RF)• Execute/Evaluatememoryaddress(EX/AG)• Memoryoperandfetch(MEM)• Store/writeback result(WB)• PCUpdate
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
IF ID/RF EX/AG WBMEM
newPC
51
InstructionProcessing• 6(5)genericsteps
• Instructionfetch(IF)• Instructiondecodeandregisteroperandfetch(ID/RF)• Execute/Evaluatememoryaddress(EX/AG)• Memoryoperandfetch(MEM)• Store/writeback result(WB)• PCUpdate
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
IF ID/RF EX/AG WBMEM
newPC
52
2/9/17
14
Single-CycleDatapathforArithmeticandLogical
Instructions
ExecutingArith./LogicalOperation
•Fetch• Read2bytes
•Decode• Readoperandregisters
•Execute• Performoperation
• Setconditioncodes
•Memory• Donothing
•Writeback• Updateregister
•PCUpdate• IncrementPCby2
OPq rA, rB 6 fn rA rB
54
StageComputation:Arith/Log.Ops
• Formulateinstructionexecutionassequenceofsimplesteps
• Usesamegeneralformforallinstructions
OPq rA,rBicode:ifun¬M1[PC]rA:rB¬M1[PC+1]
valP¬ PC+2
FetchReadinstructionbyteReadregisterbyte
ComputenextPCvalA¬ R[rA]valB¬ R[rB]
Decode ReadoperandAReadoperandB
valE¬ valBOPvalASetCC
Execute PerformALUoperationSetconditioncoderegisterMemory
R[rB]¬ valEWrite
backWritebackresult
PC¬ valPPCupdate UpdatePC
55
ALUDatapath
**Basedonoriginalfigurefrom[P&HCO&D,COPYRIGHT2004Elsevier.ALLRIGHTSRESERVED.]
ifMEM[PC]==OPq rA, rBR[rB] ¬ R[rB] opR[rA]PC¬ PC+2
Combinationalstateupdatelogic
IF ID EX MEM WB
Registerfile
ValB
rA
DestE
valAALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rB
ValE
ADD
RegWrite ALU
OP
PC
2
56
2/9/17
15
ALUDatapath
**Basedonoriginalfigurefrom[P&HCO&D,COPYRIGHT2004Elsevier.ALLRIGHTSRESERVED.]
ifMEM[PC]==OPq rA, rBR[rB] ¬ R[rB] opR[rA]PC¬ PC+2
Combinationalstateupdatelogic
IF ID EX MEM WB
Registerfile
ValB
rA
DestE
valAALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rB
ValE
ADD
RegWrite ALU
OP
PC
2
57
Wedidnotcovertheseslidesintheclass
WilllearnabouttheseinthenextclassTheyarehereforyourbenefit
Single-CycleDatapathforDataMovementInstructions
Executingmrmovq (LoadfromMemtoReg)
•Fetch• Read10bytes
•Decode• Readoperandregisters
•Execute• Computeeffectiveaddress
•Memory• Readfrommemory
•Writeback• WritetoRegister
•PCUpdate• IncrementPCby10
6 fn rA rB
mrmovq D(rB),rA
D
60
2/9/17
16
StageComputation:mrmovq
• UseALUforaddresscomputation
mrmovq D(rB),rAicode:ifun¬M1[PC]rA:rB¬M1[PC+1]valC¬M8[PC+2]valP¬ PC+10
Fetch
ReadinstructionbyteReadregisterbyteReaddisplacementDComputenextPC
valB¬ R[rB]Decode
ReadoperandBvalE¬ valB +valC
ExecuteComputeeffectiveaddress
valM¬ M8[valE]Memory WritevaluetomemoryR[rA]¬ valMWrite
backPC¬ valPPCupdate UpdatePC
61
Ld Datapath
ifMEM[PC]==mrmovq Disp (rB), rAEA=Disp +R[rB]R[rA]¬MEM[EA]PC¬ PC+10
Combinationalstateupdatelogic
IF ID EX MEM WB
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
ADD
RegWrite ALU
OP
MUX
MUXSelect
MUX
rB
rA
MUX
D
FromALU
FromMem10
PC
62
Ld Datapath
ifMEM[PC]==mrmovq Disp (rB), rAEA=Disp +R[rB]R[rA]¬MEM[EA]PC¬ PC+10
Combinationalstateupdatelogic
IF ID EX MEM WB
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
ADD
RegWrite ALU
OP
MUX
MUXSelect
MUX
rB
rA
MUX
D
FromALU
FromMem10
PC
63
Ld Datapath
ifMEM[PC]==mrmovq Disp (rB), rAEA=Disp +R[rB]R[rA]¬MEM[EA]PC¬ PC+10
Combinationalstateupdatelogic
IF ID EX MEM WB
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
ADD
RegWrite ALU
OP
MUX
MUXSelect
MUX
rB
rA
MUX
D
FromALU
FromMem10
PC
64
2/9/17
17
Ld Datapath
ifMEM[PC]==mrmovq Disp (rB), rAEA=Disp +R[rB]R[rA]¬MEM[EA]PC¬ PC+10
Combinationalstateupdatelogic
IF ID EX MEM WB
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
ADD
RegWrite ALU
OP
MUX
MUXSelect
MUX
rB
rA
MUX
D
FromALU
FromMem10
PC
65
Ld Datapath
ifMEM[PC]==mrmovq Disp (rB), rAEA=Disp +R[rB]R[rA]¬MEM[EA]PC¬ PC+10
Combinationalstateupdatelogic
IF ID EX MEM WB
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
ADD
RegWrite ALU
OP
MUX
MUXSelect
MUX
rB
rA
MUX
D
FromALU
FromMem10
PC
66
Executingrmmovq (Stfromreg toMemory)
•Fetch• Read10bytes
•Decode• Readoperandregisters
•Execute• Computeeffectiveaddress
•Memory• Writetomemory
•Writeback• Donothing
•PCUpdate• IncrementPCby10
4 0 rA rB
rmmovq rA,D(rB)
D
67
StageComputation:rmmovq
• UseALUforaddresscomputation
rmmovq rA,D(rB)icode:ifun¬M1[PC]rA:rB¬M1[PC+1]valC¬M8[PC+2]valP¬ PC+10
Fetch
ReadinstructionbyteReadregisterbyteReaddisplacementDComputenextPC
valA¬ R[rA]valB¬ R[rB]
DecodeReadoperandAReadoperandB
valE¬ valB+valCExecute
Computeeffectiveaddress
M8[valE]¬ valAMemory WritevaluetomemoryWrite
backPC¬ valPPCupdate UpdatePC
68
2/9/17
18
StDatapath
ifMEM[PC]==rmmovq rA, Disp (rB)EA=Disp +R[rB]MEM[EA]¬ R[rA]PC¬ PC+10
Combinationalstateupdatelogic
IF ID EX MEM WB
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
ADD
RegWrite ALU
OP
MUX
MUXSelect
MUX
rB
rA
MUX
D
FromALU
FromMem
PC
10
MemWrite
69
StDatapath
ifMEM[PC]==rnmovq rA, Disp (rB)EA=Disp +R[rB]MEM[EA]¬ R[rA]PC¬ PC+10
Combinationalstateupdatelogic
IF ID EX MEM WB
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
ADD
RegWrite ALU
OP
MUX
MUXSelect
MUX
rB
rA
MUX
D
FromALU
FromMem
PC
10
MemWrite
70
StDatapath
ifMEM[PC]==rnmovq rA, Disp (rB)EA=Disp +R[rB]MEM[EA]¬ R[rA]PC¬ PC+10
Combinationalstateupdatelogic
IF ID EX MEM WB
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
ADD
RegWrite ALU
OP
MUX
MUXSelect
MUX
rB
rA
MUX
D
FromALU
FromMem
PC
10
MemWrite
71
StDatapath
ifMEM[PC]==rnmovq rA, Disp (rB)EA=Disp +R[rB]MEM[EA]¬ R[rA]PC¬ PC+10
Combinationalstateupdatelogic
IF ID EX MEM WB
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
ADD
RegWrite ALU
OP
MUX
MUXSelect
MUX
rB
rA
MUX
D
FromALU
FromMem
PC
10
MemWrite
72
2/9/17
19
Executingirmovq (Moveimm toReg)
•Fetch• Read10bytes
•Decode• Readoperandregisters
•Execute• Add0toV
•Memory• Donothing
•Writeback• WriteVtorB
•PCUpdate• IncrementPCby10
3 0 F rB
irmovq V, rB
V
73
StageComputation:immovq
• UseALUforaddresscomputation
irmovq V,rBicode:ifun¬M1[PC]rA:rB¬M1[PC+1]valC¬M8[PC+2]valP¬ PC+10
Fetch
ReadinstructionbyteReadregisterbyteReaddisplacementDComputenextPC
Decode
valE¬ 0 +valCExecute
Computeeffectiveaddress
R[rB]¬ valAMemory WritevaluetomemoryWrite
backPC¬ valPPCupdate UpdatePC
74
IRMov Datapath:Option1
ifMEM[PC]==irmovq V, rBR[rB]¬ VPC¬ PC+10 Combinational
stateupdatelogic
IF ID EX MEM WB
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
ADD
RegWrite ALU
OP
MUX
MUXSelect
MUX
rB
rA
MUX
D
FromALU
FromMem
MUX
0
PC
MemWrite
10
75
IRMov Datapath:Option1
ifMEM[PC]==irmovq V, rBR[rB]¬ V + 0PC¬ PC+10 Combinational
stateupdatelogic
IF ID EX MEM WB
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
ADD
RegWrite ALU
OP
MUX
MUXSelect
MUX
rB
rA
MUX
D
FromALU
FromMem
MUX
0
PC
MemWrite
10
76
2/9/17
20
IRMov Datapath:Option1
ifMEM[PC]==irmovq V, rBR[rB]¬ V + 0PC¬ PC+10 Combinational
stateupdatelogic
IF ID EX MEM WB
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
ADD
RegWrite ALU
OP
MUX
MUXSelect
MUX
rB
rA
MUX
D
FromALU
FromMem
MUX
0
PC
MemWrite
10
77
IRMov Datapath:Option1
ifMEM[PC]==irmovq V, rBR[rB]¬ V + 0PC¬ PC+10 Combinational
stateupdatelogic
IF ID EX MEM WB
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
ADD
RegWrite ALU
OP
MUX
MUXSelect
MUX
rB
rA
MUX
D
FromALU
FromMem
MUX
0
PC
MemWrite
10
78
IRMov Datapath:Option1
ifMEM[PC]==irmovq V, rBR[rB]¬ V + 0PC¬ PC+10 Combinational
stateupdatelogic
IF ID EX MEM WB
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
ADD
RegWrite ALU
OP
MUX
MUXSelect
MUX
rB
rA
MUX
D
FromALU
FromMem
MUX
0
PC
MemWrite
10
79
IRMov Datapath:Option2
ifMEM[PC]==irmovq V, rBR[rB]¬ VPC¬ PC+10 Combinational
stateupdatelogic
IF ID EX MEM WB
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
ADD
RegWrite ALU
OP
MUX
MUXSelect
MUX
rB
rA
MUX
D
FromALU
FromMem
PC
MemWrite
10
80
2/9/17
21
IRMov Datapath:Option2
ifMEM[PC]==irmovq V, rBR[rB]¬ VPC¬ PC+10 Combinational
stateupdatelogic
IF ID EX MEM WB
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
ADD
RegWrite ALU
OP
MUX
MUXSelect
MUX
rB
rA
MUX
D
FromALU
FromMem
PC
MemWrite
10
81
• Tradeoffsbetweenoption1andoption2?
82
Executingrrmovq (MovefromReg toReg)
•Fetch• Read2bytes
•Decode• ReadoperandregisterrA
•Execute• Add0toval rA
•Memory• Donothing
•Writeback• Writeval rA torB
•PCUpdate• IncrementPCby2
2 0 rA rB
rrmovq rA, rB
83
StageComputation:rrmovq
• UseALUforaddresscomputation
rrmovq rA,rBicode:ifun¬M1[PC]rA:rB¬M1[PC+1]
valP¬ PC+2
Fetch
ReadinstructionbyteReadregisterbyteReaddisplacementDComputenextPC
ValA¬ R[rA]Decode
valE¬ 0 +valAExecute
Computeeffectiveaddress
Memory WritevaluetomemoryR[rB]ß valEWrite
backPC¬ valPPCupdate UpdatePC
84
2/9/17
22
rrMov Datapath:Option1
ifMEM[PC]==rrmovq rA, rBR[rB]¬ R[rA]PC¬ PC+2 Combinational
stateupdatelogic
IF ID EX MEM WB
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
ADD
RegWrite ALU
OP
MUX
MUXSelect
MUX
rB
rA
MUX
D
FromALU
FromMem
MUX
0
PC
MemWrite
2
85
rrmov Datapath:Option1
ifMEM[PC]==rrmovq rA, rBR[rB]¬ R[rA]PC¬ PC+2 Combinational
stateupdatelogic
IF ID EX MEM WB
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
ADD
RegWrite ALU
OP
MUX
MUXSelect
MUX
rB
rA
MUX
D
FromALU
FromMem
MUX
0
PC
MemWrite
2
86
rrmov Datapath:Option1
ifMEM[PC]==rrmovq rA, rBR[rB]¬ R[rA]PC¬ PC+2 Combinational
stateupdatelogic
IF ID EX MEM WB
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
ADD
RegWrite ALU
OP
MUX
MUXSelect
MUX
rB
rA
MUX
D
FromALU
FromMem
MUX
0
PC
MemWrite
2
87
rrmov Datapath:Option1
ifMEM[PC]==rrmovq rA, rBR[rB]¬ R[rA]PC¬ PC+2 Combinational
stateupdatelogic
IF ID EX MEM WB
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
ADD
RegWrite ALU
OP
MUX
MUXSelect
MUX
rB
rA
MUX
D
FromALU
FromMem
MUX
0
PC
MemWrite
2
88
2/9/17
23
rrmov Datapath:Option2
ifMEM[PC]==rrmovq rA, rBR[rB]¬ R[rA]PC¬ PC+2 Combinational
stateupdatelogic
IF ID EX MEM WB
Registerfile
ValA
rB
DestE
valBALU
PC
InstructionMem
InstrAddr Instruction
DataMem
Address ReadData
WriteData
rA
ValE
ADD
RegWrite ALU
OP
MUX
MUXSelect
MUX
rB
rA
MUX
D
FromALU
FromMem
PC
MemWrite
10 ?89
SamiraKhanUniversityofVirginia
Feb9,2017
IntrotoMicroarchitecture:Single-Cycle
CS3330