SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · •...

23
2/9/17 1 Samira Khan University of Virginia Feb 9, 2017 Intro to Microarchitecture: Single-Cycle CS 3330 AGENDA Review from last lecture ISA tradeoffs Single-cycle Microarchitecture 2 Review: ISA vs. Microarchitecture ISA (Instruction Set Architecture) Agreed upon interface between software and hardware SW/compiler assumes, HW promises What the software writer needs to know to write and debug system/user programs Microarchitecture Specific implementation of an ISA Not visible to the software Microprocessor ISA, uarch, circuits Architecture= ISA + microarchitecture Microarchitecture Circuits ISA Problem Algorithm Program Transistors 3 Review: ISA Instructions Opcodes, Addressing Modes, Data Types Instruction Types and Formats Registers, Condition Codes Memory Address space, Addressability, Alignment Virtual memory management Call, Interrupt/Exception Handling Access Control, Priority/Privilege I/O: memory-mapped vs. instr. Task/thread Management Power and Thermal Management Multi-threading support, Multiprocessor support 4

Transcript of SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · •...

Page 1: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

1

SamiraKhanUniversityofVirginia

Feb9,2017

IntrotoMicroarchitecture:Single-Cycle

CS3330

AGENDA• Reviewfromlastlecture

• ISAtradeoffs

• Single-cycleMicroarchitecture

2

Review:ISAvs.Microarchitecture• ISA(InstructionSetArchitecture)

• Agreeduponinterfacebetweensoftwareandhardware

• SW/compilerassumes,HWpromises• Whatthesoftwarewriterneedstoknowtowriteanddebugsystem/userprograms

• Microarchitecture• SpecificimplementationofanISA• Notvisibletothesoftware

• Microprocessor• ISA,uarch,circuits• “Architecture” =ISA+microarchitecture

Microarchitecture

Circuits

ISA

Problem

Algorithm

Program

Transistors

3

Review:ISA• Instructions

• Opcodes,AddressingModes,DataTypes• InstructionTypesandFormats• Registers,ConditionCodes

• Memory• Addressspace,Addressability,Alignment• Virtualmemorymanagement

• Call,Interrupt/ExceptionHandling• AccessControl,Priority/Privilege• I/O:memory-mappedvs.instr.• Task/threadManagement• PowerandThermalManagement• Multi-threadingsupport,Multiprocessorsupport

4

Page 2: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

2

Microarchitecture• ImplementationoftheISAunderspecific designconstraintsandgoals• Anythingdoneinhardwarewithoutexposuretosoftware

• Pipelining(willseelater)• Clockgating• Caching?Levels,size,associativity,replacementpolicy• Prefetching?• Voltage/frequencyscaling?• Errorcorrection?

5

PropertyofISAvs.Uarch?• ADDinstruction’sopcode• Numberofgeneralpurposeregisters• Numberofportstotheregisterfile• NumberofcyclestoexecutetheMULinstruction• Whetherornotthemachineemployspipelinedinstructionexecution

• Remember• Microarchitecture:ImplementationoftheISAunderspecific designconstraintsandgoals

6

DesignPoint• Asetofdesignconsiderationsandtheirimportance

• leadstotradeoffsinbothISAanduarch• Considerations

• Cost• Performance• Maximumpowerconsumption• Energyconsumption(batterylife)• Availability• ReliabilityandCorrectness• TimetoMarket

• Designpointdeterminedbythe“Problem” space(applicationspace),theintendedusers/market

7

DesignPoint• Asetofdesignconsiderationsandtheirimportance

• leadstotradeoffsinbothISAanduarch• Considerations

• Cost• Performance• Maximumpowerconsumption• Energyconsumption(batterylife)• Availability• ReliabilityandCorrectness• TimetoMarket

• Designpointdeterminedbythe“Problem” space(applicationspace),theintendedusers/marketLookForward&Up

8

Page 3: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

3

ROLEOFTHE(COMPUTER)ARCHITECT

from Yale Patt’s lecture notes9

ROLEOFTHE(COMPUTER)ARCHITECT• Lookbackward(tothepast)

• Understandtradeoffsanddesigns,upsides/downsides,pastworkloads.Analyzeandevaluatethepast

• Lookforward(tothefuture)• Bethedreamerandcreatenewdesigns.Listentodreamers• Pushthestateoftheart.Evaluatenewdesignchoices

• Lookup(towardsproblemsinthecomputingstack)• Understandimportantproblemsandtheirnature• Developarchitecturesandideastosolveimportantproblems

• Lookdown(towardsdevice/circuittechnology)• Understandthecapabilitiesoftheunderlyingtechnology• Predictandadapttothefutureoftechnology(youaredesigningforNyearsahead).Enablethefuturetechnology 10

ApplicationSpace

• Dream,andtheywillappear…

11

Tradeoffs:SoulofComputerArchitecture• ISA-leveltradeoffs

• Microarchitecture-leveltradeoffs

• SystemandTask-leveltradeoffs• Howtodividethelaborbetweenhardwareandsoftware

• Computerarchitectureisthescienceandartofmakingtheappropriatetrade-offstomeetadesignpoint

• Whyart?

12

Page 4: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

4

ISAPrinciplesandTradeoffs

ManyDifferentISAsOverDecades• x86• PDP-x:ProgrammedDataProcessor(PDP-11)• VAX• IBM360• CDC6600• SIMDISAs: CRAY-1,ConnectionMachine• VLIWISAs:Multiflow,Cydrome,IA-64(EPIC)• PowerPC,POWER• RISCISAs:Alpha,MIPS,SPARC,ARM

• Whatarethefundamentaldifferences?• E.g.,howinstructionsarespecifiedandwhattheydo• E.g.,howcomplexaretheinstructions

14

MIPS

opcode6-bit

rs5-bit

rt5-bit

immediate16-bit

I-type

R-type06-bit

rs5-bit

rt5-bit

rd5-bit

shamt5-bit

funct6-bit

opcode6-bit

immediate26-bit

J-type

15

ARM

16

Page 5: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

5

WhatAretheElementsofAnISA?• Instructions

• Opcode• Operandspecifiers(addressingmodes)

• Howtoobtaintheoperand?

• Datatypes• Definition:Representationofinformationforwhichthereareinstructionsthatoperateontherepresentation

• Integer,floatingpoint,character,binary,decimal,BCD• Doublylinkedlist,queue,string,bitvector,stack

• VAX:INSQUEUEandREMQUEUEinstructionsonadoublylinkedlistorqueue;FINDFIRST

• DigitalEquipmentCorp.,“VAX11780ArchitectureHandbook,” 1977.• X86:SCANopcodeoperatesoncharacterstrings;PUSH/POP

Why are there different addressing modes?

17

DataTypeTradeoffs• Whatisthebenefitofhavingmoreorhigh-leveldatatypesintheISA?• Whatisthedisadvantage?

• Thinkcompiler/programmervs.microarchitect

• Conceptofsemanticgap• Datatypescoupledtightlytothesemanticlevel,orcomplexityofinstructions

• Example:EarlyRISCarchitecturesvs.Intel432• EarlyRISC:Onlyintegerdatatype• Intel432:Objectdatatype,capabilitybasedmachine

18

Complexvs.SimpleInstructions• Complexinstruction:Aninstructiondoesalotofwork,e.g.manyoperations

• Insertinadoublylinkedlist• ComputeFFT• Stringcopy

• Simpleinstruction:Aninstructiondoessmallamountofwork,itisaprimitiveusingwhichcomplexoperationscanbebuilt

• Add• XOR• Multiply

19

Complexvs.SimpleInstructions• AdvantagesofComplexinstructions

+Denserencodingà smallercodesizeà bettermemoryutilization,savesoff-chipbandwidth,bettercachehitrate(betterpackingofinstructions)

+Simplercompiler:noneedtooptimizesmallinstructionsasmuch

• DisadvantagesofComplexInstructions- Largerchunksofworkà compilerhaslessopportunitytooptimize(limitedinfine-grainedoptimizationsitcando)

- Morecomplexhardwareà translationfromahighleveltocontrolsignalsandoptimizationneedstobedonebyhardware

20

Page 6: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

6

ISA-levelTradeoffs:SemanticGap• WheretoplacetheISA? Semanticgap

• Closertohigh-levellanguage(HLL)à Smallsemanticgap,complexinstructions

• Closertohardwarecontrolsignals?à Largesemanticgap,simpleinstructions

• RISCvs.CISCmachines• RISC:Reducedinstructionsetcomputer• CISC:Complexinstructionsetcomputer

• FFT,QUICKSORT,POLY,FPinstructions?• VAXINDEXinstruction(arrayaccesswithboundschecking)

21

ISA-levelTradeoffs:SemanticGap• Sometradeoffs(foryoutothinkabout)

• Simplecompiler,complexhardwarevs.complexcompiler,simplehardware

• Burdenofbackwardcompatibility

• Performance?EnergyConsumption?• Optimizationopportunity:ExampleofVAXINDEXinstruction:who(compilervs.hardware)putsmoreeffortintooptimization?

• Instructionsize,codesize

22

SmallversusLargeSemanticGap• CISCvs.RISC

• Complexinstructionsetcomputerà complexinstructions• Initiallymotivatedby“notgoodenough” codegeneration

• Reducedinstructionsetcomputerà simpleinstructions• JohnCocke,mid1970s,IBM801

• Goal:enablebettercompilercontrolandoptimization

• RISCmotivatedby• Memorystalls(noworkdoneinacomplexinstructionwhenthereisamemorystall?)

• Whenisthiscorrect?• Simplifyingthehardwareà lowercost,higherfrequency• Enablingthecompilertooptimizethecodebetter

• Findfine-grainedparallelismtoreducestalls

23

ISA-levelTradeoffs:InstructionLength• Fixedlength:Lengthofallinstructionsthesame

+Easiertodecodesingleinstructioninhardware+Easiertodecodemultipleinstructionsconcurrently-- Wastedbitsininstructions(Whyisthisbad?)-- Harder-to-extendISA(howtoaddnewinstructions?)

• Variablelength:Lengthofinstructionsdifferent(determinedbyopcodeandsub-opcode)

+Compactencoding(Whyisthisgood?)Intel432:6to321bitinstructions.

-- Morelogictodecodeasingleinstruction-- Hardertodecodemultipleinstructionsconcurrently

• Tradeoffs• Codesize(memoryspace,bandwidth,latency)vs.hardwarecomplexity• ISAextensibilityandexpressivenessvs.hardwarecomplexity• Performance?Energy?Smallercodevs.easeofdecode

24

Page 7: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

7

ISA-levelTradeoffs:UniformDecode• Uniformdecode:Samebitsineachinstructioncorrespondtothesamemeaning

• Opcodeisalwaysinthesamelocation• Dittooperandspecifiers,immediatevalues,…• Many“RISC” ISAs:Alpha,MIPS,SPARC+Easierdecode,simplerhardware+Enablesparallelism:generatetargetaddressbeforeknowingtheinstructionisabranch

-- Restrictsinstructionformat(fewerinstructions?)orwastesspace

• Non-uniformdecode• E.g.,opcodecanbethe1st-7thbyteinx86+Morecompactandpowerfulinstructionformat-- Morecomplexdecodelogic

25

ISA-levelTradeoffs:NumberofRegisters• Affects:

• Numberofbitsusedforencodingregisteraddress• Numberofvalueskeptinfaststorage(registerfile)• (uarch)Size,accesstime,powerconsumptionofregisterfile

• Largenumberofregisters:+Enablesbetterregisterallocation(andoptimizations)bycompileràfewersaves/restores

-- Largerinstructionsize-- Largerregisterfilesize

26

ISA-levelTradeoffs:AddressingModes• Addressingmodespecifieshowtoobtainanoperandofaninstruction

• Register• Immediate• Memory(displacement,registerindirect,indexed,absolute,memoryindirect,autoincrement,autodecrement,…)

• Moremodes:+helpbettersupportprogrammingconstructs(arrays,pointer-basedaccesses)

-- makeitharderforthearchitecttodesign-- toomanychoicesforthecompiler?

• Manywaystodothesamethingcomplicatescompilerdesign• Wulf,“CompilersandComputerArchitecture,” IEEEComputer1981

27

ANoteonRISCvs.CISC• Usually,…

• RISC• Simpleinstructions• Fixedlength• Uniformdecode• Fewaddressingmodes

• CISC• Complexinstructions• Variablelength• Non-uniformdecode• Manyaddressingmodes

28

Page 8: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

8

FoodforThoughtforYou• HowwouldyoudesignanewISA?

• Wherewouldyouplaceit?• WhatdesignchoiceswouldyoumakeintermsofISAproperties?

• Whatwouldbethefirstquestionyouaskinthisprocess?

• “Whatismydesignpoint?”

LookForward&Up

29

Y86-64InstructionSet#1Byte

pushq rA A 0 rA F

jXX Dest 7 fn Dest

popq rA B 0 rA F

call Dest 8 0 Dest

cmovXX rA, rB 2 fn rA rB

irmovq V, rB 3 0 F rB V

rmmovq rA, D(rB) 4 0 rA rB D

mrmovq D(rB), rA 5 0 rA rB D

OPq rA, rB 6 fn rA rB

ret 9 0

nop 1 0

halt 0 0

0 1 2 3 4 5 6 7 8 9

30

NowThatWeHaveanISA• Howdoweimplementit?

• i.e.,howdowedesignasystemthatobeysthehardware/softwareinterface?

31

ImplementingtheISA:MicroarchitectureBasics

Page 9: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

9

HowDoesaMachineProcessInstructions?• Whatdoesprocessinganinstructionmean?• RememberthevonNeumannmodel

AS=Architectural(programmervisible)statebeforeaninstructionisprocessed

Processinstruction

AS’ =Architectural(programmervisible)stateafteraninstructionisprocessed

• Processinganinstruction:TransformingAStoAS’ accordingtotheISAspecificationoftheinstruction

33

The“Processinstruction” Step• ISAspecifiesabstractlywhatAS’ shouldbe,givenaninstructionandAS

• Itdefinesanabstractfinitestatemachinewhere• State=programmer-visiblestate• Next-statelogic=instructionexecutionspecification

• FromISApointofview,thereareno“intermediatestates” betweenASandAS’duringinstructionexecution

• Onestatetransitionperinstruction

• MicroarchitectureimplementshowASistransformedtoAS’• Therearemanychoicesinimplementation• Wecanhaveprogrammer-invisiblestatetooptimizethespeedofinstructionexecution:multiplestatetransitionsperinstruction

• Choice1:ASà AS’ (transformAStoAS’ inasingleclockcycle)• Choice2:ASà AS+MS1à AS+MS2à AS+MS3à AS’ (takemultipleclockcyclesto

transformAStoAS’)

34

AVeryBasicInstructionProcessingEngine• Eachinstructiontakesasingleclockcycletoexecute• Onlycombinationallogicisusedtoimplementinstructionexecution

• Nointermediate,programmer-invisiblestateupdates

AS=Architectural(programmervisible)stateatthebeginningofaclockcycle

Processinstructioninoneclockcycle

AS’ =Architectural(programmervisible)stateattheendofaclockcycle

35

AVeryBasicInstructionProcessingEngine• Single-cyclemachine

• Whatistheclockcycletimedeterminedby?• Whatisthecriticalpath ofthecombinationallogicdeterminedby?

AS’ AS(State)CombinationalLogic

36

Page 10: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

10

Assembly/MachineCodeView

Programmer-VisibleState• PC:Programcounter

• Addressofnextinstruction• Called“RIP”(x86-64)

• Registerfile• Heavilyusedprogramdata

• Conditioncodes• Storestatusinformationaboutmostrecentarithmeticorlogicaloperation

• Usedforconditionalbranching

CPU

PC

Registers

Memory

CodeDataStack

Addresses

Data

InstructionsConditionCodes

• Memory• Byteaddressablearray• Codeanduserdata

• Stacktosupportprocedures

Instructions(andprograms)specifyhowtotransformthevaluesofprogrammervisiblestate37

Single-cyclevs.Multi-cycleMachines• Single-cyclemachines

• Eachinstructiontakesasingleclockcycle• Allstateupdatesmadeattheendofaninstruction’sexecution• Bigdisadvantage:Theslowestinstructiondeterminescycletimeà longclockcycletime

• Multi-cyclemachines• Instructionprocessingbrokenintomultiplecycles/stages• Stateupdatescanbemadeduringaninstruction’sexecution• Architecturalstateupdatesmadeonlyattheendofaninstruction’sexecution• Advantageoversingle-cycle:Theslowest“stage” determinescycletime

n Bothsingle-cycleandmulti-cyclemachinesliterallyfollowthevonNeumannmodelatthemicroarchitecturelevel

38

InstructionProcessing“Stage”• Instructionsareprocessedunderthedirectionofa“controlunit” stepbystep.

• Instructionstage:Sequenceofstepstoprocessaninstruction• Fundamentally,therearefivephases:

• Fetch• Decode• EvaluateAddress/FetchOperands• Execute• StoreResult

• Notallinstructionsrequireallstages

39

InstructionProcessing“Cycle” vs.MachineClockCycle• Single-cyclemachine:

• Allphasesoftheinstructionprocessingcycletakeasinglemachineclockcycle tocomplete

• Multi-cyclemachine:• Allsixphasesoftheinstructionprocessingcyclecantakemultiplemachineclockcycles tocomplete

• Infact,eachphasecantakemultipleclockcyclestocomplete

40

Page 11: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

11

InstructionProcessingViewedAnotherWay

• InstructionstransformData(AS)toData’ (AS’)• Thistransformationisdonebyfunctionalunits

• Unitsthat“operate” ondata

• Theseunitsneedtobetoldwhattodotothedata

• Aninstructionprocessingengineconsistsoftwocomponents• Datapath:Consistsofhardwareelementsthatdealwithandtransformdatasignals

• functionalunitsthatoperateondata• hardwarestructures(e.g.wiresandmuxes)thatenabletheflowofdataintothefunctionalunitsandregisters

• storageunitsthatstoredata(e.g.,registers)• Controllogic:Consistsofhardwareelementsthatdeterminecontrolsignals,i.e.,signalsthatspecifywhatthedatapath elementsshoulddotothedata

41

Single-cyclevs.Multi-cycle:Control&Data• Single-cyclemachine:

• Controlsignalsaregeneratedinthesameclockcycleastheoneduringwhichdatasignalsareoperatedon

• Everythingrelatedtoaninstructionhappensinoneclockcycle(serializedprocessing)

• Multi-cyclemachine:• Controlsignalsneededinthenextcyclecanbegeneratedinthecurrentcycle

• Latencyofcontrolprocessingcanbeoverlappedwithlatencyofdatapath operation(moreparallelism)

42

ManyWaysofDatapath andControlDesign• Therearemanywaysofdesigningthedatapathandcontrollogic

• Single-cycle,multi-cycle,pipelineddatapath andcontrol

• Hardwired/combinationalvs.microcoded/microprogrammedcontrol• Controlsignalsgeneratedbycombinationallogicversus• Controlsignalsstoredinamemorystructure

43

Flash-Forward:PerformanceAnalysis• Executiontimeofaninstruction

• {CPI}x{clockcycletime}

• Executiontimeofaprogram• Sumoverallinstructions[{CPI}x{clockcycletime}]• {#ofinstructions}x{AverageCPI}x{clockcycletime}

• Singlecyclemicroarchitectureperformance• CPI=1• Clockcycletime=long

• Multi-cyclemicroarchitectureperformance• CPI=differentforeachinstruction

• AverageCPIà hopefullysmall• Clockcycletime=short

Now, we have two degrees of freedomto optimize independently

44

Page 12: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

12

ASingle-CycleMicroarchitectureACloserLook

Remember…• Single-cyclemachine

AS(State)CombinationalLogic

AS’

46

Let’sStartwiththeStateElements• Dataandcontrolinputs

Registerfile

A

B

W dstW

srcA

valA

srcB

valB

valW

ALU

Operation

A

B

MUX

0

1

PC

InstructionMem

InstrAddr Instruction

DataMem

AddressReadData

WriteData

RegWrite

MemWrite

MemRead

MUXSelect

47

ForNow,WeWillAssume• “Magic” memoryandregisterfile

• Synchronouswrite• theselectedregisterisupdatedonthepositiveedgeclocktransitionwhenwriteenableisasserted

• Cannotaffectreadoutputinbetweenclockedges

48

Page 13: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

13

InstructionProcessing• 6(5)genericsteps

• Instructionfetch(IF)• Instructiondecodeandregisteroperandfetch(ID/RF)• Execute/Evaluatememoryaddress(EX/AG)• Memoryoperandfetch(MEM)• Store/writeback result(WB)• PCUpdate

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

IF ID/RF EX/AG WBMEM

newPC

49

InstructionProcessing• 6(5)genericsteps

• Instructionfetch(IF)• Instructiondecodeandregisteroperandfetch(ID/RF)• Execute/Evaluatememoryaddress(EX/AG)• Memoryoperandfetch(MEM)• Store/writeback result(WB)• PCUpdate

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

IF ID/RF EX/AG WBMEM

newPC

50

InstructionProcessing• 6(5)genericsteps

• Instructionfetch(IF)• Instructiondecodeandregisteroperandfetch(ID/RF)• Execute/Evaluatememoryaddress(EX/AG)• Memoryoperandfetch(MEM)• Store/writeback result(WB)• PCUpdate

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

IF ID/RF EX/AG WBMEM

newPC

51

InstructionProcessing• 6(5)genericsteps

• Instructionfetch(IF)• Instructiondecodeandregisteroperandfetch(ID/RF)• Execute/Evaluatememoryaddress(EX/AG)• Memoryoperandfetch(MEM)• Store/writeback result(WB)• PCUpdate

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

IF ID/RF EX/AG WBMEM

newPC

52

Page 14: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

14

Single-CycleDatapathforArithmeticandLogical

Instructions

ExecutingArith./LogicalOperation

•Fetch• Read2bytes

•Decode• Readoperandregisters

•Execute• Performoperation

• Setconditioncodes

•Memory• Donothing

•Writeback• Updateregister

•PCUpdate• IncrementPCby2

OPq rA, rB 6 fn rA rB

54

StageComputation:Arith/Log.Ops

• Formulateinstructionexecutionassequenceofsimplesteps

• Usesamegeneralformforallinstructions

OPq rA,rBicode:ifun¬M1[PC]rA:rB¬M1[PC+1]

valP¬ PC+2

FetchReadinstructionbyteReadregisterbyte

ComputenextPCvalA¬ R[rA]valB¬ R[rB]

Decode ReadoperandAReadoperandB

valE¬ valBOPvalASetCC

Execute PerformALUoperationSetconditioncoderegisterMemory

R[rB]¬ valEWrite

backWritebackresult

PC¬ valPPCupdate UpdatePC

55

ALUDatapath

**Basedonoriginalfigurefrom[P&HCO&D,COPYRIGHT2004Elsevier.ALLRIGHTSRESERVED.]

ifMEM[PC]==OPq rA, rBR[rB] ¬ R[rB] opR[rA]PC¬ PC+2

Combinationalstateupdatelogic

IF ID EX MEM WB

Registerfile

ValB

rA

DestE

valAALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rB

ValE

ADD

RegWrite ALU

OP

PC

2

56

Page 15: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

15

ALUDatapath

**Basedonoriginalfigurefrom[P&HCO&D,COPYRIGHT2004Elsevier.ALLRIGHTSRESERVED.]

ifMEM[PC]==OPq rA, rBR[rB] ¬ R[rB] opR[rA]PC¬ PC+2

Combinationalstateupdatelogic

IF ID EX MEM WB

Registerfile

ValB

rA

DestE

valAALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rB

ValE

ADD

RegWrite ALU

OP

PC

2

57

Wedidnotcovertheseslidesintheclass

WilllearnabouttheseinthenextclassTheyarehereforyourbenefit

Single-CycleDatapathforDataMovementInstructions

Executingmrmovq (LoadfromMemtoReg)

•Fetch• Read10bytes

•Decode• Readoperandregisters

•Execute• Computeeffectiveaddress

•Memory• Readfrommemory

•Writeback• WritetoRegister

•PCUpdate• IncrementPCby10

6 fn rA rB

mrmovq D(rB),rA

D

60

Page 16: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

16

StageComputation:mrmovq

• UseALUforaddresscomputation

mrmovq D(rB),rAicode:ifun¬M1[PC]rA:rB¬M1[PC+1]valC¬M8[PC+2]valP¬ PC+10

Fetch

ReadinstructionbyteReadregisterbyteReaddisplacementDComputenextPC

valB¬ R[rB]Decode

ReadoperandBvalE¬ valB +valC

ExecuteComputeeffectiveaddress

valM¬ M8[valE]Memory WritevaluetomemoryR[rA]¬ valMWrite

backPC¬ valPPCupdate UpdatePC

61

Ld Datapath

ifMEM[PC]==mrmovq Disp (rB), rAEA=Disp +R[rB]R[rA]¬MEM[EA]PC¬ PC+10

Combinationalstateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem10

PC

62

Ld Datapath

ifMEM[PC]==mrmovq Disp (rB), rAEA=Disp +R[rB]R[rA]¬MEM[EA]PC¬ PC+10

Combinationalstateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem10

PC

63

Ld Datapath

ifMEM[PC]==mrmovq Disp (rB), rAEA=Disp +R[rB]R[rA]¬MEM[EA]PC¬ PC+10

Combinationalstateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem10

PC

64

Page 17: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

17

Ld Datapath

ifMEM[PC]==mrmovq Disp (rB), rAEA=Disp +R[rB]R[rA]¬MEM[EA]PC¬ PC+10

Combinationalstateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem10

PC

65

Ld Datapath

ifMEM[PC]==mrmovq Disp (rB), rAEA=Disp +R[rB]R[rA]¬MEM[EA]PC¬ PC+10

Combinationalstateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem10

PC

66

Executingrmmovq (Stfromreg toMemory)

•Fetch• Read10bytes

•Decode• Readoperandregisters

•Execute• Computeeffectiveaddress

•Memory• Writetomemory

•Writeback• Donothing

•PCUpdate• IncrementPCby10

4 0 rA rB

rmmovq rA,D(rB)

D

67

StageComputation:rmmovq

• UseALUforaddresscomputation

rmmovq rA,D(rB)icode:ifun¬M1[PC]rA:rB¬M1[PC+1]valC¬M8[PC+2]valP¬ PC+10

Fetch

ReadinstructionbyteReadregisterbyteReaddisplacementDComputenextPC

valA¬ R[rA]valB¬ R[rB]

DecodeReadoperandAReadoperandB

valE¬ valB+valCExecute

Computeeffectiveaddress

M8[valE]¬ valAMemory WritevaluetomemoryWrite

backPC¬ valPPCupdate UpdatePC

68

Page 18: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

18

StDatapath

ifMEM[PC]==rmmovq rA, Disp (rB)EA=Disp +R[rB]MEM[EA]¬ R[rA]PC¬ PC+10

Combinationalstateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

PC

10

MemWrite

69

StDatapath

ifMEM[PC]==rnmovq rA, Disp (rB)EA=Disp +R[rB]MEM[EA]¬ R[rA]PC¬ PC+10

Combinationalstateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

PC

10

MemWrite

70

StDatapath

ifMEM[PC]==rnmovq rA, Disp (rB)EA=Disp +R[rB]MEM[EA]¬ R[rA]PC¬ PC+10

Combinationalstateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

PC

10

MemWrite

71

StDatapath

ifMEM[PC]==rnmovq rA, Disp (rB)EA=Disp +R[rB]MEM[EA]¬ R[rA]PC¬ PC+10

Combinationalstateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

PC

10

MemWrite

72

Page 19: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

19

Executingirmovq (Moveimm toReg)

•Fetch• Read10bytes

•Decode• Readoperandregisters

•Execute• Add0toV

•Memory• Donothing

•Writeback• WriteVtorB

•PCUpdate• IncrementPCby10

3 0 F rB

irmovq V, rB

V

73

StageComputation:immovq

• UseALUforaddresscomputation

irmovq V,rBicode:ifun¬M1[PC]rA:rB¬M1[PC+1]valC¬M8[PC+2]valP¬ PC+10

Fetch

ReadinstructionbyteReadregisterbyteReaddisplacementDComputenextPC

Decode

valE¬ 0 +valCExecute

Computeeffectiveaddress

R[rB]¬ valAMemory WritevaluetomemoryWrite

backPC¬ valPPCupdate UpdatePC

74

IRMov Datapath:Option1

ifMEM[PC]==irmovq V, rBR[rB]¬ VPC¬ PC+10 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

MUX

0

PC

MemWrite

10

75

IRMov Datapath:Option1

ifMEM[PC]==irmovq V, rBR[rB]¬ V + 0PC¬ PC+10 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

MUX

0

PC

MemWrite

10

76

Page 20: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

20

IRMov Datapath:Option1

ifMEM[PC]==irmovq V, rBR[rB]¬ V + 0PC¬ PC+10 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

MUX

0

PC

MemWrite

10

77

IRMov Datapath:Option1

ifMEM[PC]==irmovq V, rBR[rB]¬ V + 0PC¬ PC+10 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

MUX

0

PC

MemWrite

10

78

IRMov Datapath:Option1

ifMEM[PC]==irmovq V, rBR[rB]¬ V + 0PC¬ PC+10 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

MUX

0

PC

MemWrite

10

79

IRMov Datapath:Option2

ifMEM[PC]==irmovq V, rBR[rB]¬ VPC¬ PC+10 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

PC

MemWrite

10

80

Page 21: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

21

IRMov Datapath:Option2

ifMEM[PC]==irmovq V, rBR[rB]¬ VPC¬ PC+10 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

PC

MemWrite

10

81

• Tradeoffsbetweenoption1andoption2?

82

Executingrrmovq (MovefromReg toReg)

•Fetch• Read2bytes

•Decode• ReadoperandregisterrA

•Execute• Add0toval rA

•Memory• Donothing

•Writeback• Writeval rA torB

•PCUpdate• IncrementPCby2

2 0 rA rB

rrmovq rA, rB

83

StageComputation:rrmovq

• UseALUforaddresscomputation

rrmovq rA,rBicode:ifun¬M1[PC]rA:rB¬M1[PC+1]

valP¬ PC+2

Fetch

ReadinstructionbyteReadregisterbyteReaddisplacementDComputenextPC

ValA¬ R[rA]Decode

valE¬ 0 +valAExecute

Computeeffectiveaddress

Memory WritevaluetomemoryR[rB]ß valEWrite

backPC¬ valPPCupdate UpdatePC

84

Page 22: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

22

rrMov Datapath:Option1

ifMEM[PC]==rrmovq rA, rBR[rB]¬ R[rA]PC¬ PC+2 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

MUX

0

PC

MemWrite

2

85

rrmov Datapath:Option1

ifMEM[PC]==rrmovq rA, rBR[rB]¬ R[rA]PC¬ PC+2 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

MUX

0

PC

MemWrite

2

86

rrmov Datapath:Option1

ifMEM[PC]==rrmovq rA, rBR[rB]¬ R[rA]PC¬ PC+2 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

MUX

0

PC

MemWrite

2

87

rrmov Datapath:Option1

ifMEM[PC]==rrmovq rA, rBR[rB]¬ R[rA]PC¬ PC+2 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

MUX

0

PC

MemWrite

2

88

Page 23: SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/20170209-slides-4up.pdfFeb 09, 2017  · • Advantages of Complex instructions + Denser encoding àsmaller code size àbetter

2/9/17

23

rrmov Datapath:Option2

ifMEM[PC]==rrmovq rA, rBR[rB]¬ R[rA]PC¬ PC+2 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

PC

MemWrite

10 ?89

SamiraKhanUniversityofVirginia

Feb9,2017

IntrotoMicroarchitecture:Single-Cycle

CS3330