Pipelined MIPS Processor - Access IC Lab (Prof. An-Yeu...

Post on 12-Nov-2018

218 views 0 download

Transcript of Pipelined MIPS Processor - Access IC Lab (Prof. An-Yeu...

ACCESS IC LAB

Graduate Institute of Electronics Engineering, NTU

Pipelined MIPS ProcessorPipelined MIPS Processor

Lecturer: Chihhao ChaoAdvisor: Prof. An-Yeu Wu

Date: 2009.5.6 Wednesday

• Adapted from Prof. Wu’s Computer Organization Lecture Note• Review suggestion: Complexity Management and Improving Timing/Area/Power

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P2

OutlineOutlinev 6.1 An Overview of Pipeliningv 6.2 A Pipelined Datapathv 6.3 Pipelined Controlv 6.4 Data Hazards and Forwardingv 6.5 Data Hazards and Stallsv 6.6 Branch Hazardsv 6.7(8) Exceptions

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P3

Pipelining is Natural!Pipelining is Natural!vPipelining provides a method for executing multiple

instructions at the same time.

vLaundry ExamplevAnn, Brian, Cathy, Dave

each have one load of clothes to wash, dry, and foldvWasher takes 30 minutesvDryer takes 40 minutesv“Folder” takes 20 minutes

A B C D

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P4

vSequential laundry takes 6 hours for 4 loadsvIf they learned pipelining, how long would laundry take?

A

B

C

D

30 40 20 30 40 20 30 40 20 30 40 20

6 PM 7 8 9 10 11 Midnight

Task

Order

Time

Sequential LaundrySequential Laundry

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P5

vPipelined laundry takes 3.5 hours for 4 loads

Task

Order

A

B

C

D

6 PM 7 8 9 10 11 MidnightTime

30 40 40 40 40 20

Pipelined Laundry: Start work ASAPPipelined Laundry: Start work ASAP

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P6

vPipelining doesn’t help latency of single task, it helps throughput of entire workloadvPipeline rate limited by

slowest pipeline stagevMultiple tasks operating

simultaneously using different resourcesvPotential speedup = Number

of pipe stagesvUnbalanced lengths of pipe

stages reduces speedupvTime to “fill” pipeline and

time to “drain” it reduces speedupvStall for Dependences

A

B

C

D

6 PM 7 8 9

Task

Order

Time

30 40 40 40 40 20

Pipelining LessonsPipelining Lessons

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P7

The 5 Stages of the Load InstructionThe 5 Stages of the Load Instruction

vIFetch: Instruction FetchvFetch the instruction from the Instruction Memory

vReg/Dec: Registers Fetch and Instruction DecodevExec: Calculate the memory addressvMem: Read the data from the Data MemoryvWr: Write the data back to the register file

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5

IFetch Reg/Dec Exec Mem WrLoad

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P8

Pipeline ExecutionPipeline Execution

vOn a processor multple instructions are in various stages at the same time.vAssume each instruction takes five cycles

IFetch Dcd Exec Mem WB

IFetch Dcd Exec Mem WB

IFetch Dcd Exec Mem WB

IFetch Dcd Exec Mem WB

IFetch Dcd Exec Mem WB

IFetch Dcd Exec Mem WBProgram Flow

Time

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P9

Single Cycle, MultiSingle Cycle, Multi--cycle, Pipelinedcycle, Pipelined

Clk

Cycle 1

Multiple Cycle Implementation:

IFetch Reg Exec Mem Wr

Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10

Load IFetch Reg Exec Mem Wr

IFetch Reg Exec MemLoad Store

Pipeline Implementation:

IFetch Reg Exec Mem WrStore

Clk

Single Cycle Implementation:Load Store Waste

IFetchR-type

IFetch Reg Exec Mem WrR-type

Cycle 1 Cycle 2

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P10

Why Pipeline? Because the Resources Are There!Why Pipeline? Because the Resources Are There!

Instr.

Order

Time (clock cycles)

Inst 0

Inst 1

Inst 2

Inst 4

Inst 3

AL

UIm Reg Dm Reg

AL

UIm Reg Dm Reg

AL

UIm Reg Dm RegA

LUIm Reg Dm Reg

AL

UIm Reg Dm Reg

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P11

Pipelining MIPS ExecutionPipelining MIPS Execution

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P12

Pipeline HazardsPipeline Hazardsv Structural hazardv An occurrence in which a planned instruction cannot execute in

the proper clock cycle because the hardware cannot support the combination of instructions that are set to execute in the given clock cycle.

v Data hazardv Also called pipeline data hazard. An occurrence in which a

planned instruction cannot execute in the proper clock cycle because data that is needed to execute the instruction is not yet available.

v Control hazardv Also called branch hazard. An occurrence in which the proper

instruction cannot execute in the proper clock cycle because the instruction that was fetched is NOT the one that is needed; that is, the flow of instruction addresses is not what the pipeline expected.

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P13

Data HazardData Hazardv Forwardingv Also called bypassing. A method of resolving a data hazard by

retrieving the missing data element from internal buffers rather than waiting for it to arrive from programmer-visible register or memory.

v Load-use data hazardv A specific form of data hazard in which the data requested by

a load instruction has not yet become available when it is requested.

v Pipeline stallv Also called bubble. A stall initiated in order to resolve a hazard

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P14

Control HazardControl Hazardv Untaken branchv One that falls through to the successive instruction. A taken

branch is one that causes transfer to the branch target

v Branch predictionv A method of resolving a branch hazard that assumes a given

outcome for the branch, and proceeds from that assumptionrather than waiting to ascertain the actual outcome

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P15

Pipeline Overview SummaryPipeline Overview Summaryv Latency (pipeline)v The number of stages in a pipeline or the number of stages

between two instructions during execution.

v Throughput (pipeline)v The number of instructions executed per unit time.

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P16

OutlineOutlinev 6.1 An Overview of Pipeliningv 6.2 A Pipelined Datapathv 6.3 Pipelined Controlv 6.4 Data Hazards and Forwardingv 6.5 Data Hazards and Stallsv 6.6 Branch Hazardsv 6.8 Exceptions

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P17

Designing a Pipelined ProcessorDesigning a Pipelined Processorv Examine the datapath and control diagramv Starting with single-or multi-cycle datapath?v Single-or multi-cycle control?

v Partition datapath into stages:v IF (instruction fetch), ID (instruction decode and register file

read), EX (execution or address calculation), MEM (data memory access), WB (write back)

v Associate resources with statesv Ensure that flows do not conflict, or figure out how to

resolvev Assert control in appropriate stage

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P18

Use MultiUse Multi--cycle Execution Stepscycle Execution Steps

But, use single-cycle datapath….. (separate memory, why??)

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P19

Split SingleSplit Single--cycle Datapathcycle Datapath

What to add to split the datapath into stages

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P20

Add Pipeline RegistersAdd Pipeline Registers

v Use registers between stages to carry data and control

(Flip/Flop)

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P21

Consider Consider loadload

v IF: Instruction Fetchv Fetch the instruction from the Instruction Memory

v ID: Instruction Decodev Registers fetch and instruction decode

v EX: Calculate the memory addressvMEM: Read the data from the Data MemoryvWB: Write the data back to the register file

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5

Ifetch Reg/Dec Exec Mem Wrlw

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P22

Pipelining Pipelining loadload

v 5 functional units in the pipeline datapath are:v Instruction Memory for the IFetch stagev Register File’s Read ports (busA and busB) for the Reg/Dec stagev ALU for the Exec stagev Data Memory for the MEM stagev Register File’s Write port (busW) for the WB stage

Clock

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7

Ifetch Reg/Dec Exec Mem Wr1st lw

Ifetch Reg/Dec Exec Mem Wr2nd lw

Ifetch Reg/Dec Exec Mem Wr3rd lw

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P23

IF Stage of IF Stage of load wordload wordv IF/ID= mem[PC] ; PC = PC + 4

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P24

ID Stage of ID Stage of load wordload wordv ID/EX(A)= Reg[IR[25-21]]; ID/EX(B)= Reg[IR[20-16]];v ID/EX = Sign-extension of ID[15:0]

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P25

EX Stage of EX Stage of load wordload wordv EX/MEM = A + sign-ext(IR[15-0]) % address computation

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P26

MEM Stage of MEM Stage of load wordload wordv MEM/WB = mem[ALUout]

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P27

WB Stage of WB Stage of loadloadv Reg[ IR[20-16] ] = MEM/WB

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P28

Pipelined DatapathPipelined Datapath

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P29

The Four Stages of RThe Four Stages of R--typetype

v IF: fetch the instruction from the Instruction Memoryv ID: registers fetch and instruction decodev EX: v ALU operates on the two register operandsv Update PC

vWB: write ALU output back to the register file

Clock

Cycle 1 Cycle 2 Cycle 3 Cycle 4

Ifetch Reg/Dec Exec WrR-type

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P30

Pipelining RPipelining R--type and type and loadload

vWe have a structural hazard:v Two instructions try to write to the register file at the same time!v Only one write port

Clock

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9

Ifetch Reg/Dec Exec WrR-type

Ifetch Reg/Dec Exec WrR-type

Ifetch Reg/Dec Exec Mem WrLoad

Ifetch Reg/Dec Exec WrR-type

Ifetch Reg/Dec Exec WrR-type

We have a problem!

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P31

Important ObservationImportant Observationv Each functional unit can only be used once per

instruction

v Each functional unit must be used at the same stage for all instructions:v Load uses Register File’s write port during its 5th stage

v R-type uses Register File’s write port during its 4th stage

Several ways to solve: 1) adding pipeline bubble, 2) making instructions same length

Ifetch Reg/Dec Exec WrR-type

Ifetch Reg/Dec Exec Mem WrLoad

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P32

Solution 1: Insert BubbleSolution 1: Insert Bubble

v Insert a bubble into the pipeline to prevent two writes at the same cyclev The control logic can be complexv Lose instruction fetch and issue opportunity

v No instruction is started in Cycle 6

Clock

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9

Ifetch Reg/Dec Exec WrR-type

Ifetch Reg/Dec Exec

Ifetch Reg/Dec Exec Mem WrLoad

Ifetch Reg/Dec Exec WrR-type

Ifetch Reg/Dec Exec WrR-type Pipeline

Bubble

Ifetch Reg/Dec Exec WrR-type

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P33

Solution 2: Delay RSolution 2: Delay R--type’s Writetype’s Writev Delay R-type’s register write by one cycle:v R-type also use Reg File’s write port at Stage 5v MEM is a NOP stage: nothing is being done.

Clock

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9

Ifetch Reg/Dec Mem WrR-type

Ifetch Reg/Dec Mem WrR-type

Ifetch Reg/Dec Exec Mem WrLoad

Ifetch Reg/Dec Mem WrR-type

Ifetch Reg/Dec Mem WrR-type

Exec

Exec

Exec

Exec

Ifetch Reg/Dec Mem WrR-type Exec

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P34

The Four Stages of The Four Stages of storestore

v IF: fetch the instruction from the Instruction Memoryv ID: registers fetch and instruction decodev EX: calculate the memory addressv MEM: write the data into the Data Memory

Add an extra stage:v WB: NOP

Cycle 1 Cycle 2 Cycle 3 Cycle 4

Ifetch Reg/Dec Exec MemStore Wr

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P35

The Four Stages of The Four Stages of beqbeq

v IF: fetch the instruction from the Instruction Memoryv ID: registers fetch and instruction decodev EX:

v compares the two register operandv select correct branch target addressv latch into PC

v Add two extra stages:v MEM: NOPv WB: NOP

Cycle 1 Cycle 2 Cycle 3 Cycle 4

Ifetch Reg/Dec Exec MemStore Wr

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P36

Use Graphical Representation for Pipelined Use Graphical Representation for Pipelined MIPSMIPS

v The graph can help to answer questions like:v How many cycles to execute this code?vWhat is the ALU doing during cycle 4?

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P37

Example 1: Cycle 1Example 1: Cycle 1

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P38

Example 1: Cycle 2Example 1: Cycle 2

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P39

Example 1: Cycle 3Example 1: Cycle 3

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P40

Example 1: Cycle 4Example 1: Cycle 4

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P41

Example 1: Cycle 5Example 1: Cycle 5

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P42

Example 1: Cycle 6Example 1: Cycle 6

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P43

OutlineOutlinev 6.1 An Overview of Pipeliningv 6.2 A Pipelined Datapathv 6.3 Pipelined Controlv 6.4 Data Hazards and Forwardingv 6.5 Data Hazards and Stallsv 6.6 Branch Hazardsv 6.8 Exceptions

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P44

Pipeline Control: Control SignalsPipeline Control: Control Signals

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P45

Group Signals According to StagesGroup Signals According to Stagesv Can use control signals of single-cycle CPU

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P46

Data Stationary ControlData Stationary Controlv Pass control signals along just like the datavMain control generates control signals during ID

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P47

Data Stationary Control (cont.)Data Stationary Control (cont.)v Signals for EX (ExtOp, ALUSrc, ...) are used 1 cycle laterv Signals for MEM (MemWr, Branch) are used 2 cycles laterv Signals for WB (MemtoReg, MemWr) are used 3 cycles later

IF/ID R

egister

ID/E

x Register

Ex/M

em R

egister

Mem

/Wr R

egister

Reg/Dec Exec Mem

ExtOp

ALUOpRegDst

ALUSrc

BranchMemWr

MemtoRegRegWr

MainControl

ExtOp

ALUOpRegDst

ALUSrc

MemtoRegRegWr

MemtoRegRegWr

MemtoRegRegWr

BranchMemWr

BranchMemWr

Wr

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P48

Datapath with ControlDatapath with Control

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P49

Let’s Try it OutLet’s Try it OutSample Assembly Program

lw $10, 20($1)sub $11, $2, $3and $12, $4, $5or $13, $6, $7add $14, $8, $9

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P50

Example 2: Cycle 1Example 2: Cycle 1

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P51

Example 2: Cycle 2Example 2: Cycle 2

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P52

Example 2: Cycle 3Example 2: Cycle 3

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P53

Example 2: Cycle 4Example 2: Cycle 4

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P54

Example 2: Cycle 5Example 2: Cycle 5

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P55

Example 2: Cycle 6Example 2: Cycle 6

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P56

Example 2: Cycle 7Example 2: Cycle 7

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P57

Example 2: Cycle 8Example 2: Cycle 8

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P58

Example 2: Cycle 9Example 2: Cycle 9

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P59

Summary of Pipeline BasicsSummary of Pipeline Basicsv Pipelining is a fundamental conceptvMultiple steps using distinct resourcesv Utilize capabilities of datapath by pipelined instruction processingØ Start next instruction while working on the current oneØ Limited by length of longest stage (plus fill/flush)ØNeed to detect and resolve hazards

vWhat makes it easy in MIPS?v All instructions are of the same lengthv Just a few instruction formatsvMemory operands only in loads and stores

vWhat makes pipelining hard? hazards

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P60

OutlineOutlinev 6.1 An Overview of Pipeliningv 6.2 A Pipelined Datapathv 6.3 Pipelined Controlv 6.4 Data Hazards and Forwardingv 6.5 Data Hazards and Stallsv 6.6 Branch Hazardsv 6.8 Exceptions

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P61

Data HazardsData Hazardsv Order of operand accesses changed by pipelinev Starting next instruction before first is finishedv Dependencies “go backward in time”

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P62

Handling Data HazardsHandling Data HazardsvDetect vResolve remaining onesvCompiler inserts NOPvStallvForward

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P63

Software SolutionSoftware Solutionv Have compiler guarantee no hazardsvWhere do we insert the NOPs?

sub $2, $1, $3and $12, $2, $5or $13, $6, $2add $14, $2, $2sw $15, 100($2)

v Problem: not efficient enough!

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P64

Detecting Data HazardsDetecting Data Hazardsv Hazard conditions:v EX/MEM.RegisterRd = ID/EX.RegisterRs (1a)v EX/MEM.RegisterRd = ID/EX.RegisterRt (1b)vMEM/WB.RegisterRd = ID/EX.RegisterRs (2a)vMEM/WB.RegisterRd = ID/EX.RegisterRt (2b)

v Two optimizations:v Don’t forward if instruction does not write registerà check if RegWrite is assertedv Don’t forward if destination register is $0à check if RegisterRd = 0

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P65

Detecting Data Hazards (cont.)Detecting Data Hazards (cont.)v Hazard conditions using control signals:

At EX stage (EX hazard):

l If ( EX/MEM.RegWrite and (EX/MEM.RegRd≠0)and (EX/MEM.RegRd=ID/EX.RegRs )

ForwardA = 10

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P66

Detecting Data Hazards (cont.)Detecting Data Hazards (cont.)v Hazard conditions using control signals:v At MEM stage:

l MEM/WB.RegWrite and (MEM/WB.RegRd≠0)and (MEM/WB.RegRd=ID/EX.RegRs)

v (replace ID/EX.RegRt for ID/EX.RegRs for the other two conditions)

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P67

Resolving Hazards: ForwardingResolving Hazards: Forwardingv Use temporary results, e.g., those in pipeline registers, don’t wait for

them to be written

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P68

Forwarding LogicForwarding Logicv Forwarding: input to ALU from any pipe registersv Add multiplexors to ALU input v Control forwarding in EX à carry Rs in ID/EX

v Control signals for forwarding:v If both WB and MEM forward, e.g., add $1,$1,$2; add $1,$1,$3; add

$1,$1,$4; => let MEM forward

v EX hazard:Ø if (EX/MEM.RegWrite and (EX/MEM.RegRd≠0) and

(EX/MEM.RegRd=ID/EX.RegRs))ForwardA=10

v MEM hazard:Ø if (MEM/WB.RegWriteand (MEM/WB.RegRd≠0)

and (EX/MEM.RegRd≠ID/EX.Reg.Rs)and (MEM/WB.RegRd=ID/EX.RegRs))

ForwardA=01

(ID/EX.RegRt <-> ID/EX.RegRs, ForwardB <-> ForwardA)

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P69

No ForwardingNo Forwarding

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P70

With ForwardingWith Forwarding

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P71

Pipeline with ForwardingPipeline with Forwarding

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P72

Example 3: Cycle 3Example 3: Cycle 3

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P73

Example 3: Cycle 4Example 3: Cycle 4

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P74

Example 3: Cycle 5Example 3: Cycle 5

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P75

Example 3: Cycle 6Example 3: Cycle 6

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P76

OutlineOutlinev 6.1 An Overview of Pipeliningv 6.2 A Pipelined Datapathv 6.3 Pipelined Controlv 6.4 Data Hazards and Forwardingv 6.5 Data Hazards and Stallsv 6.6 Branch Hazardsv 6.8 Exceptions

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P77

Can't Always ForwardCan't Always Forwardv lw can still cause a hazard:

v if is followed by an instruction to read the loaded reg.

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P78

StallingStallingv Stall pipeline by keeping instructions in same stage and inserting an

NOP instead

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P79

Handling StallsHandling Stallsv Hazard detection unit in ID to insert stall between a

load instruction and its use:if (ID/EX.MemRead and

((ID/EX.RegisterRt= IF/ID.RegisterRs) or(ID/EX.RegisterRt= IF/ID.registerRt))

stall the pipeline for one cycle(ID/EX.MemRead=1 indicates a load instruction)

v How to stall?v Stall instruction in IF and ID: not change PC and IF/ID

=> the stages re-execute the instructionsvWhat to move into EX: insert an NOP by changing EX, MEM,

WB control fields of ID/EX pipeline register to 0l as control signals propagate, all control signals to EX, MEM, WB

are deasserted and no registers or memories are written

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P80

Pipeline with Stalling UnitPipeline with Stalling Unitv Forwarding controls ALU inputs, hazard detection controls PC, IF/ID,

control signals

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P81

Example 4: Cycle 2Example 4: Cycle 2

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P82

Example 4: Cycle 3Example 4: Cycle 3

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P83

Example 4: Cycle 4Example 4: Cycle 4

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P84

Example 4: Cycle 5Example 4: Cycle 5

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P85

Example 4: Cycle 6Example 4: Cycle 6

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P86

Example 4: Cycle 7Example 4: Cycle 7

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P87

OutlineOutlinev 6.1 An Overview of Pipeliningv 6.2 A Pipelined Datapathv 6.3 Pipelined Controlv 6.4 Data Hazards and Forwardingv 6.5 Data Hazards and Stallsv 6.6 Branch Hazardsv 6.8 Exceptions (optional)v (optional)

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P88

Branch HazardsBranch HazardsvWhen decide to branch, other instructions are still in the pipe

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P89

Handling Branch HazardHandling Branch Hazardv Predict branch always not taken

v Need to add hardware for flushing inst. if wrongv Branch decision made at MEM => need to flush inst. in IF, ID, EX by changing

control values to 0

v Reduce delay of taken branch by moving branch execution earlier in the pipelinev Move up branch address calculation to IDv Check branch equality at ID (using XOR) by comparing the two registers read

during IDv Branch decision made at EX => one inst. to flushv Add a control signal, IF.Flush, to zero instruction field of IF/ID => making the

instruction an NOP

v Dynamic branch predictionv Compiler rescheduling, delay branch

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P90

Delayed BranchDelayed Branchv Predict-not-taken + branch decision at ID

=> the following inst. is always executed=> branches take effect 1 cycle later

v 0 clock cycle per branch instruction if can find instruction to put in slot (≅50% of time)

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P91

Pipeline with FlushingPipeline with Flushing

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P92

Example 5: Cycle 3Example 5: Cycle 3

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P93

Example 5: Cycle 4Example 5: Cycle 4

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P94

OutlineOutlinev 6.1 An Overview of Pipeliningv 6.2 A Pipelined Datapathv 6.3 Pipelined Controlv 6.4 Data Hazards and Forwardingv 6.5 Data Hazards and Stallsv 6.6 Branch Hazardsv 6.8 Exceptions

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P95

What about Exceptions?What about Exceptions?v 5 instructions executing in 5 stage pipelinev How to stop the pipeline? restart?vWho caused the interrupt?vWho to serve first, if multiple interrupts at the same time?

v Need to know in which stage an exception can occurStage Problem interrupts occurringIF Page fault; misaligned memory access;

memory-protection violationID Undefined or illegal opcodeEX Arithmetic exceptionMEM Page fault; misaligned memory access; memory

error; mem-protection violation;

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P96

Handling ExceptionsHandling Exceptionsv Suppose overflow occur at add $1,$2,$1v Disable writes of instructions till trap hits WB, e.g., flush

following instructions using IF.Flush, ID.Flush, EX.Flush to cause multiplexorsto zero control signals(overflow exception detected at EX => flush offending instr.)v Force trap instruction into IF, e.g., fetch from 4000 0040hex by

adding 4000 0040hex to PC input MUXv Save address of offending instruction in EPC

vMultiple interrupts: use priority hardware to choose the earliest instruction to interrupt

v External interrupts: flexible in when to interrupt

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

P97

Pipeline with ExceptionPipeline with Exception