Lecture 16: Pipeline Controlsjtang/cs411.s20/lectures/L16... · 2020-03-26 · Lecture 16: Pipeline...

25
Lecture 16: Pipeline Controls Spring 2020 Jason Tang 1

Transcript of Lecture 16: Pipeline Controlsjtang/cs411.s20/lectures/L16... · 2020-03-26 · Lecture 16: Pipeline...

Lecture 16: Pipeline Controls

Spring 2020Jason Tang

�1

Topics

• Designing pipelined datapath

• Controlling pipeline operations

�2

Instruction Pipelining

• Goal is to start executing next instruction on every clock cycle

• Pipelining only works when there are no resource contentions

• If so, then either intentionally stall pipeline or duplicate those resources

�3

Fetch Decode Execute Memory Write Back

Fetch Decode Execute Memory Write Back

Fetch Decode Execute Memory Write Back

Fetch Decode Execute Memory Write Back

Fetch Decode Execute Memory Write Back

Time

Program Flow

Multicycle Datapath

�4

IF: Instruction Fetch ID: Instruction Decode / Register

File Read

EX: Execute / Address Calculation

MEM: Mem Access WB: Write Back

Register FileDataA Sel

B Sel

W Sel

ALU

ExtendData

Memory

addr readdata

write data

adder

PC

4

Instruction Memory

addr readdata

Pipelined Datapath

�5

IF: Instruction Fetch ID: Instruction Decode / Register

File Read

EX: Execute / Address Calculation

MEM: Mem Access WB: Write Back

Register FileDataA Sel

B Sel

W Sel

ALU

ExtendData

Memory

addr readdata

write data

adder

PC

4

Instruction Memory

addr readdata

IF /

ID

ID /

EX

EX /

MEM

MEM

/ W

B

Pipeline Registers

�6

IF: Instruction Fetch ID: Instruction Decode / Register

File Read

EX: Execute / Address Calculation

MEM: Mem Access WB: Write Back

Register FileDataA Sel

B Sel

W Sel

ALU

ExtendData

Memory

addr readdata

write data

adder

PC

4

Instruction Memory

addr readdata

PC

IR

PC

A

B

Ext

Out

B

M

Out

Pipeline Controls

�7

IF: Instruction Fetch ID: Instr Decode EX: Execute MEM: Mem Access WB: Write

Register FileDataA Sel

B Sel

W Sel

ALU

ExtendData

Memory

addr readdata

write data

adder

PC

4

Instruction Memory

addr readdata

PC

IR

PC

A

B

Ext

Out

B

Out

Decoder

EX Ctrls

MEM Ctrls

WB Ctrls

MEM CtrlsWB Ctrls

WB Ctrls

M

ID ControlsEX Controls MEM Controls WB Controls

Controls and Datapath

�8

A ← R[n]B ← R[m]

Imm* ← Extend(*)PC ← ALUOut

ALUOut ← A op B

R[d] ← ALUOut

ALUOut ← A + Imm9

MDR ← Mem[ALUOut]

R[d] ← MDR PC ← ALUOut

Mem[ALUOut] ← B

ALUOut ← PrevPC + Imm19

ALUOut ← A + 0

IR ← Mem[PC]ALUOut ← PC + 4

ID Controls

EX Controls

MEM Controls

WB Controls

IF Controls

Pipeline Control Generation

• In single and multicycle datapaths, a single instruction decoder takes instruction register and generates control signals

• In pipelined datapaths, either:

• Single instruction decoder decodes upfront, then writes controls into data stationary registers to be used on subsequent cycles, or

• Instruction register value itself is copied into data stationary registers, and then a local decoder generates signals for that portion of the datapath

�9

Pipelining Instructions

• Pipeline conflict (specifically, a structural hazard) when two instructions try to write to register file on same clock cycle

• Only one write port to register file

• Caused by uneven pipeline stages

�10

IF Ctrls R-TypeID Ctrls

R-TypeEX Ctrls

R-Type WB CtrlsR-Type

R-Type IF Ctrls R-TypeID Ctrls

R-TypeEX Ctrls

R-Type WB Ctrls

ClkCycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10

load IF Ctrls loadID Ctrls

loadEX Ctrls

loadMEM Ctrl

loadWB Ctrls

R-Type IF Ctrls R-TypeID Ctrls

R-TypeEX Ctrls

R-Type WB Ctrls

R-Type IF Ctrls R-TypeID Ctrls

R-TypeEX Ctrls

R-Type WB Ctrls

Solution 1: Insert Bubble into Pipeline

• Insert bubble into pipeline to prevent simultaneous writes

• Control logic can be complex

• No instruction started on Cycle 7

�11

IF Ctrls R-TypeID Ctrls

R-TypeEX Ctrls

R-Type WB CtrlsR-Type

R-Type IF Ctrls R-TypeID Ctrls

R-TypeEX Ctrls

R-Type WB Ctrls

ClkCycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10

load IF Ctrls loadID Ctrls

loadEX Ctrls

loadMEM Ctrl

loadWB Ctrls

R-Type IF Ctrls R-TypeID Ctrls

R-TypeEX Ctrls

R-Type WB Ctrls

R-Type IF Ctrls R-TypeID Ctrls

R-TypeEX Ctrls

R-Type WB Ctrls

R-Type IF Ctrls R-TypeID Ctrls

R-TypeEX Ctrls

R-Type WB Ctrls

R-Type IF Ctrls R-TypeID Ctrls

R-TypeEX Ctrls

bubble

bubble

bubble

bubble bubble bubble bubble

Solution 2: Delay Write by One Cycle

• Delay R-Type’s write by one cycle

• R-Type MEM Ctrl is a no-op; it generates no controls

• Now pipeline has same length for all instruction types

�12

IF Ctrls R-TypeID Ctrls

R-TypeEX Ctrls

R-Type WB CtrlsR-Type

R-Type IF Ctrls R-TypeID Ctrls

R-TypeEX Ctrls

R-Type WB Ctrls

ClkCycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10

load IF Ctrls loadID Ctrls

loadEX Ctrls

loadMEM Ctrl

loadWB Ctrls

R-Type IF Ctrls R-TypeID Ctrls

R-TypeEX Ctrls

R-Type WB Ctrls

R-Type IF Ctrls R-TypeID Ctrls

R-TypeEX Ctrls

R-Type WB Ctrls

R-Type IF Ctrls R-TypeID Ctrls

R-TypeEX Ctrls

R-Type WB Ctrls

R-Type IF Ctrls R-TypeID Ctrls

R-TypeEX Ctrls

R-TypeMEM Ctrl

R-TypeMEM Ctrl

R-TypeMEM Ctrl

R-TypeMEM Ctrl

R-TypeMEM Ctrl

R-TypeMEM Ctrl

Pipelined Operations

• IM = Instruction Memory, Reg = Register File, DM = Data Memory

• Shaded on right side = read, Shaded on left = write

�13

ldur X10, [X10, #40]

sub X11, X2, X3

add X12, X3, X4

stur X13, [X1, #48]

add X14, X5, X6

IM Reg DM RegALU

IM Reg DM RegALU

IM Reg DM RegALU

IM Reg DM RegALU

IM Reg DM RegALU

Time

Program Flow

Pipeline Controls

• PC is updated every cycle, so no need for PCWrite signal

• Pipeline registers (green boxes) also updated every cycle

�14

Register FileDataA Sel

B Sel

W Sel

ALU

ExtendData

Memory

addr readdata

write data

adder

PC

4

Instruction Memory

addr readdata

MemRead MemWrite

MemToReg

ALUSrcB

ALUSrcARegWrite ALUOp

Setting Pipeline Controls

• Instruction decoder calculates control signal values given instruction register

• Writes those values to pipeline registers, to be used by later cycles

�15

InstructionEX Ctrls MEM Ctrls WB Ctrls

ALUSrcA ALUSrcB ALUOp MemRead MemWrite MemToReg RegWrite

add A B add 0 0 Out 1

sub A B sub 0 0 Out 1

ldur A Ext (imm9) add 1 0 M 1

stur A Ext (imm9) add 0 1 X 0

Register FileDataA Sel

B Sel

W Sel

ALU

ExtendData

Memory

addr readdata

write data

adder

PC

4

Instruction Memory

addr readdata

PC

IR

PC

A

B

Ext

Out

B

Out

Decoder

EX Ctrls

MEM Ctrls

WB Ctrls

MEM CtrlsWB Ctrls

WB Ctrls

M

MemRead MemWrite

MemToReg

ALUSrcB

ALUSrcARegWrite ALUOp

IR

PC PC

A

BB

Out

Out

Ext

EXCtrlsMEMCtrls

MEMCtrlsWBCtrls

WBCtrls

WBCtrls

M00

Example of Pipeline: Cycle 0

• Initial state (Cycle 0): PC = 00

�16

00: ldur04: sub08: add0C: stur10: add

Register FileDataA Sel

B Sel

W Sel

ALU

ExtendData

Memory

addr readdata

write data

adder

PC

4

Instruction Memory

addr readdata

PC

IR

PC

A

B

Ext

Out

B

Out

Decoder

EX Ctrls

MEM Ctrls

WB Ctrls

MEM CtrlsWB Ctrls

WB Ctrls

M

MemRead MemWrite

MemToReg

ALUSrcB

ALUSrcARegWrite ALUOp

IR

PC PC

A

BB

Out

Out

Ext

EXCtrlsMEMCtrls

MEMCtrlsWBCtrls

WBCtrls

WBCtrls

M04

00

ldur

Example of Pipeline: After Cycle 1

• Fetch from 00h, then increment PC

�17

IF 00: ldur04: sub08: add0C: stur10: add

Register FileDataA Sel

B Sel

W Sel

ALU

ExtendData

Memory

addr readdata

write data

adder

PC

4

Instruction Memory

addr readdata

PC

IR

PC

A

B

Ext

Out

B

Out

Decoder

EX Ctrls

MEM Ctrls

WB Ctrls

MEM CtrlsWB Ctrls

WB Ctrls

M

MemRead MemWrite

MemToReg

ALUSrcB

ALUSrcARegWrite ALUOp

IR

PC PC

A

BB

Out

Out

Ext

EXCtrlsMEMCtrls

MEMCtrlsWBCtrls

WBCtrls

WBCtrls

M08

04

sub

00

X10

X

#40

A, Ext, add

1, 0

M, 1, X10

Example of Pipeline: After Cycle 2

• Fetch from 04h; decode ldur X10, [X10, #40] and generate controls

�18

ID 00: ldurIF 04: sub

08: add0C: stur10: add

Register FileDataA Sel

B Sel

W Sel

ALU

ExtendData

Memory

addr readdata

write data

adder

PC

4

Instruction Memory

addr readdata

PC

IR

PC

A

B

Ext

Out

B

Out

Decoder

EX Ctrls

MEM Ctrls

WB Ctrls

MEM CtrlsWB Ctrls

WB Ctrls

M

MemRead MemWrite

MemToReg

ALUSrcB

ALUSrcARegWrite ALUOp

IR

PC PC

A

BB

Out

Out

Ext

EXCtrlsMEMCtrls

MEMCtrlsWBCtrls

WBCtrls

WBCtrls

M0C

08

add

04

X2

X3

X

A, B, sub

0, 0

M, 1, X10

1, 0

A

Ext

add

X

sum

Out, 1, X11

Example of Pipeline: After Cycle 3

• Fetch from 08h; decode sub X11, X2, X3; execute ldur

�19

EX 00: ldurID 04: subIF 08: add

0C: stur10: add

Register FileDataA Sel

B Sel

W Sel

ALU

ExtendData

Memory

addr readdata

write data

adder

PC

4

Instruction Memory

addr readdata

PC

IR

PC

A

B

Ext

Out

B

Out

Decoder

EX Ctrls

MEM Ctrls

WB Ctrls

MEM CtrlsWB Ctrls

WB Ctrls

M

MemRead MemWrite

MemToReg

ALUSrcB

ALUSrcARegWrite ALUOp

IR

PC PC

A

BB

Out

Out

Ext

EXCtrlsMEMCtrls

MEMCtrlsWBCtrls

WBCtrls

WBCtrls

M10

0C

stur

08

X3

X4

X

A, B, add

0, 0

Out, 1, X11

0, 0

A

B

sub

X3

diff

Out, 1, X12

1 0

M, 1, X10

sum

D

Example of Pipeline: After Cycle 4

• Fetch 0Ah; decode add X12, X3, X4; execute sub; access ldur

�20

MEM 00: ldurEX 04: subID 08: addIF 0C: stur

10: add

Register FileDataA Sel

B Sel

W Sel

ALU

ExtendData

Memory

addr readdata

write data

adder

PC

4

Instruction Memory

addr readdata

PC

IR

PC

A

B

Ext

Out

B

Out

Decoder

EX Ctrls

MEM Ctrls

WB Ctrls

MEM CtrlsWB Ctrls

WB Ctrls

M

MemRead MemWrite

MemToReg

ALUSrcB

ALUSrcARegWrite ALUOp

14

10

add

0C

X1

X13

#48

A, Ext, add

0, 1

Out, 1, X12

0, 0

A

B

add

X4

sum

X, 0, X

0 0

Out, 1, X11

diff

1

X10

MX

Example of Pipeline: After Cycle 5

• Fetch 10h; decode stur X13, [X1, #48]; execute add; access sub (noop); write ldur

�21

WB 00: ldurMEM 04: subEX 08: addID 0C: sturIF 10: add

Register FileDataA Sel

B Sel

W Sel

ALU

ExtendData

Memory

addr readdata

write data

adder

PC

4

Instruction Memory

addr readdata

PC

IR

PC

A

B

Ext

Out

B

Out

Decoder

EX Ctrls

MEM Ctrls

WB Ctrls

MEM CtrlsWB Ctrls

WB Ctrls

MX

MemRead MemWrite

MemToReg

ALUSrcB

ALUSrcARegWrite ALUOp

IR

PC PC

A

BB

Out

Out

Ext

EXCtrlsMEMCtrls

MEMCtrlsWBCtrls

WBCtrls

WBCtrls

M18

14 10

X5

X6

X

A, B, add

0,0

X, 0, X

0, 1

A

Ext

add

X13

sum

Out, 1, X14

0 0

Out, 1, X12

diff

1

X11

OutX

Example of Pipeline: After Cycle 6

• Decode add X14, X5, X6; execute stur; access add (noop); write sub

�22

00: ldurWB 04: sub

MEM 08: addEX 0C: sturID 10: add

Register FileDataA Sel

B Sel

W Sel

ALU

ExtendData

Memory

addr readdata

write data

adder

PC

4

Instruction Memory

addr readdata

PC

IR

PC

A

B

Ext

Out

B

Out

Decoder

EX Ctrls

MEM Ctrls

WB Ctrls

MEM CtrlsWB Ctrls

WB Ctrls

MX

MemRead MemWrite

MemToReg

ALUSrcB

ALUSrcARegWrite ALUOp

IR

PC PC

A

BB

Out

Out

Ext

EXCtrlsMEMCtrls

MEMCtrlsWBCtrls

WBCtrls

WBCtrls

M1C

18 14

Out, 1, X14

0, 0

A

B

add

X6

sum

0 1

X, 0, X

sum

1

X12

OutX

Example of Pipeline: After Cycle 7

• Execute add; access stur; write add

�23

00: ldur04: sub

WB 08: addMEM 0C: sturEX 10: add

Register FileDataA Sel

B Sel

W Sel

ALU

ExtendData

Memory

addr readdata

write data

adder

PC

4

Instruction Memory

addr readdata

PC

IR

PC

A

B

Ext

Out

B

Out

Decoder

EX Ctrls

MEM Ctrls

WB Ctrls

MEM CtrlsWB Ctrls

WB Ctrls

MX

MemRead MemWrite

MemToReg

ALUSrcB

ALUSrcARegWrite ALUOp

IR

PC PC

A

BB

Out

Out

Ext

EXCtrlsMEMCtrls

MEMCtrlsWBCtrls

WBCtrls

WBCtrls

M20

1C 18

0 0

Out, 1, X14

sum

0

X

XX

Example of Pipeline: After Cycle 8

• Access add (noop); write stur (noop)

�24

00: ldur04: sub08: add

WB 0C: sturMEM 10: add

Register FileDataA Sel

B Sel

W Sel

ALU

ExtendData

Memory

addr readdata

write data

adder

PC

4

Instruction Memory

addr readdata

PC

IR

PC

A

B

Ext

Out

B

Out

Decoder

EX Ctrls

MEM Ctrls

WB Ctrls

MEM CtrlsWB Ctrls

WB Ctrls

MM

MemRead MemWrite

MemToReg

ALUSrcB

ALUSrcARegWrite ALUOp

IR

PC PC

A

BB

Out

Out

Ext

EXCtrlsMEMCtrls

MEMCtrlsWBCtrls

WBCtrls

WBCtrls

M24

20 1C1

X14

Out

Example of Pipeline: After Cycle 9

• Write add

�25

00: ldur04: sub08: add0C: stur

WB 10: add