Mohamed Younis CMCS 411, Computer Architecture 1
CMCS 411-101Computer Architecture
Lecture 21
Handling Pipeline Hazards
April 16, 2001
www.csee.umbc.edu/~younis/CMSC411/ CMSC411.htm
Mohamed Younis CMCS 411, Computer Architecture 2
Lecture’s Overview Previous Lecture:
• Designing a pipelined datapath Standardized multi-stage instruction execution
Unique resources per stage
• Controlling pipeline operations Simplified stage-based control Detailed working example
This Lecture:
• Pipeline hazard detection
• Handling hazard in the pipeline design
• Supporting exceptions
Mohamed Younis CMCS 411, Computer Architecture 3
Pipelined Datapath
In struct ionme mo ry
Add ress
4
32
0
A dd A ddresult
S hif tleft 2
IF/ID E X/ME M M EM /W B
Mux
0
1
Add
PC
0W ritedata
Mux
1R eg iste rs
Readd ata 1
Readd ata 2
R ea dre gis ter 1
R ea dre gis ter 2
16Sign
exte nd
W ritere gis ter
W ritedata
R eadda ta
1
A LUresult
Mux
A LUZe ro
ID /EX
D atame mo ry
Ad dress
Data Stationary
Mohamed Younis CMCS 411, Computer Architecture 4
Standardized Five Stages InstructionsCycle 1 Cycle 2 Cycle 3 Cycle 4
Ifetch Reg/Dec Exec MemLoad Wr
Ifetch Reg/Dec Exec MemStore Wr
Ifetch Reg/Dec Exec MemR-type Wr
Ifetch Reg/Dec Exec MemBeq Wr
Ifetch Reg/Dec Exec MemOri Wr
Cycle 5
Mohamed Younis CMCS 411, Computer Architecture 5
Data Stationary ControlThe Main Control generates the control signals during Reg/Dec
Control signals for Exec (ExtOp, ALUSrc, ...) are used 1 cycle later Control signals for Mem (MemWr Branch) are used 2 cycles later Control signals for Wr (MemtoReg MemWr) are used 3 cycles later
IF/ID
Reg
ister
ID/E
x Reg
ister
Ex/M
em R
egister
Mem
/Wr R
egister
Reg/Dec Exec Mem
ExtOp
ALUOp
RegDst
ALUSrc
Branch
MemWr
MemtoReg
RegWr
MainControl
ExtOp
ALUOp
RegDst
ALUSrc
MemtoReg
RegWr
MemtoReg
RegWr
MemtoReg
RegWr
Branch
MemWr
Branch
MemWr
Wr
* Slide is courtesy of Dave Patterson
Mohamed Younis CMCS 411, Computer Architecture 6
P C
Instruc tionmem ory
Add
Instruction[20– 16]
ALU Op
Branch
RegD st
ALUSrc
4
16 32Instruction[15– 0]
0
0
Mux
0
1
A dd Addresult
R egistersW ritereg is ter
W ritedata
Readdata 1
Readdata 2
R eadreg is ter 1
R eadreg is ter 2
Signextend
Mux
1
A LUresult
Zero
W ritedata
Readdata
Mux
1
A LUcon trol
Sh iftle ft 2
M emR ead
C ontrol
ALU
Instruction[15– 11]
6
EX
M
W B
M
W B
W BIF/ID
PC Src
ID/E X
EX /ME M
ME M/W B
Mux
0
1
A ddressData
memory
Address
Datapath + Data Stationary Control
Mohamed Younis CMCS 411, Computer Architecture 7
IM R eg
IM R eg
C C 1 C C 2 C C 3 C C 4 C C 5 C C 6
Tim e (in clock cyc les)
sub $2 , $1, $3
Programexecu tionorde r(in ins tructions)
and $12, $2 , $5
IM R eg D M R eg
IM D M R eg
IM D M R eg
C C 7 C C 8 C C 9
10 10 10 10 10/– 20 – 20 – 20 – 20 – 20
or $13, $6 , $2
add $14, $2 , $2
sw $15, 100($2 )
Value ofreg is te r $2 :
D M R eg
R eg
R eg
R eg
D M
Data Hazards
Generally caused by data dependence among instructions Can cause erroneous semantics if went undetected and avoided Raised when an instruction depends on result of a prior instruction still in the pipeline and attempts to use item before it is ready
Mohamed Younis CMCS 411, Computer Architecture 8
Dependency DetectionIFetch Dec Exec Mem WB
IFetch Dec Exec Mem WB
IFetch Dec Exec Mem WB
IFetch Dec Exec Mem WBProgram Flow
Time
Write after read Data Hazard
Data Hazards is caused by backward dependency at the DEC stage A hazardous register is to be updated at an earlier instruction at the EX, MEM or WB stages of the pipeline It is assumed that register writing takes precedence over reading if happened in the same clock cycle WB stages will not cause data hazard
Data Hazard conditions:
1a. EX/MEM.Register-Rd = ID/EX.Register-Rs
1b. EX/MEM.Register-Rd = ID/EX.Register-Rt
2a. MEM/WB.Register-Rd = ID/EX.Register-Rs
2b. MEM/WB.Register-Rd = ID/EX.Register-Rt
Mohamed Younis CMCS 411, Computer Architecture 9
IM R eg
IM R eg
C C 1 C C 2 C C 3 C C 4 C C 5 C C 6
Tim e (in c lock cyc les )
sub $ 2, $1 , $3
Prog ramexecution o rde r(in ins tructions)
and $12 , $2, $5
IM R eg D M R eg
IM D M R eg
IM D M R eg
C C 7 C C 8 C C 9
10 10 10 10 1 0/– 20 – 20 – 20 – 20 – 20
or $13 , $6, $2
add $14 , $2, $2
sw $15 , 100 ($2 )
V alue o f register $2 :
D M R eg
R eg
R eg
R eg
X X X – 20 X X X X XVa lue of EX /M E M :X X X X – 20 X X X XVa lue o f M EM /W B :
D M
Data Forwarding
Detecting data hazard conditions allows using intermediate results
Forward only when there is a WB stage (check the RegWrite signal)
Mohamed Younis CMCS 411, Computer Architecture 10
R e gisters
Mux M
ux
A LU
ID /E X M E M /W B
D atam em ory
Mux
Fo rw a rdingun it
E X /M E M
F orw ardB
R dE X/M E M .RegisterRd
M EM /W B .R egisterR d
R tR tR s
Forw ardA
Mux
Forwarding Hardware
EX Hazard:
if (EX/MEM.RegWrite & (EX/MEM.Register-Rd = ID/EX.Register-Rs)) ForwardA = EX/MEM
if (EX/MEM.RegWrite & (EX/MEM.Register-Rd = ID/EX.Register-Rt)) ForwardB = EX/MEM
MEM Hazard:
if (MEM/WB.RegWrite & (MEM/WB.Register-Rd = ID/EX.Register-Rs)) ForwardA = MEM/WB
if (MEM/WB.RegWrite & (MEM/WB.Register-Rd = ID/EX.Register-Rt)) ForwardB = MEM/WB
Mohamed Younis CMCS 411, Computer Architecture 11
P CInstruc tion
m em ory
R eg is ters
Mux
Mux
C ontro l
A LU
E X
M
W B
M
W B
W B
ID /E X
E X /M E M
M EM /W B
D atam em ory
Mux
Forward ingunit
IF /ID
Mux
R dEX /M EM .R egis terRd
M EM /W B .R egis te rRd
R t
R t
R s
IF /ID .R egis terR d
IF /ID .R egis terR t
IF /ID .R egis terR t
IF /ID .R egis terR s
Forwarding Register of Successive Use
Example:
add $1, $1, $2;
add $1, $1, $3;
add $1, $1, $4;
if (MEM/WB.RegWrite & (EX/MEM.Register-Rd ID/EX.Register-Rs) &
(MEM/WB.Register-Rd = ID/EX.Register-Rs)) ForwardA = MEM/WB
if (MEM/WB.RegWrite & (EX/MEM.Register-Rd ID/EX.Register-Rs) &
(MEM/WB.Register-Rd = ID/EX.Register-Rt)) ForwardB = MEM/WB
Mohamed Younis CMCS 411, Computer Architecture 12
PCInstruc tion
memory
Registe rs
Mux
Mux
Mux
EX
M
W B
W B
Datamem ory
Mux
F orwa rdingun it
IF/ID
and $4 , $2, $5 sub $2, $1 , $3
ID /EX
before <1>
EX/M EM
be fore<2>
MEM/W B
o r $4 , $4, $2
C lock 3
2
5
10 10
$2
$5
5
2
4
$1
$3
3
1
2
Control
ALU
M
WB
Data Forwarding Example
Mohamed Younis CMCS 411, Computer Architecture 13
PCInstruc tion
memory
Registe rs
Mux
Mux
Mux
EX
M
W B
M
WB
Datamemory
Mux
F orwardingun it
IF/ID
or $4 , $4, $2 and $4 , $2, $5
ID /EX
sub $2, . . .
EX/M EM
be fore<1>
MEM/W B
add $9 , $4, $2
C lock 4
4
6
10 10
$4
$2
6
2
4
$2
$5
5
2
4
Control
ALU
10
2
W B
Example
Mohamed Younis CMCS 411, Computer Architecture 14
PCInstruc tion
memory
Registe rs
Mux
Mux
Mux
EX
M
W B
M
WB
Datamemory
Mux
Forwardingun it
IF/ID
add $9 , $4, $2 o r $4 , $4, $2
ID /EX
and $4, . . .
EX/M EM
sub $2 , . . .
MEM/W B
a fte r<1>
C lock 5
4
2
10 10
$4
$2
2
4
9
$4
$2
4
2
24
Control
ALU
10
W B
2
1
4
Example
Mohamed Younis CMCS 411, Computer Architecture 15
PCInstruc tion
mem ory
Mux
Mux
Mux
EX
M
W B
M
WB
Datam em ory
Mux
Forwardingun it
after< 1>a fte r<2> add $9 , $4, $2 o r $4, . . .
E X/M EM
and $4, . . .
MEM/W B
C lock 6
10
$4
$2
2
4
9
ALU
10
4
W B
4
1
Registe rs
IF/ID
ID /EX
4
Contro l
Example
Mohamed Younis CMCS 411, Computer Architecture 16
R eg
IM
R eg
R eg
IM
C C 1 C C 2 C C 3 C C 4 C C 5 C C 6
T im e (in c lock cyc les)
lw $2, 20 ($1)
P rogramexe cu tionorder(in ins tructions)
and $4 , $2, $5
IM R eg D M R eg
IM D M R eg
IM D M R eg
C C 7 C C 8 C C 9
or $8 , $2, $6
add $9 , $4, $2
slt $1, $6 , $7
D M R eg
R eg
R eg
D M
Data Hazard Caused by Load
Dependencies backward in time are hazards Cannot solve with forwarding Must delay/stall instruction dependent on loads
Mohamed Younis CMCS 411, Computer Architecture 17
lw $2, 20 ($1 )
P rograme xe cu tiono rde r(in ins tru ctio ns)
and $4, $2 , $ 5
or $ 8, $2 , $ 6
add $9, $4 , $ 2
slt $ 1, $6 , $7
R eg
IM
R eg
R e g
IM D M
C C 1 C C 2 C C 3 C C 4 C C 5 C C 6T im e (in c lo ck cyc les)
IM R e g D M R e gIM
IM D M R eg
IM D M R eg
C C 7 C C 8 C C 9 C C 10
D M R e g
R egR eg
R eg
bubble
Hazard Detection Unit
if (ID/EX.MemRead & ((ID/EX.Register-Rt = IF/ID.Register-Rs) |
(ID/EX.Register-Rt = IF/ID.Register-Rt))) stall the pipeline
Mohamed Younis CMCS 411, Computer Architecture 18
P CInstruct ion
m em ory
R egis ters
Mux
Mux
Mux
C ontro l
A LU
E X
M
W B
M
W B
W B
ID /E X
E X /M E M
M EM /W B
D atam em ory
Mux
Ha zarddetec tion
un it
Forw ard ingun it
0
Mux
IF /ID
ID /EX .M e m R ead
ID /EX .R egis terR t
IF /ID .R eg is te rR d
IF /ID .R eg is te rR t
IF /ID .R eg is te rR t
IF /ID .R eg is te rR s
R tR s
R d
R t EX /M EM .R egisterR d
ME M/W B .R egisterR d
Supporting Pipeline Stall
De-asserting all the control line in EX, MEM and WB control fields of the ID/EX pipeline stationary register would stall the pipeline
Mohamed Younis CMCS 411, Computer Architecture 19
R eg
R eg
C C 1
Tim e (in c lock cyc le s)
40 be q $1 , $3, 7
P rogramexecu tiono rde r(in ins tru ctions)
IM R e g
IM D M
IM D M
IM D M
D M
D M R eg
R eg R eg
R eg
R eg
R egIM
44 an d $1 2, $2 , $5
48 or $13, $6 , $ 2
52 ad d $1 4, $2 , $2
72 lw $4 , 50($7)
C C 2 C C 3 C C 4 C C 5 C C 6 C C 7 C C 8 C C 9
R eg
Control Hazards
Must be carefully detected and handled to avoid semantic error Can be solved by excessive pipeline stalls which is inefficient Two approaches: “delayed branching” and “branch prediction”
Mohamed Younis CMCS 411, Computer Architecture 20
PCInstruction
memory
4
R egisters
Mux
Mux
Mux
A LU
E X
M
W B
M
W B
W B
ID/E X
0
EX /ME M
M EM /W B
Datam em ory
Mux
H azarddetection
unit
Forward ingunit
IF .F lush
IF/ID
Signextend
C ontro l
Mux
=
Shiftle ft 2
Mux
Delayed Branching
Moves comparison to ID stage to minimize worst case pipeline flushing Interacts with data hazards hardware to ensure right register values Flushes the pipeline with zero control values and instruction register
Mohamed Younis CMCS 411, Computer Architecture 21
PCIn struc tion
m em ory
4
Re gisters
Signexten d
Mux
Mux
Con trol
EX
M
W B
M
WB
WB
Mux
Hazarddetec tion
un it
F orwa rd ingu nit
Mux
IF.F lu sh
IF/ID
and $12, $ 2, $5 b eq $1 , $3, 7 su b $10, $4, $8
M EM /W B
EX/M EM
ID /EX
C lock 3
72 44
48 4 4
28
7
$1
$3
1 0
4 8
72
72
0
$4
$8
ALUData
m em o ry
Mux
S hif tle f t 2
before<1> b efore<2>
=
Delayed Branching Example
Mohamed Younis CMCS 411, Computer Architecture 22
Mux
0
bu bble (nop)lw $4, 50 ($ 7)
C lock 4
be q $1, $ 3, 7 su b $10 , . . . befo re <1>
PC In struc tionm em ory
4
Re gisters
Signexten d
Mux
Mux
Con trol
EX
M
W B
M
WB
WB
Mux
Hazarddetec tion
un it
F orwa rd ingu nit
IF.F lu sh
IF/ID
M EM /W B
EX/M EM
ID /EX
76 72
76 7 2
$ 1
$ 3
10
7 6
ALUData
m em o ry
Mux
S hif tle f t 2
=
Example
Mohamed Younis CMCS 411, Computer Architecture 23
a . F rom befo re b. From ta rget c . From fall through
sub $ t4 , $ t5, $t6
…
add $s1 , $s2, $s3
if $s1 = 0 then
add $s1 , $s2 , $s3
if $s1 = 0 then
add $s1 , $s2 , $s3
if $s1 = 0 then
sub $t4 , $ t5 , $ t6add $s1 , $s2 , $s3
if $s1 = 0 the n
sub $t4, $t5 , $ t6
add $s1 , $s2 , $s3
if $s2 = 0 then
Becom esB ecom esBecom es
D e lay slot
D e lay slot
D elay s lo t
sub $ t4 , $ t5, $t6
if $s2 = 0 then
add $s1, $s2 , $s3
Compiler Optimization of Delays
Mohamed Younis CMCS 411, Computer Architecture 24
T aken
Taken
Taken
Taken
N ot taken
N ot taken
N ot taken
N o t taken
Predic t taken P red ic t taken
P red ic t not taken P red ic t not taken
Dynamic Branch Prediction Uses a history buffer to track branching instruction using their addresses
Simplest approach uses one-bit per branch instruction to indicate the last experience
Branching differently from prediction gets the history bit flipped
One-bit buffer still fails to predict at least twice in case of a loop
A two-bit buffer better
captures the history of
the branch instruction
Mohamed Younis CMCS 411, Computer Architecture 25
Exception/Interrupts 5 instructions executing in 5 stage pipeline
How to stop the pipeline? Restart? Who caused the interrupt?
Stage Problem interrupts occurring
IF Page fault on instruction fetch; memory-protection violation
ID Undefined or illegal op-code
EX Arithmetic exception
MEM Page fault on data fetch; memory-protection violation; memory error
Load with data page fault, Add with instruction page fault? Solution: interrupt vector/instruction, interrupt ASAP, restart everything
incomplete
Faults (within instr., can be restarted) Force trap instruction into IF disable writes till trap hits WB must save multiple PCs or PC + state
External Interrupts: Allow pipeline to drain Load PC with interrupt address
Mohamed Younis CMCS 411, Computer Architecture 26
PCIns truc tion
m em ory
4
R egisters
S ignextend
Mux
Mux
Mux
C ontrol
ALU
EX
M
W B
M
W B
W B
ID /EX
E X/M EM
ME M/W B
Mux
Datamemory
Mux
H azarddetec tion
un it
Forw ardingun it
IF .F lush
IF/ID
=
ExceptPC
40000040
0
Mux
0
Mux
0
Mux
ID.F lush EX.Flush
C ause
Sh iftleft 2
Supporting Exception Handling
Mohamed Younis CMCS 411, Computer Architecture 27
s lt $15, $6, $7lw $ 16, 50 ($7 ) ad d $1, $ 2, $1 or $13, . . . and $ 12, . . .
C lock 5
400 0004 0
0
0
0
01 0
10
0
0
1
58 54
54
1 2
$6
$7
15
50
$2
$1
$1
13 12
Datam em oryPC
4
Reg isters
S igne xten d
Mux
Mux
Mux
Co ntrol
AL U
EX
M
W B
M
W B
W B
ID/EX
E X/M EM
M EM /W B
Mux
Mux
H aza rdde tection
unit
F orward in gun it
IF .F lush
IF /ID
=
Exce ptPC
4 000 00 40
0
Mux
0
Mux
0
Mux
ID.Flush EX.Flush
Ca use
Sh iftle ft 2
In struc tionm em ory
Example for Exception Handling
Add instruction causes overflow
Mohamed Younis CMCS 411, Computer Architecture 28
bubble (nop )sw $ 25, 10 00($0) bu bble bubb le or $13, . . .
C lock 6
4000 0044
400 0004 4
13
0
0
0
000
000 0
00
1
1 3
D atam em ory
40 0000 40
PC
4
R eg isters
S igne xten d
Mux
Mux
Mux
AL U
EX
M
W B
M
W B
W B
ID /EX
E X/M EM
M EM /W B
Mux
Mux
H aza rdde tection
unit
F orw ard in gun it
IF .F lush
IF /ID
=
Ex ce ptPC
4 000 00 40
0
Mux
0
Mux
0
Mux
ID .Flus h EX.F lus h
C a us e
Sh iftle ft 2
In struc tionm em ory
C o ntrol
Example
Pipeline is drained after saving address of instruction First instruction of the handler is fetched
Mohamed Younis CMCS 411, Computer Architecture 29
Conclusion Summary
Pipeline hazard detection• Data hazard • Control hazard
Handling hazard in the pipeline design• Data forwarding and branch prediction• Coordination between control and data hazard
Supporting exceptions• Pipeline flushing• Invoking exception handling routines
Next Lecture The basic of cache memory Measuring and improving cache performance
Reading assignment includes sections 6.4 to 6.8 in the text bookReading assignment includes sections 6.4 to 6.8 in the text book
Top Related