Post on 20-Oct-2019
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
2005-9-27John Lazzaro
(www.cs.berkeley.edu/~lazzaro)
CS 152 Computer Architecture and Engineering
Lecture 9 – Pipelining III
www-inst.eecs.berkeley.edu/~cs152/
Congrats on Lab 2!
TAs: David Marquardt and Udam Saini
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Last time: A Hazard Taxonomy
Structural Hazards
Data Hazards (RAW, WAR, WAW)
Control Hazards (taken branches and jumps)
On each clock cycle, we must detect the presenceof all of these hazards, and resolve them before they break the “contract with the programmer”.
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Last Time: Hazard Resolution Toolkit
Stall earlier instructions in pipeline.
Kill earlier instructions in pipeline.
Forward results computed in later pipeline stages to earlier stages.Add new hardware or rearrange hardware design to eliminate hazard.
Make hardware handle concurrent requests to eliminate hazard.
Change ISA to eliminate hazard.
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Today: Putting it All Together
Specifications for Lab 3
Preferred hazard resolution tools.
At-risk hazards for Lab 3
Tips for control design
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Lab 3: ISA Specifications
No load “delay slot”
Single “delay slot”
Also: RESET signal, BREAK release signal, etc ...
CS 152 L1: The MIPS ISA UC Regents Fall 2005 © UCB
The level of detail needed for a pipelined design can only be found in this document.
Remember: Online MIPS documentation
42 MIPS32™ Architecture For Programmers Volume II, Revision 2.00
Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved.
AND
Format: AND rd, rs, rt MIPS32
Purpose:
To do a bitwise logical AND
Description: rd ! rs AND rt
The contents of GPR rs are combined with the contents of GPR rt in a bitwise logical AND operation. The result is
placed into GPR rd.
Restrictions:
None
Operation:
GPR[rd] ! GPR[rs] and GPR[rt]
Exceptions:
None
31 26 25 21 20 16 15 11 10 6 5 0
SPECIAL
000000rs rt rd
0
00000
AND
100100
6 5 5 5 5 6
And AND
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Hazard Diagnosis
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Data Hazards: Read After Write
Read After Write (RAW) hazards.Instruction I2 expects to read a datavalue written by an earlier instruction,but I2 executes “too early” and readsthe wrong copy of the data.
Lab 3 solution: use forwarding heavily, fall back on stalling when forwarding won’twork or slows down the critical path too much.
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Mux,Logic
Full bypass network ...
rd1
RegFile
rd2
WEwd
rs1
rs2
ws
Ext
IR IR
B
A
M
32A
L
U
32
32
op
IR
Y
M
IR
Dout
Data Memory
WE
Din
Addr
MemToReg
R
WE, MemToReg
ID (Decode) EX MEM WB
From WB
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Mux,Logic
Common bug: Multiple forwards ...
rd1
RegFile
rd2
WEwd
rs1
rs2
ws
Ext
IR IR
B
A
M
32A
L
U
32
32
op
IR
Y
M
IR
Dout
Data Memory
WE
Din
Addr
MemToReg
R
WE, MemToReg
ID (Decode) EX MEM WB
From WB
ADD R4,R3,R2 OR R2,R3,R1 AND R2,R2,R1
Which do we forward from?
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Data Hazards: WAR and WAW ...
Write After Read (WAR) hazards. Instruction I2 expects to write over a data value after an earlier instruction I1 reads it. But instead, I2 writes too early, and I1 sees the new value.
Write After Write (WAW) hazards. Instruction I2 writes over data an earlier instruction I1 also writes. But instead, I1 writes after I2, and the final data value is incorrect.
WAR and WAW not possible in our 5-stage pipeline. However, TA test code checks for these, and every semester a few WAR/WAWs are found. Why?
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
LW and Hazards
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Mux,Logic
Questions about LW and forwarding
rd1
RegFile
rd2
WEwd
rs1
rs2
ws
Ext
IR IR
B
A
M
32A
L
U
32
32
op
IR
Y
M
IR
Dout
Data Memory
WE
Din
Addr
MemToReg
R
WE, MemToReg
ID (Decode) EX MEM WB
From WB
ADDIU R1 R1 24 LW R1 128(R29)
Will this work as shown?OR R3,R3,R2
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Mux,Logic
rd1
RegFile
rd2
WEwd
rs1
rs2
ws
Ext
IR IR
B
A
M
32A
L
U
32
32
op
IR
Y
M
IR
Dout
Data Memory
WE
Din
Addr
MemToReg
R
WE, MemToReg
ID (Decode) EX MEM WB
From WB
ADDIU R1 R1 24 LW R1 128(R29)
Will this work as shown?OR R1,R3,R1
Questions about LW and forwarding
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Resolving a RAW hazard by stalling
rd1
RegFile
rd2
WEwd
rs1
rs2
ws
D
PC
Q
+
0x4
Addr Data
Instr
Mem
Ext
IR IR IR
B
A
M
Instr Fetch
Stage #1 Stage #2 Stage #3
Decode & Reg Fetch
ADD R4,R3,R2OR R5,R4,R2
Let ADD proceed to WB stage, so that R4 is written to regfile.
ADD R4,R3,R2OR R5,R4,R2
Sample programKeep executingOR instructionuntil R4 is ready.Until then, sendNOPS to IR 2/3.
Freeze PC and IR until stall is over.
New datapath hardware
(1) Mux into IR 2/3to feed in NOP.
(2) Write enable on PC and IR 1/2
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Branches and Hazards
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Recall: Control hazard and hardware
rd1
RegFile
rd2
WEwd
rs1
rs2
ws
D
PC
Q
+
0x4
Addr Data
Instr
Mem
Ext
IR IR IR
B
A
M
Instr Fetch
Stage #1 Stage #2 Stage #3
Decode & Reg Fetch
==
To branch control logic
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
I1:I2:I3:I4:I5:
t1 t2 t3 t4 t5 t6 t7 t8Time:Inst
I6:
Recall: After more hardware, change ISA
D
PC
Q
+
0x4
Addr Data
Instr
Mem
IR IR
IF (Fetch) ID (Decode) EX (ALU)
IR IR
MEM WB
BEQ R4,R3,25
SUB R1,R9,R8AND R6,R5,R4
I1:I2:I3:
Sample Program(ISA w/o branch delay slot) IF ID
IF
EX MEM WB
If branch is taken, this instruction MUST NOT
complete!
ID stage computes if branch is taken
If we change ISA, can we always let I2 complete (”branch delay slot”) and
eliminate the control hazard.
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Mux,Logic
Questions about branch and forwards
rd1
RegFile
rd2
WEwd
rs1
rs2
ws
Ext
IR IR
B
A
M
32A
L
U
32
32
op
IR
Y
M
IR
Dout
Data Memory
WE
Din
Addr
MemToReg
R
WE, MemToReg
ID (Decode) EX MEM WB
BEQ R1 R3 label
Will this work as shown?OR R3,R3,R1
==
To branch control logic
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Lessons learned
Pipelining is hard
Write test code in advance
Study every instruction
Think about interactions ...
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Control Implementation
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Recall: What is single cycle control?
32rd1
RegFile
32rd2
WE32wd
5rs1
5rs2
5ws
ExtRegDest
ALUsrcExtOp
ALUctr
32A
L
U
32
32
op
MemToReg
32Dout
Data Memory
WE32
Din
Addr
MemWr
Equal
RegWr
32Addr Data
InstrMem
Equal
RegDestRegWr
ExtOpALUsrc MemWr
MemToReg
PCSrc
Combinational Logic(Only Gates, No Flip Flops)Just specify logic functions!
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
In pipelines, all IR registers are used
IR IR IR IR
ID (Decode) EX MEM WB
Equal
RegDestRegWr
ExtOp MemToReg
PCSrc
Combinational Logic(Only Gates, No Flip Flops)
(add extra state outside!)
A “conceptual” design -- for shortest critical path, IR registers may hold decoded info,
not the complete 32-bit instruction
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Two goals when specifying control logic
Bug-free: One “0” that should be a “1” in the control logic function breaks contract with the programmer.
Efficient: Logic function specification should map to hardware with good performance properties: fast, small, low power, etc.
Should be easy for humans to read and understand: sensible signal names, symbolic constants ...
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Midterm week begins on Thursday ...HW graded on effort
Midterm (6-9, 310 Soda), no class that day.
Thursday review session.Will cover format, material, and ground rules for test.
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
And concurrently, Lab 3 deadlines ...
Lab 2 team evals,Lab 3 design document,Weekend: start design work
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Admin: Team Evaluations due Thursday
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Admin: Design Document Deadlines
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Lab 3 deadlines after the mid-term ...
Lab 3 design doc + Xilinx checkoff later in week ...
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Are pipelined CPUs a “solved problem”?
Embedded CPU -- runs one program its entire life.
Embedded CPU -- “embedded into products”. Unlike desktop/laptop/server, where customer knows he is buying a “computer”.
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Theme: don’t do hardware in a vacuum
Optimize architecture, compilers for code that runs on the hardware, and the CAD tools that create the hardware together.
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Example: Customize “ALU” for the app
Designing a new CPU just for one product is hard. Why?
SecondsProgram
Instructions= SecondsCycleProgram
Cycles Instruction
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Example: Specialize memory for the app
Teaching compilers and operating systems about unusual memory systems is hard.
UC Regents Fall 2005 © UCBCS 152 L9: Pipelining III
Example: Energy & operating systems
Teaching compilers and operating systems about E = 1/2 C V2 is hard.