.1 1999©UCB CPSC 161 Lecture 6 Prof. L.N. Bhuyan bhuyan/cs161/index.html.
.1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan...
-
date post
21-Dec-2015 -
Category
Documents
-
view
220 -
download
1
Transcript of .1 1999 ©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan bhuyan...
.1 1999 ©UCB
CS 161Computer Architecture
Chapter 5Lecture 11
Instructor: L.N. Bhuyanwww.cs.ucr.edu/~bhuyan
Adapted from notes by Dave Patterson(http.cs.berkeley.edu/~patterson)
.2 1999 ©UCB
Implementing Main Control
Main Control
RegDst
Branch
MemRead
MemtoReg
ALUop
MemWrite
ALUSrc
RegWrite
op
2
Main Control has one 6-bit input, 9 outputs (7 are 1-bit, ALUOp is 2 bits)
To build Main Control as sum-of-products:
(1) Construct a minterm for each different instruction (or R-type); each minterm corresponds to a single instruction (or all of the R-type instructions), e.g., MR-format, Mlw
(2) Determine each main control output by forming the logical OR of relevant minterms (instructions), e.g., RegWrite: MR-format OR Mlw
.3 1999 ©UCB
Single-Cycle MIPS-lite CPU
Regs
ReadReg1
Readdata1
ALURead
data2
ReadReg2
WriteReg
WriteData
Zero
ALU-con
RegWrite
Address
Readdata
WriteData
SignExtend
Dmem
MemRead
MemWrite
Mux
MemTo-Reg
Mux
Read Addr
Instruc-tion
Imem
4
PC
add
add <<
2
Mux
ALU Control
5:0ALUOp (2)
ALU-src
Mux
25:21
20:16
15:11
RegDst
15:0
31:0
Branch
Main Control
op=[31:26]
PCSrc
.4 1999 ©UCB
Fig. 5.17 Datapath with Control Signals
.5 1999 ©UCB
Instruction RegDst ALUSrcMemto-
RegReg
WriteMem Read
Mem Write Branch ALUOp1 ALUp0
R-format 1 0 0 1 0 0 0 1 0lw 0 1 1 1 1 0 0 0 0sw X 1 X 0 0 1 0 0 0beq X 0 X 0 0 0 1 0 1
Fig. 5.18 Setting Control Lines Depend on Opcode
.6 1999 ©UCB
Control Design
° Simple combinational logic (truth tables)
Operation2
Operation1
Operation0
Operation
ALUOp1
F3
F2
F1
F0
F (5– 0)
ALUOp0
ALUOp
ALU control block
R-format Iw sw beq
Op0
Op1
Op2
Op3
Op4
Op5
Inputs
Outputs
RegDst
ALUSrc
MemtoReg
RegWrite
MemRead
MemWrite
Branch
ALUOp1
ALUOpO
.7 1999 ©UCB
Fig. 5.19 R-type operation, add $t1, $t2, $t3 Active parts are highlighted
.8 1999 ©UCB
Fig. 5.20 Active parts for a Load instruction
.9 1999 ©UCB
Fig. 5.21 Active parts for a beq instruction
.10 1999 ©UCB
Fig. 5.24 Extension for Jump instruction
.11 1999 ©UCB
Single-Cycle Machine: Appraisal° All instructions complete in one clock cycle
(CPI = 1)
° Some instructions take more steps than others
• lw is most expensive (5 steps, vs. 4 for R-type and sw, 3 for beq)
° Clock cycle must cover longest instruction inefficient
• suppose mul is added?
• 32-shift/add steps would delay every other instruction
.12 1999 ©UCB
Example° Assume 2ns for instruction/data memory,
1ns for decode/register read, 2ns for ALU and 1 ns for register write.
° Single-cycle datapath clock period = 8 ns.
° Assume an instn mix of 24% loads, 12% stores, 44% R-format, 18% branches, and 2% jumps.
° Assuming a variable-cycle datapath, average clock period = 6.3 ns.
° Possible Speed-up = 1.27
.13 1999 ©UCB
Multicycle Implementation (MIPS-lite v.2)° Want more efficient implementation
° Each step will take one clock cycle (not each instruction) [CPI > 1]
shorter clock cycle: cycle time constrained by longest step, not longest instruction
° simpler instructions take fewer cycles
higher overall performance
° complex control: finite state machine
° Versatile (can extend for new instructions: add3, swap, etc.)
.14 1999 ©UCB
Recap: Clocking: single-cycle vs. multicycle
add $t0,$t1,$t2 beq $t0,$t1,L
Single-cycle Implementation
Multicycle Implementation
add $t0,$t1,$t2 beq $t0,$t1,L
• Multicycle Implementation: less waste=higher performance
waste waste
clock
clock
.15 1999 ©UCB
Recap: How fast can we run the clock?°Depends on how much want done per clock cycle
• Can do: several “inexpensive” datapath operations per clock
- simple gates (AND, OR, …)
- single datapath registers (PC)
- sign extender, left shifter, multiplexor
• PLUS: exactly one “expensive” datapath operation per clock
- ALU operation
- Register File access (2 reads, or 1 write)
- Memory access (read or write)
.16 1999 ©UCB
Multicycle Datapath (overview)
Registers
ReadReg1
ALU
ReadReg2
WriteReg
Data
PC
Address
Instructionor Data
Memory
MIPS-liteMulticycle Version
A
B
ALU-Out
InstructionRegister
Data MemoryData
Register
Readdata 1
Readdata 2
• One ALU (no extra adders)• One Memory (no separate Imem, Dmem)• New Temporary Registers (“clocked”/require clock input)
.17 1999 ©UCB
Multicycle Implementation
°Datapath changes• one memory: both instructions and data (because can access on separate steps)
• one ALU (eliminate extra adders)
• extra “invisible” registers to capture intermediate (per-step) datapath results
°Controller changes• controller must fire control lines in correct sequence and correct time
controller must remember current execution step, advance to next step
.18 1999 ©UCB
Multicycle Datapath: Add Multiplexors
ALU
Regs
ReadReg1
Readdata1
Readdata2
ReadReg2
WriteReg
WriteData
Sgn Ext- end
PC
<<2
A
B
ALU-Out
Address
ReadData
Mem
WriteData
MDR
Mux
25:21
20:16
15:0 0 1M2 u3 x
Mux
Mux
Mux
IR4
zero
15:11
Note inputs to multiplexors
.19 1999 ©UCB
Datapath + Control Points
ALU
Regs
ReadReg1
Readdata1
Readdata2
ReadReg2
WriteReg
WriteData
Sgn Ext- end
PC
<<2
A
B
ALU-Out
Address
ReadData
Mem
WriteData
MDR
Mux
25:21
20:16
15:0 0 1M2 u3 x
Mux
Mux
Mux
IR4
z
15:11
IorDMemRead
MemWriteIRWrite
RegDstRegWrite
ALUSrcA
ALUSrcB
MemtoReg
ALUControl
ALUOp
22
3
(funct) 5:0
Mux
PCSrcPCWrite
PCWrite-Cond
.20 1999 ©UCB
Multicycle Instruction Execution°All instructions execute in 3-5 cycles
• 3 cycles: beq
• 4 cycles: R-type, sw
• 5 cycles: lw
°1: fetch instruction, PC=PC+4
°2: decode, fetch registers, brnch target
°3: execute/compute address/branch
°4: access memory/complete R-type
°5: (lw) store memory
.21 1999 ©UCB
Cycle 1 Datapath: IR=Mem[PC]; PC=PC+4
ALU
Regs
ReadReg1
Readdata1
Readdata2
ReadReg2
WriteReg
WriteData
Sgn Ext- end
PC
<<2
A
B
ALU-Out
Address
ReadData
Mem
WriteData
MDR
Mux
25:21
20:16
15:0 0 1M2 u3 x
Mux
Mux
Mux
IR4
z
15:11
ALUControl
22
3
(funct) 5:0
Mux
IR=Mem[PC];PC=PC+4
.22 1999 ©UCB
Cycle 2: A=Reg[IR25:21]; ALUOut= PC + sgn-ext(IR15:0) << 2
ALU
Regs
ReadReg1
Readdata1
Readdata2
ReadReg2
WriteReg
WriteData
Sgn Ext- end
PC
<<2
A
B
ALU-Out
Address
ReadData
Mem
WriteData
MDR
Mux
25:21
20:16
15:0 0 1M2 u3 x
Mux
Mux
Mux
IR4
z
15:11
ALUControl
22
3
(funct) 5:0
Mux
A=Reg[IR25:21];B=Reg[IR20:16];ALUOut= PC +
sgn-ext(IR15:0) << 2
.23 1999 ©UCB
Cycle 3: R-format: ALUOut = A op B
ALU
Regs
ReadReg1
Readdata1
Readdata2
ReadReg2
WriteReg
WriteData
Sgn Ext- end
PC
<<2
A
B
ALU-Out
Address
ReadData
Mem
WriteData
MDR
Mux
25:21
20:16
15:0 0 1M2 u3 x
Mux
Mux
Mux
IR4
z
15:11
ALUControl
22
3
(funct) 5:0
Mux
ALUOut=A op B
.24 1999 ©UCB
Cycle 4 R-format: Reg[IR15:11] = ALUOut
ALU
Regs
ReadReg1
Readdata1
Readdata2
ReadReg2
WriteReg
WriteData
Sgn Ext- end
PC
<<2
A
B
ALU-Out
Address
ReadData
Mem
WriteData
MDR
Mux
25:21
20:16
15:0 0 1M2 u3 x
Mux
Mux
Mux
IR4
z
15:11
ALUControl
22
3
(funct) 5:0
Mux
Reg[IR15:11] = ALUOut
• How many times use ALU?
.25 1999 ©UCB
Cycle 3 beq: if (A==B) PC =ALUOut
ALU
Regs
ReadReg1
Readdata1
Readdata2
ReadReg2
WriteReg
WriteData
Sgn Ext- end
PC
<<2
A
B
ALU-Out
Address
ReadData
Mem
WriteData
MDR
Mux
25:21
20:16
15:0 0 1M2 u3 x
Mux
Mux
Mux
IR4
z
15:11
ALUControl
22
3
(funct) 5:0
Mux
if (A==B) PC =ALUOut
.26 1999 ©UCB
Cycle 3 lw: ALUOut = A + sgn-ext(IR15:0)
ALU
Regs
ReadReg1
Readdata1
Readdata2
ReadReg2
WriteReg
WriteData
Sgn Ext- end
PC
<<2
A
B
ALU-Out
Address
ReadData
Mem
WriteData
MDR
Mux
25:21
20:16
15:0 0 1M2 u3 x
Mux
Mux
Mux
IR4
z
15:11
IorD=xMemRead
MemWriteIRWrite
RegDst=xRegWrite
ALUSrcA=1
ALUSrcB=2
MemtoReg=x
ALUControl
ALUOp=0
22
3
(funct) 5:0
Mux
PCSrc=xPCWrite
PCWrite-Cond
ALUOut = A + sgn-ext(IR15:0)
.27 1999 ©UCB
Cycle 4 lw:MDR = Mem[ALUout]
ALU
Regs
ReadReg1
Readdata1
Readdata2
ReadReg2
WriteReg
WriteData
Sgn Ext- end
PC
<<2
A
B
ALU-Out
Address
ReadData
Mem
WriteData
MDR
Mux
25:21
20:16
15:0 0 1M2 u3 x
Mux
Mux
Mux
IR4
z
15:11
IorD=1MemRead
MemWriteIRWrite
RegDst=xRegWrite
ALUSrcA=x
ALUSrcB=x
MemtoReg=x
ALUControl
ALUOp=x
22
3
(funct) 5:0
Mux
PCSrc=xPCWrite
PCWrite-Cond
MDR = Mem[ALUout]
.28 1999 ©UCB
Cycle 5 lw: Reg[IR15:11] = MDR
ALU
Regs
ReadReg1
Readdata1
Readdata2
ReadReg2
WriteReg
WriteData
Sgn Ext- end
PC
<<2
A
B
ALU-Out
Address
ReadData
Mem
WriteData
MDR
Mux
25:21
20:16
15:0 0 1M2 u3 x
Mux
Mux
Mux
IR4
z
15:11
IorD=xMemRead
MemWriteIRWrite
RegDst=0RegWrite
ALUSrcA=x
ALUSrcB=x
MemtoReg=1
ALUControl
ALUOp=x
22
3
(funct) 5:0
Mux
PCSrc=xPCWrite
PCWrite-Cond
Reg[IR15:11] = MDR
.29 1999 ©UCB
Cycle 4 (sw): Mem[ALUOut] = B
ALU
Regs
ReadReg1
Readdata1
Readdata2
ReadReg2
WriteReg
WriteData
Sgn Ext- end
PC
<<2
A
B
ALU-Out
Address
ReadData
Mem
WriteData
MDR
Mux
25:21
20:16
15:0 0 1M2 u3 x
Mux
Mux
Mux
IR4
z
15:11
IorD=1MemRead
MemWriteIRWrite
RegDstRegWrite
ALUSrcA
ALUSrc
MemtoReg
ALUControl
ALUOp
22
3
(funct) 5:0
Mux
PCSrcPCWrite
PCWrite-Cond
Mem[ALUOut] = B