Chapter 5: Datapath and Control (Part 3)

18
1 Chapter 5: Datapath and Control (Part 3) CS 447 Jason Bakos

description

Chapter 5: Datapath and Control (Part 3). CS 447 Jason Bakos. Single-Cycle CPU. CPI of the single cycle CPU from the last lecture had a CPI of 1 Clock cycle is determined by the longest possible path in the machine loads are the worst – they use 5 functional units in series - PowerPoint PPT Presentation

Transcript of Chapter 5: Datapath and Control (Part 3)

Page 1: Chapter 5: Datapath and Control (Part 3)

1

Chapter 5: Datapath and Control(Part 3)

CS 447Jason Bakos

Page 2: Chapter 5: Datapath and Control (Part 3)

2

Single-Cycle CPU

• CPI of the single cycle CPU from the last lecture had a CPI of 1– Clock cycle is determined by the longest possible

path in the machine• loads are the worst – they use 5 functional units in

series– Performance, utilization, and efficiency are not going

to be good, because most instructions don’t need such a long clock cycle

– A variable-speed clock could be used to solve this problem, but hinders parallelism

• Pipelining overlaps instruction executions

Page 3: Chapter 5: Datapath and Control (Part 3)

3

Multicycle Implementation

• Break instructions into steps, where each step requires one clock cycle

• We want to reuse functional units within an instruction instead of just across instructions– Reduces hardware

• Use single memory for instructions and data• Single ALU instead of one ALU and two adders• Add registers to functional units to hold intermediate results

(state data) for future cycles– Use within instruction executions

• Register file and memory hold state data to be used across instruction executions

– These are programmer-visible• We will need a FSM to control CPU

Page 4: Chapter 5: Datapath and Control (Part 3)

4

Registers

• Locations of registers is determined by the following:– What combinatorial units will fit in one clock cycles

• Assume memory access, regfile access (two reads or one write), or ALU operation

• Any data needed by these operations must be stored in a temporary register

– Instruction Register, Memory Data Register, A, B, and ALUOut registers added to design

– All these except IR only need to hold data between two adjacent clock cycles

– What data are needed in later cycles implementing the instruction

Page 5: Chapter 5: Datapath and Control (Part 3)

5

Multiplexors

• Need to add extra multiplexors (or expand existing muxes) to facilitate the reuse of the ALU within instructions– Add mux to first ALU input– Expand mux to second ALU input

Page 6: Chapter 5: Datapath and Control (Part 3)

6

Multicycle CPU

Page 7: Chapter 5: Datapath and Control (Part 3)

7

Breaking Instruction Execution into Clock Cycles

• Goal is to balance the latency of the operations performed during each clock cycle– At most one of the following can occur in

series:• One ALU operation• One register file access (or multiple in parallel)• One memory access (this is a joke, but we’ll

accept this for now)

Page 8: Chapter 5: Datapath and Control (Part 3)

8

Execution Stages

• In order to clearly define the CPU operation for each step in the operation, we’ll use RTL (register transfer language)

• Architecture research has defined 5 standard phases of instruction execution– Instruction fetch– Decode

• Fetch register values from register file– Execute

• Perform arithmetic/logic operation– Memory

• Load/Store memory– Write back

• Write register result back to register file

Page 9: Chapter 5: Datapath and Control (Part 3)

9

Execution Stages

• Fetch– IR=Memory[PC]– PC=PC+4

• Decode– A=Reg[IR[25..21]]– B=Reg[IR[20..16]]– ALUOut=PC+(sign_extend(IR[15..0]) << 2

Page 10: Chapter 5: Datapath and Control (Part 3)

10

Execution Stages

• Execute– Memory access

• ALUOut=A+sign_extend(IR[15..0])– R-type

• ALUOut=A op B– Branch (beq)

• if (A==B) PC=ALUOut– PC=PC[31..28] || (IR[25..0]<<2)

Page 11: Chapter 5: Datapath and Control (Part 3)

11

Execution Stages

• Memory Access/Write Back– Load

• MDR=Memory[ALUOut]– Store

• Memory[ALUOut]=B– R-type

• Reg[IR[15..11]]=ALUOut• Memory Read Completion

– Load• Reg[IR[20..16]]=MDR

Page 12: Chapter 5: Datapath and Control (Part 3)

12

Control Signals

• Control Unit signals– Refer to figure 5.34 (pg. 384) in the book

• ALU Control signals– Provide an appropriate ALUOp signal based

on what the ALU is being used for (if for an R-type, perform lookup based on function code)

Page 13: Chapter 5: Datapath and Control (Part 3)

13

Control Signals

• All that’s left is for us to build the control unit as a FSM and the ALU control as a lookup table

Page 14: Chapter 5: Datapath and Control (Part 3)

14

Control Unit

• The fetch and decode stages are the same for every instruction...

Page 15: Chapter 5: Datapath and Control (Part 3)

15

Control Unit

• Here’s the states and transitions for the memory-reference instructions

Page 16: Chapter 5: Datapath and Control (Part 3)

16

Control Unit

• Here’s the states and transitions for R-type, branch, and jump instructions

Page 17: Chapter 5: Datapath and Control (Part 3)

17

Control Unit

• Final control unit FSM...

Page 18: Chapter 5: Datapath and Control (Part 3)

18

Problems to Think About

• How could we add bne, blt, and bgez instructions to our CPU?

• Do do you calculate CPI for our CPU if we are given instruction-type distributions?