Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control...

30
Chapter 4 The Processor

Transcript of Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control...

Page 1: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4The Processor

Page 2: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 2

IntroductionCPU performance factors

Instruction countDetermined by ISA and compiler

CPI and Cycle timeDetermined by CPU hardware

We will examine two MIPS implementationsA simplified versionA more realistic pipelined version

Simple subset, shows most aspectsMemory reference: lw, sw

Arithmetic/logical: add, sub, and, or, slt

Control transfer: beq, j

§4.1 Introduction

Page 3: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 3

Instruction ExecutionPC → instruction memory, fetch instructionRegister numbers → register file, read registersDepending on instruction class

Use ALU to calculateArithmetic resultMemory address for load/storeBranch target address

Access data memory for load/storePC ← target address or PC + 4

Page 4: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 4

CPU Overview

Page 5: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 5

MultiplexersCan’t just join wires together

Use multiplexers

Page 6: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 6

Control

Page 7: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 7

Logic Design Basics§4.2 Logic D

esign Conventions

Information encoded in binaryLow voltage = 0, High voltage = 1One wire per bitMulti-bit data encoded on multi-wire buses

Combinational elementOperate on dataOutput is a function of input

State (sequential) elementsStore information

Page 8: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 8

Combinational Elements

AND-gateY = A & B

AB Y

I0I1 Y

Mux

S

MultiplexerY = S ? I1 : I0

A

BY+

A

B

YALU

F

AdderY = A + B

Arithmetic/Logic UnitY = F(A, B)

Page 9: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 9

Sequential ElementsRegister: stores data in a circuit

Uses a clock signal to determine when to update the stored valueEdge-triggered: update when Clk changes from 0 to 1

D

Clk

QClk

D

Q

Page 10: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 10

Sequential ElementsRegister with write control

Only updates on clock edge when write control input is 1Used when stored value is required later

D

Clk

QWrite

Write

D

Q

Clk

Page 11: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 11

Clocking MethodologyCombinational logic transforms data during clock cycles

Between clock edgesInput from state elements, output to state elementLongest delay determines clock period

Page 12: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 12

Building a DatapathDatapath

Elements that process data and addressesin the CPU

Registers, ALUs, mux’s, memories, …

We will build a MIPS datapath incrementally

Refining the overview designTop Down

§4.3 Building a D

atapath

Page 13: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 13

Instruction Fetch

32-bit register

Increment by 4 for next instruction

Page 14: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 14

R-Format InstructionsRead two register operandsPerform arithmetic/logical operationWrite register result

Page 15: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 15

Load/Store InstructionsRead register operandsCalculate address using 16-bit offset

Use ALU, but sign-extend offsetLoad: Read memory and update registerStore: Write register value to memory

Page 16: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 16

Branch InstructionsRead register operandsCompare operands

Use ALU, subtract and check Zero outputCalculate target address

Sign-extend displacementShift left 2 places (word displacement)Add to PC + 4

Already calculated by instruction fetch

Page 17: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 17

Branch Instructions

Justre-routes

wires

Sign-bit wire replicated

Page 18: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 18

Composing the ElementsFirst-cut data path does an instruction in one clock cycle

Each datapath element can only do one function at a timeHence, we need separate instruction and data memories

Use multiplexers where alternate data sources are used for different instructions

Page 19: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 19

R-Type/Load/Store Datapath

Page 20: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 20

Full Datapath

Page 21: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 21

ALU ControlALU used for

Load/Store: F = addBranch: F = subtractR-type: F depends on funct field

§4.4 A Sim

ple Implem

entation Schem

eALU control Function0000 AND0001 OR0010 add0110 subtract0111 set-on-less-than1100 NOR

All logic can be done with

a NOR

Equal and not-equal can also be done with an

XOR (that is faster), but you need a bit more

hardware.

Page 22: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 22

ALU ControlAssume 2-bit ALUOp derived from opcode

Combinational logic derives ALU control

opcode ALUOp Operation funct ALU function ALU controllw 00 load word XXXXXX add 0010sw 00 store word XXXXXX add 0010beq 01 branch equal XXXXXX subtract 0110R-type 10 add 100000 add 0010

subtract 100010 subtract 0110AND 100100 AND 0000OR 100101 OR 0001set-on-less-than 101010 set-on-less-than 0111

Page 23: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 23

The Main Control UnitControl signals derived from instruction

0 rs rt rd shamt funct31:26 5:025:21 20:16 15:11 10:6

35 or 43 rs rt address31:26 25:21 20:16 15:0

4 rs rt address31:26 25:21 20:16 15:0

R-type

Load/Store

Branch

opcode always read

read, except for load

write for R-type

and load

sign-extend and add

Page 24: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 24

Datapath With Control

Page 25: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 25

R-Type Instruction

Page 26: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 26

Load Instruction

Page 27: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 27

Branch-on-Equal Instruction

Page 28: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 28

Implementing Jumps

Jump uses word addressUpdate PC with concatenation of

Top 4 bits of old PC26-bit jump address00

Need an extra control signal decoded from opcode

2 address31:26 25:0

Jump

Page 29: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 29

Datapath With Jumps Added

Page 30: Chapter · State (sequential) elements Store ... Chapter 4 — The Processor — 21. ALU Control ... Combinational logic derives ALU control opcode: ALUOp: Operation: funct; ALU ...

Chapter 4 — The Processor — 30

Performance IssuesLongest delay determines clock period

Critical path: load instructionInstruction memory → register file → ALU →data memory → register file

Not feasible to vary period for different instructionsViolates design principle

Making the common case fastWe will improve performance by pipelining