The Processor (1) - SKKU

Post on 01-Nov-2021

4 views 0 download

Transcript of The Processor (1) - SKKU

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu)

The Processor (1)

Jinkyu Jeong (jinkyu@skku.edu)Computer Systems Laboratory

Sungkyunkwan Universityhttp://csl.skku.edu

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 1

Introduction• CPU performance factors– Instruction count

• Determined by ISA and compiler– CPI and Cycle time

• Determined by CPU hardware

• We will examine two MIPS implementations– A simplified version– A more realistic pipelined version

• Simple subset, shows most aspects– Memory reference: lw, sw– Arithmetic/logical: add, sub, and, or, slt– Control transfer: beq, j

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 2

Outline

Textbook: P&H 4.1-4.4

• Logic Design Basics & Implementation Overview

• Building a Datapath

• Control Logic Design

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu)

Logic Design Basics & Implementation Overview

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 4

Logic Design Basics

• Information encoded in binary– Low voltage = 0, High voltage = 1– One wire per bit– Multi-bit data encoded on multi-wire buses

• Combinational element– Operate on data– Output is a function of input

• State (sequential) elements– Store information

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 5

Combinational Elements

• AND-gate– Y = A & B

AB

Y

I0I1

YMux

S

A

B

Y+

A

B

YALU

F

• Adder– Y = A + B

• Multiplexer– Y = S ? I1 : I0

• Arithmetic/Logic Unit– Y = F(A, B)

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 6

Sequential Elements

• Register: stores data in a circuit– Uses a clock signal to determine when to update the stored

value– Edge-triggered: update when Clk changes from 0 to 1

D

Clk

QClk

D

Q

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 7

Sequential Elements

• Register with write control– Only updates on clock edge when write control input is 1– Used when stored value is required later

D

Clk

Q

Write

Write

D

Q

Clk

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 8

Clocking Methodology• Combinational logic transforms data during clock

cycles– Between clock edges– Input from state elements, output to state element– Longest delay determines clock period

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 9

Instruction Execution

• PC ® instruction memory, fetch instruction

• Register numbers ® register file, read registers

• Depending on instruction class– Use ALU to calculate

• Arithmetic result• Memory address for load/store• Branch target address

– Access data memory for load/store– PC ¬ target address or PC + 4

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu)

CPU Overview

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 11

Multiplexers

• Can’t just join wires together

– Use multiplexers

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu)

Control

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu)

Building a Datapath

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 14

Building a Datapath

• Datapath– Elements that process data and addresses in the CPU

• Registers, ALUs, mux’s, memories, …

• We will build a MIPS datapath incrementally– Refining the overview design

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu)

Instruction Fetch

32-bit register

Increment by 4 for next instruction

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 16

R-Format Instructions

• Read two register operands

• Perform arithmetic/logical operation

• Write register result

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 17

Register File Read

• Two register numbers select two register outputs

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 18

Register File Write

• A register number and a write signal enable a state element (D flip-flop) to update its value

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 19

Arithmetic/Logic Unit

Ainvert Binvert Operationa AND b 0 0 00a OR b 0 0 01

a NOR b 1 1 00a + b 0 0 10a - b 0 1 10

slt a, b 0 1 11

Binvert

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 20

R-Format Instructions• R format instructions (add, sub, slt, and, or)

– perform operation (op and funct) on values in rs and rt– store the result back into the Register File (into location rd)

– Note that Register File is not written every cycle (e.g. sw), so we need an explicit write control signal for the Register File

Instruction

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

ReadData 1

ReadData 2

ALU

overflowzero

ALU controlRegWrite

R-type:

31 25 20 15 5 0

op rs rt rd functshamt

10

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 21

Load/Store Instructions• Read register operands• Calculate address using 16-bit offset– Use ALU, but sign-extend offset

• Load: Read memory and update register• Store: Write register value to memory

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 22

Load/Store Instructions• Load and store instructions involve

– compute memory address by adding the base register (read from the Register File during decode) to the 16-bit signed-extended offset field in the instruction

– store value (read from the Register File during decode) written to the Data Memory

– load value, read from the Data Memory, written to the Register File

Instruction

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

ReadData 1

ReadData 2

ALU

overflowzero

ALU controlRegWrite

DataMemory

Address

Write Data

Read Data

SignExtend

MemWrite

MemRead

16 32

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 23

Branch Instructions

• Read register operands

• Compare operands– Use ALU, subtract and check Zero output

• Calculate target address– Sign-extend displacement– Shift left 2 places (word displacement)– Add to PC + 4

• Already calculated by instruction fetch

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu)

Branch Instructions

Justre-routes wires

Sign-bit wire replicated

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 25

Branch Instructions• Branch instructions involve

– compare the operands read from the Register File during decode for equality (zero ALU out)– compute the branch target address by adding the updated PC to the 16-bit signed-extended offset

field in the instr

Instruction

Write Data

Read Addr 1

Read Addr 2

Write Addr

Register

File

ReadData 1

ReadData 2

ALU

zero

ALU control

SignExtend16 32

Shiftleft 2

Add

4 Add

PC

Branchtargetaddress

(to branch control logic)

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 26

Composing the Elements

• First-cut data path does an instruction in one clock cycle

– Each datapath element can only do one function at a time– Hence, we need separate instruction and data memories

• Use multiplexers where alternate data sources are used for different instructions

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 27

R-Type/Load/Store Datapath

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu)

Full Datapath

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu)

Control Logic Design

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu)

Datapath With Control

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 31

ALU Control

• ALU used for– Load/Store: F = add– Branch: F = subtract– R-type: F depends on funct field

ALU control Function0000 AND0001 OR0010 add0110 subtract0111 set-on-less-than1100 NOR

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 32

ALU Control

• Assume 2-bit ALUOp derived from opcode– Combinational logic derives ALU control

opcode ALUOp Operation funct ALU function ALU controllw 00 load word XXXXXX add 0010sw 00 store word XXXXXX add 0010beq 01 branch equal XXXXXX subtract 0110R-type 10 add 100000 add 0010

subtract 100010 subtract 0110AND 100100 AND 0000OR 100101 OR 0001set-on-less-than 101010 set-on-less-than 0111

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 33

The Main Control Unit

• Control signals derived from instruction

0 rs rt rd shamt funct

31:26 5:025:21 20:16 15:11 10:6

35 or 43 rs rt address

31:26 25:21 20:16 15:0

4 rs rt address

31:26 25:21 20:16 15:0

R-type

Load/Store

Branch

opcode always read

read, except for load

write for R-type and load

sign-extend and add

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu)

R-Type Instruction

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu)

Load Instruction

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu)

Branch-on-Equal Instruction

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 37

Implementing Jumps

• Jump uses word address

• Update PC with concatenation of– Top 4 bits of old PC– 26-bit jump address– 00

• Need an extra control signal decoded from opcode

2 address

31:26 25:0Jump

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu)

Datapath With Jumps Added

SWE3005: Introduction to Computer Architectures, Fall 2019, Jinkyu Jeong (jinkyu@skku.edu) 39

Performance Issues

• Longest delay determines clock period– Critical path: load instruction– Instruction memory ® register file ®ALU ® data memory ® register file

• Not feasible to vary period for different instructions

• Violates design principle– Making the common case fast

• We will improve performance by pipelining