The Evolution of Microprocessors Per Stenström...Rtdata X Sign Ex-tend [15..0] Shift Left 2 ADD ALU...

Post on 02-Mar-2021

0 views 0 download

Transcript of The Evolution of Microprocessors Per Stenström...Rtdata X Sign Ex-tend [15..0] Shift Left 2 ADD ALU...

The Evolution of Microprocessors

Per Stenström

Processor(Core)

Memory

Processor(Core)

Processor(Core)

L2 Cache

L1 Cache L1 Cache L1 Cache

MicroprocessorChip

Time1970 20201980 1990 2000 2010

Multicycle Pipelined Superscalar Multicore …and beyond

Evolution of Microprocessors

Multicycle Computers

Fetc

h ne

xt in

stru

ctio

n

Dec

ode/

Fetc

h da

ta

Exec

ute

Stor

e re

sult

Opcode Assembly code Meaning CommentsADD,SUB, ADD R1,R2,R3 R1=(R2) + (R3) Integer

operationsADDI, SUBI ADDI R1,R2,#3 R1=(R2)+3 ImmediateAND, OR, XOR AND R1,R2,R3 R1=(R2).AND.(R3) bit-wise

logical AND,OR, XOR

SLT SLT R1,R2,R3 If (R2)<(R3) then R1=1 else R1=0

Test R2, R3 outcome in R1

ADD.D, SUB.D, MUL.D, DIV.D

ADD.D F1,F2,F3 F1=(F2)+(F3) Floating-pointoperations

Opcode Assembly code Meaning CommentsLB,LH,LW,LD LW R1,#20(R2) R1=MEM[20+(R2)] For bytes,

half-words, words anddouble words

SB,SH,SW,SD SW R1,#20(R2) MEM[20+(R2)]=R1L.S, L.D L.S F0,#20(R2) F0=MEM[20+(R2)] Single/double

float load

S.S, S.D S.S F0,#20(R2) MEM[20+(R2)]=F0 Single/double float store

Opcode Assembly code Meaning CommentsBEQZ, BNEZ BEQZ R1,Label If (R1)=0 then

PC=LabelConditionalbranch-equal 0/not equal 0

BEQ, BNE BEQ R1,R2,Label If (R1)=(R2) PC=Label

Conditional branch-equal/not equal

J J Target PC=Target Target is an immediate field

JR JR R1 PC=(R1) Target is in register

JAL JAL Target R31 = PC + 4;PC = Target

Jump to target and save return address in R31

Instructionmemory

ReadAddress

Instruction[31..0]

PC

Instructionmemory

ReadAddress

Instruction[31..0]

PC

MU

X

[25..21]

[20..16]

[15..11]

Registers

Rs addrRt addr

Rd addr

Write data

Rs data

Rt data

[15..0]

ALU

Control

Memory transfer

Instructionmemory

ReadAddress

Instruction[31..0]

PC

MU

X

[25..21]

[20..16]

[15..11]

Registers

Rs addrRt addr

Rd addr

Write data

Rs data

Rt data

[15..0]

MU

XALU

ALU

Control

Memory transfer

Instructionmemory

ReadAddress

Instruction[31..0]

PC

MU

X

[25..21]

[20..16]

[15..11]

Registers

Rs addrRt addr

Rd addr

Write data

Rs data

Rt data

[15..0]

MU

XALU

MU

X

MU

X

ALU

Control

Memory transfer

Instructionmemory

ReadAddress

Instruction[31..0]

PC

MU

X

[25..21]

[20..16]

[15..11]

Registers

Rs addrRt addr

Rd addr

Write data

Rs data

Rt dataM

UX

ALU

SignEx-tend

[15..0]

Effective addresssDisplacement + (Rs)

ALU

Control

Memory transfer

Instructionmemory

ReadAddress

Instruction[31..0]

PC

MU

X

[25..21]

[20..16]

[15..11]

Registers

Rs addrRt addr

Rd addr

Write data

Rs data

Rt dataM

UX

ALU

SignEx-tend

[15..0]

Data memory

Rd addrWr addr

Write data

Rd data

ALU

Control

Memory transfer

Instructionmemory

ReadAddress

Instruction[31..0]

PC

MU

X

[25..21]

[20..16]

[15..11]

Registers

Rs addrRt addr

Rd addr

Write data

Rs data

Rt dataM

UX

ALU

SignEx-tend

[15..0]

Data memory

Rd addrWr addr

Write data

Rd data

MU

XALU

Control

Memory transfer

Instructionmemory

ReadAddress

Instruction[31..0]

PC

MU

X

[25..21]

[20..16]

Data memory

Rd addrWr addr

Write data

Rd data

[15..11] MU

X

Registers

Rs addrRt addr

Rd addr

Write data

Rs data

Rt data

MU

X

SignEx-tend

[15..0]

ALU

ADD

+4

MU

XFetching the next instruction

ALU

Control

Memory transfer

Instructionmemory

ReadAddress

Instruction[31..0]

PC

MU

X

[25..21]

[20..16]

Data memory

Rd addrWr addr

Write data

Rd data

[15..11] MU

X

Registers

Rs addrRt addr

Rd addr

Write data

Rs data

Rt data

MU

X

SignEx-tend

[15..0]

ShiftLeft2 ADD

ALU

ADD

+4

MU

X

ALU

Control

Memory transfer

BEQ R1,R2,offset ZERO

Instructionmemory

ReadAddress

Instruction[31..0]

PC

MU

X

[25..21]

[20..16]

Data memory

Rd addrWr addr

Write data

Rd data

[15..11] MU

X

Registers

Rs addrRt addr

Rd addr

Write data

Rs data

Rt data

MU

X

SignEx-tend

[15..0]

ShiftLeft2 ADD

ALU

ADD

+4

MU

X

Instructionmemory

ReadAddress

Instruction[31..0]

PC

MU

X

[25..21]

[20..16]

Data memory

Rd addrWr addr

Write data

Rd data

[15..11] MU

X

Registers

Rs addrRt addr

Rd addr

Write data

Rs data

Rt data

MU

X

SignEx-tend

[15..0]

ShiftLeft2 ADD

ALU

ADD

+4

MU

X

RegDst

RegWrite

ALUSrcALUOp

PCSrc

MemToReg

MemWrite

MemRead

Instructionmemory

ReadAddress

Instruction[31..0]

PC

MU

X

[25..21]

[20..16]

Data memory

Rd addrWr addr

Write data

Rd data

[15..11] MU

X

Registers

Rs addrRt addr

Rd addr

Write data

Rs data

Rt data

MU

X

SignEx-tend

[15..0]

ShiftLeft2 ADD

ALU

ADD

+4

MU

X

DECODE /OPERAND FETCH

EXECUTE MEMORYACCESSINSTRUCTION

FETCHWRITEBACK

Pipelined ”Single-Cycle” Computer

The Conveyor Belt

Instructionmemory

ReadAddress

Instruction[31..0]

PC

MU

X

[25..21]

[20..16]

Data memory

Rd addrWr addr

Write data

Rd data

[15..11] MU

X

Registers

Rs addrRt addr

Rd addr

Write data

Rs data

Rt data

MU

X

SignEx-tend

[15..0]

ShiftLeft2 ADD

ALU

ADD

+4

MU

X

LD R3,0(R11) LD R2,0(R10)ADD R1,R2,R3SD 0(R12),R1SUBI R4,R4,#1

LABEL: LD R2, 0(R10)

LD R3, 0(R11)

ADD R1,R2,R3

SD 0(R12), R1

SUBI R4,R4,#1

ADDI R10,R10,#4

ADDI R11,R11,#4

ADDI R12,R12,#4

BNEZ R4, LABEL

Instructionmemory

ReadAddress

Instruction[31..0]

PC

MU

X

[25..21]

[20..16]

Data memory

Rd addrWr addr

Write data

Rd data

[15..11] MU

X

Registers

Rs addrRt addr

Rd addr

Write data

Rs data

Rt data

MU

X

SignEx-tend

[15..0]

ShiftLeft2 ADD

ALU

ADD

+4

MU

X

LD R3,0(R11) LD R2,0(R10)ADD R1,R2,R3SD 0(R12),R1SUBI R4,R4,#1

Structural hazardsData hazards

Control hazards

Instructionmemory

ReadAddress

Instruction[31..0]

PC

MU

X

[25..21]

[20..16]

Data memory

Rd addrWr addr

Write data

Rd data

[15..11] MU

X

Registers

Rs addrRt addr

Rd addr

Write data

Rs data

Rt data

MU

X

SignEx-tend

[15..0]

ShiftLeft2 ADD

ALU

ADD

+4

MU

X

LD R3,0(R11)ADD R1,R2,R3SD 0(R12),R1SUBI R4,R4,#1 UNUSED

Structural hazardsData hazards

Control hazards

Instructionmemory

ReadAddress

Instruction[31..0]

PC

MU

X

[25..21]

[20..16]

Data memory

Rd addrWr addr

Write data

Rd data

[15..11] MU

X

Registers

Rs addrRt addr

Rd addr

Write data

Rs data

Rt data

MU

X

SignEx-tend

[15..0]

ShiftLeft2 ADD

ALU

ADD

+4

MU

X

ADD R1,R2,R3SD 0(R12),R1SUBI R4,R4,#1 UNUSED UNUSED

Structural hazardsData hazards

Control hazards

Instructionmemory

ReadAddress

Instruction[31..0]

PC

MU

X

[25..21]

[20..16]

Data memory

Rd addrWr addr

Write data

Rd data

[15..11] MU

X

Registers

Rs addrRt addr

Rd addr

Write data

Rs data

Rt data

MU

X

SignEx-tend

[15..0]

ShiftLeft2 ADD

ALU

ADD

+4

MU

X

BEQ R1,offsetUNUSED

Structural hazardsData hazards

Control hazards

Instructionmemory

ReadAddress

Instruction[31..0]

PC

MU

X

[25..21]

[20..16]

Data memory

Rd addrWr addr

Write data

Rd data

[15..11] MU

X

Registers

Rs addrRt addr

Rd addr

Write data

Rs data

Rt data

MU

X

SignEx-tend

[15..0]

ShiftLeft2 ADD

ALU

ADD

+4

MU

X

UNUSED UNUSED BEQ R1,offset

Structural hazardsData hazards

Control hazards

Instructionmemory

ReadAddress

Instruction[31..0]

PC

MU

X

[25..21]

[20..16]

Data memory

Rd addrWr addr

Write data

Rd data

[15..11] MU

X

Registers

Rs addrRt addr

Rd addr

Write data

Rs data

Rt data

MU

X

SignEx-tend

[15..0]

ShiftLeft2 ADD

ALU

ADD

+4

MU

X

UNUSED UNUSED BEQ R1,offsetNEXT INST.

Structural hazardsData hazards

Control hazards

Superscalar Microprocessors

IF IS EXID WB

InstructionFetch

InstructionDecode/Dispatch

InstructionIssue/ReadOperands

Execute WriteBack

IF ID WB

EX 1

EX 2

EX N

IQ 1

IQ 2

IQ N

Fetch Decode/Dispatch

ReadOperands Execute Write

Back

IF ID WB

INT

FP

MEM

I-Q

FP-Q

M-Q

Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

L.S F0,0(R1)

Cycle: 1

IF ID WB

INT

FP

MEM

I-Q

FP-Q

M-Q

Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

L.S F0,0(R1)L.S F1,0(R2)

Cycle: 2

IF ID WB

INT

FP

MEM

I-Q

FP-Q

M-Q

Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

L.S F0,0(R1)

L.S F1,0(R2)ADD.S F2,F1,F0

Cycle: 3

IF ID WB

INT

FP

MEM

I-Q

FP-Q

M-Q

Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

L.S F0,0(R1)L.S F1,0(R2)

ADD.S F2,F1,F0S.S F2,0(R1)

Cycle: 4

IF ID WB

INT

FP

MEM

I-Q

FP-Q

M-Q

Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

L.S F0,0(R1)

L.S F1,0(R2)

ADD.S F2,F1,F0

ADDI R1,R1,#4 S.S F2,0(R1)

Cycle: 5

IF ID WB

INT

FP

MEM

I-Q

FP-Q

M-Q

Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

L.S F1,0(R2)

ADD.S F2,F1,F0

ADDI R1,R1,#4ADDI R2,R2,#4

S.S F2,0(R1)

Cycle: 6

IF ID WB

INT

FP

MEM

I-Q

FP-Q

M-Q

Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

ADD.S F2,F1,F0

ADDI R2,R2,#4 SUBI R3,R3,#1

S.S F2,0(R1)

ADDI R1,R1,#4Cycle: 7

IF ID WB

INT

FP

MEM

I-Q

FP-Q

M-Q

Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

ADD.S F2,F1,F0

SUBI R3,R3,#1 BNEZ R3,Loop

S.S F2,0(R1)

ADDI R1,R1,#4ADDI R2,R2,#4 Cycle: 8

The ADDI R1,R1,#4 is being executed Out-of-program-orderwith respect to the S.S F2,0(R1)!

IF ID WB

INT

FP

MEM

I-Q

FP-Q

M-Q

Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

L.S F0,0(R2)

ADD.S F2,F1,F0

SUBI R3,R3,#1

BNEZ R3,Loop

S.S F2,0(R1)

ADDI R1,R1,#4

ADDI R2,R2,#4 Cycle: 9

IF ID WB

INT

FP

MEM

I-Q

FP-Q

M-Q

Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

L.S F1,0(R1) L.S F0,0(R2)

ADD.S F2,F1,F0

SUBI R3,R3,#1 BNEZ R3,Loop

S.S F2,0(R1)

ADDI R2,R2,#4

Cycle: 10

IF ID WB

INT

FP

MEM

I-Q

FP-Q

M-Q

Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

L.S F1,0(R1) L.S F0,0(R2)

ADD.S F2,F1,F0

SUBI R3,R3,#1

BNEZ R3,Loop

S.S F2,0(R1)

Cycle: 11

IF ID WB

INT

FP

MEM

I-Q

FP-Q

M-Q

Loop L.S F0,0(R1) L.S F1,0(R2) ADD.S F2,F1,F0 S.S F2,0(R1) ADDI R1,R1,#4 ADDI R2,R2,#4 SUBI R3,R3,#1 BNEZ R3,Loop

L.S F1,0(R1)

L.S F0,0(R2)

ADD.S F2,F1,F0ADD,S F2,F1,F0

BNEZ R3,Loop

S.S F2,0(R1)

Cycle: 12

Multicore Computers

Processor(Core)

Memory

Processor(Core)

Processor(Core)

L2 Cache

L1 Cache L1 Cache L1 Cache

MicroprocessorChip

Lectures on youtube9 interactive sessions, flipped class-room style5 problem solution sessions1 design exploration projectTextbook: Parallel Computer Organization and DesignDubois, Annavaram, Stenström

Blended Learning