CS352H: Computer Systems Architecture

37
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell CS352H: Computer Systems Architecture Topic 8: MIPS Pipelined Implementation September 29, 2009

description

CS352H: Computer Systems Architecture. Topic 8: MIPS Pipelined Implementation September 29, 2009. MIPS Pipeline. Five stages, one step per stage IF: Instruction fetch from memory ID: Instruction decode & register read EX: Execute operation or calculate address MEM: Access memory operand - PowerPoint PPT Presentation

Transcript of CS352H: Computer Systems Architecture

Page 1: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell

CS352H: Computer Systems Architecture

Topic 8: MIPS Pipelined Implementation

September 29, 2009

Page 2: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 2

MIPS Pipeline

Five stages, one step per stageIF: Instruction fetch from memory

ID: Instruction decode & register read

EX: Execute operation or calculate address

MEM: Access memory operand

WB: Write result back to register

Page 3: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 3

Pipeline Performance

Assume time for stages is100ps for register read or write

200ps for other stages

Compare pipelined datapath with single-cycle datapath

Instr Instr fetch Register read

ALU op Memory access

Register write

Total time

lw 200ps 100 ps 200ps 200ps 100 ps 800ps

sw 200ps 100 ps 200ps 200ps 700ps

R-format 200ps 100 ps 200ps 100 ps 600ps

beq 200ps 100 ps 200ps 500ps

j 200ps 200ps

Page 4: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 4

Pipeline Performance

Single-cycle (Tc= 800ps)

Pipelined (Tc= 200ps)

Page 5: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 5

Pipeline Speedup

If all stages are balancedi.e., all take the same time

Time between instructionspipelined

= Time between instructionsnonpipelined

Number of stages

If not balanced, speedup is less

Speedup due to increased throughputLatency (time for each instruction) does not decrease

Page 6: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 6

Pipelining and ISA Design

MIPS ISA designed for pipeliningAll instructions are 32-bits

Easier to fetch and decode in one cyclec.f. x86: 1- to 17-byte instructions

Few and regular instruction formatsCan decode and read registers in one step

Load/store addressingCan calculate address in 3rd stage, access memory in 4th stage

Alignment of memory operandsMemory access takes only one cycle

Page 7: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 7

Hazards

Situations that prevent starting the next instruction in the next cycleStructure hazards

A required resource is busy

Data hazardNeed to wait for previous instruction to complete its data read/write

Control hazardDeciding on control action depends on previous instruction

Page 8: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 8

Structure Hazards

Conflict for use of a resource

In MIPS pipeline with a single memoryLoad/store requires data access

Instruction fetch would have to stall for that cycleWould cause a pipeline “bubble”

Hence, pipelined datapaths require separate instruction/data memories

Or separate instruction/data caches

Page 9: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 9

Data Hazards

An instruction depends on completion of data access by a previous instruction

add $s0, $t0, $t1sub $t2, $s0, $t3

Page 10: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 10

Forwarding (aka Bypassing)

Use result when it is computedDon’t wait for it to be stored in a register

Requires extra connections in the datapath

Page 11: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 11

Load-Use Data Hazard

Can’t always avoid stalls by forwardingIf value not computed when needed

Can’t forward backward in time!

Page 12: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 12

Code Scheduling to Avoid Stalls

Reorder code to avoid use of load result in the next instruction

C code for A = B + E; C = B + F;

lw $t1, 0($t0)lw $t2, 4($t0)add $t3, $t1, $t2sw $t3, 12($t0)lw $t4, 8($t0)add $t5, $t1, $t4sw $t5, 16($t0)

stall

stall

lw $t1, 0($t0)lw $t2, 4($t0)lw $t4, 8($t0)add $t3, $t1, $t2sw $t3, 12($t0)add $t5, $t1, $t4sw $t5, 16($t0)

11 cycles13 cycles

Page 13: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 13

Control Hazards

Branch determines flow of controlFetching next instruction depends on branch outcomePipeline can’t always fetch correct instruction

Still working on ID stage of branch

In MIPS pipelineNeed to compare registers and compute target early in the pipelineAdd hardware to do it in ID stage

Page 14: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 14

Stall on Branch

Wait until branch outcome determined before fetching next instruction

Page 15: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 15

Branch Prediction

Longer pipelines can’t readily determine branch outcome early

Stall penalty becomes unacceptable

Predict outcome of branchOnly stall if prediction is wrong

In MIPS pipelineCan predict branches not taken

Fetch instruction after branch, with no delay

Page 16: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 16

MIPS with Predict Not Taken

Prediction correct

Prediction incorrect

Page 17: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 17

More-Realistic Branch Prediction

Static branch predictionBased on typical branch behavior

Example: loop and if-statement branchesPredict backward branches taken

Predict forward branches not taken

Dynamic branch predictionHardware measures actual branch behavior

e.g., record recent history of each branch

Assume future behavior will continue the trendWhen wrong, stall while re-fetching, and update history

Page 18: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 18

Pipeline Summary

Pipelining improves performance by increasing instruction throughput

Executes multiple instructions in parallel

Each instruction has the same latency

Subject to hazardsStructure, data, control

Instruction set design affects complexity of pipeline implementation

Page 19: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 19

MIPS Pipelined Datapath

WB

MEM

Right-to-left flow leads to hazards

Page 20: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 20

Pipeline registers

Need registers between stagesTo hold information produced in previous cycle

Page 21: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 21

Pipeline Operation

Cycle-by-cycle flow of instructions through the pipelined datapath

“Single-clock-cycle” pipeline diagramShows pipeline usage in a single cycle

Highlight resources used

c.f. “multi-clock-cycle” diagramGraph of operation over time

We’ll look at “single-clock-cycle” diagrams for load & store

Page 22: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 22

IF for Load, Store, …

Page 23: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 23

ID for Load, Store, …

Page 24: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 24

EX for Load

Page 25: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 25

MEM for Load

Page 26: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 26

WB for Load

Wrongregisternumber

Page 27: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 27

Corrected Datapath for Load

Page 28: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 28

EX for Store

Page 29: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 29

MEM for Store

Page 30: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 30

WB for Store

Page 31: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 31

Multi-Cycle Pipeline Diagram

Form showing resource usage

Page 32: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 32

Multi-Cycle Pipeline Diagram

Traditional form

Page 33: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 33

Single-Cycle Pipeline Diagram

State of pipeline in a given cycle

Page 34: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 34

Pipelined Control (Simplified)

Page 35: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 35

Pipelined Control

Control signals derived from instructionAs in single-cycle implementation

Page 36: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 36

Pipelined Control

Page 37: CS352H: Computer Systems Architecture

University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell 37

Concluding Remarks

ISA influences design of datapath and control

Datapath and control influence design of ISA

Pipelining improves instruction throughputusing parallelism

More instructions completed per second

Latency for each instruction not reduced

Hazards: structural, data, control