Computer Architecture

18
Computer Architecture CS423: Lecture 12 Dynamic Scheduling Jahangir Ikram

description

Computer Architecture. CS423: Lecture 12 Dynamic Scheduling Jahangir Ikram. COMPARISON. FP PIPELINE VS SCOREBOARD. Revision. EX. IF. ID. A 1. A 2. A 3. A 4. Mem. WB. M 1. M 2. M 7. Divide. Multiple Cycle Floating Point Pipeline. Register File. Scoreboard of CDC 6600. - PowerPoint PPT Presentation

Transcript of Computer Architecture

Page 1: Computer Architecture

Computer Architecture

CS423: Lecture 12

Dynamic Scheduling

Jahangir Ikram

Page 2: Computer Architecture

COMPARISON

FP PIPELINE VS SCOREBOARD

Page 3: Computer Architecture

Revision

Page 4: Computer Architecture

Multiple Cycle Floating Point Pipeline

EX

Mem WBIF ID A1

A2

A3

A4

M1

M2

.

.M7

DivideFunction Unit Latency Initiation

/Re-Issue Interval

Integer ALU 0 1

Load/Store 1 1

FP Add 3 1

FP/Int Multiply

6 1

FP/Int Divide 24 25

Page 5: Computer Architecture
Page 6: Computer Architecture

ISSUE

Read Operands

Check for WAW, FU

Check for RAW, Read Values from Register File when free

Read Operands

EX Mem

WBEX

A1

A2

A3

A4

M1

M2

.

.M7

Divide

Read Operands

Read Operands

Read Operands

Check for WAR

Register File

Register File

Scoreboard of CDC 6600

Page 7: Computer Architecture

IS RO EXWR

IS RO EXWR

L.D F0,0(R2)

L.D F4,0(R3)

MUL.D F0,F0,F4

ADD.D F2,F0,F2

DADDUIR3,R3,8

DAADUIR3,R3,8

DSUBUR5,R4,R2

BNEZ R5, Loop

Ask students to fill this and compare with 2 slides before

Page 8: Computer Architecture

IS RO EXWR

IS RO EXWR

L.D F0,0(R2)

L.D F4,0(R3)

MUL.D F0,F0,F4

ADD.DF2,F0,F2

DADDUIR3,R3,8

DAADUIR3,R3,8

DSUBUR5,R4,R2

BNEZ R5, Loop

Page 9: Computer Architecture

Data Hazards

RAW HazardADD.D F3, F1, F2SUB.D F5, F6, F3

WAW HazardDIV.D F3, F1, F2SUB.D F3, F6, F5

WAR HazardDIV.D F3, F1, F2SUB.D F5, F6, F3ADD.D F3, F6, F7

Page 10: Computer Architecture

TRUE and False Dependencies

Find Dependencies in this code DIV.D F0,F2,F4 ADD.D F6,F0,F8 S.D F6,0(R1) SUB.D F8,F10,F14 MUL.D F6,F10,F8

Page 11: Computer Architecture

1. DIV.D F0,F2,F4

2. ADD.D F6,F0,F8

3. S.D F6,0(R1)

4. SUB.DF8,F10,F14

5. MUL.D F6,F10,F8

WAR and WAW Data Dependencies

Type B/W Register/FU

RAW 1,2 F0

RAW 2,3 F6

RAW 4,5 F8

WAW

2,5 F6

WAR 2,4 F8

Struc

2,4 ADDER

Page 12: Computer Architecture

Name Dependencies

WAW and WAR dependencies are also called name dependencies: they do not carry a value between two instructions

Can be removed by avoiding use of the same name: rename the destination register whenever a new value is created

Both compiler (statically) and processor (dynamically) can do that

Page 13: Computer Architecture

Register Renaming: Compiler

1. DIV.D F0,F2,F4

2. ADD.D F6,F0,F8

3. S.D F6,0(R1)

4. SUB.D F20,F10,F14

5. MUL.D F21,F10,F20 Only RAW or struc. hazards left

Page 14: Computer Architecture

Dynamic Register Renaming

Use some architecture invisible registers for renaming, called rename registers to avoid WAW.

Read and keep a copy of available operands at the time of issue, this will avoid WAR. The values are stored in reservation station.

Page 15: Computer Architecture

Tomasulo’s Algorithm

Check for RAW

ISSUE/ Rename to RS

Wait for Operands

Wait for Operands

MemAccess

CDB

A1

A2

A3

A4

M1

M2

.. M7

Divide

Wait for Operands

Wait for Operands

Wait for Operands

Integer

FPISSUE/ Rename to RS

Wait for Operands

Check for RS

Wait for Operands

EXTAC

MemAccess

CDBEX

A1

A2

A3

A4

M1

M2

.. M7

Divide

Wait for Operands

Wait for Operands

Wait for Operands

Integer

LD/ST

FP ADD

Wait for Operands

Wait for Operands

Wait for Operands

Wait for Operands

Tag

DATA

Register FILE

Register FILE

Page 16: Computer Architecture

MIPS FP Unit Using Tomasulo’s Algorithm

From Instruction Unit

FP registersInstruction Queue

Address unitStop Buffers

Memory unit FP Adders FP multipliers

AddressData

Reservation Stations

FP Operations

Common Data Bus (CDB)

Operand Busses

Load / Store Unit

Page 17: Computer Architecture

Structure of Reservation Station

Qj,Qk: Like scoreboard Vj, Vk: Contains values of two

operands. Value are valid if Qj and Qk is zero

Busy OpCode A: For Target address TA or Imm value

Registers have Qi field as before

Page 18: Computer Architecture

Tomasulo’s Example

Instruction status

Execution Write

instruction j k issue

complete

Result

L.D F6 34+ R2 1 3 4L.D F2 45+ R3 2 4 5

MUL.D F0 F2 F4 3 15 16SUB.D F8 F6 F2 4 7 8DIV.D F10 F0 F6 5 55 57ADD.

DF6 F8 F2 6 10 11