Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M...
Transcript of Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M...
![Page 1: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/1.jpg)
EE457EE457
Out of Order (OoO) Execution
Introduction to Dynamic
Out of Order (OoO) Execution
Introduction to Dynamic Scheduling of Instructions(The Tomasulo Algorithm)
ByGandhi Puvvada
![Page 2: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/2.jpg)
References• EE557 Textbook
• Prof. Dubois’ EE557 ClassnotesProf. Dubois EE557 Classnotes
• Prof Annavaram’s slides• Prof. Annavaram s slides
P f P tt ’ L t lid2
• Prof. Patterson’s Lecture slides
![Page 3: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/3.jpg)
Programs often have several small fragments of d hi h b t d i dcode, which can be executed in any order.
3
![Page 4: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/4.jpg)
OoO (Out of Order) executionOoO (Out of Order) execution
Io = In orderIo = In order
”Execution” here means producing the results.C l ti itti ltCompletion means committing results. (writing into register file or memory).
IoI (IoD) OoE IoCIn order Issue/Dispatch Out of orderIn order Issue/Dispatch, Out of order Execution and finally In order completion/commitmentcompletion/commitment
4
![Page 5: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/5.jpg)
IoC or OoC?IoC or OoC?
IoI (IoD) OoE IoCIoI (IoD) OoE IoCIoC (In order completion) is necessary to support exceptions (ex: page fault)support exceptions (ex: page fault).
Here we present firstHere we present firstIoI (IoD) OoE OoCand then (at the end)and then (at the end)IoI (IoD) OoE IoC
5
![Page 6: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/6.jpg)
OoC? But branchesOoC? But branches ..
OoC? Hope you are not executingOoC? Hope you are not executing instruction beyond a branch and committing them!them!
Well we dispatch a branch and suspendWell we dispatch a branch and suspend dispatching and wait until the branch is resolved Then we resume dispatchingresolved. Then we resume dispatching instructions beyond the branch at either the fall-through area or at the target areafall through area or at the target area.
6
![Page 7: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/7.jpg)
Instruction Scheduling(R d i f i i )(Re-ordering of instructions)
• Basic block = a straight-line code sequence withBasic block = a straight-line code sequence with no branches.
• Compiler can perform static instruction scheduling.
• Tomasulo Algorithm lets us schedule instructions dynamically (in hardware).
• Branch prediction and speculative execution beyond a branch (of course with ability to flush wrong-path
7
( y g pinstructions on misprediction) will be covered later (and implemented on FPGA in EE560).
![Page 8: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/8.jpg)
Register renaming to allow later instructions to proceed
lw $8, 40($2);dd $8 $8 $8
lw $8, 40($2);dd $8 $8 $8add $8, $8, $8;
sw $8, 40($2);add $8, $8, $8;sw $8, 40($2);
lw $8, 60($3);add $8 $8 $8;
lw $48, 60($3);add $48 $48 $48;add $8, $8, $8;
sw $8, 60($3);add $48, $48, $48;sw $48, 60($3);
8
![Page 9: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/9.jpg)
Static Scheduling (based on Prof. Dubois slide)g• Strengths
-- Hardware simplicity-- Compiler has a global view of the code (does not help the hardware much)
W k• Weaknesses-- can not be CPU-implementation specific-- can not foresee dynamic events y
-- cache misses -- data-dependent delays-- conditional branches-- conditional branches
can only reschedule instructions in a basic block (basic block = a straight-line code sequence with no branches)
t t dd9
-- can not pre-compute memory addresses
![Page 10: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/10.jpg)
10
![Page 11: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/11.jpg)
Simple 5-stage pipelineIn order executionIn-order executionRAW dependency
Solve it by forwarding, if not by stallingif not, by stalling
Dependent instructions are stalled in the ID stage
IM DMIM DM
11
IF ID EX M WB
![Page 12: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/12.jpg)
Simple 5-stage pipeline: Dependent instructions are stalled in the ID stageDependent instructions are stalled in the ID stage
and lw
12
![Page 13: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/13.jpg)
Simple 5-stage pipeline: Dependent instructions can not be stalled in theDependent instructions can not be stalled in the EX stage. Why?
andlwlw
13
![Page 14: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/14.jpg)
Provide multiple functional unitsp(for simplicity, we avoid talking about floating point
execution unit and floating point register file)
Stall after decoding in queuesStall, after decoding, in queues
Multiply
Divide
IMInteger
IF ID
Queues and
WBDM Load/Store
14
Queues andFunctional unit
![Page 15: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/15.jpg)
Why junior instructions carry their source register IDs into EX stage? Well they need to get help from Senior #1 orEX stage? Well they need to get help from Senior #1 or Senior #2 in EX stage under the control of the FU.No more of that. There may be 40 seniors in front of
S I th di t h it ill t ll f hi hyou. So I, the dispatch unit, will tell you from which senior you need to get help for which source register.
rs, rt (IDs)rs, rt (IDs) are carried
into EX
15
![Page 16: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/16.jpg)
Tomasulo’s planTomasulo s plan
• OoO Out of order execution• OoO Out of order execution
• Multiple functional units(say, Integer, DM, Multiplier, Divider)( y g p )
• Queues between ID and EX stages• Queues between ID and EX stages(in place of ID/EX register)
16
![Page 17: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/17.jpg)
Out of order execution ?!Out of order execution ?!Problems all over ??!!
• For the time, no branch prediction, no speculative execution beyondbranches, ,just stall on a conditional branch
• No support for precise exceptions for the time
17Even then, …
![Page 18: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/18.jpg)
RAW WAR and WAWRAW, WAR, and WAWRAW = Read After Write
l $8 40($2)lw $8, 40($2); add $9, $8, $7;
WAR W it ft R dWAR = Write after Readadd $9, $8, $6; lw $8, 40($2); $ , ($ );
WAW = Write after Writeadd $9 $8 $6;add $9, $8, $6; lw $9, 40($2); WAW ?
How is it possible?Why would anyone produce some lt i $9 d ith t tili i
18
How is it possible?Consider a printer or a FIFO
result in $9 and without utilizing that result, why would he overwrite
it with another result?
![Page 19: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/19.jpg)
WAW can easily occur!WAW can easily occur!WAW ? How is it possible?In out of order execution instructions before the branchIn out of order execution, instructions before the branch and instruction after the branch can co-exist.For example, multiple iterations of this loop can coexist i th ti
$ $
in the execution area. So, what?
Loop: LW $2, 40($1);MULT $4 $2, $3;SW $4, 40($1);$ , ($ );ADDI $1, $1, -4;BNE $1, $0, Loop;
19
![Page 20: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/20.jpg)
Say a company gives standard bonus to most of the employees and a higher bonus to the managersemployees and a higher bonus to the managers.So you load into $3 standard bonus from the stdbonus location in memory. And then you check to see if it is a case
$of a manager, and then load into $3 again (overwriting the earlier $3) the special bonus from the special location in memory.y
LW $3 stdbonus ($0)
BNE $1, $2, SKIP
LW $3 special ($0)
20
![Page 21: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/21.jpg)
RAW, WAR, and WAW(some terminology to remember)
RAW Read After WriteRAW = Read After Writelw $8, 40($2); add $9, $8, $7;
RAW A true dependency
WAR = Write after Readadd $9, $8, $6; WAR
A true dependency
nces
add $9, $8, $6; lw $8, 40($2);
WAW = Write after Write
WAR An anti-dependency
pend
en
WAW = Write after Writeadd $9, $8, $6; lw $9, 40($2); WAW
A t t d dame
Dep
21
An output dependencyNa
![Page 22: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/22.jpg)
RAW, WAR, and WAW
• In order execution:• In-order execution: We need to deal with RAW only.
• Out of order execution:Now we need to deal with WAR and WAW besides RAW
22
WAR and WAW besides RAW.
![Page 23: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/23.jpg)
23
![Page 24: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/24.jpg)
Limited Architectural RegistersLimited Architectural RegistersMore Physical Registersy g
Register Renaminglw $8, 40($2);add $8, $8, $8;
It is clear that compiler is using $8 as a temporary register.
sw $8, 40($2);
l $8 60($3)
If there is a delay in obtaining $2, the first part of the code can not proceed.
lw $8, 60($3);add $8, $8, $8;
Unfortunately, the second part of the code can not proceed because of
d d f24
sw $8, 60($3);name dependency for $8.
![Page 25: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/25.jpg)
If we had 64 registers instead of 32 registers, g gthen perhaps compiler might have used $48 instead of $8 and we could have executed the
d t f th d b f th fi t t!second part of the code before the first part!
$ $lw $8, 40($2);add $8, $8, $8;sw $8, 40($2);
lw $48, 60($3); This is an example of$ , ($ );add $48, $48, $48;sw $48, 60($3);
This is an example of name dependency.
25
sw $48, 60($3);
![Page 26: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/26.jpg)
Four different temporary registers can be used here as shown: $8 $18 $28 and $48used here as shown: $8, $18, $28, and $48(or called with coded names, LION, TIGER, CAT and ANT)CAT, and ANT).
$ $ $lw $8, 40($2);add $18, $8, $8;
lw LION, 40($2);add TIGER, LION, LION;
sw $18, 40($2);
lw $28, 60($3);
sw TIGER, 40($2);
lw CAT, 60($3);$ , ($ );add $48, $28, $28;sw $48, 60($3);
, ($ );add ANT, CAT, CAT;sw ANT, 60($3);
26
sw $48, 60($3); sw ANT, 60($3);
![Page 27: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/27.jpg)
Can a later implementation provide 64 registers (instead of 32) while maintaining binary compatibilitymaintaining binary compatibilitywith previously compiled codes?
Answer: Yes / No
Why?
27
![Page 28: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/28.jpg)
Answer: Can not change the number of Architectural Registers
Register Renaming Through TaggingRegistersRegisters
This solves name dependencyThis solves name dependency problems (WAR and WAW) while
di d d (RAW)attending to true dependency (RAW) through waiting in queues.
28
g g q
![Page 29: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/29.jpg)
square root $2 $10; $1 $1
RST RF
lw $8, 40($2);
square_root $2, $10; $1$2$3$4
$1$2$3$4( )
add $8, $8, $8;sw $8 40($2);
$5$6$7$8
$5$6$7$8sw $8, 40($2);
lw $8, 60($3);
$8...$31
$8...$31
add $8, $8, $8;sw $8 60($3);
$31 $31
dependentsw $8, 60($3);
RST = Register Status Tabledestination
dependentsource
29
gRF = Register File
![Page 30: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/30.jpg)
square root $2 $10; $1 $1
RST RF
lw $8, 40($2);
square_root $2, $10; $1$2$3$4
$1$2$3$4( )
add $8, $8, $8;sw $8 40($2);
$5$6$7$8
$5$6$7$8sw $8, 40($2);
lw $8, 60($3);
$8...$31
$8...$31
add $8, $8, $8;sw $8 60($3);
$31 $31
sw $8, 60($3);
30
![Page 31: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/31.jpg)
square root $2 $10; $1 $1
RST RF
lw $8, 40($2);
square_root $2, $10; $1$2$3$4
$1$2$3$4( )
add $8, $8, $8;sw $8 40($2);
$5$6$7$8
$5$6$7$8sw $8, 40($2);
lw $8, 60($3);
$8...$31
$8...$31
add $8, $8, $8;sw $8 60($3);
$31 $31
sw $8, 60($3);
31
![Page 32: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/32.jpg)
square root $2 $10; $1 $1
RST RF
lw $8, 40($2);
square_root $2, $10; $1$2$3$4
$1$2$3$4( )
add $8, $8, $8;sw $8 40($2);
$5$6$7$8
$5$6$7$8sw $8, 40($2);
lw $8, 60($3);
$8...$31
$8...$31
add $8, $8, $8;sw $8 60($3);
$31 $31
sw $8, 60($3);
32
![Page 33: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/33.jpg)
Di t h it d d dsquare root $2 $10; Dispatch unit decodes and dispatches instructions.
lw $8, 40($2);
square_root $2, $10;
For destination operand, an instruction carries a TAG (but
( )add $8, $8, $8;sw $8 40($2); (
not the actual register name)!
F d
sw $8, 40($2);
lw $8, 60($3);For source operands, an instruction carries either the values or TAGs of the
add $8, $8, $8;sw $8 60($3); values or TAGs of the
operands (but not the actual register names)!
sw $8, 60($3);
33
g )
![Page 34: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/34.jpg)
Register RenamingRegister Renaming
34
![Page 35: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/35.jpg)
TAGs for destinations or sources or for both?
• A new tag is assigned to the destination register of the g g ginstruction being dispatched.
• For each of the source registers (source operands) of g ( p )the instruction being dispatched, either the value of the source register (if it has not been previously tagged) or the existing tag associated with the source register (if it has been tagged already) is conveyed to theit has been tagged already) is conveyed to the instruction.
• If a tag is conveyed for a source then the instruction• If a tag is conveyed for a source, then the instruction needs to wait for the original instruction with that destination tag to go on to the CDB and announce the value.
35
value.
![Page 36: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/36.jpg)
Unique TAG4 Unique TAG
4
4
• Like SSN, we need a unique TAG
4
• SSNs are reused.
• Similarly TAGs can be reused.y
• TAGs are similar to the number TOKENs36
TAGs are similar to the number TOKENs.
![Page 37: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/37.jpg)
Take a number vs Take a tokenTake a number vs. Take a token
44
In State Bank of India, the cashier issues brass tokens to customers trying to draw money as an identification (and not at all to put them in any virtual queue). Token numbers are in random order.
Helps to create a Virtual Queue.
The cashier verifies the signature in the records room and returns with money, call the token number and issues the money.
We do not need that here!
37Tokens are reclaimed and reused.
![Page 38: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/38.jpg)
TAGs (= Tokens)4
( )4
• How many Tokens should the bank cashier have to start with?
• What happens if the tokens are run out?• What happens if the tokens are run out?
• Does he need to have any order in holding y gtokens and issuing tokens?
D h h t ll t t k b k?38
• Does he have to collect tokens back?
![Page 39: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/39.jpg)
TAGs (= Tokens)4
( )4
• How many Tokens should the bank cashier have to start with?
• What happens if the tokens are run out?• What happens if the tokens are run out?
• Does he need to have any order in holding y gtokens and issuing tokens?
D h h t ll t t k b k?38
• Does he have to collect tokens back?
![Page 40: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/40.jpg)
TAG FIFO (FIFO t ht i EE560)(FIFOs are taught in EE560)
• To issue and collect Tokens (TAGs), use a circular FIFO (First in First Out) unituse a circular FIFO (First-in-First-Out) unit.While the FIFO-order is not important here, a FIFO is the easiest to implement in hardware compared to a random order in a pile.
• Filled with (say) 64 tokens (in any order) initially on resetFilled with (say) 64 tokens (in any order) initially on reset. • Tokens return in out of order anyway.• Put tokens back in the FIFO and issue.
01
wp rp wp 1wp
63
2 rp
63
2
63
rp2
39Full 2 tokens issued 1 token returned
![Page 41: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/41.jpg)
Simplified Block Diagramprovided by P f D b ifor EE457 Prof. Dubois
TAG FIFO
63
2
63
IntegerMultiplier
nt.
Div
ider
Issue UnitIn D
40CDB = Common Data Bus (compare it to a Public Announcing System)
![Page 42: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/42.jpg)
IoI-OoE-OoC with RSTI-Cache Block Diagram
Adapted from Prof. Michel Dubois
(Simplified for EE 457)
Register Status Table
Integer / Branch
D-Cache Div Mul
TAG FIFO
Instruc. Queue
Reg
. File
Int.
Que
ue
L/S
Que
ue
Div
Que
ue
Mul
t. Q
ueue
CDB
Issue Unit
Dispatch
Load Buffer
![Page 43: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/43.jpg)
Front-End & Back-EndFront End & Back End• IFQ Instruction Fetch Queue (a FIFO structure)Q Q ( )
• Dispatch unit (including RST, RF, Tag FIFO)
• Load Store and other Issue Queues
• Issue Unit
• Functional units
CDB (C D t B )41
• CDB (Common Data Bus)
![Page 44: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/44.jpg)
42
![Page 45: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/45.jpg)
Bottle neck in the designBottle neck in the design
• CDB = Common Data BusCDB Common Data Bus
Do all instructions use CDB?Do all instructions use CDB?
sw ?• sw ?
j (jump)?• j (jump)?
beq43
• beq
![Page 46: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/46.jpg)
load store queueload store queue
• Address calculationAddress calculation
• Memory disambiguationy g
Mr. Bruin: Let me take a guess!gYou will now propose to have a MST (Memory Status Table) (like the RST).And you will rename memory locations to solve WAW and WAR problems among memory locations right?!
44
locations, right?!
![Page 47: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/47.jpg)
MST (Memory Status Table)? No way! It is too big!No way! It is too big!
We will just ask the junior to stall and wait to solve his WAR and WAW problems with his seniors.
0 0
MST Memory$1 $1
RST RF01
01
$1$2$3$4
$1$2$3$4
. .
$5$6$7$8
$5$6$7$8
.
.
.
.
.
$8...$31
$8...$31
45
$31 $31
![Page 48: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/48.jpg)
Address calculation for lw and sw
EE557 approach for address calculation
EE457/560 approach for address calculationDedicated adder to computeDedicated adder, to compute address, attached to the load-store queue
46
store queue.
![Page 49: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/49.jpg)
Memory DisambiguationMemory DisambiguationEE557
47
![Page 50: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/50.jpg)
Memory Disambiguationy gRAW
sw $2 2000($0);sw $2, 2000($0);
lw $8, 2000($0);
WAWsw $2, 2000($0);, ( );
sw $8, 2000($0);
WARlw $2, 2000($0);
48sw $8, 2000($0);
![Page 51: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/51.jpg)
Memory Disambiguationy gRAW
sw $2 2000($0);sw $2, 2000($0);
lw $8, 2000($0);
WAWsw $2, 2000($0);, ( );
sw $8, 2000($0);
WARlw $2, 2000($0);
48sw $8, 2000($0);
![Page 52: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/52.jpg)
Memory Disambiguationy gRAW
sw $2 2000($0);This later lw can proceed only if there is no store ahead sw $2, 2000($0);
lw $8, 2000($0);
yof it with the same address.
WAWsw $2, 2000($0);
This later sw can proceed only if there is no store ahead , ( );
sw $8, 2000($0);
yof it with the same address.
WARlw $2, 2000($0);
This later sw can proceed only if there is no load ahead
f it ith th dd49sw $8, 2000($0);
of it with the same address.
![Page 53: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/53.jpg)
Maintaining instructions in the order gof arrival (issue order/program order)
in a queuein a queueIs it necessary or is it desirable?
I th f L S Q ?In the case of L-S Queue ?
In the case of Integer and other queues (mult queue, div queue)?q , q )
50
![Page 54: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/54.jpg)
Maintaining instructions in the order gof arrival (issue order/program order)
in a queuein a queueIs it necessary or is it desirable?
In the case of L S Queue ?In the case of L-S Queue ? NECESSARY to enforce memory disambiguation rules
In the case of Integer and other queues (mult queue, div queue)?
DESIRABLE so that an earlier instructionDESIRABLE, so that an earlier instruction gets executed whenever possible, there by perhaps reducing too many instructions
51
waiting on it.
![Page 55: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/55.jpg)
Priority (based on the order y (of arrival) among instructions ready
to executeto execute• Is it necessary or is it desirable?y
L l i it ith i th• Local priority with in the queues
• Global priority across the queues
52
![Page 56: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/56.jpg)
Issue UnitIssue UnitCDB
• CDB availability constraint
• Pipelined functional unitvsvs.
Multi-cycle functional unit
• Conflict resolutionRound robin priority adequate? well
53
Round-robin priority adequate?, well, …
![Page 57: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/57.jpg)
Conditional branches• Dispatch unit stops dispatching until the
branch is resolvedbranch is resolved.
• CDB broadcasts the result of the branch
• Dispatching continues there after either at the fall-through instruction or at target instruction.
• Successful branch shall cause flushing of IFQ very much like jump
54
IFQ very much like jump.
![Page 58: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/58.jpg)
Conditional branchesConditional branches• Since we stop dispatching instructions
after a branch does it mean that thisafter a branch, does it mean that this branch is the last instruction to be executed in the back-end ?executed in the back end ?
• Is it possible that the back-end holdsIs it possible that the back end holds simultaneously (a) some instructions dispatched before the branch and (b) p ( )some instructions issued after the branch was resolved?
55
![Page 59: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/59.jpg)
Tomasulo Loop ExampleLoop: LW $2, 40($1);
MULT $4 $2, $3;SW $4 40($1)SW $4, 40($1);ADDI $1, $1, -4;BNE $1 $0 Loop;BNE $1, $0, Loop;
• Assume Multiply takes 4 clocksp y• Assume first load takes 8 clocks (cache
miss), second load takes 1 clock (hit)
Based on Prof. Annavaram’s lecture slide56
Based on Prof. Annavaram s lecture slide
![Page 60: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/60.jpg)
How could Tomasulo overlap it ti f l ?iterations of loops?
Loop: LW $2, 40($1);MULT $4 $2, $3;SW $4, 40($1);ADDI $1, $1, -4;ADDI $1, $1, 4;BNE $1, $0, Loop;
The destination registers bear different TAGs in different iterations These tags wereiterations. These tags were given in place of the source operands to the dependent i t ti f ll i th
57
instructions following them.
![Page 61: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/61.jpg)
Say, only two iterations.L t ll th t it ti
Loop: LW $2, 40($1);
Let us unroll the two iterations.p $ , ($ );
MULT $4 $2, $3;SW $4, 40($1);
destination register$ , ($ );
ADDI $1, $1, -4;BNE $1, $0, Loop;$ , $ , p;
Loop: LW $2, 40($1);MULT $4 $2 $3; dependent sourceMULT $4 $2, $3;SW $4, 40($1);ADDI $1, $1, -4;
register(s)
58
ADDI $1, $1, 4;BNE $1, $0, Loop;
![Page 62: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/62.jpg)
Say, only two iterations.L t ll th t it ti
Loop: LW $2, 40($1);
Let us unroll the two iterations.p $ , ($ );
MULT $4 $2, $3;SW $4, 40($1);
destination register$ , ($ );
ADDI $1, $1, -4;BNE $1, $0, Loop;$ , $ , p;
Loop: LW $2, 40($1);MULT $4 $2 $3; dependent sourceMULT $4 $2, $3;SW $4, 40($1);ADDI $1, $1, -4;
register(s)
58
ADDI $1, $1, 4;BNE $1, $0, Loop;
![Page 63: Out of Order (OoO) ExecutionDependent instructions are stalled in the ID stage IM DM 11 IF ID EX M WB. Simple 5-stage pipeline: Dependent instructions areDependent instructions are](https://reader033.fdocuments.in/reader033/viewer/2022052101/603bae25905d4d1a15760e02/html5/thumbnails/63.jpg)
59Because, there is no reorder buffer.
Note: Your EE560 project will use a reorder buffer and much more!