Lec Jan19 2009
-
Upload
ravi-soni -
Category
Technology
-
view
1.320 -
download
1
Transcript of Lec Jan19 2009
Anshul Kumar, CSE IITD
CSL718 : Pipelined ProcessorsCSL718 : Pipelined ProcessorsCSL718 : Pipelined Processors
Pipeline HazardsHandling Structural Hazards
19th Jan, 2009
Anshul Kumar, CSE IITD slide 2
Types of PipelinesTypes of Types of PipelinesPipelines
• Degree of overlap– Serial, Overlapped, Pipelined, Super-pipelined
• Depth– Shallow, Deep
• Structure– Linear, Non - linear
• Scheduling of operations– Static, Dynamic
Anshul Kumar, CSE IITD slide 3
Degree of overlap DepthDegree of overlap DepthDegree of overlap DepthSerial
Overlapped
Pipelined
Shallow
Deep
Anshul Kumar, CSE IITD slide 4
Pipeline StructurePipeline StructurePipeline Structure
A B CNon-linearPipeline
A B CLinearPipeline
Sequence: A, B, C, B, C, A, C, A
Anshul Kumar, CSE IITD slide 5
Scheduling/timing alternativesScheduling/timing alternativesScheduling/timing alternatives
• Decisions : Static / Dynamic– order in which instructions enter the pipeline– sequence of stages through which different instructions
pass– detection of hazards and introduction of stall cycles
• Static– if one instruction stalls, all subsequent instructions are
delayed• Dynamic
– higher throughput is achieved
Anshul Kumar, CSE IITD slide 6
Dynamic SchedulingDynamic SchedulingDynamic Scheduling
• type 1 : beginnings (decode) and endings (put away) in order
• type 2 : only beginnings in order• type 3 : no order restrictions except
dependencies• type 1 extended : beginnings in order,
references that effect memory state are in order
[note that a memory reference may lead to page fault]
Anshul Kumar, CSE IITD slide 7
Pipelining and CPIPipelining and CPIPipelining and CPI
Type CPISerial 5 – 6
Overlapped 3Pipelined (static) 1.5 – 2
Pipelined (dynamic) 1.2 – 1.5Multiple instruction issue < 1.0
Anshul Kumar, CSE IITD slide 8
Hazards in PipeliningHazards in PipeliningHazards in Pipelining
• Data dependencies => Data hazards– RAW (read after write)– WAR (write after read)– WAW (write after write)
• Resource conflicts => Structural hazards– use of same resource in different stages
• Procedural dependencies => Control hazards– conditional and unconditional branches, calls/returns
Anshul Kumar, CSE IITD slide 9
Handling hazardsHandling hazardsHandling hazards
• Data hazards – detect instructions with data dependence– introduce nop instructions (bubbles) in the
pipeline– more complex: data forwarding
• Control hazards– detect branch instructions– flush inline instructions if branching occurs– more complex: branch prediction
Anshul Kumar, CSE IITD slide 10
Data HazardsData HazardsData Hazards
delay = 3
previousinstr
currentinstr
read/write
read/write
Anshul Kumar, CSE IITD slide 11
Handling Data HazardsHandling Data HazardsHandling Data Hazardsprevious
instr
currentinstr
W
R
EX
EXData Forwarding
previousinstr
currentinstr
W
R
InstructionReordering
1
2
Anshul Kumar, CSE IITD slide 12
Are there software solutions?Are there software solutions?Are there software solutions?
• Separate dependent instructions by reordering code
• Insert nop instructions in worst case
• Treat branches as delayed branches and insert suitable instructions in delay slots
Anshul Kumar, CSE IITD slide 13
Control HazardsControl HazardsControl Hazards
delay = 5
branchinstr
next inlineinstr
targetinstr
cond eval
delay = 2
• the order of cond eval and target addr gen may be different• cond eval may be done in previous instruction
target addr gen
Anshul Kumar, CSE IITD slide 14
Structural HazardsStructural HazardsStructural Hazards
• Use of a hardware resource in more than one cycle
• Different sequences of resource usage by different instructions
• Non-pipelined multi-cycle resources
A B A CA B A C
A B A C
A B C D
A C B D
F D X X
F D X X
Caused by Resource Conflicts
Anshul Kumar, CSE IITD slide 15
Structural hazardsStructural hazardsStructural hazards
• Structural hazards can possibly be removed by design– separate instruction and data memories– adders for PC increment and offset addition to
PC separate from main ALU– each instruction uses ALU at most in one cycle– one instruction can read from RF while other
can write into it in the same cycle
Anshul Kumar, CSE IITD slide 16
Analysis of Structural HazardsAnalysis of Structural HazardsAnalysis of Structural Hazards
A B C
1 2 3 4 5 6 7 8A X X XB X XC X X X
Non-linearPipeline
Reservation Tablefor X
Anshul Kumar, CSE IITD slide 17
Analysis of Structural HazardsAnalysis of Structural HazardsAnalysis of Structural Hazards
A B C
1 2 3 4 5 6 7 8A X X XB X XC X X X
Multi-functionalPipeline
Reservation Tablefor X
YY
Y
Y
Y Yfor Y
Anshul Kumar, CSE IITD slide 18
Collisions with Initiation Interval =2Collisions with Initiation Interval =2Collisions with Initiation Interval =2
1 2 3 4 5 6 7 8 9 10 11A 1 2 3 1 4 1,2 5 2,3 6
B 1 1,2 2,3 3,4 4,5
C 1 1,2 1-3 2-4
Anshul Kumar, CSE IITD slide 19
Collisions with Initiation Interval =5Collisions with Initiation Interval =5Collisions with Initiation Interval =5
1 2 3 4 5 6 7 8 9 10 11A 1 1,2 1 2,3
B 1 1 2 2
C 1 1 1 2 2
Anshul Kumar, CSE IITD slide 20
Latency Sequences and CyclesLatency Sequences and CyclesLatency Sequences and Cycles
1, 8, 1, 8, …. (1, 8) avg = 4.53, 3, 3, 3, …. (3) avg = 36, 6, 6, 6, …. (6) avg = 6
Minimum Average Latency (MAL) ?
Anshul Kumar, CSE IITD slide 21
Collision Free SchedulingCollision Free Collision Free SchedulingScheduling
1 0 1 1 0 1 0m …. 2 1
Collision vector for X
1 : collision0 : no collision
Cycle no. (future)
Starting with the current state,in which cycles the next instance of X can be scheduled ?
Anshul Kumar, CSE IITD slide 22
Computing collision vectorComputing collision vectorComputing collision vector
1 2 3 4 5 6 7 8 9 10 11ABC
1 1 1
1 1
1 1 1
2 2 2
2 2
2 2 2
7 6 5 4 3 2 101111 00
0
Anshul Kumar, CSE IITD slide 23
State TransitionsState TransitionsState Transitions
1 0 1 1 0 1 0
0 1 0 1 1 0 1
0 0 1 0 1 1 0
0 0 0 1 0 1 1
1 0 1 1 0 1 1
clock cycle
clock cycle
clock cycle
schedule X
1 0 1 1 0 1 0
1 0 1 1 0 1 1
In short:3
Anshul Kumar, CSE IITD slide 24
Collision Free Scheduling for XCollision Free Scheduling for XCollision Free Scheduling for X
1 0 1 1 0 1 0m …. 2 1
Collision vector for X1 : collision0 : no collision
1 0 1 1 0 1 0
1 0 1 1 0 1 1 1 1 1 1 1 1 1
36 8+
8+
8+1
3 6
Anshul Kumar, CSE IITD slide 25
Collision Free Scheduling for YCollision Free Scheduling for YCollision Free Scheduling for Y
1 0 1 0m….2 1
Collision vector for Y1 : collision0 : no collision
1 0 1 0
1 0 1 1 1 1 1 1
3 5+
5+
5+1
3
Anshul Kumar, CSE IITD slide 26
Latency Cycles from State DiagramLatency Cycles from State DiagramLatency Cycles from State Diagram
Latency Cycles(1, 8) (1, 8, 6, 8) (3) (6) (3, 8) (3, 6, 3) Simple Latency Cycles (no figure repeats)(1, 8) (3) (6) (3, 8) (6, 8) Greedy Latency Cycles(1, 8) (3) - from different starting states
Anshul Kumar, CSE IITD slide 27
Minimum Average Latency (MAL)Minimum Average Latency (MAL)Minimum Average Latency (MAL)
MAL > max no. of check marks in any row MAL < avg latency of any greedy cycle
avg latency of any greedy cycle <no. of 1’s in initial collision vector + 1
Anshul Kumar, CSE IITD slide 28
Upper Bound on MALUpper Bound on MALUpper Bound on MAL
• Consider a greedy cycle (k1 ,k2 ,..,kn )• Let p = no. of 1’s in initial collision vector
⇒ k1 < p + 1k2 < 2 p - k1 + 2 {k1 -1 1’s removed, p 1’s added}
k3 < 3 p - k1 - k2 + 3….kn < n p - k1 - k2 … - kn-1 + n
⇒ k1 + k2 … + kn < n p + n ⇒
MAL < p + 1
Anshul Kumar, CSE IITD slide 29
Reference (Structural Hazards)Reference (Structural Hazards)Reference (Structural Hazards)1. K. Hwang, "Advanced Computer Architecture : Parallelism,
Scalability, Programmability", McGraw Hill, 1993.