Instr Sched 1

download Instr Sched 1

of 20

Transcript of Instr Sched 1

  • 7/28/2019 Instr Sched 1

    1/20

    Instruction Scheduling - Part 1

    Y.N. Srikant

    Department of Computer ScienceIndian Institute of Science

    Bangalore 560 012

    NPTEL Course on Compiler Design

    Y.N. Srikant Instruction Scheduling

    http://find/
  • 7/28/2019 Instr Sched 1

    2/20

    Outline

    Instruction Scheduling

    Simple Basic Block Scheduling

    Automaton-based SchedulingInteger programming based schedulingOptimal Delayed-load Scheduling (DLS) for treesTrace, Superblock and Hyperblock scheduling

    Y.N. Srikant Instruction Scheduling

    http://find/
  • 7/28/2019 Instr Sched 1

    3/20

    Instruction Scheduling

    Reordering of instructions so as to keep the pipelines of

    functional units full with no stalls

    NP-Complete and needs heuristcs

    Applied on basic blocks (local)

    Global scheduling requires elongation of basic blocks

    (super-blocks)

    Y.N. Srikant Instruction Scheduling

    http://find/
  • 7/28/2019 Instr Sched 1

    4/20

    Instruction Scheduling - Motivating Example

    time: load - 2 cycles, op - 1 cycle

    This code has 2 stalls, at i3 and at i5,

    due to the loads

    i1 i2 i4

    i3

    i5

    i7

    i6

    load load load

    add

    sub

    st

    mult

    Y.N. Srikant Instruction Scheduling

    http://goforward/http://find/http://goback/
  • 7/28/2019 Instr Sched 1

    5/20

    Scheduled Code - no stalls

    There are no stalls, but dependences are indeed satisfied

    Y.N. Srikant Instruction Scheduling

    http://find/http://goback/
  • 7/28/2019 Instr Sched 1

    6/20

    Definitions - Dependences

    Consider the following code:

    i1 : r1 load(r2)i2 : r3 r1 + 4i3 : r1 r4 + r5

    The dependences are

    i1 i2 (flow dependence) i2 i3 (anti-dependence)

    i1 o i3 (output dependence)

    anti- and ouput dependences can be eliminated by register

    renaming

    Y.N. Srikant Instruction Scheduling

    G

    http://find/
  • 7/28/2019 Instr Sched 1

    7/20

    Dependence DAG

    full line: flowdependence, dash line: anti-dependence

    dash-dot line: outputdependence

    some anti- and output dependences are because memorydisambiguation could not be done

    st

    add

    mult st

    add

    ldld

    add sub

    i1

    i3 i4

    i7

    i8

    i6

    i2

    i5

    i9

    Y.N. Srikant Instruction Scheduling

    B i Bl k S h d li

    http://find/
  • 7/28/2019 Instr Sched 1

    8/20

    Basic Block Scheduling

    Basic block consists of micro-operation sequences (MOS),

    which are indivisible

    Each MOS has several steps, each requiring resources

    Each step of an MOS requires one cycle for executionPrecedence constraints and resource constraints must besatisfied by the scheduled program

    PCs relate to data dependences and execution delaysRCs relate to limited availability of shared resources

    Y.N. Srikant Instruction Scheduling

    Th B i Bl k S h d li P bl

    http://find/
  • 7/28/2019 Instr Sched 1

    9/20

    The Basic Block Scheduling Problem

    Basic block is modelled as a digraph, G = (V, E)

    R: number of resourcesNodes (V): MOS; Edges (E): PrecedenceLabel on node v

    resource usage functions, v(i) for each step of the MOSassociated with v

    length l(v) of node vLabel on edge e: Execution delay of the MOS, d(e)

    Problem: Find the shortest schedule : V N such thate= (u, v) E, (v) (u) d(e) and

    i,

    vVv(i (v)) R, where

    length of the schedule is maxvV

    {(v) + l(v)}

    Y.N. Srikant Instruction Scheduling

    I t ti S h d li P d d R

    http://goforward/http://find/http://goback/
  • 7/28/2019 Instr Sched 1

    10/20

    Instruction Scheduling - Precedence and Resource

    Constraints

    Y.N. Srikant Instruction Scheduling

    A Si l Li t S h d li Al ith

    http://find/
  • 7/28/2019 Instr Sched 1

    11/20

    A Simple List Scheduling Algorithm

    Find the shortest schedule : V N, such that precedenceand resource constraints are satisfied. Holes are filled with

    NOPs.FUNCTION ListSchedule (V,E)

    BEGIN

    Ready = root nodes of V; Schedule= ;

    WHILE Ready= DO

    BEGINv = highest priority node in Ready;

    Lb = SatisfyPrecedenceConstraints (v, Schedule, );(v) = SatisfyResourceConstraints (v, Schedule, , Lb);

    Schedule = Schedule + {v};Ready = Ready {v} + {u | NOT (u Schedule)AND (w, u) E, w Schedule};

    END

    RETURN ;

    ENDY.N. Srikant Instruction Scheduling

    List Scheduling Ready Queue Update

    http://find/
  • 7/28/2019 Instr Sched 1

    12/20

    List Scheduling - Ready Queue Update

    Y.N. Srikant Instruction Scheduling

    Constraint Satisfaction Functions

    http://find/
  • 7/28/2019 Instr Sched 1

    13/20

    Constraint Satisfaction Functions

    FUNCTION SatisfyPrecedenceConstraint(v, Sched, )

    BEGINRETURN ( max

    uSched(u) + d(u, v))

    END

    FUNCTION SatisfyResourceConstraint(v, Sched, , Lb)BEGIN

    FOR i := Lb TO DO

    IF 0 j < l(v), v(j) +uSched

    u(i + j (u)) R THEN

    RETURN (i);END

    Y.N. Srikant Instruction Scheduling

    Precedence Constraint Satisfaction

    http://find/
  • 7/28/2019 Instr Sched 1

    14/20

    Precedence Constraint Satisfaction

    Y.N. Srikant Instruction Scheduling

    http://find/
  • 7/28/2019 Instr Sched 1

    15/20

    List Scheduling - Priority Ordering for Nodes

  • 7/28/2019 Instr Sched 1

    16/20

    List Scheduling - Priority Ordering for Nodes

    1 Height of the node in the DAG (i.e., longest path from the

    node to a terminal node2 Estart, and Lstart, the earliest and latest start times

    Violating Estartand Lstartmay result in pipeline stallsEstart(v) = max

    i=1, ,k(Estart(ui) + d(ui, v))

    where u1, u2, , uk are predecessors of v. Estart value ofthe source node is 0.Lstart(u) = min

    i=1, ,k(Lstart(vi) d(u, vi))

    where v1, v2, , vk are successors of u. Lstart value of thesink node is set as its Estart value.

    Estart and Lstart values can be computed using atop-down and a bottom-up pass, respectively, eitherstatically (before scheduling begins), or dynamically duringscheduling

    Y.N. Srikant Instruction Scheduling

    Estart and Lstart Computation

    http://find/http://goback/
  • 7/28/2019 Instr Sched 1

    17/20

    Estart and Lstart Computation

    Y.N. Srikant Instruction Scheduling

    List Scheduling - Slack

    http://find/
  • 7/28/2019 Instr Sched 1

    18/20

    List Scheduling Slack

    1 A node with a lower Estart (or Lstart) value has a higher

    priority2 Slack

    =Lstart

    Estart

    Nodes with lower slack are given higher priorityInstructions on the critical path may have a slack value ofzero and hence get priority

    Y.N. Srikant Instruction Scheduling

    http://find/
  • 7/28/2019 Instr Sched 1

    19/20

    Simple List Scheduling - Example - 2

  • 7/28/2019 Instr Sched 1

    20/20

    Simple List Scheduling Example 2

    latenciesadd,sub,store: 1 cycle; load: 2 cycles; mult: 3 cycles

    path lengthand slackare shown on the left side and right

    side of the pair of numbers in parentheses

    ld

    sub

    mult st

    add

    st

    add

    ld

    add

    5(3, 3)0

    6(2, 2)0

    8(0, 0)0

    3(2, 5)3 1(2, 7)5

    7(0, 1)1

    0(8, 8)0

    2(6, 6)0

    1(7, 7)0

    i1

    i3 i4

    i7

    i8

    i6

    i2

    i5

    i9

    Y.N. Srikant Instruction Scheduling

    http://find/