Online Computation of Critical Paths for Multithreaded Languages

44
May/01/2000 HIPS 2000 1 Online Computation of Critical Paths for Multithreaded Languages Yoshihiro Oyama Kenjiro Taura Akinori Yonezawa University of Tokyo

description

Online Computation of Critical Paths for Multithreaded Languages. Yoshihiro Oyama Kenjiro Taura Akinori Yonezawa University of Tokyo. Presentation Outline. What is a critical path? Background & Overview Our work Target language Critical path computation algorithm Instrumentation scheme - PowerPoint PPT Presentation

Transcript of Online Computation of Critical Paths for Multithreaded Languages

Page 1: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 1

Online Computation ofCritical Paths for

Multithreaded Languages

Yoshihiro OyamaKenjiro Taura

Akinori Yonezawa

University of Tokyo

Page 2: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 2

Presentation Outline

• What is a critical path?• Background & Overview• Our work

– Target language– Critical path computation algorithm– Instrumentation scheme

• Experimental results• Related work

Page 3: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 3

What is a Critical Path (CP)?

• The longest execution path– Nodes: sequential program parts– Edges: fork/sync points

31

36

5

2

2

2

8

7

4

1

CP length: 31

Page 4: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 4

Benefits of Getting CPs(1/2)

• CP info gives us– Performance upper bound

= Exec. time lower bound = lim {exec. time} PE→∞

– Important parts in need of tuning

Page 5: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 5

Benefits of Getting CPs(2/2)

• CP info is useful for– Tuning

• CP is short → Overhead should be reduced• Otherwise → CP should be shortened

– Performance prediction• TP = T1 / P + T∞ (by Cilk group)• Exec. time is close to CP length

→ More processors: futile

Page 6: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 6

Presentation Outline

• What is a critical path?• Background & Overview• Our work

– Target language– Critical path computation algorithm– Instrumentation scheme

• Experimental results• Related work

Page 7: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 7

This Work

• Computing critical paths– Primary targets:

• Multithreaded languages• Shared-memory machines

– On-the-fly• Not using tracefiles

– Source code instrumentation

Page 8: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 8

Background(Shortcoming of Existing Work)

• Cilk [Frigo et al. 98]– Provides online computation of CPs

– Supports fork-join synchronization only

– Unrealistic setting• Fork: zero cost• Join: zero cost

Page 9: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 9

Contribution

• Developed algorithm for computing CPs– It deals with languages with threads and synchr

onization via first-class data• Not limited to fork-join model

– It takes fork / communication cost into account– It gives length of each subpath in a CP

• Helps us “pinpoint” important program parts

• Demonstrated its usefulness through experiments using SMP

Page 10: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 10

CP Info Example

• Displaying a sequence of all subpaths in a CP

frame entry point frame exit point time=============================================================main() --- move_mols(mols,100) 741 usecspawn 10 usecmove_mols(mols,n) --- spawn move_one_mol(mols[i]) 39 usecspawn 10 usecmove_one_mol(molp) --- return 4982 useccommunication 15 usecv = recv(r) --- send(s, v*2) 128 useccommunication 15 usecu = recv(s) --- die 1207 usec=============================================================critical path length 7147 usec

Page 11: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 11

Presentation Outline

• What is a critical path?• Background & Overview• Our work

– Target language– Critical path computation algorithm– Instrumentation scheme

• Experimental results• Related work

Page 12: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 12

Target Language

• Sequential language(C, Scheme, …)

+ Threads spawn f(x1,…,xn)

+ Channels• are first-class sync. media• can express locks, barriers,

and monitors

r

th2v = recv(r)

th1send(r,8)

8

8

Page 13: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 13

Sample Program

main(){ spawn sum(r,vec); ... v = recv(r); ... die;}

sum(r,vec){ ... ... send(r,ans);}

End of Program

Beginning of Program

Page 14: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 14

Presentation Outline

• What is a critical path?• Background & Overview• Our work

– Target language– Critical path computation algorithm– Instrumentation scheme

• Experimental results• Related work

Page 15: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 15

Behavior of Sample Program

sum(r,vec)

v=recv(r)spawn sum(r,vec)

send(r,ans)

diemain

Nodes: fork & sync. points

Edges: inter-node dependencies

DAG-structured execution

Page 16: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 16

Three Kinds of Edges (Dependencies)

• Arithmetic edges• Spawn edges• Communication edges

83

5

14 29

sum(r,vec)

v=recv(r)spawn sum(r,vec)

send(r,ans)

diemain

Page 17: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 17

CP Computation AlgorithmBasic Idea

• DAG not constructed– Each thread keeps only the longest path

up to the current program point

recvmain

Path2

Path1thrownaway

Page 18: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 18

Key Questions

• How to determine edge values?

• How to compute CP withoutconstructing DAG?– How to manage CP info? – How to keep the longest path?

Page 19: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 19

Determining Edge Values

• Computing the amount of time that elapsed after leaving the previous node

Y ZXt1=time() t2=time() t3=time()8 6

Page 20: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 20

CP=({…},{…},{…}, {L1,L2,8})

CP=({…},{…},{…})

Extending CP withArithmetic Edge

XL1:

8 YL2:

6 ZL3:

CP=({…},{…},{…}, {L1,L2,8})

CP=({…},{…},{…}, {L1,L2,8}, {L2,L3,6})

The amount of time in nodes: NOT accounted

CP info = a sequence of edge info

Page 21: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 21

Extending CP withSpawn Edge

CP=({…},{…},{…})

X spawn Y

ZCP=({…},{…},{…})

CP=({…},{…},{…}, {…,…,Cspawn })

Page 22: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 22

Extending CP withCommunication Edge

CPsend=({…},{…})

send

recv

[v, CPsend]

Piggyback a sentvalue with CP

CPsend=({…},{…}, {…,…,Ccomm })

CPsend=({…},{…})

Page 23: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 23

Keeping the Longest Path(Throwing Shorter Paths Away)

send

recv

[v, CPsend]

CP=max( CPsend, CPrecv )

CPsend = …

CPrecv = … CPsend=({…},{…}, {…,…,Ccomm })

Page 24: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 24

Presentation Outline

• What is a critical path?• Background & Overview• Our work

– Target language– Critical path computation algorithm– Instrumentation scheme

• Experimental results• Related work

Page 25: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 25

Instrumentation

• Source-to-source transformation

– Independent of the implementation details• Ex. management of activation frames

– Instrumentation code is inserted into• Sends, recvs, spawns• Entry/exit points of functions

Page 26: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 26

Transformation Rule Example

l: v = recv(r);

t = time() - et;[v, cp’] = recv(r);cp’’ = addCommEdge(cp’)if(t + length(cp) < length(cp’)){ cp = cp’ el = l; et = time();} else { et = time() - t;}

Compute CP up to recv

Receive a valuepiggybacked with CP

Compare the two CPs

Extend CP withcomm. edge

Use the sender’s CP

Use the receiver’s CP

Page 27: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 27

• DAG shape varies between different runs

Discussion (1/2)-- Nondeterminism --

X Y28

X Y5

– The amounts of time for each part vary(e.g., cache effects)

send

recv

send

recv

send

recv

send

recv

– Comm. edges may connect different pairs

Page 28: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 28

Discussion (2/2)-- What we Compute as CP --

• CP of a DAG created in an actual run

– Programs may give different CPsin different runs

– Other reasonable ways?

Page 29: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 29

Presentation Outline

• What is a critical path?• Background & Overview• Our work

– Target language– Critical path computation algorithm– Instrumentation scheme

• Experimental results• Related work

Page 30: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 30

Experiments

• Schematic: concurrent OO language [Taura et al. 96]

• Sun Ultra Enterprise 10000– UltraSPARC x 64

• Apps:– Prime– Natural Language Parser– Raytrace

• Timer function: gethrtime()

Page 31: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 31

Purpose of Experiments

• Checking that execution timesget close to computed CPs

• Identifying how large instrumentation overhead is

Page 32: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 32

Raytrace

0

500

1000

1500

0 10 20 30 40 50 60number of processors

tim

e (m

sec)

Instrumented CP Org

We could predictthe best performance

by using only one processor

We could predictthe best performance

by using only one processor

Page 33: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 33

Prime

0

300

600

900

1200

0 10 20 30 40 50 60

number of processors

tim

e (m

sec)

Instrumented CP Org

Small (< 5%) differencebetween the actual execution timeand the predicted execution time

Small (< 5%) differencebetween the actual execution timeand the predicted execution time

Page 34: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 34

Information Useful for Future Tuning of Prime

• Gathering primes into a list → 95 % of CP

• Dividing prime candidates by smaller primes → 5% of CP

Page 35: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 35

Natural Language Parser

0

400

800

1200

0 10 20 30 40 50 60

number of processors

tim

e (m

sec)

Instrumented CP Org

Page 36: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 36

Information Useful for Future Tuning of NL Parser

• Application of lexical rules → 4 % of CP

• Application of production rules → 96% of CP

Page 37: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 37

Instrumentation Overhead(Execution Time on One Processor)

9.9

4.7 3.7

0

5

10

15

Prime NL Parser Raytrace

norm

aliz

ed t

ime

Org Instrumented

Page 38: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 38

Presentation Outline

• What is a critical path?• Background & Overview• Our work

– Target language– Critical path computation algorithm– Instrumentation scheme

• Experimental results• Related work

Page 39: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 39

Related Work (1/2)

% foo -nproc 10 20

• Cilk– Breakdown of CP not shown

• CP info: not detailed enough for tuning

Which function should we tune???

result: 524288Running time on 10 procs: 416.33 msTotal work = 3.94 sCritical path = 1.08 msParallelism = 2800.92%

Page 40: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 40

Related Work (2/2)

• Paradyn [Hollingsworth 98]– Main target is message-passing programs– It does not display all subpaths in CP

• Tracefile-based offline scheme(Dimemas [Pallas] etc.)– Tracefile contains the parameters and the timin

gs of all communication operations

– Required memory/storage is very large

Page 41: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 41

Summary (1/2)

• Scheme for online CP computation– Supports synchronization via first-class data

• Piggybacking communicated values with CP info• Keeping the maximum of two paths in receives

– Takes spawn/communication cost into account

– Shows all subpaths in CP• Attaching subpath info in each CP update

Page 42: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 42

Summary (2/2)

• CP info we compute– Helps predict the MP performance

• Small (< 10%) difference between– Actual execution time– Predicted execution time

– Gives a useful guide to tuning• Prime: Tune list construction part!• Parser: Tune production rule application part!

Page 43: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 43

Future Work

• More precise performance prediction– Taking thread mapping into account

• Adaptive optimization using CP info– Time-consuming optimizations are

applied to the parts included in CP

Page 44: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 44

Any Comments?