Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck...

21
Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002, September 23, 2002

description

September 23, 2002PACT Main idea Workload design space is p-D space –with p = # relevant program characteristics –p is too large for understandable visualization –correlation between p characteristics Idea: reduce p-D space to q-D space –with q small (typically 2 to 4) –without losing important information –no correlation –achieved by multivariate data analysis techniques: PCA and cluster analysis

Transcript of Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck...

Page 1: Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002,

Workload Design: Selecting Representative Program-Input

Pairs

Lieven EeckhoutHans VandierendonckKoen De Bosschere

Ghent University, BelgiumPACT 2002, September 23, 2002

Page 2: Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002,

September 23, 2002 PACT 2002 2

Introduction• Microprocessor design: simulation of

workload = set of programs + inputs– constrained in size due to time limitation– taken from suites, e.g., SPEC, TPC, MediaBench

• Workload design:– which programs?– which inputs?– representative: large variation in behavior– benchmark-input pairs should be “different”

Page 3: Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002,

September 23, 2002 PACT 2002 3

Main idea• Workload design space is p-D space

– with p = # relevant program characteristics– p is too large for understandable visualization– correlation between p characteristics

• Idea: reduce p-D space to q-D space– with q small (typically 2 to 4)– without losing important information– no correlation– achieved by multivariate data analysis

techniques: PCA and cluster analysis

Page 4: Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002,

September 23, 2002 PACT 2002 4

Goal• Measuring impact of input data sets on

program behavior– “far away” or weak clustering: different

behavior– “close” or strong clustering: similar behavior

• Applications:– selecting representative program-input pairs

• e.g., one program-input pair per cluster• e.g., take program-input pair with smallest

dynamic instruction count– getting insight in influence of input data sets– profile-guided optimization

Page 5: Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002,

September 23, 2002 PACT 2002 5

Overview• Introduction• Workload characterization• Data analysis

– Principal components analysis (PCA)– Cluster analysis

• Evaluation• Discussion• Conclusion

Page 6: Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002,

September 23, 2002 PACT 2002 6

Workload characterization (1)

• Instruction mix– int, logic, shift&byte, load/store, control

• Branch prediction accuracy– bimodal (8K*2 bits), gshare (8K*2 bits) and

hybrid (meta: 8K*2 bits) branch predictor• Data and instruction cache miss rates

– Five caches with varying size and associativity

Page 7: Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002,

September 23, 2002 PACT 2002 7

Workload characterization (2)

• Number of instructions between two taken branches

• Instruction-Level Parallelism– IPC of an infinite-resource machine with only

read-after-write dependencies• In total: p = 20 variables

Page 8: Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002,

September 23, 2002 PACT 2002 8

Overview• Introduction• Workload characterization• Data analysis

– Principal components analysis (PCA)– Cluster analysis

• Evaluation• Discussion• Conclusion

Page 9: Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002,

September 23, 2002 PACT 2002 9

PCA• Many program characteristics (variables) are

correlated• PCA computes new variables

– p principal components PCi– linear combination of original characteristics– uncorrelated– contain same total variance over all benchmarks– Var[PC1] > Var [PC2] > Var[PC3] > …– most have near-to-zero variance (constant)– reduce dimension of workload space to q = 2 to 4

Page 10: Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002,

September 23, 2002 PACT 2002 10

PCA: Interpretation

• Interpretation– Principal Components

(PC) along main axes of ellipse

– Var(PC1) > Var(PC2) > ...

– PC2 is less important to explain variation over program-input pairs

• Reduce No. of PC’s– throw out PCs with

negligible variance

Variable 1

Var

iabl

e 2

PC 1PC 2

Page 11: Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002,

September 23, 2002 PACT 2002 11

Cluster analysis

• Hierarchic clustering

• Based on distance between program-input pairs

• Can be represented by a dendrogram

Page 12: Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002,

September 23, 2002 PACT 2002 12

Overview• Introduction• Workload characterization• Data analysis

– Principal components analysis (PCA)– Cluster analysis

• Evaluation• Discussion• Conclusion

Page 13: Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002,

September 23, 2002 PACT 2002 13

Methodology• Benchmarks

– SPECint95• Inputs from SPEC: train and ref• Inputs from the web (ijpeg)• Reduced inputs (compress)

– TPC-D on postgres v6.3– Compiled with –O4 on Alpha– 79 program-input pairs

• ATOM– Instrumentation– Measuring characteristics

• STATISTICA– Statistical analysis

Page 14: Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002,

September 23, 2002 PACT 2002 14

GCC: principal components

-1-0.8-0.6-0.4-0.2

00.20.40.60.8

1

ILP

BIM

OD

AL

GS

HA

RE

HY

BR

ID

LD_S

T

INT_

AR

IT

INT_

LOG

I

INT_

SH

IF

CTR

L

BR

EA

K

I_8K

B

I_16

KB

I_32

KB

I_64

KB

I_12

8KB

D_8

KB

D_1

6KB

D_3

2KB

D_6

4KB

D_1

28K

B

Workload Characteristic

Wei

ght i

n P

C

Principal Component 1Principal Component 2

2 PC’s: 96,9% of total variance

Page 15: Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002,

September 23, 2002 PACT 2002 15

GCC

-2

-1

0

1

2

3

4

5

-3 -2 -1 0 1 2 3

principal component 1

prin

cipa

l com

pone

nt 2

emit-rtl

insn-emit

protoize

varasm

explow

recog

reload1expr

cp-decl

insn-recogprint-treedbxout

toplev

High branch prediction accuracyHigh I-cache miss rates

High D

-cache miss rates

Many control &

shift insnM

any LD/STs

and ILP

7 inputs

Page 16: Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002,

September 23, 2002 PACT 2002 16

compress

linkage distance

gogcc.em

it-rtl +g

cc.insn-recog

gcc

gcc.explow

Q6+Q12+Q13+Q15

vortex

li

Q16Q5 m88k.ref

Q10Q

8

perl.ju

mble

Q3+Q7+Q9+Q11+Q14+Q17

m88ksim

.train

perl.scrabbl

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

li.takr+li.browse+li.boyer

Q2+Q4

ijpeg

compress.100,000

Workload space: 4 PCs -> 93.1%

ijpeg, compress and

go are isolated

Go: low branch prediction accuracyCompress: high data cache miss rateIjpeg: high LD/STs rate, low ctrl ops rate

Page 17: Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002,

September 23, 2002 PACT 2002 17

compress

linkage distance

gogcc.em

it-rtl +g

cc.insn-recog

gcc

gcc.explow

Q6+Q12+Q13+Q15

vortex

li

Q16Q5 m88k.ref

Q10Q

8

perl.ju

mble

Q3+Q7+Q9+Q11+Q14+Q17

m88ksim

.train

perl.scrabbl

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

li.takr+li.browse+li.boyer

Q2+Q4

ijpeg

compress.100,000

Workload space

strong clustering

Page 18: Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002,

September 23, 2002 PACT 2002 18

Small versus large inputs• Vortex:

– Train: 3.2B insn– Ref: 92.5B insn– Similar behavior: linkage distance ~ 1.4

• Not for m88ksim– Linkage distance ~ 4

• Reference input for compress can be reduced without significantly impacting behavior: 2B vs. 60B instructions

Page 19: Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002,

September 23, 2002 PACT 2002 19

Impact of input on behavior

• For TPC-D queries:– Weak clustering– Large impact– I-cache behavior

• In general: variation between programs is larger than the variation between input sets for the same program– However: there are exceptions where input

has large impact on behavior, e.g., TPC-D and perl

Page 20: Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002,

September 23, 2002 PACT 2002 20

Overview• Introduction• Workload characterization• Data analysis

– Principal components analysis (PCA)– Cluster analysis

• Evaluation• Discussion• Conclusion

Page 21: Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002,

September 23, 2002 PACT 2002 21

Conclusion• Workload design

– representative– not long running

• Principal Components Analysis (PCA) and cluster analysis help in detecting input data sets resulting in similar or different behavior of a program

• Applications:– workload design: representativeness while

taking into account simulation time– impact of input data sets on program behavior– profile-guided optimizations