Factorial Experiments2^k

35
1 Statistics CSE 807

Transcript of Factorial Experiments2^k

Page 1: Factorial Experiments2^k

1

Statistics

CSE 807

Page 2: Factorial Experiments2^k

2

Experimental Design and Analysis

How to:• Design a proper set of experiments for measurement or

simulation.• Develop a model that best describes the data obtained.• Estimate the contribution of each alternative to the

performance.• Isolate the measurement errors.• Estimate confidence intervals for model parameters.• Check if the alternatives are significantly different.• Check if the model is adequate.

Page 3: Factorial Experiments2^k

3

Example

• Personal workstation design.• Processor:68000, Z80, or 8086.• Memory size: 512K, 2M, or 8M bytes.• Number of Disks: One, two, three, or four.• Workload: Secretarial, managerial, or scientific.• User education: High school, college, or Post-

graduate level.

Page 4: Factorial Experiments2^k

4

Terminology• Response Variable: Outcome.

E.g., throughput, response time.• Factors: Variables that affect the response variable.

E.g., CPU type, memory size, number of disk drivers, workload used, and user’s educational level.

Also called predictor variables or predictors.• Levels: The value that a factor can assume.

E.g., the CPU type has three levels: 68000, 8080, or Z80.# of disk drives has four levels. Also called treatment.

Page 5: Factorial Experiments2^k

5

Terminology (cont’d)• Primary Factors: The factors whose effects need to

be quantified.E.g., CPU type, memory size only, and number of disk

drives.• Secondary Factors: “Factors whose impact need

not be quantified.E.g., the work loads.

• Replication: Repetition of all or some experiments.

Page 6: Factorial Experiments2^k

6

Terminology (cont’d)

• Design: The number of experiments, the factor level and number of replications for each experiment.E.g., Full Factorial design with 5 replications: 3 X 3 X 4 X 3 X 3 or 324 experiments, each repeated five

times.• Experimental Unit: Any entity that is used for

experiments.E.g., users. Generally, no interest in comparing the units.Goal - minimize the impact of variation among the units.

Page 7: Factorial Experiments2^k

7

Terminology (cont’d)

• Interaction => Effect of one factor depends upon the level of the other.

Non-interacting Factors Interacting Factors

A1 A2B1B2

36

58

A1 A2B1B2

36

59

Page 8: Factorial Experiments2^k

8

Common Mistakes in Experimentation

1. The variation due to experimental error is ignored.2. Important parameters are not controlled.3. Effects of different factors are not isolated.4. Simple one-factor-at-a-time designs are used5. Interactions are ignored.6. Too many experiments are conducted.

Better: two phases.

Page 9: Factorial Experiments2^k

9

Types of Experimental Designs

• Simple Designs: Vary one factor at a time

– #of Experiments =

Not statistically efficient.Wrong conclusions if the factors have interaction.Not recommended.

k

iin

1

)1(1

Page 10: Factorial Experiments2^k

10

Types of Experimental Designs (cont’d)

• Full Factorial Design: All combinations.

– # of Experiments =

Can find the effect of all factors.Too much time and money.May try 2k design first

k

iin

1

Page 11: Factorial Experiments2^k

11

Types of Experimental Designs (cont’d)

• Fractional Factorial Designs: Save time and expense.Less information.May not get all interactions.Not a problem if negligible interactions.

Page 12: Factorial Experiments2^k

12

A Sample Fractional Factorial Design.

ExperimentNumber CPU Memory

LevelWorkload

TypeEducational

Level123456789

680006800068000Z80Z80Z80808680868086

512K2M8M

512K2M8M

512K2M8M

ManagerialScientificSecretarialScientificSecretarialManagerialSecretarialManagerialScientific

High SchoolPost-graduate

CollegeCollege

High SchoolPost-graduatePost-graduate

CollegeHigh School

Page 13: Factorial Experiments2^k

13

Exercise• The performance of a System being designed depends

upon the following three factors:a. CPU type: 68000, 8086, 80286b. Operating System type: CPM, MS-DOS, UNIXc. Disk drive type: A, B, CHow many experiments are required to analyze the performance ifa. There is significant interaction among factors.b. There is no interaction among factorsc. The interactions are small compared to main effects.

Page 14: Factorial Experiments2^k

14

2k Factorial Designs

• k factors, each at two levels.• Easy to analyze.• Helps in sorting out impact of factors.• Good at the beginning of study.• Valid only if the effect is unidirectional.

E.g., memory size, the number of disk drives

Page 15: Factorial Experiments2^k

15

22 Factorial Designs• Two factors, each at two levels

Performance in MIPSCacheSize

Memory size4M Bytes 16M Bytes

4575

1525

1K2K

-1 if 4M bytes memory

1 if 16M bytes memory

-1 if 1M bytes cache

1 if 2M bytes cache

{{

xA=

xB=

Page 16: Factorial Experiments2^k

16

Modely = q0 + qAxA + qBxB +qABxAxB

15= q0 - qA - qB + qAB

45= q0 + qA - qB - qAB

25= q0 - qA + qB - qAB

75= q0 + qA + qB + qAB

y = 40 + 20xA + 10xB + 5xAxB

Interpretation: Mean performance = 40 MIPSEffect of memory = 20 MIPS

Effect cache = 10 MIPSInteraction between memory and cache = 5 MIPS

Page 17: Factorial Experiments2^k

17

Computation of EffectsExperiment A B y

1234

-11-11

-1-111

y1y2y3y4

Model: y = q0 + qAxA + qBxB +qABxAxB

Substitution:y1 = q0 - qA - qB + qAB

y2 = q0 + qA - qB - qAB

y3 = q0 - qA + qB - qAB

y4 = q0 + qA + qB + qAB

Page 18: Factorial Experiments2^k

18

Computation of Effects (cont’d)Solution:q0 =1/4 (y1 + y2 + y3 + y4)

qA =1/4 (-y1 + y2 - y3 + y4)

qB =1/4 (-y1 - y2 + y3 + y4)

qAB =1/4 (y1 - y2 - y3 + y4)

Notice that effects are linear combinations of responses.Sum of the coefficients is zero => contrasts.Notice: qA = Column A x Column y

qB = Column B x Column y

qAB = Column A x Column B x Column y

Page 19: Factorial Experiments2^k

19

Sign Table Method

I A B AB y1111

-11-11

-1-111

1-1-11

15452575

16040

8020

4010

205

TotalTotal/4

Page 20: Factorial Experiments2^k

20

Allocation of Variation• Importance of a factor = proportion of the

variation explained

• Sample variance of• Variation of y Numerator

= sum of squares total (SST)

12

)(

2

2

1

2

2

2

i

i

y

yysy

22

1

2)(i

i yy

Page 21: Factorial Experiments2^k

21

Allocation of Variation (cont’d)For a 22 design:

Variation due to Variation due toVariation due to interaction

SST = SSA + SSB + SSABFraction explained byVariation Variance

222222 222 ABBA qqqSST

22

22

2

2

B

A

qSSBB

qSSAA

222 ABqSSAB

SSTSSAA

Page 22: Factorial Experiments2^k

22

DerivationModel:yi = q0 + qAxAi + qBxBi +qABxAixBi

Notice1. The sum of entries in each column is zero:

;0;0;04

1

4

1

4

1

i

BiAii

Bii

Ai xxxx

4)(

4

4

4

1

2

4

1

2

4

1

2

iBiAi

iBi

iAi

xx

x

x2. The sum of the squares of entries in each column is 4:

Page 23: Factorial Experiments2^k

23

Derivation (cont’d)• 3. The columns are orthogonal (inner

product of any two columns is zero):

0)(

0)(

0

4

1

4

1

4

1

iBiAiBi

iBiAiAi

iBiAi

xxx

xxx

xx

Page 24: Factorial Experiments2^k

24

Derivation (cont’d)

Sample mean

0

4

141

4

141

4

141

4

104

1

4

104

1

4

141

)(

q

xxqxqxqq

xxqxqxqq

y

y

iBiAiAB

iBiB

iAiA

i

BiAiABBiBAiAi

ii

Page 25: Factorial Experiments2^k

25

Derivation (cont’d)Variation of y

222

4

1

224

1

224

1

22

4

1

24

1

24

1

2

4

1

2

4

1

2

444

0)()()(

)()()(

)(

)(

ABBA

iBiAiAB

iBiB

iAiA

iBiAiAB

iBiB

iAiA

iBiAiABBiBAiA

ii

qqq

xxqxqxq

xxqxqxq

xxqxqxq

yy

Product terms

Page 26: Factorial Experiments2^k

26

ExampleMemory-cache study:

40)75254515(41 y

Total Variation

222

2222

4

1

2

54104204

2100)3515525(

)(

i

i yy

Total variation = 2100Variation due to memory = 1600 (76%)Variation due to cache = 400 (19%)Variation due to interaction = 100 (5%)

Page 27: Factorial Experiments2^k

27

Case Study: Interconnection NetMemory interconnection networks:

Omega and Crossbar.Memory reference patterns:

random and MatrixFixed factors:1. Number of processors was fixed at 16.2. Queued requests were not buffered but blocked.3. Circuit switching instead of packet switching.4. Random arbitration instead of round robin.5. Infinite interleaving of memory => no memory back

contention.

Page 28: Factorial Experiments2^k

28

22 Design for Interconnection NetworksFactors Used in the Interconnection Network Study

LevelSymbol Factor -1 1AB

Type of the networkAddress Pattern Used

CrossbarRandom

Omegamatrix

ResponseA B Throughput T 90%Transit N Response R-11-11

-1-111

0.06410.42200.79220.4717

3524

1.6552.3781.2622.190

Page 29: Factorial Experiments2^k

29

Interconnection Network Study (cont’d)

Para-meter

Mean Estimate Variation Explained

q0

qA

qB

qAB

0.57250.0595-0.1257-0.0346

3.5-0.51.00.0

1.871-0.1450.4130.051

17.2%77.0%5.8%

20%80%0%

10.9%87.8%1.3%

T N R T N R

Page 30: Factorial Experiments2^k

30

Interpretation of Results• Average throughput = 0.5725• Most effective factor = B = reference pattern

=> The address patterns chosen are very different.• Reference pattern explains 0.1257 (77%) of

variation• Effect of network type = 0.0595

Omega networks = Average + 0.0595Crossbar networks = Average - 0.0595Difference between the two = 0.119

• Slight interaction (0.0346) between reference pattern and network type.

Page 31: Factorial Experiments2^k

31

General 2k Factorial Designsk factors at two levels each.2k experiments.2k effects:

k main effects

3

2

k

kTwo factor interactions

Three factor interactions...

Page 32: Factorial Experiments2^k

32

2k Design ExampleThree factors in designing a machine:Cache sizeMemory sizeNumber of processors

Factor Level -1 Level 1ABC

Memory SizeCache SizeNumber of Processors

4MB1kB1

16MB2kB2

Page 33: Factorial Experiments2^k

33

2k Design Example (cont’d)CacheSize

4M Bytes 16M Bytes

1K Byte2K Byte

1 Proc1410

2 Proc 1 Proc 2 Proc4650

2234

5886

I A B C AB AC BC ABC y11111111

-11

-11

-11

-11

-1-111

-1-111

-1-1-1-11111

1-1-111

-1-11

1-11

-1-11

-11

11

-1-1-1-111

-111

-11

-1-11

1422103446585086

32040

8010

405

16020

405

162

243

91

TotalTotal/8

Page 34: Factorial Experiments2^k

34

Analysis

4512 8 72 32 200 3200 200 800) 1 3 2 5 20 5 10 ( 8

) ( 22 2 2 2 2 2 2

2 2 2 2 2 2 2 3

ABC BC AC AB C B Aq q q q q q q

SST

=18%+4%+71%+4%+1%+2%+0%=100%

Number of Processors (C) is the most important factor

Page 35: Factorial Experiments2^k

35

ExerciseAnalyze the 23 design:

A1 A2

B1

B2

C1

10040

C2 C1 C2

1530

12020

1050

a. Quantify main effects and all interactions.b. Quantify percentages of variation explained.c. Sort the variables in the order of decreasing importance