1 Design of Experiments. 2 DESIGN OF EXPERIMENTS Purposeful changes of the inputs (factors) to a...

Post on 02-Apr-2015

215 views 1 download

Transcript of 1 Design of Experiments. 2 DESIGN OF EXPERIMENTS Purposeful changes of the inputs (factors) to a...

1

Design of Experiments

2

DESIGN OF EXPERIMENTS

Purposeful changes of the inputs (factors) to a process in order to observe corresponding changes in the output (response).

ProcessInputs Outputs

Douglas Montgomery, Design and Analysis of Experiments

3

Why use DOE ?

• A basis of action -- allows purposeful changes.

• An analytic study -- one in which action will be taken on a cause-and-effect system to improve performance of a product or process in the future.

• Follows the scientific approach to problem solving.

• Provides a way to measure natural variation.

• Permits the clear analysis of complex effects.

• Most efficient way to derive the required information at the least expenditure of resources.

Moen, Nolan and Provost, Improving Quality Through Planned Experimentation

4

Interactions

Varying factors together vs. one at a time.

B

U

C

K

D O E- -

+

+

George Box, Do Interactions Really Matter, Quality Engineering, 1990.

5

B

U

C

K

D O E- -

+

+

Voila!

George Box, Do Interactions Really Matter, Quality Engineering, 1990.

6

• Experiment run at SKF -- largest producer of rolling bearing in the world.

• Looked at three factors: heat treatment, outer ring osculation and cage design.

• Results:

•choice of cage design did not matter (contrary to previously accepted folklore -- considerable savings)

•life of bearing increased five fold if osculation and heat treatment are increased together -- saved millions of dollars !

George Box, Do Interactions Really Matter, Quality Engineering, 1990.

Industry Example

7

• Bearings like this have been made for decades. Why did it take so long to discover this improvement ? One factor vs. interaction effects !

Osculation

Cage

Heat

12816

19 21

26 85

17 25

George Box, Do Interactions Really Matter, Quality Engineering, 1990.

8

10621

18 23

Osculation

Heat

The Power of Interactions !

George Box, Do Interactions Really Matter, Quality Engineering, 1990.

9

2 Design Example2

Consider an investigation into the effect of the concentration of the reactant and the amount of catalyst on the reaction time of a chemical process.

L H

reactant (factor A) 15% 25%

catalyst (factor B) 1 bag 2 bags

Douglas Montgomery, Design and Analysis of Experiments

10

Design Matrix for 22

A B AB Total Average

- - +

+ - -

- + -

+ + +

Main effects Interaction

11

Factor A - B - 28 25 27 80

Settings A + B - 36 32 32 100

A - B + 18 19 23 60

A + B + 31 30 29 90

I II III Total

Replicates

Douglas Montgomery, Design and Analysis of Experiments

12

An effect is the difference in the average response at one level of the factor versus the other level of the factor.

- +

A

60 90

80 100

A effect =

( [90 + 100] - [60 + 80] )

/ 2(3) = 8.33

Douglas Montgomery, Design and Analysis of Experiments

13

Use a matrix to find the effects of each factor, including the interaction effect between the two factors.

A B AB Total Average

- - + 80 26.7

+ - - 100 33.3

- + - 60 20

+ + + 90 30

Avg + 31.7

Avg - 23.3

Effect 8.4Douglas Montgomery, Design and Analysis of Experiments

14

A B AB Total Average

- - + 80 26.7

+ - - 100 33.3

- + - 60 20

+ + + 90 30

Avg + 31.7 25 28.3

Avg - 23.3 30 26.7

Effect 8.4 -5 1.7

Completing the matrix with the effect calculations:

Douglas Montgomery, Design and Analysis of Experiments

15

-10 -5 0 5 10

B AB A

Dot Diagram

Douglas Montgomery, Design and Analysis of Experiments

16

35

30

25

20

- +

A

Response Plots

35

30

25

20

- +

B

Douglas Montgomery, Design and Analysis of Experiments

17

35

30

25

20

- +

A

B -

B +

B - B +

A -

A +

26.7 20

33.3 30

Interaction

Response

Plot

Douglas Montgomery, Design and Analysis of Experiments

18

Normal Probability Plots

• Effects are the differences between two averages.

• As we know, the distribution of averages are approximately normal.

• NPP can be used to identify the effects that are different from noise.

Soren Bisgaard, A Practical Introduction to Experimental Design

19

Construction of NPP

• Can be constructed with effects on horizontal and cumulative percentages on vertical -- but this requires normal probability paper.

• Can also be constructed using the inverse standard normal of the plotting point ( (i - .5) / n ).

• Look for effects that are different from plotted ‘vertical’ reference line.

Soren Bisgaard, A Practical Introduction to Experimental Design

20

Steps in constructing NPP

1. Compute effects.

2. Order effects from smallest to largest.

3. Let i be the order number (1 to n).

4. Calculate probability plotting position of the ordered effect using the formula ( p = [i - .5]/n).

5. Using a standard normal table determine the Z value corresponding to each left tail probability of step 4.

6. Plot the effects on horizontal axis and Z on vertical.

7. Fit a line through the most points.

8. Those ‘off the line’ are significant effects.

Soren Bisgaard, A Practical Introduction to Experimental Design

21

3 4 5 2

22

Plot reference linethrough the majorityof points. Look for effects which are offthis line.

6

7

8

23

Prediction Equation

The ‘intercept’ in the equation is the overall average ofall observations.

The coefficients of the factors in the model are 1/2 the effect.

Y = 27.5 + 8.33/2 A - 5/2 B + 1.7/2 AB

Y = 27.5 + 4.165 A - 2.5 B + 0.85 AB

or

note: A and B will be values between -1 and +1.

24

Analysis of Variance

Source of Sum of Degrees of Mean FVariation Squares Freedom Square

A 208.33 1 208.33 53.15 *B 75.00 1 75.33 19.13 *AB 8.33 1 8.33 2.13Error 31.34 8 3.92

Total 323.00 11

* = significant at 1% (see F table)

25

Calculating SS, df and MS for Effects and Interactions

Source of Sum of Degrees of Mean FVariation Squares Freedom Square

A 208.33 1 208.33 53.15 *

SS = Effect 2 x n = 8.332 x 3where n = replicates

always 1 for this type design

SS / df

Use this same process for A, B and AB

26

Source of Sum of Degrees of Mean FVariation Squares Freedom Square

Total 323.00 11

This is found by adding up every squared observation and then subtracting what is called a correction factor (sum of all observations,square this amount, then divide by the number of observations).

SST = 282 + 252 + 272 + ... + 292 - (3302 / 12) = 9398.0 - 9075.0 = 323.0

Total df = n - 1 = 12 - 1 = 11

Calculating total sum of squares and total degrees of freedom

27

Source of Sum of Degrees of Mean FVariation Squares Freedom Square

Error 31.34 8 3.92

Found by subtraction:Total SS - SSA - SSB - SSAB = 323 - 208.33 - 75.0 - 8.33 = 31.34

Found by subtraction: = Total df - A df - B df - AB df = 11 - 1 - 1 - 1 = 8

SS / df

Calculating error sum of squares, df and mean square

28

Source of Sum of Degrees of Mean FVariation Squares Freedom Square

A 208.33 53.15 *B 75.33 19.13 *AB 8.33 2.13Error 3.92

Calculating F ratios

F ratios:

F = MS (A or B or AB) ------------------------- MS (error)

208.33 / 3.9275.33 / 3.928.33 / 3.92

Compare to F table

29

Interpreting F ratios

F tableat num df = 1and denom df = 8

F .25 1.54F .10 3.46F .05 5.32F .025 7.57F .01 11.26

• F ratios confirm that factors A and B are significant at the 1% level.• F ratio shows there is not a significant interaction.

30

Exercise

You will conduct a 23 experiment with 2 replicates.

Factors: L H

A -- Tower 3 5

B -- Front Stop 0 2

C -- Back Stop 5 7

31

Requirements:

1. Collect data -- total of 16 observations (random order).

2. Fill in matrix and compute effects.

3. Put averages on a cube plot.

4. Plot effects on dot plot and normal probability plot.

5. Create appropriate response plots for significant interactions and main effects.

6. Interpret results and make recommendations to management.

32

Design MatrixReplicates

A B C AB AC BC ABC I II Total Average

1 - - - + + + -2 + - - - - + +3 - + - - + - +4 + + - + - - -5 - - + + - - +6 + - + - + - -7 - + + - - + -8 + + + + + + +

Avg +Avg -

Effect

33

Cube Plot

34

Response Plots

- +

-

+

35

i P Z1 0.07 -1.52 0.21 -0.83 0.36 -0.44 0.50 05 0.64 0.46 0.79 0.87 0.93 1.5

Z

Effect

Normal

Probability

Plot

36

ANOVA table

Source of Sum of Degrees of Mean FVariation Squares Freedom Square

A

B

C

AB

AC

BC

ABC

Error

Total

note: for 23 the SS =effect 2 x 2n

37

Why use 2k designs ?

• Easy to use and data analysis can be performed using graphical methods.

• Relatively few runs required.

• 2k designs have been found to meet the majority of the experimental needs of those involved in the improvement of quality.

• 2k designs are easy to use in sequential experimentation.

• Fractions of the 2k (fractional factorials) can be used to further reduce the experiment size.Moen, Nolan and Provost, Improving Quality Through Planned Experimentation

38

A review of the concepts behind Analysis of Variance

39

Analysis of Variance

• ANOVA is used to compare the means of two or more populations.

• Procedure is based on the spread (variance) between sample averages of populations and spread within sample averages.

• Possibly the most widely used procedure across disciplines.

40

Example

Consider a cereal manufacturer who wants to evaluate the impact on sales of four package designs. Ten stores are randomly assigned to one of the designs and sales data are collected for a given period.

This type of design is called a Completely Randomized Design.

Package Store Number Design 1 2 3 Total Mean of stores

1 12 18 30 15 22 14 12 13 39 13 33 19 17 21 57 19 34 24 30 54 27 2

All designs 180 18 10

41

It’s called analysis of VARIANCE!

Recall that variance is the “almost average” of the squared differences of a set of data around its mean.

For this set of data then, we have:

(12 - 18)2 + (14 - 18)2 + ... + (21 - 18)2 = 304 units of variation

42

Variation

What can account for this variation?

• type of package design (SSB) treatment

• everything else (SSE) error

The total variation can be expressed in this relationship:

SST = SSB + SSE

43

Why didn’t we sell the same amount of each package type?

Why not the same at each store for package 1? (or 2?, or 3?, or 4?)

----- thousands of extraneous factors!

SSE = (12 - 15)2 + (18 - 15)2 package 1

+ (14 - 13)2 + (12 - 13)2 + (13 - 13)2 package 2

+ (19 - 19)2 + (17 - 19)2 + (21 - 19)2 package 3

+ (24 - 27)2 + (30 - 27)2 package 4

= 46 total units of variation

44

What number best represents the long-term average for package 1?

The package 1 average (just as package 2 average does for package 2 designs, and so forth.

How do these package averages vary from the overall average?

(15 - 18)2 + (13 - 18)2 + (19 - 18)2 + (27 - 18)2 = 116

But, we must weigh each one by the number of observations in that average -- that gives us ... 2(9) + 3(25) + 3(1) + 2(81) = 258

(note: 304 = 258 + 46)

45

OK -- we’ve got some ‘sum of squares’

But this procedure is called ‘analysis of variance’

FACT: variance = sum of squares divided by appropriate df

Lets organize our results so far in an ANOVA table:

Sources of variation Sum of squares df Variance

SSB 258 3 86

SSE 46 6 7.67

Total 304 9

46

degrees of freedom

As the name implies, this is the number of things that are free to vary and still get the same result. For example, if I told you the average of five numbers is 7, you could pick any four numbers, and if I can pick the fifth I can ensure the average is 7.

Generally speaking, the df will be one less than the number of things being compared. For example,

SSB df = 4(package designs) - 1 = 3

SSE df = (2 - 1) + (3 - 1) + (3 - 1) + (2 - 1) = 6

Total df = 10 - 1 = 9

47

A ratio of variances

We next form a ratio of variances = 86 / 7.67 = 11.2

We need a reference distribution to evaluate this -- we compare it to the Fisher distribution (F distribution)

A Short Fisher Table

df for Degree of df for Numerator

Denominator Confidence 1 2 3

2 95 18.51 19.00 19.16

99 98.50 99.00 99.17

6 95 5.99 5.14 4.76

99 13.75 10.92 9.78

48

Making a decision

• We are really carrying out a hypothesis test.

Our Ho is that are package design means are equal.

H o :

Our Ha is that at least one mean is different than the rest.

• We can make two decisions with our data:

1. No difference in means. This is one of those less than 1 in a 100 times we would get a value this large.

2. The null is false -- we reject the null and accept H a.

1 2 3 4

49

Summary of ANOVA concept

1. Decompose the total sum of squares.

2. Convert sum of squares into variances.

3. Compute variance ratio and compare to F table.

50

Assumptions!

1. Populations being compared are normally distributed -- moderate departures OK -- “robust” in this regard.

2. Variances of the populations are equal -- can be tested -- if this assumption is not met there is “trouble in River City.”

3. Observations are statistically independent (use randomization).