Stat 470-11 Stat 470-1 Introduction to the Design of Experiments Instructor: Derek Bingham, Office:...

Stat 470-1 1

Stat 470-1Introduction to the Design of Experiments

Instructor: Derek Bingham,

Office: West Hall 451

Contact Information:Email: [email protected]

Phone: (734) 763-9294

Office hours: West Hall 443

Tuesday & Thursday, 12:30-2:00; others by appointment

Text: Experiments: Parameter Design and Optimization by Wu and Hamada

Stat 470-1 2

Stat 470 – Overview/Syllabus

• Coverage: Review of linear regression; most of Chapters 1-4, additional topics as needed, and Chapter 9 if time permits

• Course notes will be available on the web by 12:00 day of class…otherwise as handouts in class (www.stat.lsa.umich.edu/~dbingham/Stat470)

• Term Project– design, conduct, analyze, report on an experiment– Will be given out later

Stat 470-1 3

Computing

• Computing– SPSS

– Any other package you like

Stat 470-1 4

What is Experimental Design?

• Experiments performed in almost all fields of study

• Experiment is conducted to learn something about a process or system

• Designed experiment is a series of tests (experiments) where changes are made in the inputs to observe and identify the impact on the output

• Better understanding of how the factors impact the system allows the experimenter predict future values or optimize the process

Stat 470-1 5

• Can consider a process as:

Inputs Process Output

• The input variables (usually called variables or factors) will be denoted will be denoted x1, x2,…, xp.

• The output variable (often called the response variable) will be denoted will be denoted by y.

Stat 470-1 6

Example (Tomato Fertilizer)

• Experiment was conducted by a horticulturist• Has 2 types of fertilizer available for tomato production (A and B)

• Objective: Is one fertilizer better than the other – higher yield, on average?

• Has 11 tomato plots

• Experiment Procedure - specify fertilizer amounts each fertilizer; decide upon number of pots to receive each fertilizer; randomly assign fertilizer to pots

• Response: yield – pounds of tomatoes

Stat 470-1 7

Some Definitions

• Factor: variable whose influence upon a response variable is being studied in the experiment

• Factor Level: numerical values or settings for a factor

• Treatment or level combination: set of values for all factors in a trial

• Experimental unit: object, to which a treatment is applied

• Trial: application of a treatment to an experimental unit

• Replicates: repetitions of a trial

• Randomization: using a chance mechanism to assign treatments to experimental units

Stat 470-1 8

What is an Experiment Design?

• Suppose you are going to conduct an experiment with 8 factors

• Suppose each factor has only to possible settings

• How many possible treatments are there?

• Suppose you have enough resources for 32 trials. Which treatments are you going to perform?

• Design: specifies the treatments, replication, randomization, and conduct of the experiment

Stat 470-1 9

Types of Experiments

• Treatment Comparisons: Purpose is to compare several treatments of a factor (have 3 diets and would like to see if they are different in terms of effectiveness)

• Variable Screening: Have a large number of factors, but only a few are important. Experiment should identify the important few. (we will focus on these!)

• Response Surface Exploration: After important factors have been identified, their impact on the system is explored to optimize response

Stat 470-1 10

Types of Experiments

• System Optimization: Often interested in determining the optimum conditions (e.g., Experimenters often wish to maximize the yield of a process or minimize defects)

• System Robustness: Often wish to optimize a system and also reduce the impact of uncontrollable (noise) factors. (e.g., would like a fridge to cool to a set temperature…but the fridge must work in Florida, Alaska and Michigan!)

Stat 470-1 11

Systematic Approach to Experimentation

• State the objective of the study

• Choose the response variable…should correspond to the purpose of the study

• Choose factors and levels

• Choose experiment design (purpose of this course)

• Perform the experiment

• Analyze data (design should be selected to meet objective and so analysis is efficient and easy)

• Draw conclusions

Stat 470-1 12

Observation vs. Experimentation

• Data collection is not experimentation

• By observation, you can learn that lightning can cause fires

• By experimentation, you can learn that friction between certain materials can cause fires

• By more experimentation, we have learned how to make fire-starting reliable, cheap, easy, …

• By experimentation, we learn more and we learn faster

Stat 470-1 13

The Need for Experimentation

• A doctor has impression that persons he gives Medicine A to recover more quickly than persons he gives Medicine B to.

• But, apparent difference could be due to:– luck, random variation, small sample sizes, … – bias in choice of medicine for patients– physical differences in people receiving A vs. B

• age, weight, sex, prior health, ….

• Clinical trials (experiments that control extraneous sources of variation and bias) are required to get a scientific assessment of Medicine A vs. Medicine B

Stat 470-1 14

Three Principles

• Replication – each treatment is applied to experimental units that are representative of the population of interest– independent repetition of a trial

– provides a measure of “noise,” meaning:• experimental error -- the variability of experimental units which receive the

same treatment

– experimental error is the yardstick against which we compare different treatments

– increasing number of replicates decreases variance of treatment effects and increases the power to detect significant differences

– Replication provides a measure of experimental “noise” and the means for controlling the level of that noise. (More replication means less noise in averages.)

Stat 470-1 15

Warning!Sometimes what looks like replication is not!

• Repeat measurements on one experimental unit is not replication• Measurements on multiple samples from one experimental unit is not

replication

• Example: two cake recipes. – Make one batch by recipe A; one batch by recipe B– Bake 12 cupcakes from each batch; measure fluffiness– The experimental unit is a batch;

• there has been no replication of either recipe A or recipe B• there is no valid comparison of recipe A to recipe B; apparent difference could

be random batch differences

Stat 470-1 16

Principle 2 - Randomization

2. Randomization -- use of a chance mechanism (e.g., random number generator) to assign treatments to experimental units or to the sequence of experiments– provides protection against unknown lurking variables

– help justify the assumption of “independence” that will underlie many analyses

Stat 470-1 17

Principle 3 - Blocking

3. Blocking -- run groups of treatments on homogenous units (block) to reduce variability of effect estimates and have more fair comparisons

– Example: To compare 4 varieties of corn, an experimenter could consider blocks of land of various soil types and terrain, subdivide each block into plots, and randomly assign the 4 varieties to plots in a block.

– Blocking:• controls variability due to soil types and terrain and allows varieties to be

compared within blocks

• broadens the scope of conclusions, e.g., by including variety of soil types and terrain in the experiment

Stat 470-1 18

Case Study: Reliability of Wire Bonding on Integrated Circuits

• Process monitoring:– Sample ICs selected, pull-

tested

• Available data:– pull strengths from 1000s of

pull tests

• Success criterion:– pull strength > 2.5g

• Planned analysis:– fit a distribution to all the

data

– estimate reliability

Stat 470-1 19

Designed Experiment to the Rescue

• Experimental Design:– 3 bonding operators

– 3 bonding machines

– 3 pull-test operators

– 2 IC packages per combination

– 48 wires per IC package

– All combinations = 2,592 observations (!)

Note: Unusually large experiment, but feasible in this case – many defective IC’s available and processing time is short

Stat 470-1 20

Case Study (cont.)

• Analysis Findings:

– No appreciable difference among bonding machines

– Large and independent effects of bonding and pull-test operators. NOT GOOD!

Ave. Pull Strength - each the ave. of 288 observations.(grams)

Pull Test Operator

A B C

Bond. A 8.4 6.3 7.0

Op B 9.0 6.8 7.6 (noise std dev = 1.5 grams)

C 7.1 5.3 5.8

• Further conjectures and experiments led to improved consistency of manufacturing and testing techniques

Stat 470-1 21

Case Study -- Messages

• People and procedures can have more of an influence on quality than machines.

• Think about possible sources of variability

• Use designed experiments to control and evaluate these sources of variability

Stat 470-1 22

Assignment

• Review t-tests (2-sample and paired)

• Review Linear regression

• Review ANOVA

• These will be the fundamental analysis tools for this course.

Stat 470-11 Stat 470-1 Introduction to the Design of Experiments Instructor: Derek Bingham, Office:...

Documents

Transcript of Stat 470-11 Stat 470-1 Introduction to the Design of Experiments Instructor: Derek Bingham, Office:...