Statistics: The Science of Learning from Data Data Collection Data Analysis Interpretation...
-
date post
21-Dec-2015 -
Category
Documents
-
view
215 -
download
1
Transcript of Statistics: The Science of Learning from Data Data Collection Data Analysis Interpretation...
Statistics: The Science of Learning from Data
• Data Collection• Data Analysis• Interpretation• PredictionTake Action• W.E. Deming “The value of statistics and
statisticians is to make predictions that form the basis for action”
Data Collection
• Observational Studies To study correlation in variables
Prediction OK ---infer causation No! • Sampling Surveys Estimate Population Totals, Ratios etc.
• Experimental Designs – to study cause and effect relationships “If you want to predict what will happen in the future when you
do something”
The only way to find out what happens whenyou manipulate a variable, is to go ahead andmanipulate it, then observe the result!
Typical Purposes for Experimentation
• To determine principal causes of variation in a measured response
• To find conditions that give rise to a maximum or minimum response
• To determine if there is a difference in (or how big that difference is) between responses achieved at different settings of controllable variables
• To obtain a mathematical model in order to predict future responses, when controllable variables are changed
Goals of This Class
• Students should be able to choose an experimental design plan that is appropriate for the research problem at hand
• Students should be able to construct the design (including performing proper randomization and determining the required number of replicates)
• Execute the plan to collect the data (or advise a researcher to do it)
• Determine the appropriate model to fit the data• Fit the model to the data and check the appropriateness of the
model• Interpret and explain the results in a meaningful way to answer
the research question
Some Basic Definitions
• Experiment or Run – experimenter changes at least one of the items under study and observes the effect of his action
• Experimental Unit – the “material” under study upon which something is changed
• Treatment Factor or Independent Variable – a variable under study which is controlled at some level during a given experiment and varied from experiment to experiment, at the will of the experimenter
Some Basic Definitions
•Treatment Factor Levels – the different settings the
treatment factor that will be used throughout the
course of experimentation
●Background or Lurking Variable A variable the experimenter is unaware of, or cannot control, that may affect the outcome
•Response or Dependent Variable – measurements
of experimental units that depend upon settings of
the factors.
Some Basic Definitions
• Effect – Change in the response caused by a change in the factor level
• Replication – more than one experimental unit assigned to the same combination of treatment factor levels
• Repeated measurements (Duplicates) – more one measures of the same characteristic of an experimental unit
• Subsamples – observational unit, random subsample of the larger experimental unit
Some Basic Definitions
• Experimental Design – Collection of experiments or runs to be made
• Confounded Factors – two or more factors are changed at the same time resulting in confused effects
• Biased Factor – Background variable changes when factor is changed resulting in confused effect.
Some Basic Definitions
• Experimental Error – the difference between the response for a given experiment and the long run average of all potential experiments that could be made at the same factor settings. This is usually caused by inherent differences in experimental units
• Sources of noise – anything that could cause the response for one experiment to be different than another (treatment factors, nuisance factors – variation in experimental units)
Examples of Experimental Units
Medical Experiments – human subjects
Agriculture – individual plots of land
Manufacturing – batch of raw materials
If an experiment has to be run over a period of time with observations collected sequentially over time, the time of the run (or conditions that exist at the time of the run) or trial may be regarded as the experimental unit
Experimental units should be representative of the material and conditions to which the conclusions of the experiment are applied
Blocking
• The act of grouping the experimental units together into similar groups or Blocks
• Each treatment factor level will be tested on at least one experimental level within each Block
Purpose of Blocking
• Increase precision of treatment factor level comparisons by comparing treatment factor levels within homogeneous groups of experimental factors
• Broaden the scope of the results by including blocks which are representative of all conditions where conclusions are to be applied
Randomization
• The act of assigning treatment factor levels to experimental units in a random manner (utilizing a table of random numbers or randomization computer algorithm)
Purpose of Randomization
• Prevent experimenter bias
• Prevent systematic bias
• Insure independence of experimental error
Types of Experimental Designs
Classify Sources of Variation
Screen important factors
Constrained optimization
Unconstrained optimization
Mechanistic modeling
Planning Experiments
• Define objectives• Identify experimental units• Define meaningful and measurable response• List independent and lurking variables• Run pilot tests• Make flow diagram of the experimental procedure for run• Choose experimental design • Determine number of replicates• Randomize experimental conditions to experimental units• Describe the method of data analysis• Provide timetable and budget
Recipe is the same for both chocolate and orange cookies up to the point of adding the syrup
Baking time and temperature is the same for both
What is the Purpose for Experimenting ?
Hypothesis: Maybe the baking temperature must be modified for the orange cookie recipe
Plan: Vary the oven temperature from one sheet of orange cookies to the next, and measure the diameter of each cookie and calculate the average for each tray of cookies.
What is the response ?
What is an experiment ?
What is the experimental design ?
Are there any replicates or repeated reasurements?
Could blocking or randomization help?
What is the treatment factor?
What is the experimental unit?
What other sources of noise exist (besideTreatment factor levels)?
Example 2
Problem: Want to increase average flight time of paper helicopters made from one 8.5×11 sheet of paper
Hypothesis: Changing wing-length and wing-width should affect average flight time, and if so an optimal combination should exist
Plan: Construct four different prototypes to test, test each repeatedly, compare the average flight time
What are the treatment factor(s)?
What is the experimental unit?
What other sources of variation exist (besidetreatment factor levels)?
Will Plants be plantedFar enough apart to Prevent fertilizer Bleeding over?
Tomato Experiment Box Hunter and Hunter(1978)
Why only 11 Plants?
How will YieldsBe measured?
Analysis 1 - Plot the Data!
Tomato Example
0
5
10
15
20
25
30
35
0 2 4 6 8 10 12position
Yiel
d (lb
s.) A
B
Observations:
•quite a bit of variation (factor of 2-3 in yield, low to high);
•one possible outlier
•evidence of trend toward decreasing yield along the row from position 1 - 11
Conclusion: It matters more where you plant than what fertilizer you use! Has the positional trend been noted
before?