1 Experimental Statistics - week 4 Chapter 8: 1-factor ANOVA models Using SAS.
-
Upload
shavonne-crawford -
Category
Documents
-
view
216 -
download
0
Transcript of 1 Experimental Statistics - week 4 Chapter 8: 1-factor ANOVA models Using SAS.
1
Experimental StatisticsExperimental Statistics - week 4 - week 4Experimental StatisticsExperimental Statistics - week 4 - week 4
Chapter 8: 1-factor ANOVA models
Using SAS
2
EXAM SCHEDULE:
Exam I – Take-home exam (handed out Thursday, March 3, due 8:00 AM Tuesday, March 8)
Exam II – Take-home exam (handed out Thursday, April 14, due 8:00 AM Tuesday, April 19)
Final Exam – optional (scheduled for 8:00 AM – 11:00 AM Friday, May 6)
GRADE COMPUTATION:
Exam Grades (75%)Daily Assignments (25%)
3
ANOVA Table Output - hostility data - calculations done in class
Source SS df MS F p-value
Between 767.17 2 383.58 16.7 <.001 samples
Within 205.74 9 22.86 samples
Totals 972.91
4
SPSS ANOVA Table for Hostility Data
5
ANOVA Models
Consider the random sample
Population has mean .
1 2, ,..., ny y y
1 2 35.5, 3.8, 6.0,y y y where etc.
1 2, ,...,
,
, 1,...,
n
i i
y y y
y i n
2
If is a sample from a population that is
normal with mean and variance then we
can write
Note:
Example:
6
11 1 11
12 1 12
y
y
We can write
etc.
For 1-factor ANOVA
7
Alternative form of the 1-Factor ANOVA Model
2 ' are (0, )ij s NID
General Form of Model: ij i ijy
(pages 394-395)
- random errors follow a Normal (N) distribution, are independently distributed (ID), and have zero mean and constant variance
1
0t
ii
Note:
i i
ij i ijy
1
1
t
iit
-- i.e. variability does not change from group to group
8
0 1 2:
:t
a
H
H
Testing the hypotheses:
at least 2 means a unequal
0 :
:a
H
H
is equivalent to testing the hypotheses:
9
Analysis of Variance TableAnalysis of Variance TableAnalysis of Variance TableAnalysis of Variance Table
2
0 2( 1, )B
TW
sH F F t n t
s We reject at significance level if
1F - if factor effects, we expect
2B is 22 estimates constant -
1F - if no factor effects, we expect ;
Recall:
In our model:2 2Ws estimates
Introduction to SAS Introduction to SAS Programming LanguageProgramming Language
11
Recall CAR DATA
For this analysis, 5 gasoline types (A - E) were to be tested. Twenty carswere selected for testing and were assigned randomly to the groups (i.e. the gasoline types). Thus, in the analysis, each gasoline type was tested on 4 cars. A performance-based octane reading was obtained for each car, and the question is whether the gasolines differ with respect to this octane reading.
A
91.7 91.2 90.9 90.6
B
91.7 91.9 90.9 90.9
C
92.4 91.2 91.6 91.0
D
91.8 92.2 92.0 91.4
E
93.1 92.9 92.4 92.4
12
The CAR data set as SAS needs to see it: A 91.7A 91.2A 90.9A 90.6B 91.7B 91.9B 90.9B 90.9C 92.4C 91.2C 91.6C 91.0D 91.8D 92.2D 92.0D 91.4E 93.1E 92.9E 92.4E 92.4
13
Case 1: Data within SAS FILE : DATA one;INPUT gas$ octane;DATALINES;A 91.7A 91.2 . . . E 92.4E 92.4 ;PROC GLM; (or ANOVA) CLASS gas; MODEL octane=gas; TITLE 'Gasoline Example - Completely Randomized Design'; MEANS gas/duncans;RUN;PROC MEANS mean var;RUN;PROC MEANS mean var;class gas;RUN;
SAS file for CAR data
14
Brief Discussion of Components of the SAS File:
DATA Step
DATA STATEMENT - the first DATA statement names the data set whose variables are defined in the INPUT statement -- in the above, we create data set 'one'
INPUT STATEMENT - 2 forms
1. Freefield - can be used when data values are separated by 1 or more blanks
INPUT NAME $ AGE SEX $ SCORE; ($ indicates character variable)
2. Formatted - data occur in fixed columns
INPUT NAME $ 1-20 AGE 22-24 SEX $ 26 SCORE 28-30;
DATALINES STATEMENT - used to indicate that the next records in the file contain the actual data and the semicolon after the data indicates the end of the data itself
15
SPECIFYING THE ANALYSISSPECIFYING THE ANALYSIS -- PROC STATEMENTS
GENERAL FORM PROC xxxxx; implies procedure is to be run on most recently created data set PROC xxxxx DATA = data set name; Note: I did not have to specify DATA=one in the above example
Example PROCs:
PROC REG - regression analysisPROC ANOVA - analysis of variance PROC GLM - general linear model PROC MEANS - basic statistics, t-test for H0:
PROC PLOT - plottingPROC TTEST - t-tests PROC UNIVARIATE - descriptive stats, box-plots, etc.
PROC BOXPLOT - boxplots
16
PROC GLMPROC GLMPROC GLMPROC GLM
• Proc GLM data = fn ;
– Class … ; List all the factors.
– Model … / options; e.g., model octane = gas;
– Means … / options;
– Run;
17
SAS SyntaxSAS SyntaxSAS SyntaxSAS Syntax
• Every command MUSTMUST end with a semicolon– Commands can continue over two or more lines
• Variable names are 1-8 characters (letters and numerals, beginning with a letter or underscore), but no blanks or special characters
– Note: values for character variables can exceed 8 characters
• Comments – Begin with *, end with ;
18
Titles and LabelsTitles and LabelsTitles and LabelsTitles and Labels
• TITLE ‘…’ ;– Up to 10 title lines: TITLE ‘include your title here’;
– Can be placed in Data Steps or Procs
• LABEL name = ‘…’ ;– Can be in a DATA STEP or PROC PRINT
– Include ALL labels, then a single ;
Note: For class assignments, place descriptive titles and labels on the output. Print the data to the output file.
19
Case 2: Data in an External File
FILENAME f1 ‘complete directory/file specification’;
FILENAME f1 ‘a:car.data';DATA one;INFILE f1; INPUT gas$ octane;PROC GLM; (or ANOVA) CLASS gas; MODEL octane=gas; TITLE 'Gasoline Example - Completely Randomized Design';RUN;PROC MEANS mean var;RUN;PROC MEANS mean var;class gas;run;
20
The SAS Output for CAR data: Gasoline Example - Completely Randomized Design General Linear Models Procedure Dependent Variable: OCTANE Sum of MeanSource DF Squares Square F Value Pr > F Model 4 6.10800000 1.52700000 6.80 0.0025 Error 15 3.37000000 0.22466667 Corrected Total 19 9.47800000 R-Square C.V. Root MSE OCTANE Mean 0.644440 0.516836 0.4739902 91.710000 Source DF Type I SS Mean Square F Value Pr > F GAS 4 6.10800000 1.52700000 6.80 0.0025 Source DF Type III SS Mean Square F Value Pr > F GAS 4 6.10800000 1.52700000 6.80 0.0025
21
Text Format for ANOVA Table Output - car data
Source SS df MS F p-value
Between 6.108 4 1.527 6.80 0.0025 samples
Within 3.370 15 0.225 samples
Totals 9.478 19
22
PC SAS on Campus
Library
BIC
Student Center
http://support.sas.com/rnd/le/index.html
SAS Learning Edition $125
23
1. Calculate the average, standard deviation, minimum, and maximum for the 20 octane readings. CS pp. 25 - 32
2. Graph a histogram of OCTANE. CS pp. 37
3. Calculate descriptive statistics in (1) above for OCTANE for each of the 5 gasolines. CS pp. 32-34
0 : A BH Run 4. t-test to test using GA S typesA and B. CS pp. 138-141
“Lab” AssignmentUsing CAR Data, run the following in this order with one set of code:
5. Plot side-by-side box plots for OCTANE for the 5 levels of the variable GAS
6. Compute a 1-factor ANOVA for the CAR data using only the first 3 GAS types. CS pp.150-155