SPSS Guide For MMI 409 - Medical Informatics...
Transcript of SPSS Guide For MMI 409 - Medical Informatics...
SPSS Guide
For
MMI 409 by
John Wong
March 2012
Preface
Hopefully, this document can provide some guidance to MMI 409 students on how to use SPSS
to solve many of the problems covered in the D’Agostino book. In order to minimize the size of
this document, the images are reduced to very small sizes. Readers should view this document
electronically using maximum zoom. Good Luck.
John Wong
LBJOHN99 at Yahoo dot com
D’Agostino, R.B., Sullivan, L.M., & Beiser, A.S. (2006). Introductory applied biostatistics. Belmont, CA:
Brooks/Cole, Cengage Learning.
S P S S G u i d e F o r M M I 4 0 9 P a g e | i
Table of Contents Descriptive Statistics ..................................................................................................................................... 1
Explore, Percentile, 95% CI, Boxplot, Mean, Median, Histogram ............................................................. 1
T Test ............................................................................................................................................................. 3
One Sample T Test .................................................................................................................................... 3
T Test ............................................................................................................................................................. 4
2 Independent Samples T-test .................................................................................................................. 4
T Test ............................................................................................................................................................. 5
Paired Samples T-test ............................................................................................................................... 5
Chi-Square Test ............................................................................................................................................. 6
Goodness of Fit (Aggregated Data) ........................................................................................................... 6
Chi-Square Test ............................................................................................................................................. 8
Test of Independence (Aggregated Data) ................................................................................................. 8
ANOVA ........................................................................................................................................................ 10
ANOVA (One Way ANOVA) ..................................................................................................................... 10
ANOVA ........................................................................................................................................................ 12
ANOVA with Eta ..................................................................................................................................... 12
ANOVA ........................................................................................................................................................ 13
Repeated Measures ANOVA .................................................................................................................. 13
Correlation Analysis .................................................................................................................................... 15
Scatter Diagram ...................................................................................................................................... 15
Correlation Analysis .................................................................................................................................... 16
Pearson r correlation coefficient ............................................................................................................ 16
Correlation Analysis .................................................................................................................................... 17
Linear Regression .................................................................................................................................... 17
Non-parametric Test ................................................................................................................................... 18
2 Dependent Samples ............................................................................................................................. 18
Sign Test (Legacy Dialog) ......................................................................................................................... 18
Non-Parametric Test ................................................................................................................................... 19
2 Dependent Samples ............................................................................................................................. 19
Wilcoxon Signed Rank (Legacy Dialog) ................................................................................................... 19
S P S S G u i d e F o r M M I 4 0 9 P a g e | ii
Non-parametric Test ................................................................................................................................... 20
2 Dependent Samples ............................................................................................................................. 20
Wilcoxon Signed Rank (New Dialog) ....................................................................................................... 20
Non-Parametric Test ................................................................................................................................... 21
2 Independent Samples .......................................................................................................................... 21
Wilcoxon Rank Sum (Mann Whitney U).................................................................................................. 21
Non-Parametric Test ................................................................................................................................... 22
k Independent Samples .......................................................................................................................... 22
Kruskal-Wallis Test (Legacy Dialog) ......................................................................................................... 22
Non Parametric Test ................................................................................................................................... 23
k Independent Samples .......................................................................................................................... 23
Kruskal-Wallis Test (New Dialog) ............................................................................................................ 23
Non-Parametric Test ................................................................................................................................... 24
Spearman Correlation (Correlation Between Variables) ........................................................................ 24
S P S S G u i d e F o r M M I 4 0 9 P a g e | 1
Descriptive Statistics
Explore, Percentile, 95% CI, Boxplot, Mean, Median, Histogram
1) Setup Data
2) Descriptive Statistics -
> Explore
3) Select Dependent List
4) Select Factor List
(Independent Variable)
5) Click Statistics
6) Check Descriptives,
Outliers, Percentiles
7) Click Continue
8) Click Plot
9) Check Normality Plots
with tests
10) Click Continue
11) Click OK
Histogram Output 1
Histogram Output 2
Boxplot Output
S P S S G u i d e F o r M M I 4 0 9 P a g e | 2
Descriptives
TRT Statistic Std. Error
HDL 0 Mean 58.60 9.338
95% Confidence Interval for
Mean
Lower Bound 37.48
Upper Bound 79.72
5% Trimmed Mean 58.78
Median 64.00
Variance 872.044
Std. Deviation 29.530
Minimum 19
Maximum 95
Range 76
Interquartile Range 64
Skewness -.373 .687
Kurtosis -1.587 1.334
1 Mean 55.20 7.422
95% Confidence Interval for
Mean
Lower Bound 38.41
Upper Bound 71.99
5% Trimmed Mean 55.44
Median 56.00
Variance 550.844
Std. Deviation 23.470
Minimum 18
Maximum 88
Range 70
Interquartile Range 47
Skewness -.220 .687
Kurtosis -1.121 1.334
Percentiles
TRT
Percentiles
5 10 25 50 75 90 95
Weighted
Average(Definition 1)
HDL 0 19.00 19.10 21.50 64.00 85.50 94.20 .
1 18.00 19.00 31.75 56.00 78.25 87.10 .
Tukey's Hinges HDL 0 22.00 64.00 85.00
1 33.00 56.00 78.00
S P S S G u i d e F o r M M I 4 0 9 P a g e | 3
T Test
One Sample T Test
1) Setup Data
2) Compare Means ->
One Sample T Test
3) Select Test Variable
4) Enter Test Value
5) Click OK
One-Sample Statistics
N Mean Std. Deviation Std. Error Mean
AGE 20 48.85 9.466 2.117
One-Sample Test
Test Value = 50
t df Sig. (2-tailed) Mean Difference
95% Confidence Interval of the
Difference
Lower Upper
AGE -.543 19 .593 -1.150 -5.58 3.28
S P S S G u i d e F o r M M I 4 0 9 P a g e | 4
T Test
2 Independent Samples T-test
1) Setup Data
2) Compare Means ->
Independent Samples
T Test
3) Select Test Variable
4) Select Grouping
Variable
5) Click Define Groups
6) Enter Groups range for
Analysis
7) Click Continue
8) Click OK
9) From the Levene’s
test, with pvalue > 0.05
(variances are equal),
use the t test for equal
variances.
Group Statistics
TRT N Mean Std. Deviation Std. Error Mean
LDL 0 10 283.90 108.189 34.212
1 10 227.10 70.245 22.213
Independent Samples Test
Levene's Test
for Equality of
Variances t-test for Equality of Means
F Sig. t df
Sig. (2-
tailed)
Mean
Difference
Std.
Error
Differe
nce
95% Confidence
Interval of the
Difference
Lower Upper
LDL Equal variances
assumed
3.651 .072 1.392 18 .181 56.800 40.791 -28.899 142.499
Equal variances
not assumed
1.392 15.443 .184 56.800 40.791 -29.927 143.527
S P S S G u i d e F o r M M I 4 0 9 P a g e | 5
T Test
Paired Samples T-test
1) Setup Data
2) Compare Means ->
Paired Samples T Test
3) Select Variables into
Variable 1 and
Variable 2
Paired Samples Correlations
N Correlation Sig.
Pair 1 Before & After 20 .313 .180
Paired Samples Test
Paired Differences
t df
Sig.
(2-
tailed) Mean
Std.
Deviation
Std. Error
Mean
95% Confidence Interval
of the Difference
Lower Upper
Pair 1 Before -
After
-8.050 24.752 5.535 -19.634 3.534 -1.454 19 .162
S P S S G u i d e F o r M M I 4 0 9 P a g e | 6
Chi-Square Test
Goodness of Fit (Aggregated Data)
1) Enter Data
2) Define Labels
(Optional)
3) Weight Cases to use
the aggregated data
4) Check Weight Cases
by
5) Select the aggregated
variable
6) Non-parametric test ->
Legacy Diaglo -> Chi-
square
7) Select Test Variable
8) Define Expected
Variable (Order must
match p1,p2,p3,p4)
9) Click OK
S P S S G u i d e F o r M M I 4 0 9 P a g e | 7
Chi-Square Test
Frequencies
Topic Issue
Observed N Expected N Residual
Drugs 52 48.0 4.0
Sex 38 30.0 8.0
Stress 21 30.0 -9.0
Education 9 12.0 -3.0
Total 120
Test Statistics
Topic Issue
Chi-Square 5.917a
df 3
Asymp. Sig. .116
a. 0 cells (.0%) have expected
frequencies less than 5. The
minimum expected cell
frequency is 12.0.
S P S S G u i d e F o r M M I 4 0 9 P a g e | 8
Chi-Square Test
Test of Independence (Aggregated Data)
1) Setup Data
2) Define Labels For
Variable 1 (Optional)
3) Define Labels for
Variable 2
4) Weight Cases to use
aggregated data
5) Click Weight cases by
6) Select measurement
variable
7) Click OK
8) Select Descriptive
Statistics -> Crosstab
9) Select Row and
Column Variables
10) Click Statistics
11) Check Chi-square
12) Click Continue
13) Click Cells
14) Check Expected
S P S S G u i d e F o r M M I 4 0 9 P a g e | 9
Crosstabs
Site * Treatment Crosstabulation
Treatment
Total Diet and Exercise
Oral
Hypoglycemics Insulin
Site HMO Count 294 827 579 1700
Expected Count 292.0 774.3 633.8 1700.0
UTH Count 132 288 352 772
Expected Count 132.6 351.6 287.8 772.0
IPA Count 189 516 404 1109
Expected Count 190.5 505.1 413.4 1109.0
Total Count 615 1631 1335 3581
Expected Count 615.0 1631.0 1335.0 3581.0
Chi-Square Tests
Value df
Asymp. Sig. (2-
sided)
Pearson Chi-Square 34.629a 4 .000
Likelihood Ratio 34.498 4 .000
Linear-by-Linear Association 1.744 1 .187
N of Valid Cases 3581
a. 0 cells (.0%) have expected count less than 5. The minimum expected count
is 132.58.
S P S S G u i d e F o r M M I 4 0 9 P a g e | 10
ANOVA
ANOVA (One Way ANOVA)
1) Setup Data
2) Compare Means -> One Way ANOVA
3) Select Dependent List 4) Select Factor
5) Click Post Hoc 6) Select Scheffe 7) Select Tukey
8) Click Continue
ANOVA
Time To Relief in Minutes
Sum of Squares df Mean Square F Sig.
Between Groups 423.333 2 211.667 10.160 .003
Within Groups 250.000 12 20.833
Total 673.333 14
Post Hoc Tests
S P S S G u i d e F o r M M I 4 0 9 P a g e | 11
Multiple Comparisons
Dependent Variable:Time to Relief in Minutes
(I) Drug Type (J) Drug Type
Mean
Difference (I-J) Std. Error Sig.
95% Confidence Interval
Lower Bound Upper Bound
Tukey HSD 1 2 7.0000 2.8868 .076 -.701 14.701
3 13.0000* 2.8868 .002 5.299 20.701
2 1 -7.0000 2.8868 .076 -14.701 .701
3 6.0000 2.8868 .136 -1.701 13.701
3 1 -13.0000* 2.8868 .002 -20.701 -5.299
2 -6.0000 2.8868 .136 -13.701 1.701
Scheffe 1 2 7.0000 2.8868 .091 -1.047 15.047
3 13.0000* 2.8868 .003 4.953 21.047
2 1 -7.0000 2.8868 .091 -15.047 1.047
3 6.0000 2.8868 .158 -2.047 14.047
3 1 -13.0000* 2.8868 .003 -21.047 -4.953
2 -6.0000 2.8868 .158 -14.047 2.047
*. The mean difference is significant at the 0.05 level.
Homogeneous Subsets
Time to Relief in Minutes
Drug Type N
Subset for alpha = 0.05
1 2
Tukey HSDa 3 5 20.000
2 5 26.000 26.000
1 5 33.000
Sig. .136 .076
Scheffea 3 5 20.000
2 5 26.000 26.000
1 5 33.000
Sig. .158 .091
Means for groups in homogeneous subsets are displayed.
a. Uses Harmonic Mean Sample Size = 5.000.
S P S S G u i d e F o r M M I 4 0 9 P a g e | 12
ANOVA
ANOVA with Eta
1) Setup Data
2) Select Compare Means -> Means
3) Select Dependent List 4) Select Independent List
5) Click Options 6) Check Anova Table with
Eta
7) Click Continue
ANOVA Table
Sum of
Squares df
Mean
Square F Sig.
Time to Relief in Minutes *
Drug Type
Between Groups 423.333 2 211.667 10.160 .003
Within Groups 250.000 12 20.833
Total 673.333 14
Measures of Association
Eta Eta Squared
Time to Relief in Minutes *
Drug Type
.793 .629
S P S S G u i d e F o r M M I 4 0 9 P a g e | 13
ANOVA
Repeated Measures ANOVA
1) Setup Data
2) General Linear Model -> Repeated Measures
3) Enter Within Subject Name
4) Type Number of Levels 5) Enter Measure Name
6) Click Define
7) Select Variables to Within Subjects
8) Click OK
Tests of Within-Subjects Effects
Measure:Time
Source
Type III Sum of
Squares df Mean Square F Sig.
Course
Between
Course
Sphericity Assumed 476.467 2 238.233 15.601 .000
Greenhouse-Geisser 476.467 1.270 375.146 15.601 .001
Huynh-Feldt 476.467 1.384 344.184 15.601 .001
S P S S G u i d e F o r M M I 4 0 9 P a g e | 14
(Treatment) Lower-bound 476.467 1.000 476.467 15.601 .003
Error(Course)
Within Course
(Treatment)
Sphericity Assumed 274.867 18 15.270
Greenhouse-Geisser 274.867 11.431 24.046
Huynh-Feldt 274.867 12.459 22.062
Lower-bound 274.867 9.000 30.541
Tests of Within-Subjects Contrasts
Measure:Time
Source Course
Type III Sum of
Squares df Mean Square F Sig.
Course Linear 470.450 1 470.450 43.628 .000
Quadratic 6.017 1 6.017 .305 .594
Error(Course) Linear 97.050 9 10.783
Quadratic 177.817 9 19.757
Tests of Between-Subjects Effects
Measure:Time
Transformed Variable:Average
Source
Type III Sum of
Squares df Mean Square F Sig.
Intercept 624386.133 1 624386.133 2526.137 .000
Error
Between
Subjects
2224.533 9 247.170
S P S S G u i d e F o r M M I 4 0 9 P a g e | 15
Correlation Analysis
Scatter Diagram
1) Enter Data
2) Legacy Dialogs -> Scatter
Plot
3) Select Simple Scatter
4) Click Define
5) Select Dependent
Variable into Y Axis
6) Select Independent
Variable into X
Axus
7) Click OK
Scatter Plot Output
S P S S G u i d e F o r M M I 4 0 9 P a g e | 16
Correlation Analysis
Pearson r correlation coefficient
1) Enter Data
2) Correlate -> Bivariate
3) Select Variables for
correlation
4) Click Options
5) Select Cross product
Deviations and
covariances
Correlations
Body Mass Index Systolic Blood Pressure
Body Mass Index Pearson Correlation 1 .860**
Sig. (2-tailed) .001
Sum of Squares and Cross-products 286.669 1036.950
Covariance Var(X)=31.852 Cov(X,Y)=115.217
N 10 10
Systolic Blood Pressure Pearson Correlation .860** 1
Sig. (2-tailed) .001
Sum of Squares and Cross-products 1036.950 5072.500
Covariance 115.217 Var(Y)=563.611
N 10 10
**. Correlation is significant at the 0.01 level (2-tailed).
S P S S G u i d e F o r M M I 4 0 9 P a g e | 17
Correlation Analysis
Linear Regression
1) Define Measure
Variables as scale.
Enter Data
2) Regression -> Linear
3) Select Dependent and
Independent varaibles
Model Summary
Model R R Square
Adjusted R
Square
Std. Error of the
Estimate
1 .860a .739 .707 12.853
a. Predictors: (Constant), Body Mass Index
Coefficientsa
Model
Unstandardized Coefficients
Standardized
Coefficients
t Sig. B Std. Error Beta
1 (Constant) 40.786 21.112 1.932 .089
Body Mass Index 3.617 .759 .860 4.765 .001
a. Dependent Variable: Systolic Blood Pressure
S P S S G u i d e F o r M M I 4 0 9 P a g e | 18
Non-parametric Test
2 Dependent Samples
Sign Test (Legacy Dialog)
1) Define Measure Variables as Scale; Enter Data
2) Nonparametric tests -> Legacy Dialogs ->2 Related Samples
3) Select Before and After variables
4) Check Sign
Sign Test
Frequencies
N
Postprogram - Baseline Negative Differencesa 2
Positive Differencesb 6
Tiesc 0
Total 8
a. Postprogram < Baseline
b. Postprogram > Baseline
c. Postprogram = Baseline
Test Statisticsb
Postprogram -
Baseline
Exact Sig. (2-tailed) .289a
a. Binomial distribution used.
b. Sign Test
S P S S G u i d e F o r M M I 4 0 9 P a g e | 19
Non-Parametric Test
2 Dependent Samples
Wilcoxon Signed Rank (Legacy Dialog)
1) Define Measure Variables as Scale; Enter Data
2) Nonparametric tests -> Legacy Dialogs -> 2 Related Samples
3) Select Before and After Variables
4) Check Wilcoxon
Wilcoxon Signed Ranks Test
Ranks
N Mean Rank Sum of Ranks
Postprogram - Baseline Negative Ranks 2a 5.25 10.50
Positive Ranks 6b 4.25 25.50
Ties 0c
Total 8
a. Postprogram < Baseline
b. Postprogram > Baseline
c. Postprogram = Baseline
Test Statisticsb
Postprogram -
Baseline
Z -1.053a
Asymp. Sig. (2-tailed) .292
a. Based on negative ranks.
b. Wilcoxon Signed Ranks Test
S P S S G u i d e F o r M M I 4 0 9 P a g e | 20
Non-parametric Test
2 Dependent Samples
Wilcoxon Signed Rank (New Dialog)
1) Define Measure Variables as Scale; Setup Data
2) Nonparametric tests -> Related Samples
3) Click Customize Analysis
4) Click Fields 5) Select Variables
6) Click Settings 7) Check Sign test 8) Check Wilcoxon
9) Click Run
Hypothesis Output Double Click to drill down to analysis
Sign Test Output
Wilcoxon Test Output
S P S S G u i d e F o r M M I 4 0 9 P a g e | 21
Non-Parametric Test
2 Independent Samples
Wilcoxon Rank Sum (Mann Whitney U)
1) Define Treatment type as Ordinal, Measure Variable as Scale, Setup Data
2) Nonparametric tests -> Independent Samples
3) Select Customize analysis
4) Select Fields tab 5) Select Measure
variable in Test Fields 6) Select Treatment type
variable in Groups
7) Select Settings Tab 8) Check Mann Witney U 9) Click Run
Hypothesis Output (one tail test) Double Click to drill down to analysis
Mann-Whitney U Output Use Exact Sig for one tail analysis Use Asymp Sig for 2 tail analysis
S P S S G u i d e F o r M M I 4 0 9 P a g e | 22
Non-Parametric Test
k Independent Samples
Kruskal-Wallis Test (Legacy Dialog)
1) Define Treatment type as Ordinal, Measure Variable as Scale, Setup Data
2) Nonparametric tests -> Legacy Dialog -> K Independent Samples
3) Select test Variable 4) Select Grouping Variable 5) Click Define range
6) Define range of grouping
7) Click Continue 8) Click OK
Kruskal-Wallis Test
Ranks
Treatment N Mean Rank
Time 0 5 9.70
15 5 17.40
40 5 11.20
50 5 3.70
Total 20
Test Statisticsa,b
Time
Chi-Square 13.692
df 3
Asymp. Sig. .003
a. Kruskal Wallis Test
b. Grouping Variable:
Treatment
S P S S G u i d e F o r M M I 4 0 9 P a g e | 23
Non Parametric Test
k Independent Samples
Kruskal-Wallis Test (New Dialog)
1) Define Treatment type as Ordinal, Measure Variable as Scale, Setup Data
2) Nonparametric tests -> Independent Samples
3) Select Customize analysis
4) Select Fields tab 5) Select Measure variable
in Test Fields 6) Select Treatment type
variable in Groups
7) Select Settings 8) Select Customer Tests 9) Select Kruskal Wallis 10) Click Run
Hypothesis Output (one tail) Double Click to drill down to analysis
At the bottom of output, click the down arrow, select pairwise comparison
n) Pairwise Comparison. Use the “Sig.” column value for significance
S P S S G u i d e F o r M M I 4 0 9 P a g e | 24
Non-Parametric Test
Spearman Correlation (Correlation Between Variables)
1) Define Treatment type as Ordinal, Measure Variable as Scale, Setup Data
2) Correlate -> Bivariate
3) Select Variables 4) Check Spearman
Nonparametric Correlations
[DataSet0]
Correlations
Number of Cigaretes
Per Day
Number of Hours of
Exercise Per Day
Spearman'
s rho
Number of Cigaretes Per
Day
Correlation Coefficient 1.000 -.454
Sig. (2-tailed) . .139
N 12 12
Number of Hours of
Exercise Per Day
Correlation Coefficient -.454 1.000
Sig. (2-tailed) .139 .
N 12 12