Inferential Statistics. Explore relationships between variables Test hypotheses –Research...
-
Upload
maud-manning -
Category
Documents
-
view
216 -
download
0
Transcript of Inferential Statistics. Explore relationships between variables Test hypotheses –Research...
Inferential Statistics
Inferential Statistics
• Explore relationships between variables• Test hypotheses
– Research hypothesis: a statement of the relationship between variables.
• An increase in the number of stressors as measured by the LE scale will correspond to an increase in the number of illness incidents as measured by self-report.
– Statistical hypothesis: mathematical statement implying a relationship between variables
– Null hypothesis: mathematical statement implying no relationship
Examples
• You are looking the relationship between GPA and gender– Null hypothesis: H1: f = m
– Statistical hypothesis: H0: f m
• You are looking at the relationship between stressors and illness– Statistical hypothesis: H1: = 1– Null hypothesis: H0: 1
Tests
• Crosstab
• Difference of Means
• ANOVA
• Bivariate correlation
Crosstab
• Used for 2 categorical variables• Like a sort box, only you are using frequencies
instead of objects
blue
triangles circles
orange
• I want to use a crosstab to see if there is a difference in referral patterns by school
• SPSS: analyze – descriptive statistics - crosstabSCHOOL * Q14. Food Crosstabulation
Count
21 55 76
4 85 89
14 96 110
22 78 100
42 168 210
103 482 585
Madison cross roads
Rainbow
Rolling hills
New Hope
Morris
SCHOOL
Total
Yes No
Q14. Food
Total
This is the raw sorted data. It gives you a sense of what’sGoing on, but let’s add percentages…
• To read this table, first search for the 100% and then follow the line up or down..
SCHOOL * Q14. Food Crosstabulation
21 55 76
27.6% 72.4% 100.0%
20.4% 11.4% 13.0%
4 85 89
4.5% 95.5% 100.0%
3.9% 17.6% 15.2%
14 96 110
12.7% 87.3% 100.0%
13.6% 19.9% 18.8%
22 78 100
22.0% 78.0% 100.0%
21.4% 16.2% 17.1%
42 168 210
20.0% 80.0% 100.0%
40.8% 34.9% 35.9%
103 482 585
17.6% 82.4% 100.0%
100.0% 100.0% 100.0%
Count
% within SCHOOL
% within Q14. Food
Count
% within SCHOOL
% within Q14. Food
Count
% within SCHOOL
% within Q14. Food
Count
% within SCHOOL
% within Q14. Food
Count
% within SCHOOL
% within Q14. Food
Count
% within SCHOOL
% within Q14. Food
Madison cross roads
Rainbow
Rolling hills
New Hope
Morris
SCHOOL
Total
Yes No
Q14. Food
Total
• Significance test for a crosstab is the Chi Square ( ²)
Chi-Square Tests
19.778a
4 .001
22.855 4 .000
585
PearsonChi-SquareLikelihood Ratio
N of Valid Cases
Value dfAsymp. Sig.
(2-sided)
0 cells (.0%) have expected count less than 5. Theminimum expected count is 13.38.
a.
The p-value for this hypothesis test is 0.001, therefore you would reject the null hypothesis
• Pearson's chi-square is by far the most common type of chi-square significance test. If simply "chi-square" is mentioned, it is probably Pearson's chi-square. This statistic is used to test the hypothesis of no association of columns and rows in tabular data. It can be used even with nominal data. Note that chi square is more likely to establish significance to the extent that (1) the relationship is strong, (2) the sample size is large, and/or (3) the number of values of the two associated variables is large. A chi-square probability of .05 or less is commonly interpreted by social scientists as justification for rejecting the null hypothesis that the row variable is unrelated (that is, only randomly related) to the column variable.
Difference of means test
• One variable is dichotomous (2 categories only) and the other is continuous
• t-test for significance of the difference of means– Can look at means within one sample and means
between two samples.
• I want to look at the difference in depression symptoms between women who retained custody of their children vs. women who did not after an allegation of child sexual abuse.
• Analyze – compare means – independent samples t test
Independent Samples Test
.042 .837 -2.316 116 .022 -3.3058 1.4274 -6.1330 -.4786
-2.225 21.105 .037 -3.3058 1.4856 -6.3944 -.2172
Equal variancesassumed
Equal variancesnot assumed
DEPTOTF Sig.
Levene's Test forEquality of Variances
t df Sig. (2-tailed)Mean
DifferenceStd. ErrorDifference Lower Upper
95% ConfidenceInterval of the
Difference
t-test for Equality of Means
Group Statistics
17 6.8824 5.7105 1.3850
101 10.1881 5.4013 .5375
You kept custodyof your child(ren)no
Yes
DEPTOTN Mean Std. Deviation
Std. ErrorMean
I can choose the p-value of 0.022 because the standard deviationsare close enough to assume equal variances. SPSS also tests this withthe Levene’s test for equality of variances in the table above
ANOVA• One variable is categorical with 3 or more
categories and the other is continuous• Looks for a difference between and within groups.
– Takes into account the mean and the variability
• The ANOVA uses the F-test for significance. – F is between-groups mean square variance divided by
within-groups mean square variance
x x x
Descriptives
Q11. Grade
40 2.42 1.63 .26 1.90 2.95 0 5
73 2.90 1.68 .20 2.51 3.30 0 6
74 2.76 1.87 .22 2.32 3.19 0 11
62 3.68 2.42 .31 3.06 4.29 0 8
165 2.18 1.74 .14 1.91 2.45 0 5
414 2.66 1.92 9.44E-02 2.47 2.85 0 11
Madison cross roads
Rainbow
Rolling hills
New Hope
Morris
Total
N Mean Std. Deviation Std. Error Lower Bound Upper Bound
95% Confidence Interval forMean
Minimum Maximum
ANOVA
Q11. Grade
109.159 4 27.290 7.883 .0001415.819 409 3.462
1524.978 413
Between Groups
Within Groups
Total
Sum ofSquares df Mean Square F Sig.
I am trying to find out if the schools serve children of different grades
The p-value is less than 0.001 so I can reject the null hypothesis
Analyze – compare means – one way ANOVA
Bivariate Correlations• Both variables are continuous• Measure of the association between the two variables• Pearson's r is the usual measure of correlation,
sometimes called product-moment correlation. It is a measure of association which varies from -1 to +1, with 0 indicating no relationship (random pairing of values) and 1 indicating perfect relationship, taking the form, "The more the x, the more the y, and vice versa." A value of -1 is a perfect negative relationship, taking the form "The more the x, the less the y, and vice versa."
Analyze – correlate - bivariate
Correlations
1.000 .205**. .000
412 412
.205** 1.000
.000 .
412 418
Pearson Correlation
Sig. (2-tailed)
N
Pearson Correlation
Sig. (2-tailed)
N
Q5. child age first abuse
SEVERE2
Q5. child agefirst abuse SEVERE2
Correlation is significant at the 0.01 level (2-tailed).**.
The pearson r is 0.205, which shows a weak association betweenthe two variables. The p-value is less than 0.001 so it issignificant.If you remember, variation is r-squared (0.205²) which means thatchild age at first abuse explains 4% of the variance in abuse severity
Scatterplot: age of child at first abuse and abuse severity
Q5. child age first abuse
20100-10
SE
VE
RE
2
12
10
8
6
4
2
0
What does the p-value really mean?
• Based on the idea of the sampling distribution.– If you have a population and repeatedly sample that
population you will end up with a normal distribution of means
– If you find a mean that the SPSS program tells you has a p-value of less than 0.05, that means that if there is no relationship between the variables in the population and you take 100 samples from the population, you will find a relationship as strong as the one SPSS found in less than 5 out of 100 samples.