Lecture 16 Correlation and Coefficient of Correlation By Aziza Munir.
1 Inferences About The Pearson Correlation Coefficient.
-
Upload
cassandra-glenn -
Category
Documents
-
view
221 -
download
0
Transcript of 1 Inferences About The Pearson Correlation Coefficient.
1
Inferences About The Pearson Correlation
Coefficient
2
STUDENTS Y(GPA) X(SAT)
A 1.6 400 -0.97 -145.80 141.43
B 2.0 350 -0.57 -195.80 111.61
C 2.2 500 -0.37 -45.80 16.95
D 2.8 400 0.23 -145.80 -33.53
E 2.8 450 0.23 -95.80 -22.03
F 2.6 550 0.03 4.20 0.13
G 3.2 550 0.63 4.20 2.65
H 2.0 600 -0.57 54.20 -30.89
I 2.4 650 -0.17 104.20 -17.71
J 3.4 650 0.83 104.20 86.49
K 2.8 700 0.23 154.20 35.47
L 3.0 750 0.43 204.20 87.81
Sum 30.80 6550.0 378.33
Mean 2.57 545.80
S.D. 0.54 128.73
)X - (X)Y - (Y )X - (X )Y - (Y
3
Calculation of Covariance & Correlation
4
Population of visual acuity and neck size “scores”
ρ=0
Sample 1 EtcSample 2 Sample 3
r = -0.8 r = +.15 r = +.02
Rel
ativ
e F
requ
ency
r: 0
µr
The development of a sampling distribution of sample v:
5
Steps in Test of Hypothesis
1. Determine the appropriate test 2. Establish the level of significance:α3. Determine whether to use a one tail or two tail test4. Calculate the test statistic5. Determine the degree of freedom6. Compare computed test statistic against a
tabled/critical valueSame as Before
6
1. Determine the Appropriate Test
Check assumptions:Both independent and dependent variable (X,Y) are measured on an interval or ratio level.Pearson’s r is suitable for detecting linear relationships between two variables and not appropriate as an index of curvilinear relationships.The variables are bivariate normal (scores for variable X are normally distributed for each value of variable Y, and vice versa)Scores must be homoscedastic (for each value of X, the variability of the Y scores must be about the same)
Pearson’s r is robust with respect to the last two specially when sample size is large
7
2. Establish Level of Significance
α is a predetermined value
The convention• α = .05
• α = .01
• α = .001
8
3. Determine Whether to Use a One or Two Tailed Test
H0 : ρXY = 0
Ha : ρXY ≠ 0
Ha : ρXY > or < 0
Two Tailed Test if no direction is specified
One Tailed Test if direction is specified
9
4. Calculating Test Statistics
10
5. Determine Degrees of Freedom
For Pearson’s r df = N – 2
11
6. Compare the Computed Test Statistic Against a Tabled Value
α = .05
Identify the Region (s) of Rejection.
Look up tα corresponding to degrees of freedom
12
Formulate the Statistical Hypotheses. Ho : ρXY = 0 Ha : ρXY ≠ 0
α = 0.05
Collect a sample of data, n = 12
Example of Correlations Between SAT and GPA scores
13
Data
14
Calculation of Difference of Y and mean of Y
15
Calculation of Difference of X and Mean of X
16
Calculation of Product of Differences
17
Covariance & Correlation
18
Calculate t-statistics
2738.2
r1 S 82.1
S
r t
0.50 )73.128)(54.0(
39.34
SS
S r
2
rr
c
Y X
XYXY
n
19
Identify the Region (s) of Rejection. tα = 2.228
Make Statistical Decision and Form Conclusion.
tc < tα Fail to reject Ho
p-value = 0.095 > α = 0.05 Fail to reject Ho
Or use Table B-6: rc = 0.50 < rα =.576 Fail to reject Ho
Check Significance
20
Practical Significance in Pearson r
Judge the practical significance or the magnitude of r within the context of what you would expect to find, based on reason and prior studies.
The magnitude of r is expressed in terms of r2 or the coefficient of determination.
In our example, r2 is .50 2 = .25 (The proportion of variance that is shared by the two variables).
21
Intuitions about Percent of Variance Explained
22
Sample Size in Pearson r
To estimate the minimum sample size needed in r, you need to do the power analysis. For example, Given the:
α = .05, effect size (population r or ρ) = 0.20, and a power of .80, 197 subjects would be needed. (Refer to Table 9-1).
Note: [ρ = .10 (small), ρ=.30 (medium), ρ =.50 (large)]
23
Magnitude of Correlations
ρ = .10 (small)
ρ = .30 (medium)
ρ = .50 (large)
24
Factors Influencing the Pearson r
Linearity. To the extent that a bivariate distribution departs from normality, correlation will be lower.
Outliers. Discrepant data points affect the magnitude of the correlation.
Restriction of Range. Restricted variation in either Y or X will result in a lower correlation.
Unreliable Measures will results in a lower correlation.
25
Take Home Lesson
How to calculate correlation and test if it is different from a constant