Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or...

30
Correlation and Regression A BRIEF overview

Transcript of Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or...

Page 1: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Correlation and Regression

A BRIEF overview

Page 2: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Correlation Coefficients

Continuous IV & DV or dichotomous variables (code as 0-1)

mean interpreted as proportion Pearson product moment correlation

coefficient range -1.0 to +1.0

Page 3: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Interpreting Correlations

1.0, + or - indicates perfect relationship 0 correlations = no association between

the variables in between - varying degrees of

relatedness r2 as proportion of variance shared by two

variables which is X and Y doesn’t matter

Page 4: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Positive Correlation

regression line is the line of best fit With a 1.0 correlation, all points fall

exactly on the line 1.0 correlation does not mean values

identical the difference between them is

identical

Page 5: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Negative Correlation

If r=-1.0 all points fall directly on the regression line

slopes downward from left to right sign of the correlation tells us the

direction of relationship number tells us the size or magnitude

Page 6: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Zero correlation

no relationship between the variables a positive or negative correlation gives

us predictive power

Page 7: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Direction and degree

Page 8: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Direction and degree (cont.)

Page 9: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Direction and degree (cont.)

Page 10: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Correlation Coefficient r = Pearson Product-Moment Correlation Coefficient zx = z score for variable x

zy = z score for variable y N = number of paired X-Y values Definitional formula (below)

N

zzr yx )(

Page 11: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Raw score formula

])(][)([ 2222 YYNXXN

YXXYNr

Page 12: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Interpreting correlation coefficients comprehensive description of relationship direction and strength need adequate number of pairs more than 30 or so same for sample or population population parameter is Rho (ρ) scatterplots and r more tightly clustered around line=higher correlation

Page 13: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Examples of correlations

-1.0 negative limit -.80 relationship between juvenile street

crime and socioeconomic level .43 manual dexterity and assembly line

performance .60 height and weight 1.0 positive limit

Page 14: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Describing r’s Effect size index-Cohen’s guidelines:

Small – r = .10, Medium – r = .30, Large – r = .50

Very high = .80 or more Strong = .60 - .80 Moderate = .40 - .60 Low = .20 - .40 Very low = .20 or less

small correlations can be very important

Page 15: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Correlation as causation??

Page 16: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Nonlinearity and range restriction

if relationship doesn't follow a linear pattern Pearson r useless

r is based on a straight line function if variability of one or both variables is

restricted the maximum value of r decreases

Page 17: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Linear vs. curvilinear relationships

Page 18: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Linear vs. curvilinear (cont.)

Page 19: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Range restriction

Page 20: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Range restriction (cont.)

Page 21: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Understanding r

Page 22: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Simple linear regression

enables us to make a “best” prediction of the value of a variable given our knowledge of the relationship with another variable

generate a line that minimizes the squared distances of the points in the plot

no other line will produce smaller residuals or errors of estimation

least squares property

Page 23: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Regression line

The line will have the form Y'=A+BX Where: Y' = predicted value of Y A = Y intercept of the line B = slope of the line X = score of X we are using to predict Y

Page 24: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Ordering of variables

which variable is designated as X and which is Y makes a difference

different coefficients result if we flip them generally if you can designate one as

the dependent on some logical grounds that one is Y

Page 25: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Moving to prediction

statistically significant relationship between college entrance exam scores and GPA

how can we use entrance scores to predict GPA?

Page 26: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Best-fitting line (cont.)

Page 27: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Best-fitting line (cont.)

Page 28: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Calculating the slope (b)

N=number of pairs of scores, rest of the terms are the sums of the X, Y, X2, Y2, and XY columns we’re already familiar with

22 )()(

))(()(

XXN

YXXYNb

Page 29: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Calculating Y-intercept (a)

b = slope of the regression line the mean of the Y values the mean of the X values

Y

X

XbYa )(

Page 30: Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.

Let’s make up a small example

SAT – GPA correlation How high is it generally? Start with a scatter plot Enter points that reflect the relationship

we think exists Translate into values Calculate r & regression coefficients