Smith/Davis (c) 2005 Prentice Hall Chapter Eight Correlation and Prediction PowerPoint Presentation...

Smith/Davis (c) 2005 Prentice Hall

Chapter Eight

Correlation and Prediction

PowerPoint Presentation created by Dr. Susan R. BurnsMorningside College


Correlation

Correlation is the extent to which two variables are related.

If the two variables are highly related, then knowing the value of one of them will allow you to predict the other variable with considerable accuracy.

The less highly related the variable, the less accurate your ability to predict when you know the other.


The Nature of Correlation

Often used as means for prediction, correlation tells us how related two variables are.

However, note that even though two variables may be highly correlated, you should not assume that one variable causes the other.

CORRELATION DOES NOT IMPLY CAUSATION.

– For example, there is the third variable possibility (i.e., there may be additional variable(s) that are causing the two things you are investigating to be related to each other).

“There’s a significant NEGATIVE correlation between the number of mules and the number of academics in a state, but remember, correlation is not causation”


The Scatterplot: Graphing Correlations

Also known as the scatter diagram, the scatterplot allows us to visually see the relation between two variables.

One variable is plotted on the ordinate and the other on the abscissa.

– Although you can list either variable on either axis, it is common to place the variable you are attempting to predict on the ordinate.

– Positive correlations – occur when both variables move in the same direction (e.g., as SAT scores increase, so to do GPAs).

– Negative Correlations – occur when one variable increases, the other decreases (e.g., as age increases, the number of speeding tickets decrease).


The Scatterplot: Graphing Correlations


The Pearson Product Moment Correlation Coefficient

The correlation coefficient is the single number that represents the degree of relation between two variables.

The Pearson Product-Moment Correlation Coefficient (symbolized by r) is the most common measure of correlation; researchers calculate it when both the X variable and the Y variable are interval or ration scale measurements. Mathematically, it can be defined as the average of the cross-products of z-scores.

The raw score formula for r is:


The Range of r Values

The Range of r – correlation coefficients can range in value from -1.00 to +1.00.

– A correlation of -1.00 indicates a perfect negative correlation between the two variables of interest. That is, whenever there is an increase of one unit in one variable, there is always the same proportional decrease in the other variable.



The Range of r – correlation coefficients can range in value from -1.00 to +1.00.

– A zero correlation means there is little or no relation between the two variables. That is, as scores on one variable increase, scores on the other variable may increase, decrease, or not change at all.



The Range of r – correlation coefficients can range in value

from -1.00 to +1.00. – Perfect positive correlation

occurs when you have a value of +1.00 and as we see an increase of one unit in one variable, we always see a proportional increase in the other variable.

– The existence of a perfect correlation indicates there are no other factors present that influence the relation we are measuring. This situation rarely occurs in real life.


Interpreting Correlation Coefficients

Statistically significant results mean that a research result occurred rarely by chance.

If the correlation you calculate is sufficiently large that it would occur rarely by chance, then you have reason to believe that these two variables are related.

The standard by which significance in psychology is determined is at the .05 level.

That is, a result is significant when it occurs by chance 5 times out of a hundred.

Researchers who are more caution may choose to adopt a .01 level of significance.


Effect Size

Even though statistical significance is an important component of psychological research, it may not tell us very much about the magnitude of our results.

Effect size refers to the size or magnitude of the effect an independent variable (IV) produced in an experiment or the size or magnitude of a correlation.

Effect size calculation is important because, unfortunately, a research result can be significant and yet the effect size may be quite small.

An example of this situation occurs as sample size gets larger, the critical value needed to achieve significance becomes smaller.


Effect Size

To calculate the effect size for the Pearson product-moment correlation, all you have to do is square the correlation coefficient.

r2 is known as the coefficient of determination. Multiply the coefficient of determination by 100 and

you will see what percentage of the variance is accounted for by the correlation.

The higher r2 becomes, the more variance is accounted for by the relation between the two variables under study.

Lower r2 values indicate that factor, other than the two variables of interest are influencing the relation in which we are interested.


Prediction

Generally speaking, regression refers to the prediction of one variable from our knowledge of another variable.

We label the variable that is being predicted as the Y variable and refer to it as the criterion variable.

We label the variable that we are predicting from as the X variable and refer to it as the predictor variable.

In other words, we use X to predict Y.


Prediction

The Regression Equation – – The regression equation is the statistical basis of prediction:


The Regression Equation

The regression line is a graphical display of the relation between the values on the predictor variable and predicted values on the criterion variable. It is similar to the scatterplot used to display correlations.

The calculation of b can be done a couple of ways:



A second way to calculate involves the following formula:



The calculation of a is as follows:


Constructing the Regression Line

There are two ways to construct the regression line:– First, you could calculate several predicted values, plot

these values, connect the points, and then extend the line to the Y intercept. This procedure works, but carries with it the potential for calculation errors and inaccurate points.

– The second procedure is less likely to have calculation errors:

Locate the Y intercept (a) Plot My and Mx The line that passes through these points is the regression

line. – Remember, the steepness of the regression line is known

as the slope, whereas the point at which this line crosses the vertical axis is called the Y intercept.


Constructing the Regression Line

Smith/Davis (c) 2005 Prentice Hall Chapter Eight Correlation and Prediction PowerPoint Presentation...

Documents

Transcript of Smith/Davis (c) 2005 Prentice Hall Chapter Eight Correlation and Prediction PowerPoint Presentation...