Corellation analysis notes

28
COMPILED BY MUTHAMA, JAPHETH MUTINDA CORRELATION

Transcript of Corellation analysis notes

Page 1: Corellation analysis notes

COMPILED BY MUTHAMA, JAPHETH MUTINDA

CORRELATION

Page 2: Corellation analysis notes

INTRODUCTION

Page 3: Corellation analysis notes

Objectives of the presentation

After going through this presentation, the listener is expected to:1.Be able to present the results of analysed research data.2.Make effective interpretation of the relationship between research variables3.Draw implications or inferences from the variables in the study model

Page 4: Corellation analysis notes

Definition Correlation (r) is the statistical measure of how two Variables move

in relation to each other.

It measures the relative strength of the relationship between two

variables

Correlation is computed into what is known as the correlation

coefficient, which ranges between -1 and +1.

Page 5: Corellation analysis notes

Coefficient of correlation

Coefficient of correlation is the technique of determining the degree of

correlation between two or more variables in different values of the study

variables

The correlation, if any, found through this approach is applied in a

statistical method to deal with the formulation of mathematical model

depicting relationship amongst variables which can be used for the purpose

of prediction of the values of dependent variable, given the values of the

independent variable

Page 6: Corellation analysis notes

The sample correlation coefficient (r) measures the degree of linearity in the relationship between X and Y.

-1 < r < +1

r = 0 : Indicates no linear relationship between the research variables

-1 < r < +1 The + and – signs are used for explaining the positive linear correlations

and negative linear correlations respectively

Coefficient of Correlation Analysis

Strong negative relationship Strong positive

relationship

Page 7: Corellation analysis notes

Interpreting Correlation Coefficient (r)

1) Strong correlation: r > 0.70 or r < –0.70

2) Moderate correlation: r is between 0.30 and 0.70

or r is between –0.30 and –0.70

3) Weak correlation: r is between 0 and 0.30 or r is between 0 and –0.30 .

Page 8: Corellation analysis notes

Methods of studying Correlation

Correlation can be determined by use of the following method;

1.A Scatter Diagram Method

2.Karl Pearson Coefficient Correlation of Method

3.Spearman’s Rank Correlation Method

Page 9: Corellation analysis notes

SCATTER DIAGRAMS This is a graph in which the individual data points are plotted in two-dimensions as presented below;

Very good fit Moderate fit

Points clustered closely around a line show a strong correlation. The line is a good predictor (good fit) with the data. The more spread out the points, the weaker the correlation, and the less good the fit.

The line is a REGRESSSION line (Y = a + bX)

Strong relationship simply means a good linear fit

Page 10: Corellation analysis notes

Coefficient of determination and the regression line

NOTE:

1. The coefficient of determination is a measure of how well the

regression line represents the data and therefore represents the percent of

the data that is the closest to the line of best fit2. If the regression line passes exactly through every point on the scatter plot, it would be able to explain all of the variation

3. The further the line is away from the points, the less it is able to explain the variation

Page 11: Corellation analysis notes

Cont…

For example in the case of variables X and Y:If the r = 0.922, then r 2 = 0.850

Which means that 85% of the total variation in y can be explained by the linear relationship between x and y (as described by the regression equation)

This therefore means that, the other 15% of the total variation in y remains unexplained

Page 12: Corellation analysis notes

Karl Pearson’s coefficient of correlation (or simple correlation)

This is the most widely used method of measuring the degree of relationship between two variables. Its defined as the measure of the strength of the linear relationship between two variables that is defined in terms of the (sample) covariance of the variables divided by their (sample) standard deviations.This coefficient assumes the following:(i) that there is linear relationship between the two variables;(ii) that the two variables are casually related which means that one of the variables is independent and the other one is dependent (iii) A large number of independent causes are operating in both variables so as to produce a normal distribution.

Page 13: Corellation analysis notes

2222 )Y(Yn )X(Xn

YXXYn

r xy

- Shared variability of X and Y variables - on the top

- Individual variability of X and Y variables- At the bottom

Karl Pearson’s coefficient of correlation can be worked out thus

OR

yxryx .

),cov(

Page 14: Corellation analysis notes

Illistration

From the following data find the coefficient of correlation by Karl Pearson method

X: 6, 2, 10, 4, 8Y: 9, 11, 5, 8, 7

Page 15: Corellation analysis notes

Sol.cont.

92.080026

20.4026

.

.

8540

65

30

22

yx

yxr

NY

Y

NX

X

Page 16: Corellation analysis notes

Spearman's rank coefficient

This is the technique of determining the degree of correlation between two

variables incase of ordinal data where ranks are given to different values of

the variables.

The main objective of the coefficient is to determine the extend to which

the two sets of ranking are similar or dissimilar.

This method is only used to determine correlation when the data is not

available in numerical form

Thus when the values of the two variables are converted to their ranks and

the correlation is obtained, the correlation is known as rank correlation

Page 17: Corellation analysis notes

Computation of Rank Correlation

Spearman’s rank correlation coefficient ρ can be calculated when

• Actual ranks given

• Ranks are not given but grades are given but not repeated

• Ranks are not given and grades are given and repeated

yofrankRXofrankR

RRDwhere

NND

R

y

x

yx

..

..

)1(6

1 2

2

Page 18: Corellation analysis notes

Illustration

Calculate the spearman’s rank correlation coefficient between advertisement cost and sales from the following data

Advertisement cost : 39, 65, 62, 90, 82, 75, 25, 98, 36, 78Sales(Shs): 47, 53, 58, 86, 62, 68, 60, 91, 51, 84

Page 19: Corellation analysis notes

X Y R-x R-y D39 47 8 10 -2 465 53 6 8 -2 462 58 7 7 0 090 86 2 2 0 082 62 3 5 -2 475 68 5 4 1 125 60 10 6 4 1698 91 1 1 0 036 51 9 9 0 078 84 4 3 1 1

30

2D

Page 20: Corellation analysis notes

Cont….

82.09901801

1010)30(61

61

3

3

2

R

R

R

NND

R

Page 21: Corellation analysis notes

Nonlinear Relationships

In correlation analysis, not all relationships are linear.

In cases where there is clear evidence of a nonlinear relationship DO NOT use Pearson’s Product Moment Correlation ( r ) to summarize the strength of the relationship between Y and X.

Page 22: Corellation analysis notes

Non linear correlation Scatter graph

Page 23: Corellation analysis notes

Conclusions

Correlation is the linear association between two numeric variables e.g variables X and Y.The correlation (r) ranges from -1 to +1where-1 < r < 1If r < 0 then there is a negative correlation between X and Y, i.e. as X increases Y generally decreasesIf r > 0 then there is a positive correlation between X and Y, i.e. as X increases Y generally increasesThe close r is to 0 the weaker the linear association between X and Y.

Page 24: Corellation analysis notes

A diagram explaining different strengths of correlationsThe value of r ranges between ( -1) and ( +1)The value of r denotes the strength of the association as illustratedby the following diagram.

-1 10-0.25-0.75 0.750.25

strong strongintermediate intermediateweak weak

no relation

perfect correlation

perfect correlation

Directindirect

Page 25: Corellation analysis notes

Example of graphs and their interpretationNegative and positive correlations

Page 26: Corellation analysis notes

No Relationship (r = .00)

Information about Explanatory Flexibility tells you nothing about Emotional Insight

Explanatory Flexibility

3.53.02.52.01.51.0.50.0-.5

AS

IS -

Em

otio

nal I

nsig

ht

8

7

6

5

4

3

2

1

Page 27: Corellation analysis notes

REFERENCESDhrymes, P. J.: Econometrics: Statistical Foundations and Applications, Harper & Row, New York, 1970.Fomby, Thomas B., Carter R. Hill, and Stanley R. Johnson: Advanced Econometric Methods, Springer-Verlag, New York, 1984.Goldberger, A. S.: A Course in Econometrics, Harvard University Press, Cambridge, Mass., 1991.Harvey, A. C.: The Econometric Analysis of Time Series, 2d ed., MIT Press, Cambridge, Mass., 1990.Kothari CR, Research methodology: an introduction. New Delhi, Vikas publishing house Pvt ltd 2000Emory C William, Business research methods. Illinois: Richard D. Irwin, Inc. Homewood 2001

Page 28: Corellation analysis notes

THANK YOU