Section 9.6 Linear Correlation

Section 9.6

Linear Correlation

Section 9.6

Linear Correlation

Objectives:1. To see the method of least

squares to determine the best-fitline through a set of data points.

2. To calculate correlation and coefficient of determination.

Objectives:1. To see the method of least

squares to determine the best-fitline through a set of data points.

2. To calculate correlation and coefficient of determination.

Gen Pop1 12 33 24 35 56 47 5

1 2 3 4 5 6 71 2 3 4 5 6 7

654321

654321P

GenerationGeneration

Population of a bacteria culture after every generationPopulation of a bacteria culture after every generation

Finding the line of best fit is called linear regression.

Correlation measures the strength of the relationship between two variables.

Finding the line of best fit is called linear regression.

Correlation measures the strength of the relationship between two variables.

Suppose y = mx + b is the equation of the best-fit line. For each data point (xi, yi), you could calculate the predicted y-value, yi (yi-hat), by the line yi = mxi + b.

ˆ̂ˆ̂

To have a good model, the yi on the best-fit line should be close to the yi of the original data for each xi.

Since the sum of the deviations will be zero, we will minimize the sum of the squared deviations.

This method is called the method of least squares. Since the sum of the squared deviations represents the error between the line and actual data, SSE is used as an abbreviation for the sum of squares error.

i=1i=1

ˆ̂(yi – yi)2(yi – yi)2SSE =SSE = nn

(yi – mxi – b)2(yi – mxi – b)2== i=1i=1

----==nn

i=1i=1iiiixyxy ))yyyy)()(xxxx((SSSS

--==nn

i=1i=1

22iiyy ))yyyy((SSSS

--==nn

i=1i=1

22iixx ))xxxx((SSSS

You can also compute the sum of squared deviations for the x and y variables separately.

linelinefit fit --bestbest thethe squares,squares,least least ofof methodmethod thethe UsingUsing

..xxmmyy bbintercept intercept -- y yandand --==

SSSS mm slopeslope hashas bb mxmx yy

xyxy==++==

Theorem 9.6: Linear RegressionTheorem 9.6: Linear Regression

EXAMPLE 1 Give the equation of the line for the bacteria population. Predict the population after the eighth generation.

yi = 1+3+2+3+5+4+5 = 23yi = 1+3+2+3+5+4+5 = 23

xi = 1+2+3+4+5+6+7 = 28xi = 1+2+3+4+5+6+7 = 28

x = 28/7 = 4x = 28/7 = 4

y = 23/7 = 3.29y = 23/7 = 3.29

1 1 -3 -2.29 9 5.22 6.86

2 3 -2 -0.29 4 0.08 0.57

3 2 -1 -1.29 1 1.65 1.29

4 3 0 -0.29 0 0.08 0.00

5 5 1 1.71 1 2.94 1.71

6 4 2 0.71 4 0.51 1.43

7 5 3 1.71 9 2.94 5.14

1 1 -3 -2.29 9 5.22 6.86

2 3 -2 -0.29 4 0.08 0.57

3 2 -1 -1.29 1 1.65 1.29

4 3 0 -0.29 0 0.08 0.00

5 5 1 1.71 1 2.94 1.71

6 4 2 0.71 4 0.51 1.43

7 5 3 1.71 9 2.94 5.14

xi yi xi-x yi-y (xi-x)2 (yi-y)2 (xi-x)(yi-y)xi yi xi-x yi-y (xi-x)2 (yi-y)2 (xi-x)(yi-y)

2828 13.4313.43 17.0017.00

y = mx + b = 0.61x + 0.86y = mx + b = 0.61x + 0.86

SSx = 28 SSy = 13.43 SSxy = 17SSx = 28 SSy = 13.43 SSxy = 17

0.610.6128281717

SSSSSSSS

m =m =xx

xyxy====

= 3.29 - (0.61)(4) = 0.86= 3.29 - (0.61)(4) = 0.86

f(8) = 0.61(8) + 0.86 = 5.71f(8) = 0.61(8) + 0.86 = 5.71

b = y - mx b = y - mx

Correlation A measure of the strength of the relation between two variables using

the formula

Correlation A measure of the strength of the relation between two variables using

the formulaSSxSSySSxSSy

SSxySSxyr =r =

DefinitionDefinition

Coefficient of determination The square of the correlation, r2.Coefficient of determination The square of the correlation, r2.

DefinitionDefinition

The ranges for these measures are 0 r2 1 and -1 r 1. When all the data falls exactly on the least squares line, the model has no error and SSE = 0. This means that r2 = 1 (and r = 1 or -1). If the model does not help at all, and there is no reduction in error, then SSE = SSy, making r2 = 0 (and r = 0).

A correlation of 0 means the model is worthless, and a correlation of ±1 means that it is perfect.

EXAMPLE 2 Find the correlation between generation and population size for bacteria.

SSxSSySSxSSy

SSxy SSxy r =r =

28(13.43)28(13.43)

1717r =r = ≈ 0.88≈ 0.88

Since r > 0, the positive correlation tells us that the slope of the best-fit line is positive. Since r2 = 0.77, using the line provides a 77% reduction in error over using the average, the horizontal line.

Homework

pp. 477-479

Homework

pp. 477-479

Given SSx = 100, SSy = 25, SSxy = -50, y = 4, and x = 6, find1. the slope of the best-fit line.

Given SSx = 100, SSy = 25, SSxy = -50, y = 4, and x = 6, find2. the intercept of the best-fit line.

Given SSx = 100, SSy = 25, SSxy = -50, y = 4, and x = 6, find 3. the equation of the best-fit line.

Given SSx = 100, SSy = 25, SSxy = -50, y = 4, and x = 6, find4. the correlation r and its meaning.

Given SSx = 100, SSy = 25, SSxy = -50, y = 4, and x = 6, find5. the error SSE of the model.

If y = 4x + 3 is the best-fit line by the method of least squares and SSx = 2, and SSy = 71, then 6. predict y when x is 8.

If y = 4x + 3 is the best-fit line by the method of least squares and SSx = 2, and SSy = 71, then7. find SSxy.

If y = 4x + 3 is the best-fit line by the method of least squares and SSx = 2, and SSy = 71, then8. find r.

If y = 4x + 3 is the best-fit line by the method of least squares and SSx = 2, and SSy = 71, then9. interpret r.

If y = 4x + 3 is the best-fit line by the method of least squares and SSx = 2, and SSy = 71, then10. find SSE.

■ Cumulative Review:

Consider the function: f(x) = x4 + 2x3 – 35x2 – 36x + 180.

31. Find the zeros of the function.

32. Is the function even? odd? Identify any symmetry.

33. Graph the function.

34. Solve the equation x3 + 125 = 0.

35. Solve the system using Cramer’s rule.

4x – 5y = 83x + 2y = 44x – 5y = 83x + 2y = 4

Section 9.6 Linear Correlation

Documents

Transcript of Section 9.6 Linear Correlation

Correlation, linear regression

Simple linear regressionn and Correlation

Linear Correlation and Regression Analysis

1. Correlation and linear regression

Chapter 4: Correlation and Linear Regression – Quiz A Namesite.iugaza.edu.ps/.../01/Ch.-4-Correlation-and-Linear-Regression-.pdf · Chapter 4: Correlation and Linear Regression

Coefficient of linear correlation theorem

Pearson Correlation, Spearman Correlation &Linear Regression

Simple Linear Correlation and Regression

Linear Toolkit Regression and Correlation

Chapter 9 : Linear Correlation

Linear correlation

LESSON 9.6: CAUSATION VS. CORRELATION Module 9: Epidemiology Obj. 9.6: Differentiate a causative and correlative relationship between variables.

Linear regression correlation coefficient

Simple Linear Regression and Correlation

Correlation and Simple Linear Regression - Cloudinary · Correlation and Simple Linear Regression 2 Correlation Coefficient Correlation measures both the strength and direction of

Correlation iNZight gives r. POSITIVE linear correlation r=1 "Perfect" 0.9

9. Linear Regression and Correlation

Chapter 10 Correlation and Regression. SCATTER DIAGRAMS AND LINEAR CORRELATION.

Department of Psychology · UNIT VIII: Correlation: concept of correlation, linear and non-linear correlation, Pearson’s product moment correlation (real and assumed mean), Spearman’s

Xuhua Xia Correlation and Regression Introduction to linear correlation and regression Numerical illustrations SAS and linear correlation/regression –CORR.