Post on 24-Dec-2015
Exploring relationships between variables
Exploring relationships between variables
Ch. 10Scatterplots, Associations,
and Correlations
Ch. 10Scatterplots, Associations,
and Correlations
ScatterplotsScatterplots
• Shows change over time• Shows patterns• Shows Trends• Relationships• Outlier values
• Shows change over time• Shows patterns• Shows Trends• Relationships• Outlier values
Scatterplots Scatterplots
• Can be positive or negative• Show relationship amongst 2
variables• Can be shown more in depth
through the Z-scores of both variables (ZX, ZY)
• Can be positive or negative• Show relationship amongst 2
variables• Can be shown more in depth
through the Z-scores of both variables (ZX, ZY)
Z-scoresZ-scores
• X-MeanX / Standard Deviation (SX)
• Y-MeanY / Standard Deviation (SY)
• Calculating standard deviation in the same way as before.
• X-MeanX / Standard Deviation (SX)
• Y-MeanY / Standard Deviation (SY)
• Calculating standard deviation in the same way as before.
RatioRatio
• Correlation coefficient• Sum of SX * SY / n-1• Correlation measures the
strength of the linear association between 2 variables
• Correlation coefficient• Sum of SX * SY / n-1• Correlation measures the
strength of the linear association between 2 variables
variablesvariables
• Explanatory Variable – X• Response Variable - Y
• Explanatory Variable – X• Response Variable - Y
Least-Squares LineLeast-Squares Line• Y= a + bx• a = y intercept• b = slope• a = y – bx• b = SSxy/SSx • SSx = Sum of squares of x
• Y= a + bx• a = y intercept• b = slope• a = y – bx• b = SSxy/SSx • SSx = Sum of squares of x
SSxSSx
• This is calculated by obtaining the sum of each squared x
• You then subtract the sum of x squared divided by n
• You can get SSx on the calculator by squaring the standard deviation then multiplying it by (n-1)
• This is calculated by obtaining the sum of each squared x
• You then subtract the sum of x squared divided by n
• You can get SSx on the calculator by squaring the standard deviation then multiplying it by (n-1)
SSxySSxy
• Sum of squares of x and y• Take the sum of each x value
times each y value.• You then subtract from that
total the (Sum of x) * (Sum of y) n
• Sum of squares of x and y• Take the sum of each x value
times each y value.• You then subtract from that
total the (Sum of x) * (Sum of y) n
SSxySSxy
• SSxy is a more efficient way of computing
• Sum of each (x-xbar) * (y-ybar)
• SSxy is a more efficient way of computing
• Sum of each (x-xbar) * (y-ybar)
Complete Guided Ex. #3 page 566
Complete Guided Ex. #3 page 566
Standard Error of Estimate
Standard Error of Estimate
• Se = square root of E(y-yp)squared/n – 2
• How to calculate square root of SDY – b(SDx * SDy) / n-2
• Se = square root of E(y-yp)squared/n – 2
• How to calculate square root of SDY – b(SDx * SDy) / n-2
ResidualsResiduals
• You can graph the residual of the equation to see if the regression is accurate
• Residuals are the difference between the observed value and the predicted value
• R = observed - predicted
• You can graph the residual of the equation to see if the regression is accurate
• Residuals are the difference between the observed value and the predicted value
• R = observed - predicted
Confidence IntervalsConfidence Intervals
• Yp – E < y < yp + E• Yp = predicted value of y
• Yp – E < y < yp + E• Yp = predicted value of y
What does this mean (better understanding)What does this mean
(better understanding)
Types of dataTypes of data
• Outlier• Leverage• Influential Point• Lurking Variable
• Outlier• Leverage• Influential Point• Lurking Variable
OutlierOutlier
• Any data point that stands away from the others
• Any data point that stands away from the others
LeverageLeverage
• Data points with X-values that are far from the mean
• Can alter the line of least regression
• Data points with X-values that are far from the mean
• Can alter the line of least regression
Influential PointInfluential Point
• Omitting this point can drastically alter the regression model
• Omitting this point can drastically alter the regression model
Lurking VariableLurking Variable
• A variable that is hidden in the equation
• It is not explicitly part of the model but affects the way the variables in the model appear
• A variable that is hidden in the equation
• It is not explicitly part of the model but affects the way the variables in the model appear