C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of...
-
Upload
anna-osborne -
Category
Documents
-
view
216 -
download
1
Transcript of C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of...
![Page 1: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/1.jpg)
CHAPTER 5
Summarizing Bivariate Data
What conclusions can be made when considering the effect of one treatment on another?
![Page 2: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/2.jpg)
SCATTERPLOTS5-1 What is a scatterplot and what can be
determined from them?
![Page 3: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/3.jpg)
TYPES OF DATA
Univariate—one list
Bivariate—two lists
Multivariate—multiple lists
![Page 4: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/4.jpg)
SCATTERPLOT
The most important graphical representation of bivariate data
Plotted on a Cartesian coordinate system
graphs
![Page 5: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/5.jpg)
5.1 HOMEWORK
Page 150-151 2, 4, 6, 8
![Page 6: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/6.jpg)
CORRELATION5-2
WHAT IS MEANT BY CORRELATION?
![Page 7: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/7.jpg)
Strong Negative Correlation
As x increases, y decreases
Strong Positive Correlation
As x increases, y increases
No Correlation
x and y do not appear to related
![Page 8: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/8.jpg)
Correlation coefficient—
Indicates the strength of the relationship of bivariate data.
Pearson’s correlation coefficient is the most commonly used and often called simply THE correlation coefficient.
![Page 9: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/9.jpg)
Find , Sx (ave. x, sd of x)
, Sy (ave. y, sd of y)
zx (calc the z-score for each xi)
zy (calc the z-score for each yi)
multiply zx zy (multiply the zx and the zy)
Calc. r
remember -1 ≤ r ≤ 1
To calculate Pearson’s Correlation Coefficientby hand
X Y zx zy zx zy1n
zzr yx
xy
Use the chart to help
![Page 10: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/10.jpg)
Enter the data in L1, L2 Turn on the diagnostics Find the linear
regression for the data
To calculate Pearson’s Correlation Coefficientby calculator
![Page 11: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/11.jpg)
Strong Negative Correlation
As x increases, y decreases
Strong Positive Correlation
As x increases, y increases
No Correlation
x and y do not appear to related
Correlation values:-1 to -.8 and .8 to 1 strong-.8 to -.5 and .5 to .8 moderate-.5 to .5 weak
Same Slide as before with an addition
![Page 12: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/12.jpg)
EXAMPLE 1observation 1 2 3 4 5 6 7 8 9 10
crisis management score 20 13 27 18 19 21 0 21 21 11
family strength score 50 60 67 57 49 72 50 68 60 58
Find the correlation coefficient for crisis management vs family strength
Using both the calculator and excel
Repeat switching L1 and L2 on the calculator
what does this indicate?
n
yy
n
xx
n
yxxy
r2
22
2 )()(Alternate method:
Listed on formula sheet
Will only be used if they give you summary statistics
![Page 13: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/13.jpg)
Properties of r Does not depend on the unit of measurement Does not depend on which is labeled x Is always between -1 and 1 1 indicates a strong positive correlation 0 indicates no correlation -1 indicates a strong negative correlation--measures the extent to which x and y have a linear
relationship
r – for the sample
![Page 14: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/14.jpg)
Correlation DOES NOT imply causation Often two items have a high correlation not because
they impact each other but because they are strongly related to a third item
EX.Among elementary students, there is a strong positive correlation between vocabulary size and the number of cavities. WHY?
They are both related to age.
![Page 15: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/15.jpg)
Spearman’s Rank correlation Coefficient Not as effected by “outliers” Order the x’s low to high Order the y’s low to high Keep the original x and y togetherEX
Use the calculator as before OR
12)1)(1(
4)1(
))((2
nnn
nnyrankxrank
rs
2 1 3 4
X 3 -2 5 7
Y 6 9 4 12
2 3 1 4
-1< rs < 1
![Page 16: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/16.jpg)
5.2 HOMEWORK
P 163 5.9, 5.10, 5.12, 5.13,
5.14, 5.16, 5.18, 5.22
![Page 17: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/17.jpg)
5.3 FITTING A LINE TO BIVARIATE DATA How do you fit a line to linear data?
![Page 18: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/18.jpg)
5.3 FITTING A LINE TO BIVARIATE DATA Activation:
Given the following points, find the equation
X Y .-2 2
0 -2 2 -6
![Page 19: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/19.jpg)
VARIABLES DEFINED
X = the independent or explanatory variable
Y = the dependent or response variable
Stat version of the linear regression (#8)y = a + bx
Algebra and calculus version (#4)y = ax + b
The slope and y-intercept are the same but stat prefers the other set up
![Page 20: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/20.jpg)
REGRESSION LINEFORMED BY THE PRINCIPLE OF LEAST SQUARES
Determine the vertical distance each point is to the line which is supposed to represent the overall pattern of the data
if y = a + bx then
the predicted points are (x1, y1), (x2, y2), (x3, y3), etc.
the vertical distance is
yi – (a + bxi)
if this is positive yi is above the prediction line
if this is negative yi is below the prediction line
![Page 21: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/21.jpg)
The least squares regression line is the one that minimizes
The formula for the least squares line is
a and b can be calculated by
(on the AP STAT formula sheet)
LEAST SQUARES REGRESSION LINE
2))(( ii bxay
bxay ˆ
2)(
))((
xx
yyxxb xbya
![Page 22: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/22.jpg)
CALCULATING BY HAND
n
xx
n
yxxy
b
2
2 )(
xbya
These values can be calculated straight from the data. This formula is not on the formula sheet and is only used when the summary values are given.
![Page 23: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/23.jpg)
LEAST SQUARES REGRESSION LINE
USE for INTERPOLATION not EXTRAPOLATION
Interpolation—data values between the given values
Extrapolation—data values beyond the given values If you are asked to extrapolate always state that
the values may not be accurate due to extrapolation
![Page 24: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/24.jpg)
EXAMPLEAge in months Height in inches
19 22
21 23
23
24 25
27 28
29 31
31 28
34 32
38 34
43 39
50 45
72 48
84 54
58
120 62
128
Find the linear regression line for the given data: then find the values for the missing data
![Page 25: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/25.jpg)
MINITAB INFOxy 407.354.61ˆ
a
The Regression equation isChollevl=61.5 + 3.41 perchgwt
Predictor Coef Stdev t-ratio pConstant 61.537 2.268 27.13 0.000Perchgwt 3.407 1.028 3.31 0.007
value of a value of b (slope)% weight change
Cho
lest
erol
leve
l
Should only be used to predict cholesterol from weight. And only weights from -5 to 3 should be used with any certainty.
![Page 26: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/26.jpg)
USING PEARSON’S CORRELATION COEFFICIENT AND ALGEBRAIC MANIPULATION:
Given and
1) If
Then
2) If r =1
if
if
3) If it is not a perfect correlation let r =.5
Then substituting
this means that y will be r standard deviations from
that x is from
Hence it pulls (regresses) y back into the line
x
y
s
srb )(ˆ xx
s
sryy
x
y
xx
yy ˆ
)(ˆ xxs
syy
x
y
xsxx 1
ysyy ˆ
xsxx 2
ysyy 2ˆ
)(5.ˆ xxs
syy
x
y
xsxx 1
ysyy 5.ˆ
yx
![Page 27: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/27.jpg)
5.3 HOMEWORK
Page 174-176 26, 27, 28, 31, 32, 34
![Page 28: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/28.jpg)
5.4 ASSESSING THE FIT OF A LINE
How do you assess how well a line fits the data?
![Page 29: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/29.jpg)
3 CHECKS FOR FIT
1) Is a line an appropriate way to summarize the data (does it the shape appear to be linear)
2) Are there any unusual aspects of the data that
need to be considered before making predictions
3) How accurate can we expect these predictions to
be
![Page 30: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/30.jpg)
FINDING RESIDUALS The distance from the actual or observed to the
predicted value (HINT: this is an AP class a residual is Actual – Predicted)
ii yy ˆUsing the calculator to find residuals L1=x L2=y L3= predicted L3
vars stat 5EqReg EQreplace the X in Reg EQ w/L1
L4 = residuals
L4 type L2 – L3
![Page 31: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/31.jpg)
PLOTTING RESIDUALS OR
There are two types of residuals that can be plotted Each gives us a picture that can be examined
Residuals for a good fit should have no particular pattern but should be in a band not be too far from zero
)ˆ,( yyx )ˆ,ˆ( yyy
![Page 32: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/32.jpg)
WHAT TO LOOK FOR
Removal of the data causing a single large residual has a minimal impact on the regression line
Removal of a single influential point, has a large impact on the regression line.
An influential point is one where the x is not in the same group as the rest of the values.
![Page 33: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/33.jpg)
THE COEFFICIENT OF DETERMINATION
Gives the proportion of variation in y that is attributed to the approximate linear relationship between x and y.
0
2 Re1
SST
sidSSr
Amount actually attributed to the linear relationship
Possible amount explained by a linear relationship
Amount not attributed to a linear relationship
![Page 34: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/34.jpg)
SST0 AND SSRESID CALCULATIONS
SST0
Total sum of squares squared variation from
mean of
SSResid The amount of variation
not attributed to a linear relationship
Referred to as the errorsum of squares
SSResid ≤SST0
y2
0 )( yySST i
2)ˆ(Re ii yysidSS
Easy Computational Formulas
SST0=
SSResid =
All items can be obtained from the regression line and 2 variable stats function including the coefficient of determination
n
yy
22 )(
xybyay2
0
2 Re1
SST
sidSSr
![Page 35: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/35.jpg)
STANDARD DEVIATION ABOUT THE LEAST SQUARES LINE
Denoted Se => means the Standard Deviation of error
n-2 relates to degrees of freedom—to be discussed later
For a truly good fit r2 must be larger than .5 and Se should be low
2
Re
n
sidSSSe
![Page 36: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/36.jpg)
MINITAB AND CORRELATION
Page 179
![Page 37: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/37.jpg)
EXAMPLE
Page use data from 5.441)Use the calculator to :
a)draw a scatterplot
b) find the regression line
c) find the correlation coefficient
d) calculate the predicted values
e) calculate the residuals
f) graph the residuals
X Y
92 1.7
92 2.3
96 1.9
100 2.0
102 1.5
102 1.7
106 1.6
106 1.8
121 1.0
143 0.3
![Page 38: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/38.jpg)
5.4 HOMEWORK
Page 188-191 37, 38, 39, 41, 42, 43, 48, 51 c&d
![Page 39: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/39.jpg)
5.5 NONLINEAR RELATIONSHIPS AND TRANSFORMATION
How are nonlinear relationships explained?
![Page 40: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/40.jpg)
TRANSFORMATIONS
DO NOT mean moved from the parent function
DO mean adjusting x and/or y values so that the new points appear linear
Common transformations are sq. roots, logs, and reciprocals
originalAlgebraic transformation
![Page 41: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/41.jpg)
QUADRATIC AND CUBIC FUNCTIONS
Use a graphing calculator or a STAT package such as minitab or fathom
Quadratic equations can be done by hand although it is not recommended
2)ˆ( yy
0
2
0
2
)ˆ(1
Re1
SST
yy
SST
sidSSR
![Page 42: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/42.jpg)
UNDOING A TRANSFORMATION y’ = 1.14 – 1.92x where y’ = log (y)log y = 1.14 – 1.92x10log y = 10 1.14 – 1.92x
y = 101.14 – 1.92x
y = (101.14)(10-1.92x) y = 13.8038 (10-1.92x)
Undoing a transformation yields a curve that fits the data, but is not a least squares line.
![Page 43: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/43.jpg)
DETERMINING WHICH TRANSFORMATION TO USE
+y
-y
-x +x
12
43
If the curve resembles one of the numbered curves to achieve a linear transformation move up(+) or down (-) the power chart as indicated by the closest part of the x or y axis.
Power Function Name
3 X3 Cube
2 X2 Square
1 X No transformation
½ Sq. Root
1/3 Cube Root
0 log x Log
-1 1/x Reciprocal
3 x
x
![Page 44: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/44.jpg)
EXAMPLE
frying time moisture
x y5 16.310 9.715 8.120 4.225 3.430 2.945 1.960 1.3
#3 curve therefore x and/or y down
frying time moisture transformation
x y log(y)5 16.3 1.21218760410 9.7 0.98677173415 8.1 0.90848501920 4.2 0.6232492925 3.4 0.53147891730 2.9 0.46239799845 1.9 0.27875360160 1.3 0.113943352
Is the transformed data linear?
Find the linear regression on the transformation
Check the residual pattern. Try a different transformation. Plot this residual pattern. Which one looks better? Which has a better r value.
![Page 45: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/45.jpg)
5.5 HOMEWORK
Page 206-207 52, 53, 59
![Page 46: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/46.jpg)
5.6 INTERPRETING THE RESULTS OF STATISTICAL ANALYSIS
Read pages 208-209
![Page 47: C HAPTER 5 Summarizing Bivariate Data What conclusions can be made when considering the effect of one treatment on another?](https://reader036.fdocuments.in/reader036/viewer/2022062422/56649f275503460f94c3e8a7/html5/thumbnails/47.jpg)
REVIEW
Page 210-213 61, 63, 64, 66, 68, 69