Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator,...
Transcript of Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator,...
![Page 1: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/1.jpg)
Stats Review Chapters 3-4
Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success
Examples are taken from Statistics 4 E by Michael Sullivan, III
And the corresponding Test Generator from Pearson
Revised 12/13
![Page 2: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/2.jpg)
Note:
This review is composed of questions the textbook and the test generator. This review is meant to highlight basic concepts from the course. It does not cover all concepts presented by your instructor. Refer back to your notes, unit objectives, handouts, etc. to further prepare for your exam. A copy of this review can be found at www.sctcc.edu/cas.
The final answers are displayed in red and the chapter/section number is the corner.
![Page 3: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/3.jpg)
Find the Mean, Median, Mode Data Set: 71, 74, 67, 64, 72, 71, 65, 66, 69,70
Mean: add up all the numbers and divide by the amount of numbers
71+74+ 67+ 64+72+71+65+66+69+70
10= 68.9
Median: Arrange numbers from smallest to largest then find the middle number
64, 65,66, 67,69,70,71, 71, 72, 74
The middle number is between 69 and 70. Finding the mean of these two numbers will give us the median which is 69.5
Mode: The most repeated number (can have none, one, or more than one)
The mode is 71. 3.1
![Page 4: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/4.jpg)
Find the Range and Sample Standard Deviation
Data Set: 71, 74, 67, 64, 72, 71, 65, 66, 69,70
Range: The highest number – the smallest number
74-64=10
Sample Standard Deviation: Use Excel or Calculator
You get 3.28
3.2
![Page 5: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/5.jpg)
Find the Five-Number Summary: Part 1 of 2
Data Set: 64, 65, 66, 67, 69, 70, 71, 71, 72, 74
Min: the smallest number 64
Max: the largest number 74
Median: the middle number 69.5
Q1: Find the median of the numbers from the minimum and the median (do not include the median)
These numbers are 64, 65, 66, 67, 69. The median of these numbers are 66. So Q1 is 66.
3.4
![Page 6: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/6.jpg)
Find the Five-Number Summary: Part 2 of 2
Q3: Find the median of the numbers from the median (not included) to the maximum.
These numbers are 70, 71, 71, 72, 74. The median of these numbers are 71. So Q1 is 71.
The five-number summary is
64 66 69.5 71 74
3.4
![Page 7: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/7.jpg)
Find the Upper and Lower Fences
Data Set: 71, 74, 67, 64, 72, 71, 65, 66, 69, 70
1) Find the IQR. IQR=Q3-Q1=71-66=5
2) Multiply the IQR by 1.5: 5*1.5=7.5
3) Upper Fence: Add 7.5 to Q3; 7.5+71=78.5
4) Lower Fence: Subtract 7.5 from Q1; 66-7.5=58.5
Any numbers outside this range would be considered outliers.
3.4
![Page 8: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/8.jpg)
Construct a Boxplot Construct a box plot given the five-number summary
64 66 69.5 71 74 64 66 69.5 71 74
3.5
![Page 9: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/9.jpg)
Relationship between Mean, Median and Shape
Match the shape to the relationship between mean and median.
Mean>Median Mean=Median Mean<Median
Skewed Left
Mean<Median
Symmetric
Mean=Median
Skewed right
Mean>Median
3.1 If data is skewed, median is more representative
of the typical observation.
![Page 10: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/10.jpg)
Empirical Rule to Find the Probability: Part 1 of 2
At a tennis tournament, a statistician keeps track of every serve. She reported that the mean serve speed of a particular player was 104 mph and the standard deviation of the serve speeds was 8 mph. Assume that the statistician also gave us the information that the distribution of the serve speeds was bell shaped. Using the Empirical Rule, what proportion of the player's serves are expected to be between 112 mph and 120 mph?
3.2
![Page 11: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/11.jpg)
Empirical Rule to Find the Probability: Part 1 of 2
Start by drawing the bell curve
Compare this to the empirical rule curve on page 149 or find the percents as described on page 149. We
can see that the probability is 13.5% or .135. *You could also find the z-scores, then find the probability. 3.2
![Page 12: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/12.jpg)
Find the Mean: Part 1 of 2 Days Frequency
1-2 2
3-4 21
5-6 20
7-8 10
9-10 30
Step 1: Find the midpoint of each class (days)
Days Class Midpoint, xi Frequency, fi
1-2 1 + 2
2= 1.5
2
3-4 3.5 21
5-6 5.5 20
7-8 7.5 10
9-10 9.5 30
3.3
![Page 13: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/13.jpg)
Find the Mean: Part 2 of 2
Step 2: Multiply the class midpoint by frequency
Days Class
Midpoint, xi
Frequency, fi
xifi
1-2 1 + 2
2= 1.5
2 1.5*2=3
3-4 3.5 21 73.5
5-6 5.5 20 110
7-8 7.5 10 75
9-10 9.5 30 285
Step 3: Find the sum of the xifi column: 3+73.3+110+75+285=546.5 Step 4: Find the sum of the frequency column: 83 Step 5: Divide the number from step 3 by the number from step 4:
546.5/83=6.6 The mean is 6.6.
3.3
![Page 14: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/14.jpg)
Find the Sample Standard Deviation: Part 1 of 2 (from previous problem)
Step 1: Find the mean (see previous 2 slides). Mean=6.6 Step 2: Complete the following table (the first three columns were done in the previous 2 slides and column 4 is the mean)
Days Class
Midpoint, xi
Frequency, fi
𝒙 xi-𝒙 (𝒙𝒊 − 𝒙 ) 𝟐𝒇𝒊
1-2 1 + 2
2= 1.5
2 6.6 1.5-6.6=-5.1 (−5.1)2∗ 2= 52.02
3-4 3.5 21 6.6 -3.1 201.81
5-6 5.5 20 6.6 -1.1 24.2
7-8 7.5 10 6.6 .9 8.1
9-10 9.5 30 6.6 2.9 252.3
3.3
![Page 15: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/15.jpg)
Find the Sample Standard Deviation: Part 2 of 2 (from previous problem)
Days Class
Midpoint, xi
Frequency, fi 𝒙 xi-𝒙 (𝒙𝒊 − 𝒙 ) 𝟐𝒇𝒊
1-2 1 + 2
2= 1.5
2 6.6 1.5-6.6=-5.1 (−5.1)2∗ 2 = 52.02
3-4 3.5 21 6.6 -3.1 201.81
5-6 5.5 20 6.6 -1.1 24.2
7-8 7.5 10 6.6 .9 8.1
9-10 9.5 30 6.6 2.9 252.3
Step 3: Find the total of the (𝑥𝑖 − 𝑥 ) 2𝑓𝑖 column. Total =538.43 Step 4: Find the total of the frequency column. Total =83. Take this number and subtract 1. 83-1=82 Step 5: Take the number from Step 3 and divide by the number from step 4. 538.43/82=6.56622
Step 6: Take the square root of the number from step 5. 6.56622 = 2.6 3.3
![Page 16: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/16.jpg)
Correlation Coefficient Match the Correlation with the Graph
r=-.6 r=0 r=.99 should not use correlation
0
2
4
6
8
10
12
0 2 4 6 8 10
r=.999
0
5
10
15
20
0 2 4 6 8 10
0
2
4
6
8
10
0 2 4 6 8 10
r=0
0
5
10
15
20
0 2 4 6 8 10
r=-.6
Should not use correlation
4.1
![Page 17: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/17.jpg)
Least-Squares Regression Line: Part 1 of 3
The data are the average one-way commute times (in minutes) for selected students and the number of absences for those students during the term. a) Find the equation of the
regression line for the given data. Round the regression line values to the nearest hundredth.
b) What would be the predicted number of absences if the commute time was 40 minutes? Is this a reasonable question?
c) Interpret the Slope
Commute time (x)
Number of absences (y)
72 3
85 7
91 10
90 10
88 8
98 15
75 4
100 15
80 5
4.2
![Page 18: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/18.jpg)
Least-Squares Regression Line: Part 2 of 3
a) Find the regression Line With Calculator 𝑦 = .45x − 30.3
By Hand First find the mean of x and y (labeled 𝑥 and 𝑦 ), the sample standard deviation of x and y (labeled 𝑠𝑥 and 𝑠𝑦), and the correlation (r).
𝑥 =86.556, 𝑦 =8.556, 𝑠𝑥=9.593, 𝑠𝑦=4.39, r=.98
Slope (b1)=r𝑠𝑦
𝑠𝑥=.98
4.39
9.593=.45
Y-Intercept (b0)= 𝑦 -b1 𝑥 =8.556-.45(86.556)=-30.3
The regression Line is 𝑦 = 𝑏1x + 𝑏𝑜 so ours is 𝑦 = .45x − 30.3
4.2
![Page 19: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/19.jpg)
Least-Squares Regression Line: Part 3 of 3
b) What would be the predicted number of absences if the commute time was 40 minutes? Is this a reasonable question?
Tine is 40 minutes or x=40. Put this value into our least-squares regression line 𝑦 = .45(40) − 30.3=-12.3 This means that when the commute time is 40 minutes, than the number of absences is -12.3. This is not a reasonable question since 40 is outside the scope (i.e. 40 is not within the given range of x values).
c) Interpret the slope The slope is .45. This means that for every minute we increase our commute the number of absences increases by .45.
4.2
![Page 20: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/20.jpg)
Sum Of Residuals: Part 1 of 2 Find the sum of residuals
Step 1: Find the least squares regression line (see previous slides) Step 2: Find the predicted y values (𝑦 ) for each x
Commute time (x) Number of absences
(y)
72 3
85 7
91 10
90 10
88 8
98 15
75 4
100 15
80 5
Predicted 𝒚
=.45(72)-30.3=2.1
7.95
10.65
10.2
9.3
13.8
3.45
14.7
5.7
4.3
![Page 21: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/21.jpg)
Sum Of Residuals: Part 2 of 2 Step 3: Calculate the residuals: observed –predicted or y-𝑦 Step 4: Calculate the residuals squared: (observed –predicted)2 or (y-𝑦 )2
Step 5: Find the sum of the numbers in the column from step 4. This is the sum of residuals which equals 6.1875.
Commute time (x)
Number of absences (y)
Predicted 𝒚
72 3 2.1
85 7 7.95
91 10 10.65
90 10 10.2
88 8 9.3
98 15 13.8
75 4 3.45
100 15 14.7
80 5 5.7
Step 3 y-𝒚
.9
-.95
-.65
-.2
-1.3
1.2
.55
.3
-.7
Step 4 (y-𝒚 )2
.81
.9025
.4225
.04
1.69
1.44
.3025
.09
.49
4.3
![Page 22: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/22.jpg)
Coefficient of Determination (R2)
If the coefficient of determination (R2) is 86.44% and the data shows a negative association, what is the linear correlation coefficient (r)?
𝑟 = 𝑟2 = .8644 = .9297
Since it has a negative association, r =-.9297.
Interpret R2 = 86.44% 86.44% of the variability in y (the response variable) is explained by the least-squares regression line.
4.3
![Page 23: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/23.jpg)
Residual Plots
• What does the residual plot to the right suggest?
– There is an outlier
• Removing the outlier, what does the residual plot suggest?
– No pattern, linear model is appropriate
4.3
-14
-12
-10
-8
-6
-4
-2
0
2
4
6
0 2 4 6 8 10
![Page 24: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/24.jpg)
Contingency Tables: Part 1 of 3 Is there an association between party affliction and gender? The following represents the gender and party affliction of registered voters based on random sample 802 adults.
Female Male
Republican 105 115
Democratic 150 103
Independent 150 179
a) Construct a frequency marginal distribution b) Construct a relative frequency marginal
distribution c) Construct a conditional distribution of
party affiliation by gender d) Is gender associated with party affiliation?
If so, how?
4.4
![Page 25: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/25.jpg)
Contingency Tables: Part 2 of 3 a) Construct a frequency marginal distribution
Gender frequency marginal distribution Party Female Male
Republican 105 115 =105+115=220
Democratic 150 103 253
Independent 150 179 329
=105+150+150=405 397 802
To do: Find the total for each row and column
b) Construct a relative frequency marginal distribution
Gender relative frequency marginal distribution Party Female Male
Republican 105 115 .274
Democratic 150 103 .315
Independent 150 179 .41
=405/802=.505 .495 1
To do: Divide the row/column total by the sample size
4.4
![Page 26: Stats Review Chapters 3-4 · Stats Review Chapters 3-4 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by](https://reader033.fdocuments.in/reader033/viewer/2022060419/5f16a7f5a9733d497c0a4066/html5/thumbnails/26.jpg)
Contingency Tables: Part 3 of 3 c) Construct a conditional distribution of party affiliation by gender
Gender
Party Female Male
Republican =105/405=.259 =115/397=.290
Democratic .370 .259
Independent .370 .451
Total 1 1
To do: Divide the each cell by its column total
d) Is gender associated with party affiliation? If so, how? Yes; males are more likely to be Independents and less likely to be democrats.
4.4