1.2 Describing Distributions with Numbers
-
Upload
mantreh-laleh -
Category
Documents
-
view
39 -
download
3
description
Transcript of 1.2 Describing Distributions with Numbers
1.2 Describing Distributions with NumbersAP Statistics
Fuel Economy for 2004 car models
2-seater Cars Compact cars
Model City HWY Model City Hwy
Acura NSX 17 24 Ashton Mar 12 19
Audi 20 28 Audi TT 21 29
BMW z4 20 28 BMW 325 19 27
Cadillac XLR 17 25 BMW 330 19 28
Corvette 18 25 BMWM3 16 23
Miata 22 28 Jaguar XK8 18 26
Viper 12 20 Jaguar XKR 16 23
Ferrari 360 11 16 Lexus SC 18 23
Ferrari M 10 16 Mini Cooper 25 32
Honda I 60 66 Mits Eclipse 23 31
Thunderbird 17 23 Porsche 911 14 22
Lotus 15 22 Mits Spyder 20 29
Mean and Median
Mean: The Average
Median: The Middle
Mean Highway mileage for two seaters
1 2 ... 51824.7
21 21nx x x
x
Caution!!!
The mean is sensitive to the influence of a few extreme observations. It is not a resistant measure of center.
Measuring Spread
The simplest useful numerical description of a distribution includes a description of both the center and the spread.
Range: simplest measure of spread. The highest value minus the lowest value.
Pth percentile of a distribution is the percent of observations that fall below it.
The Quartiles
Calculating Quartiles
The highway mileage of 20 2-seaters arranged in numerical order are:
13 15 16 16 17 19 20 22 23 23 |23 24 25 25 26 28 28 28 29 32
The median is marked by |.
The 1st Quartile (Q1) is the median of the 10 numbers to the left of the median.
The 3rd Quartile (Q1) is the median of the 10 numbers to the right of the median.
If there are an odd number of observations, the median is the middle number and it is excluded from calculations of Q1 and Q3.
The Five–Number Summary and Boxplots
The Five–Number Summary and Boxplots
Boxplots of the highway and city gas mileages for cars classified as two-seaters and minicompacts.
Outliers
Lower Bound: Q1 – 1.5(IQR)•If an observation falls below the lower bound it is considered an outlier.
Upper Bound: Q3 + 1.5(IQR)•If an observation falls above the upper bound it is considered an outlier.
Measuring Spread: Variance and Standard Deviation
Some FAQs Why do we square the deviations?
If not, they would all add to zero. Why do we emphasize the standard deviation rather
than the variance?The standard deviation measures spread about the mean
in the original scale. Why do we average by dividing by n-1 rather than n
in calculating the variance? Because the sum of the deviations is always zero, the last
deviation can be found once we know the other n-1. The number n-1 is called the degrees of freedom.
Choosing Measures of Center and Spread