Means & Medians
-
Upload
clarke-mcbride -
Category
Documents
-
view
27 -
download
0
description
Transcript of Means & Medians
Means & Medians
Chapter 4
Parameter -
• Fixed value about a population
• Typical unknown
Suppose we want to know the MEAN length of fish in
Lake Lewisville . . .
Statistic -
• Value calculatedcalculated from a sample
Measures of Central Tendency
• Median - the middle of the data; 50th percentile–Observations must be in
numerical order–Is the middle single value if n is
odd–The average of the middle two
values if n is even
NOTE: n denotes the sample size
Measures of Central Tendency
• Mean - the arithmetic average
–Use to represent a population mean
–Use x to represent a sample mean
nx
x FormulaFormula: : is the capital Greek
letter sigma – it means to sum the values that
follow
parameter
statistic
Measures of Central Tendency
• Mode – the observation that occurs the most often
–Can be more than one mode
–If all values occur only once – there is no mode
–Not used as often as mean & median
Suppose we are interested in the number of lollipops that are bought at a certain store. A sample of 5 customers buys the following number of lollipops. Find the median.
22 3 3 4 4 8 8 12 12
The numbers are in order & n is odd – so
find the middle observation.
The median is 4 lollipops!
Suppose we have sample of 6 customers that buy the following number of lollipops. The median is …
22 3 3 4 4 6 6 8 8 12 12
The numbers are in order & n is even – so find the middle two
observations.
The median is 5 lollipops!
Now, average these two values.
5
Suppose we have sample of 6 customers that buy the following number of lollipops. Find the mean.
22 3 3 4 4 6 6 8 8 12 12
To find the mean number of lollipops add the observations
and divide by n.
61286432 833.5x
Using the calculator . . .
What would happen to the median & mean if the 12 lollipops were 20?
22 3 3 4 4 6 6 8 8 20 20
The median is . . .
5
The mean is . . .
62086432
7.17
What happened?
What would happen to the median & mean if the 20 lollipops were 50?
22 3 3 4 4 6 6 8 8 50 50
The median is . . .
5
The mean is . . .
65086432
12.17
What happened?
Resistant -
• Statistics that are not affected by outliers
• Is the median resistant?
►Is the mean resistant?Is the mean resistant?
YES
NO
Now find how each observation deviates from the mean.
What is the sum of the deviations from the mean?
Look at the following data set. Find the mean.
22 23 24 25 25 26 29 30
5.25x
xx 0
Will this sum always equal zero?
YESThis is the deviation from
the mean.
Look at the following data set. Find the mean & median.
Mean =
Median =
21 23 23 24 25 25 26 2626 27
27 27 27 28 30 30 30 3132 32
27Create a histogram with
the data. (use x-scale of 2) Then find the mean
and median.
27
Look at the placement of the mean and median in this symmetrical distribution.
Look at the following data set. Find the mean & median.
Mean =
Median =
22 29 28 22 24 25 2821 25
23 24 23 26 36 38 6223
25Create a histogram with
the data. (use x-scale of 8) Then find the mean
and median.
28.176
Look at the placement of the mean and
median in this right skewed distribution.
Look at the following data set. Find the mean & median.
Mean =
Median =
21 46 54 47 53 60 55 5560
56 58 58 58 58 62 63 64
58Create a histogram with
the data. Then find the mean and median.
54.588
Look at the placement of the mean and
median in this skewed left distribution.
Recap:
• In a symmetrical distribution, the mean and median are equal.
• In a skewed distribution, the mean is pulled in the direction of the skewness.
• In a symmetrical distribution, you should report the mean!
• In a skewed distribution, the median should be reported as the measure of center!
The “average” or mean price for this
sample of 10 houses in Fancytown is
$295,000
Example calculations
• During a two week period 10 houses were sold in Fancytown.
x 2,950,000x
n 10295,000
House Price in Fancytown
x231,000313,000299,000312,000285,000317,000294,000297,000315,000287,000
2,950,000x
• During a two week period 10 houses were sold in Lowtown.
The “average” or mean price for this
sample of 10 houses in Lowtown is
$295,000Outlier
House Price in Lowtown
x97,00093,000
110,000121,000113,00095,000
100,000122,00099,000
2,000,0002,950,000x
x 2,950,000x
n 10295,000
• Looking at the dotplots of the samples for Fancytown and Lowtown we can see that the mean, $295,000 appears to accurately represent the “center” of the data for Fancytown, but it is not representative of the Lowtown data.
• Clearly, the mean can be greatly affected by the presence of even a single outlier.
Outlier
1. In the previous example of the house prices in the sample of 10 houses from Lowtown, the mean was affected very strongly by the one house with the extremely high price.
2. The other 9 houses had selling prices around $100,000.
3. This illustrates that the mean can be very sensitive to a few extreme values.
SOOOO……
Describing the Center of a Data Set with the median
The sample median is obtained by first ordering the n observations from smallest to largest (with any repeated values included, so that every sample observation appears in the ordered list). Then
the single middle value if n is odd
sample median=the mean of the middle two values if n is even
Consider the Fancytown data. First, we put the data in numerical increasing order to get 231,000 285,000 287,000 294,000 297,000 299,000 312,000 313,000 315,000 317,000Since there are 10 (even) data values, the median is the mean of the two values in the middle.
297000 299000median $298,000
2
Example of Median Calculation
Consider the Lowtown data. We put the data in numerical increasing order to get
93,000 95,000 97,000 99,000
100,000 110,000 113,000 121,000
122,000 2,000,000
Since there are 10 (even) data values, the median is the mean of the two values in the middle.
000,105$2
000,110000,100
median
• Typically, 1. when a distribution is skewed positively, the
mean is larger than the median,2. when a distribution is skewed negatively, the
mean is smaller then the median, and3. when a distribution is symmetric, the mean
and the median are equal.
The Trimmed Mean• A trimmed mean is computed by first
ordering the data values from smallest to largest, deleting a selected number of values from each end of the ordered list, and finally computing the mean of the remaining values.
• The trimming percentage is the percentage of values deleted from each end of the ordered list.
Example of Trimmed
Mean
House Price in Fancytown
231,000285,000287,000294,000297,000299,000312,000313,000315,000317,000
2,950,000
291,000
295,000
300,250
2,402,000
Sum of the eight middle values is
Divide this value by 8 to obtain the 10% trimmed mean.
x x
median 10% Trim Mean
Example of Trimmed Mean
House Price in Lowtown
93,00095,000
857,000 97,00099,000
100,000110,000113,000121,000122,000
2,000,0002,950,000
295,000
105,000
107,125
Divide this value by 8 to obtain the 10% trimmed mean.
Sum of the eight middle values is
x x
median 10% Trim Mean
Trimmed mean:Purpose is to remove outliers from a
data setTo calculate a trimmed mean:• Multiply the % to trim by n• Truncate that many observations from
BOTH ends of the distribution (when listed in order)
• Calculate the mean with the shortened data set
Find a 10% trimmed mean with the following data.
12 14 19 20 22 24 25 26 2635
10%(10) = 1
So remove one observation from each side!
228
2626252422201914
Example:The Highway Loss Data Institute publishes data on repair costs resulting from a 5-mph crash test of a car moving forward into a flat barrier. The following table gives data for 10 midsize luxury cars:Model Repair Cost in dollarsAudi A6 0BMW 328i 0Cadillac Catera 900Jaguar X 1254Lexus ES300 234Lexus IS 300 979Mercedes C320 707Saab 9-5 670Volvo S60 769Volvo S80 4194
Compute the mean and median values. Why are these values so different? Which of the mean and median do you think is more representative of the data set? Why?
Test day: The average for Mrs. Goins’ 4th period Algebra test was 88% for those 24 students. The average for 6th period was 85% for those 20 students. What was the average for the combined classes?