Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

31
Means & Medians Chapter 4

Transcript of Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Page 1: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Means & Medians

Chapter 4

Page 2: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Parameter -

• Fixed value about a population

• Typical unknown

Page 3: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Statistic -

• Value calculated from a sample

Page 4: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Measures of Central Tendency

• Median - the middle of the data; 50th percentile–Observations must be in

numerical order–Is the middle single value if n is

odd–The average of the middle two

values if n is even

NOTE: n denotes the sample size

Page 5: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Measures of Central Tendency

• Mean - the arithmetic average

–Use to represent a population mean

–Use x to represent a sample mean

nx

x FormulaFormula: : is the capital Greek

letter sigma – it means to sum the values that

follow

parameter

statistic

Page 6: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Measures of Central Tendency

• Mode – the observation that occurs the most often

–Can be more than one mode

–If all values occur only once – there is no mode

–Not used as often as mean & median

Page 7: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Suppose we are interested in the number of lollipops that are bought at a certain store. A sample of 5 customers buys the following number of lollipops. Find the median.

22 3 3 4 4 8 8 12 12

The numbers are in order & n is odd – so

find the middle observation.

The median is 4 lollipops!

Page 8: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Suppose we have sample of 6 customers that buy the following number of lollipops. The median is …

22 3 3 4 4 6 6 8 8 12 12

The numbers are in order & n is even – so find the middle two

observations.

The median is 5 lollipops!

Now, average these two values.

5

Page 9: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Suppose we have sample of 6 customers that buy the following number of lollipops. Find the mean.

22 3 3 4 4 6 6 8 8 12 12

To find the mean number of lollipops add the observations

and divide by n.

61286432 833.5x

Page 10: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Using the calculator . . .

Page 11: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

What would happen to the median & mean if the 12 lollipops were 20?

22 3 3 4 4 6 6 8 8 20 20

The median is . . .

5

The mean is . . .

62086432

7.17

What happened?

Page 12: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

What would happen to the median & mean if the 20 lollipops were 50?

22 3 3 4 4 6 6 8 8 50 50

The median is . . .

5

The mean is . . .

65086432

12.17

What happened?

Page 13: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Resistant -

• Statistics that are not affected by outliers

• Is the median resistant?

►Is the mean resistant?Is the mean resistant?

YES

NO

IMPORTANT: Median is resistant to outliers Mean is NOT resistant to outliers

Page 14: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Now find how each observation deviates from the mean.

What is the sum of the deviations from the mean?

Look at the following data set. Find the mean.

22 23 24 25 25 26 29 30

5.25x

xx 0

Will this sum always equal zero?

YESThis is the deviation from

the mean.

Page 15: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Look at the following data set. Find the mean & median.

Mean =

Median =

21 23 23 24 25 25 26 2626 27

27 27 27 28 30 30 30 3132 32

27Create a histogram with

the data. (use x-scale of 2) Then find the mean

and median.

27

Look at the placement of the mean and median in this symmetrical distribution.

Page 16: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Look at the following data set. Find the mean & median.

Mean =

Median =

22 29 28 22 24 25 2821 25

23 24 23 26 36 38 6223

25Create a histogram with

the data. (use x-scale of 8) Then find the mean

and median.

28.176

Look at the placement of the mean and

median in this right skewed distribution.

Page 17: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Look at the following data set. Find the mean & median.

Mean =

Median =

21 46 54 47 53 60 55 5560

56 58 58 58 58 62 63 64

58Create a histogram with

the data. Then find the mean and median.

54.588

Look at the placement of the mean and

median in this skewed left distribution.

Page 18: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Recap:

• In a symmetrical distribution, the mean and median are equal.

• In a skewed distribution, the mean is pulled in the direction of the skewness.

• In a symmetrical distribution, you should report the mean!

• In a skewed distribution, the median should be reported as the measure of center!

Page 19: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

The “average” or mean price for this

sample of 10 houses in Fancytown is

$295,000

Example calculations

• During a two week period 10 houses were sold in Fancytown.

x 2,950,000x

n 10295,000

House Price in Fancytown

x231,000313,000299,000312,000285,000317,000294,000297,000315,000287,000

2,950,000x

Page 20: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

• During a two week period 10 houses were sold in Lowtown.

The “average” or mean price for this

sample of 10 houses in Lowtown is

$295,000Outlier

House Price in Lowtown

x97,00093,000

110,000121,000113,00095,000

100,000122,00099,000

2,000,0002,950,000x

x 2,950,000x

n 10295,000

Page 21: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

• Looking at the dotplots of the samples for Fancytown and Lowtown we can see that the mean, $295,000 appears to accurately represent the “center” of the data for Fancytown, but it is not representative of the Lowtown data.

• Clearly, the mean can be greatly affected by the presence of even a single outlier.

Outlier

Page 22: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

1. In the previous example of the house prices in the sample of 10 houses from Lowtown, the mean was affected very strongly by the one house with the extremely high price.

2. The other 9 houses had selling prices around $100,000.

3. This illustrates that the mean can be very sensitive to a few extreme values.

SOOOO……

Page 23: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Describing the Center of a Data Set with the median

The sample median is obtained by first ordering the n observations from smallest to largest (with any repeated values included, so that every sample observation appears in the ordered list). Then

the single middle value if n is odd

sample median=the mean of the middle two values if n is even

Page 24: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Consider the Fancytown data. First, we put the data in numerical increasing order to get 231,000 285,000 287,000 294,000 297,000 299,000 312,000 313,000 315,000 317,000Since there are 10 (even) data values, the median is the mean of the two values in the middle.

297000 299000median $298,000

2

Example of Median Calculation

Page 25: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Consider the Lowtown data. We put the data in numerical increasing order to get

93,000 95,000 97,000 99,000

100,000 110,000 113,000 121,000

122,000 2,000,000

Since there are 10 (even) data values, the median is the mean of the two values in the middle.

000,105$2

000,110000,100

median

Page 27: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

• Typically, 1. when a distribution is skewed positively, the

mean is larger than the median,2. when a distribution is skewed negatively, the

mean is smaller then the median, and3. when a distribution is symmetric, the mean

and the median are equal.

Page 28: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Trimmed mean:Purpose is to remove outliers from a

data setTo calculate a trimmed mean:• Multiply the % to trim by n• Truncate that many observations from

BOTH ends of the distribution (when listed in order)

• Calculate the mean with the shortened data set

Page 29: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Find a 10% trimmed mean with the following data.

12 14 19 20 22 24 25 26 2635

10%(10) = 1

So remove one observation from each side!

228

2626252422201914

Page 30: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Example of Trimmed

Mean

House Price in Fancytown

231,000285,000287,000294,000297,000299,000312,000313,000315,000317,000

2,950,000

291,000

295,000

300,250

2,402,000

Sum of the eight middle values is

Divide this value by 8 to obtain the 10% trimmed mean.

x x

median 10% Trim Mean

Page 31: Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.

Example of Trimmed Mean

House Price in Lowtown

93,00095,000

857,000 97,00099,000

100,000110,000113,000121,000122,000

2,000,0002,950,000

295,000

105,000

107,125

Divide this value by 8 to obtain the 10% trimmed mean.

Sum of the eight middle values is

x x

median 10% Trim Mean