Unit 23 : Measures of Central tendency and...

21
CMV6120 Mathematics Unit 20 : Measures of Central Tendency and Dispersion Learning Objectives The students should be able to: determine the mean, median and mode from ungrouped data determine the mean, median and mode from grouped data determine the range, inter quartile range and standard deviation. Activities Teacher demonstration and student hand-on exercise. Use MS Excel spreadsheet, internal functions and data analysis to measure central tendency and dispersion. Reference Suen, S.N. (1998) “Mathematics for Hong Kong 5A”; Rev. Ed.; Canotta Unit 20 Central tendency & dispersion Page 1 of 21

Transcript of Unit 23 : Measures of Central tendency and...

Page 1: Unit 23 : Measures of Central tendency and dispersionlwsua.vtc.edu.hk/~cim/CMV6120_FMaths/Student_note/Unit 20... · Web viewUnit 20: Measures of Central Tendency and Dispersion Learning

CMV6120 Mathematics

Unit 20 : Measures of Central Tendency and Dispersion

Learning Objectives

The students should be able to:

determine the mean, median and mode from ungrouped data

determine the mean, median and mode from grouped data

determine the range, inter quartile range and standard deviation.

Activities

Teacher demonstration and student hand-on exercise.

Use MS Excel spreadsheet, internal functions and data analysis to measure central tendency and dispersion.

Reference

Suen, S.N. (1998) “Mathematics for Hong Kong 5A”; Rev. Ed.; Canotta

Unit 20 Central tendency & dispersion Page 1 of 13

Page 2: Unit 23 : Measures of Central tendency and dispersionlwsua.vtc.edu.hk/~cim/CMV6120_FMaths/Student_note/Unit 20... · Web viewUnit 20: Measures of Central Tendency and Dispersion Learning

CMV6120 Mathematics

Measures of Central Tendency and Dispersion

1. Measure of central tendency: mean, median and mode from grouped and ungrouped data

For a set of data, we determine a quantity used to summarise the whole set of data. This quantity is termed a measure of central tendency. The most commonly used measures are mean, medium and mode.

1.1 mean

For ungrouped data,

Example 1

Find the mean for the set data: 3, 7, 2, 1, 7

Solution

=

For grouped data,

Example 2

a) Find the mean of the set of data: 25, 36, 42, 38, 36b) Find the mean from the set of grouped data

Class mark 10.5 30.5 50.5 70.5 90.5 110.5Frequency 19 6 3 2 1 2

Solution

a) mean =

Unit 20 Central tendency & dispersion Page 2 of 13

Page 3: Unit 23 : Measures of Central tendency and dispersionlwsua.vtc.edu.hk/~cim/CMV6120_FMaths/Student_note/Unit 20... · Web viewUnit 20: Measures of Central Tendency and Dispersion Learning

CMV6120 Mathematics

b) x f x f

10.5 19 199.530.5 650.5 370.5 290.5 1110.5 2sum 33

mean =

Example 3

The HK Consumer Price Index B from 1997 to 2003 was as following:1996 99.71997 105.51998 108.51999 103.42000 99.42001 97.72002 94.72003 92.1Calculate the average consumer price index B:a) For the first 4 years, (1997 – 2000).b) For the next 3 years, (2001 – 2003)c) For all 7 years d) Suppose the original data was lost, and only the 4- and 3-year averages in a) and b)

were available. Would it still be possible to calculate the overall 7-year average? How?

Solution

a) From 1997 2000, n = 4.

The average price index = (105.5 + 108.5 + 103.4 + 99.4) 4 =

b) From 20012003, n = 3.

The average price index = (97.7 + 94.7 + 92.1) 3 =

c) From 1997 2003, n = 7.

The average price index = (105.5 + 108.5 + 103.4 + 99.4 + 97.7 + 94.7 + 92.1) 7 =

d) The average price index over 7 years = ( 4 + 3 ) (4 + 3) =

Unit 20 Central tendency & dispersion Page 3 of 13

Page 4: Unit 23 : Measures of Central tendency and dispersionlwsua.vtc.edu.hk/~cim/CMV6120_FMaths/Student_note/Unit 20... · Web viewUnit 20: Measures of Central Tendency and Dispersion Learning

CMV6120 Mathematics

1.2 Median

For ungrouped data,Median = the middle datum, when n is odd.Median = the mean of the two middle data, when n is even.

e.g.1 For the set of data2, 4, 7, 9, 21

middle datum

median = 7

e.g.2 For the set of data3, 5, 7, 7

middle of two data

median = (5 + 7) 2 = 6

For grouped data,

Step 1: Draw the cumulative frequency polygon.Step 2: The median is the datum corresponding to the middle value of the cumulative

frequency.

Example 4

a) Find the median of 2, 3, 10, 12, 999.b) Find the median of 2, 3, 10, 12, 22, 123.c) The cumulative frequency polygon for maths marks of a class is given below, find the median

mark.

Solution

a) Median =

b) Median =

c) Total frequency = 40

The rank of median

= 40/2 =

From the cumulative polygon,median =

Unit 20 Central tendency & dispersion Page 4 of 13

0

5

10

15

20

25

30

35

40

frequency

9.5 19.5 29.5 39.5 49.5 59.5 69.5 79.5 89.5 99.5

Cumulative frequency polygon for marks in maths

Marks

Page 5: Unit 23 : Measures of Central tendency and dispersionlwsua.vtc.edu.hk/~cim/CMV6120_FMaths/Student_note/Unit 20... · Web viewUnit 20: Measures of Central Tendency and Dispersion Learning

CMV6120 Mathematics

Example 5

The provisional figures on the population by age group in Hong Kong as at 9/2001 are tabulated below. Draw a cumulative frequency polygon and determine the median age for the population.

Age group 0 9 10 19 20 29

49 50 59 60 69 > 70

Population (‘000)

676 885 1208 677 503 499

Solution

Age groupx

Population ('000)

cumulative population (‘000 000)

x < 10 676 0.67610 ≦ x < 20 885 1.56120 ≦ x < 30 100030 ≦ x < 40 126740 ≦ x < 50 120850 ≦ x < 60 67760 ≦ x < 70 503

70 < x 499

The rank of median = 6.715/2 =

The median age =

1.3 mode

For ungrouped data, mode is the datum that has the highest frequency.For grouped data, modal class is the class that has the highest frequency.

Example 6

a) Find the mode of the data:1, 2, 2, 2, 3, 3, 9

b) Find the modal class

Class 10 - 14 15 - 19 20 - 24 25 - 29Frequency 2 8 7 3

Solution

a) The mode is

b) The modal class is

Unit 20 Central tendency & dispersion Page 5 of 13

0

1.0

2.0

3.0

4.0

5.0

6.0

7.0

8.0

Population (‘000 000)

Less than Cumulative frequency polygon for population

10 20 30 40 50 60 70 70+

Age group

Page 6: Unit 23 : Measures of Central tendency and dispersionlwsua.vtc.edu.hk/~cim/CMV6120_FMaths/Student_note/Unit 20... · Web viewUnit 20: Measures of Central Tendency and Dispersion Learning

CMV6120 Mathematics

Example 7

The temperature in degree Celsius each day cover a three week period were follow: 17, 18, 20, 21, 19, 16, 15, 18, 20, 21, 21,,22, 21, 19, 20,19, 17,16,16,17. Compute the mean, median, and mode of these raw dates by using two-degree intervals starting with 15-16. Draw a cumulative frequency polygon.

SolutionTemperature

(℃)Tally Frequency

fClass mark

xf x

15 1617 1819 2021 2223 24

Sum

Temperature(℃)

cumulative frquency

< 14.5< 16.5< 18.5< 20.5< 22.5< 24.5

The mean temperature =

The modal class of temperature is

The median temperature is

RemarkMean seems to be the most commonly used (and often misused) quantity for measuring central tendency. If the distribution of the data set shows a strong degree of skewness, then mean is not a reliable measure as it is strongly affected by the extreme values. In this case, medium may be a better choice. Mode is used when there is reason to choose the most commonly occurring data value as the representative for the whole data set.

Unit 20 Central tendency & dispersion Page 6 of 13

0

4

8

12

16

20

24Frequency

Cumulative frequency polygon for temperature

14.5 16.5 18.5 20.5 22.5 24.5

Temperature (℃)

Page 7: Unit 23 : Measures of Central tendency and dispersionlwsua.vtc.edu.hk/~cim/CMV6120_FMaths/Student_note/Unit 20... · Web viewUnit 20: Measures of Central Tendency and Dispersion Learning

CMV6120 Mathematics

2. Measure of dispersion: Range, Inter-quartile range and Standard deviation

Apart from using a measure of central tendency to summarise a set of data, we need a quantity to measure the degree of dispersion of the set of data (so that we can determine the reliability of the set of data). Range is a measure that is very simple to use but it provides relatively little information on dispersion. Quartile deviation is used in association with the median whereas standard deviation goes with the mean.

2.1 Range

For ungrouped data, the range is the difference between the largest datum and the smallest datum.For grouped data, the range is the difference between the highest class boundary and the lowest boundary.

Example 8

a) Find the range of the data:1, 2, 2, 2, 3, 3, 9

b) Find the range of the grouped data

Class 10 - 14 15 - 19 20 - 24 25 - 29Frequency 2 8 7 3

Solution a) The range =

b) The range =

2.2 Inter quartile range

Inter quartile range = Q3 – Q1

where Q1, Q2, Q3 are called quartiles which divide the data (which have been ranked, i.e. arranged in order) into four equal parts.

Moreover,Q2 is the median of the whole set of data,Q1 is the median of the lower half,Q3 is the median of the upper half.Quartile deviation, Q.D. = ½ (Q3 Q1)

Unit 20 Central tendency & dispersion Page 7 of 13

Page 8: Unit 23 : Measures of Central tendency and dispersionlwsua.vtc.edu.hk/~cim/CMV6120_FMaths/Student_note/Unit 20... · Web viewUnit 20: Measures of Central Tendency and Dispersion Learning

CMV6120 Mathematics

Example 9

Find the inter quartile range ofa) 1, 2, 3, 5, 11, 12, 13.b) 1, 2, 3, 4, 11, 12, 13, 14.

Solution

a) inter-quartile range =

b) inter-quartile range =

Example 10

The following frequency distribution gives the life hours of a sample of 50 light bulbs:

Life hours (‘000) Frequency Life hoursUp to (‘000)

Cumulative frequency

0.60.6 to under 0.7 2 0.70.7 to under 0.8 4 0.80.8 to under 0.9 6 0.90.9 to under 1.0 14 1.01.0 to under 1.1 13 1.11.1 to under 1.2 7 1.21.2 to under 1.3 4 1.3

Find the median and the inter-quartile range of the data.

The rank of median = ½ × 50 =

The median of life hours is hrs.

The rank of upper quartile = ¾ × 50 = = 38 , to the nearest integer

The upper quartile Q3 is hrs

The rank of lower quartile = ¼ × 50 = = 13 , to the nearest integer

The lower quartile Q1 is hrs.

The inter-quartile range = Q3 Q1 =

Quartile deviation = ½ (Q3 Q1)=

Unit 20 Central tendency & dispersion Page 8 of 13

0

8

16

24

32

40

48

56

Frequency

Less than cumulative frequency polygon for life hours of 50 sample light bulbs

0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3

Life hour (‘000)

Page 9: Unit 23 : Measures of Central tendency and dispersionlwsua.vtc.edu.hk/~cim/CMV6120_FMaths/Student_note/Unit 20... · Web viewUnit 20: Measures of Central Tendency and Dispersion Learning

CMV6120 Mathematics

Example 11Find the range, inter-quartile range and quartile deviation for the data in example 4 and example 7 respectively.

2.3 Standard deviationFor ungrouped data x1, x2,…,xn, with a mean , the standard deviation () is

For grouped data with class marks x1, x2,…,xn; corresponding frequencies f1,f2,…,fn, and a mean , the standard deviation () is

Example 12

Find the standard deviation fora) the ungrouped data 8, 9, 10, 10, 11b) the grouped data

x 17 22 27 32 37 42 47f 2 4 7 8 7 4 2

Solution

a) mean = standard deviation =

Calculator key-in method:Model 3600 Model 3900 Model 506R

Set Statistic mode MODE 3 MODE 2 2ndF MODE 3 0

Clear memory KAC KAC 2ndF CAKey-in data 8 DATA 9 DATA

10 DATA 10 DATA 11 DATA

8 DATA 9 DATA 10 DATA 10 DATA 11 DATA

8 DATA 9 DATA 10 DATA 10 DATA 11 DATA

mean SHIFT 1 SHIFT 4 2ndF 4s.d. SHIFT 2 SHIFT 5 2ndF 6

Unit 20 Central tendency & dispersion Page 9 of 13

Page 10: Unit 23 : Measures of Central tendency and dispersionlwsua.vtc.edu.hk/~cim/CMV6120_FMaths/Student_note/Unit 20... · Web viewUnit 20: Measures of Central Tendency and Dispersion Learning

CMV6120 Mathematics

b) mean = standard deviation =

Calculator key-in method:Model 3600 Model 3900 Model 506R

Set Statistic mode MODE 3 MODE 2 2ndF MODE 3 0

Clear memory KAC KAC 2ndF CAKey-in data 17 2 DATA

22 4 DATA27 7 DATA 32 8 DATA37 7 DATA 32 4 DATA47 2 DATA

17 2 DATA 22 4 DATA27 7 DATA 32 8 DATA37 7 DATA 32 4 DATA47 2 DATA

17 , 2 DATA

22 , 4 DATA

27 , 7 DATA

32 , 8 DATA

37 , 7 DATA

42 , 4 DATA

47 , 2 DATAmean SHIFT 1 SHIFT 4 2ndF 4s.d. SHIFT 2 SHIFT 5 2ndF 6

Example 13

The life hours of 50 light bulbs has the following frequency distribution. Complete the table with class marks. Calculate the mean and standard deviation.

Life hours (‘000) Class mark Frequency0.6 to under 0.7 0.65 20.7 to under 0.8 0.75 40.8 to under 0.9 0.85 60.9 to under 1.0 0.95 141.0 to under 1.1 1.05 131.1 to under 1.2 1.15 71.2 to under 1.3 1.25 4

Solution

mean = standard deviation =

Unit 20 Central tendency & dispersion Page 10 of 13

Page 11: Unit 23 : Measures of Central tendency and dispersionlwsua.vtc.edu.hk/~cim/CMV6120_FMaths/Student_note/Unit 20... · Web viewUnit 20: Measures of Central Tendency and Dispersion Learning

CMV6120 Mathematics

Example 14

The height of Basil team members at the 2002 FIFA World Cup is listed as following:Marcos 1.93Cafu 1.76Lucio 1.88Roque Junior 1.86Edmilson 1.85Carlos 1.68Richardino 1.76Silva 1.85Ronaldo 1.83Rivaldo 1.86Ronaldinho 1.80

Calculate the average height, and the standard deviation:

Solution

Example 15

Find the mean and standard deviation for the data given below:

Age group x Population ('000)5 67615 88525 100035 126745 120855 67765 50375 499

Solution

Unit 20 Central tendency & dispersion Page 11 of 13

Page 12: Unit 23 : Measures of Central tendency and dispersionlwsua.vtc.edu.hk/~cim/CMV6120_FMaths/Student_note/Unit 20... · Web viewUnit 20: Measures of Central Tendency and Dispersion Learning

CMV6120 Mathematics

Practice

1. The Hong Kong unemployment rate in the year of 4/2003 – 4/2004 was as following:5/2003 8.36/2003 8.67/2003 8.78/2003 8.69/2003 8.310/2003 8.011/2003 7.512/2003 7.31/2004 7.32/2004 7.23/2004 7.24/2004 7.15/2004 7.0

Calculate the average, median, mode and the standard deviation of unemployment rate:a) For 5/2003 – 12/2003b) For1/2004 – 5/2004c) For all 13 months.

2. Find the mean, median, mode of the following:10, 13, 14, 14, 14, 15, 15,16, 17, 22

3. Which student has the highest average mark?

Student A B CEnglish 78 63 55Chinese 80 85 72

Mathematics 59 71 95

4. The frequency distribution of the lengths of 100 leaves from a certain species of plant is given below:

length (mm) Frequency20 – 24 625 – 29 1030 – 34 1835 – 39 2540 – 44 2245 – 49 1550 – 54 4

Unit 20 Central tendency & dispersion Page 12 of 13

Page 13: Unit 23 : Measures of Central tendency and dispersionlwsua.vtc.edu.hk/~cim/CMV6120_FMaths/Student_note/Unit 20... · Web viewUnit 20: Measures of Central Tendency and Dispersion Learning

CMV6120 Mathematics

5. The following table shows the distribution of heights of 50 students:

Height (cm) Frequency160 – 164 8165 – 169 12170 – 174 14175 – 179 7180 – 184 6185 – 189 3

Find the range and standard deviation of heights.

6. The mean of one set of six numbers is 9 and the mean of a second set of eight numbers is 12.5. Calculate the mean of the combined set of fourteen numbers.

7. The mean of the numbers a, b, c, d is 8 and the mean of the numbers a, b, c, d, e, f, g is 11. What is the mean of the numbers e, f, g?

8. Find the mean and standard deviation of the 5 numbers in term of x:x5, x-3, x2, x+1, x+4.

9. The mean of the five numbers 6, 9, 2, x, y is 5 and the standard deviation is . Find the values of x and y.

Answer:1.a) 8.16: 8.3; undefined; 0.52 b) 7.16: 7.2; 7.2; 0.11 c) 7.78: 7.5; undefined; 0.652) 15: 4.5; 4 3) 72; 73; 74 4) 37.4 5) 30; 7.14 6) 11 7) 158) x1; 3.169) (3, 5); (5, 3)

Unit 20 Central tendency & dispersion Page 13 of 13