Measures of CeMEASURES OF CENTRAL TENDENCY.pptntral Tendency

MEASURES OF CENTRAL TENDENCY

A single value that summarizes a set of data.

The following are the important measures of central location:The Arithmetic Mean

The MedianThe Mode

The Geometric Mean The Harmonic Mean

The Arithmetic Mean

• THE POPULATION MEAN:• Many studies involve all the values in a

population. For ungrouped data,• • the population mean, where,

N=Number of items in the population• = sum of the X values.• Example: There are 30 companies in the city of

Dhaka. Their profits (in lakh taka) in the year 2001-2002 are given below:

MEASURES OF CENTRAL TENDENCY

• A single value that summarizes a set of data.

The following are the important measures of central location:

• The Arithmetic Mean • The Median• The Mode• The Geometric Mean • The Harmonic Mean

• Parameter: A characteristic of a population. • THE SAMPLE MEAN:• Frequently we select a sample from the population in

order to find something about a specific characteristic of the population. It may be expensive and time consuming to collect data from all the companies under consideration. Therefore, a sample of 5 companies might be selected and the mean of five companies calculated in order to estimate the mean income of all the companies.

• The mean of a sample and the mean of a population are computed in the same way.

• The formula for the mean of a sample is:• Sample Mean,

Where, • n= sample

size.• Example: From our previous example, we take

sample of 5 companies as below:• 65, 22, 48, 55, 29.• Find the average income of the companies from

the sample data

• 20, 22, 35, 42, 37, 42, 48, 53, 49, 65, 39, 48, 67, 18, 16, 23, 37, 35, 49, 63, 65, 55, 45, 58, 57, 69, 25, 29, 58, 65.

• What is the average profit of the companies?

• Statistic: A characteristic of a sample.

• Arithmetic Mean for Grouped Data:• Quite often data on incomes, ages, and so on

are grouped and presented in the form of a frequency distribution.

• The mean of a sample of data organized in a frequency distribution is computed by the following formula:

• Arithmetic Mean of Grouped data

n

fXX

Where, f=frequency of each class

X= mid-point each class

Example:

• We organize the raw data from our previous example and present in the form of a frequency distribution ( We consider the data as sample):

Profits(lakh taka) Frequency, f Mid-point, X fX

15-24 5

25-34 2

35-44 7

45-54 6

55-64 5

65-74 5

Total 30

The Properties of the Arithmetic Mean:

• The arithmetic mean is a widely used measure of the central location. It has several important properties:

• Every set of interval-level data has a mean.• All the values are included in computing the mean.• A set of data has only one mean. The mean is unique. • The mean is a useful measure for comparing two or

more populations. • The arithmetic mean is the only measure of central

tendency where the sum of the deviations of each value from the mean will always be zero.

Expressed symbolically,

0)( XX

As an example, the mean of 3, 4 and 8 is 5.

)( XX (3-5)+ (8-5) + (4-5)

= -2+3-1 =0.

• The arithmetic mean has some disadvantages.

• The mean uses the value of every item in a sample, or population, in its computations. If one or two of these values are either extremely large or extremely small, the mean might not be an appropriate average to represent the data.

• Example: suppose the annual profits of a small group of companies are

• ( in taka): 62900, 61600, 62500, 60800, and 1.2 million.

• The mean profit is 289560 taka. Obviously, it is not representative of this group, because all but one company has a profit in the 60000 to 63000 taka range. One profit 1.2 million taka is unduly affecting the mean.

• The mean is also inappropriate if there is an open-ended class for data tallied into a frequency distribution.

THE MEDIAN:

• For data containing one or two very large or very small values, the arithmetic mean may not be representative. The centre point of such data can be better described using a measure of central tendency called the median.

• Median: The midpoint of the values after they have been ordered from the smallest to the largest, or the largest to the smallest. Fifty percent of the observations are above the median and fifty percent below the median.

• Median for ungrouped data:

• When n is an odd number,

• Order the values in an ascending or descending manner.

2

1nMedian = th item in the ordered series.

• Example: Find the median of the following values:

• 11, 9, 13, 4, 7

• Solution: First we array the data in an ascending order as follows:

• 4, 7, 9, 11, 13

2

1nMedian = th item =

2

15

th item = 3rd item. In the series the third item is 9. So the median value is 9.

When n is an even number,Order the values in an ascending or descending manner.

• Median = Arithmetic mean of 2

nth item and

2

n+1) th item in the ordered series.

• Example: Find the median of the following values: 11, 9, 13, 4, 7,15

• Solution: First we array the data in an ascending order as follows: 4, 7, 9, 11, 13,15

• Median = Arithmetic mean of 2

n th item and

2

n+1) th item in the ordered series.

• = Arithmetic mean of 2

6th item and (

2

6

= Arithmetic mean of 3rd item and 4th item [ 3rd item = 9

and 4th item =11]

+1)th

item

102

119

So the median value is 10.

• The major properties of the median are:• The median is unique; that is like the mean,

there is only one median for a set of data.• It is not affected by extremely large or small

values and is therefore a valuable measure of central tendency when such values do occur.

• It can be computed for a frequency distribution with an open-ended class if the median does not lie in an open-ended class.

• It can be computed for ratio-level, interval-level, and ordinal-level data. Suppose five people rated the service of a commercial bank as follows: excellent, very good, good, fair and poor. The median response is "good". Half the responses are above "good"; and other half are below

MEDIAN FOR GROUPED DATA

• EXAMPLE: VEHICLE SELLING PRICES.

THE MODE

• Mode - The value of the observation that appears most frequently.

• The mode is especially useful in describing nominal and ordinal levels of measurements. We can determine the mode for all levels of data - nominal, ordinal, interval, and ratio.

• The mode also has the advantage of not being affected by extremely high or low values. Like the median, it can be used as a measure of central tendency for distributions with open-ended classes.

• The mode does have a number of disadvantages: For many sets of data, there is no mode because no value appears more than once. For example, there is no mode for this set of price data: tk.19, tk.21, tk.23, tk.20, and tk.18.

• For some data sets there is more than one mode. Suppose the ages of a group of people are 22, 27, 26, 27, 31, 35, and 35. Both the ages 27 and 35 are modes. This grouping of ages is referred to as bimodal (having two modes).

Mode for grouped data:

• The mode is defined as the value that occurs most often. For data grouped into a frequency distribution, the mode can be approximated by the midpoint of the class containing the largest number of class frequencies.

• Example: Done on the board.

• Self-Review: 3 - 3. Page - 75. Exercises: 16, 17, 18, 20. Page-75.

• Self - Review 3 - 5 Page - 81.Exercises: 30, 31, 32,33,34. Page 82

• Self - Review - 3 - 6. Page - 85.

• Exercises: 43, 45, 51, 52, 61, 63 Pages: 90 - 94.

THE GEOMETRIC MEAN

• The geometric mean is useful in finding the average of percentages, ratios, indexes, or growth rates. It has a wide application in business and economics because we are often interested in finding the percentage changes in sales, salaries, or economic figures, such as the Gross National Product.

• The geometric mean of a set of n positive numbers is defined as the nth root of the product of n values.

The formula is:

nnXXXGM ))...()(( 21

All the data values must be positive to determine the geometric mean.

Example: The profits earned by a construction company on four recent projects were 3 percent, 2 percent, 4 percent, and 6 percent. What is the geometric mean profit?

46.3144

)6)(4)(2)(3())...()((

4

421

nnXXXGM

• The geometric mean profit is 3.46 percent.

• A second application of the geometric mean is to find average percent increase over a period of time.

• The formula is: [ From the board.]

• Example: The population of a small village in 1990 was 2 persons, by 2000 it was 22. What is the average annual rate of percentage increase during the period?

• Solution: Done on the board.

• Self - Review: 3 - 4. Page - 79. Exercises: 21, 23, 25, 27.Page - 79.

POSITIONAL MEASURES OF LOCATION

• There are measures of location that divide a set of observations into equal parts. These measures include quartiles, deciles, and percentiles.

• Quartiles divide a set of observations into four equal parts. We call the middle value of a set of data arranged from smallest to largest the median. That is, 50 percent of the observations are larger than the median and 50 percent are smaller. The median is a measure of location because it pinpoints the center of the data.

• In a similar fashion quartiles divide a set of observations into four equal parts. The first quartile, usually labeled Q1, is the value below which 25 percent of the observations occur and the third quartile, usually labeled Q3, is the value below which 75 percent of the observations occur. Q2 is the median. The values corresponding to Q1, Q2, and Q3 divide a set of data into four equal parts.

• In a similar fashion deciles divide a set of observations into 10 equal parts and percentiles into 100 equal parts.

Computational Procedure:

• Let Lp refer to the location of a desired percentile. So if we wanted to find the 33rd percentile we would use L33 and if we wanted the median, the 50th percentile, then L50 .

• The number of observation is n, so if we want to locate the middle observation, its position is at (n+1)/2, or we could write this as

• (n+1)(P/100), where P is the desired percentile.

LOCATION OF A PERCENTILE,

100)1(P

nLp

Example:

• Listed below are the commissions earned last month by a sample of 15 brokers:

$2,038 $1,758 $1,721 $1,637 $2,097 $2,047 $2,205, $1,787 $2,287 $1,940 $2,311 $2,054 $2,406 $1,471

$1,460.

• Locate the first quartile and the third quartile for the commissioned earned.

• Solution: The first step is to organize the data from the smallest to the largest.

• (in $): 1460, 1471, 1637, 1721, 1758, 1787, 1940, 2038, 2047, 2054, 2097, 2205, 2287, 2311, 2406.

• Quartiles divide a set of observations into four equal parts. Hence 25 percent of the observations will be less than the first quartile (Q1 ).

• 75 percent of the observations will be less than the third quartile (Q3 ).

• To locate the first quartile, we use the above formula, where n = 15 and P = 25:

• and to locate the third quartile, n = 15 and P = 75:

4100

25)115(

100)1( P

nLp

12100

75)115(

100)1( P

nLp

Therefore the first and third quartile values are located at positions 4 and 12.

• The fourth value in the ordered array is $1,721 and the twelfth is $2,205. Q1 = $1,721 and Q3 = $2,205.

When n is an even number:

• Example: Suppose a data set contained the six values: 91, 75, 61, 101, 43, and 104.

• Locate the first quartile.

• Solution: We order the values from the smallest to the largest:

• 43, 61, 75, 91, 101, 104.

• The first quartile is located at

75.1100

25)16(

100)1( P

nLp

• The position formula tells us that first quartile is located between the first and the second value and that it is .75 of the distance between the first and the second values.

• The first value is 43 and the second is 61. So the distance (61-43) between these two values is 18. To locate the first quartile, we need to move .75 of the distance between the first and second values, so .75(18) = 13.5. We add 13.5 to the first value and the first quartile is: 43 + 13.5 = 56.

• Self - Review: 4 - 8. Page - 124. Exercises: 35 - 38 Page-124.5.

Measures of CeMEASURES OF CENTRAL TENDENCY.pptntral Tendency

Documents

Transcript of Measures of CeMEASURES OF CENTRAL TENDENCY.pptntral Tendency