Measures of Dispersion

31
1 Measures of Dispersion or Variation Dr. Vijay Kumar [email protected] ; [email protected]

Transcript of Measures of Dispersion

Page 1: Measures of Dispersion

1

Measures of Dispersion or Variation

Dr. Vijay Kumar [email protected]; [email protected]

Page 2: Measures of Dispersion

2

Dispersion– The degree to which data tends to spread

about an average value is called variation or

dispersion of the data.

Measures of Dispersion:- Techniques that are used to

measure the extent of variation or the deviation of

each value in the data set from a measure of central

tendency (mean, median).

Page 3: Measures of Dispersion

3

Why we need Measures of Variations?

• To determine the reliability of an average.

• To serve as a basis for the control of the variability

• To compare two or more series with regards to their

variability

• To facilitate the use of other statistical measures

(correlation analysis, the testing of hypothesis, the

analysis of fluctuations, techniques of production

control, cost control etc.)

Page 4: Measures of Dispersion

4

Properties of a good measure of dispersion

• It should be simple to understand.

• It should be easy to compute.

• It should be rigidly defined.

• It should be based on each and every observation of the distribution.

• It should be capable for further algebraic treatment.

• It should have sampling stability.

• It should not be unduly affected by extreme values.

Page 5: Measures of Dispersion

5

Methods of studying variation / dispersion

1. The Range

2. The Interquartile Range

3. The Average Deviation

4. The Standard Deviation

5. The Lorenz Curve

Page 6: Measures of Dispersion

6

Absolute measures of variation

Absolute measures of variations are expressed in the

same statistical unit in which the original data are

given such as rupees, kilograms, meters, tonnes etc.

These values may be used to compare the variation

in two or more than two distributions provided the

variables are expressed in the same units and have

almost the same average value.

Page 7: Measures of Dispersion

7

Relative measures of variations

When the two data sets are expressed in different

units, for example – quintals of sugar versus tonnes

of sugarcane, or if the average value is very much

different, such as manager’s salary versus worker’s

salary, then relative measures of variations are

used.

A measure of relative variation is the ratio of a

measure of absolute variation to an average. It is

sometimes called a coefficient of variation ( a pure

number independent of the unit of measurement)

Page 8: Measures of Dispersion

8

1.The Range

Range = L-S

L= Largest Value and

S= Smallest Value

The relative measure corresponding to range

or the coefficient of range = SLSL

Page 9: Measures of Dispersion

9

Merits/Advantage of Range

• Range is the simplest to understand and the

easiest to compute, among all the methods of

studying variation.

• It takes minimum time to calculate the value of

range.

• For getting a quick rather than a very accurate

picture of variability, one may compute range.

Page 10: Measures of Dispersion

10

Demerits/Disadvantages of Range

• Range is not based on each and every observation

of the distribution.

• It is subject to fluctuations of considerable

magnitude from sample to sample.

• Range can not be computed in case of open-end

distributions.

• Range cannot tell us anything about the character

of the distribution within two observations.

Page 11: Measures of Dispersion

11

Uses of Range

• Quality Control

• Fluctuations in the share market

• Weather Forecasts

Page 12: Measures of Dispersion

12

2. Interquartile Range

The range which include middle 50% 0bservations

Inter-quartile range =

Semi-interquartile range or quartile deviation

(Q.D.) =

The Relative measure corresponding to this measure

Coefficient of Q.D. =

13 QQ

213 QQ

13

13

QQ

QQ

Page 13: Measures of Dispersion

13

Merits of Q.D.

• It is superior to range.

• We can calculate range in case of open end

distributions.

• It is not effected by the presence of extreme

observations.

Page 14: Measures of Dispersion

14

Demerits of Q.D.

• It ignores 50% observations.

• It is not capable for further mathematical manipulations.

• Its value is very much effected by sampling fluctuations.

• It is not a measure of variation as it really does not show the scatter around an average but rather a distance on the scale, i.e. Q.D. is not itself measured from an avg., but it is a positional avg.

Page 15: Measures of Dispersion

15

3.The Mean Absolute Deviation / Average Deviation

A.D. is obtained by calculating the absolute deviations of each observation from median or mean, and than averaging these deviations by taking their arithmetic mean.

N

XX

N

MedX .

Computation of Mean Absolute Deviation – Ungrouped Data

MAD (Mean) =

Or MAD (Median) =

Page 16: Measures of Dispersion

16

Computation of Mean Absolute Deviation – Grouped Data

N

XXf

N

MedXf .

fN

MAD (Mean) =

Or MAD (Median) =

Where N is the sum of frequencies i.e.

The Relative measure corresponding to this measure

X

MAD

.Med

MAD

Coefficient of MAD (Mean) = For MEDIAN

Coefficient of MAD (Median) =

Page 17: Measures of Dispersion

17

Merits of A.D.

• It is simple to understand and easy to compute.

• It is based on each and every observation of

the data.

• A.D. is less effected by the values of extreme

observations.

Page 18: Measures of Dispersion

18

Demerits of A.D.

• The greatest drawback is that algebraic signs are

ignored

• This method may not give us very accurate results.

(because A.D. give us best results when deviations

are taken from median)

• It is not capable for further algebraic treatments.

• It is rarely used in sociological and business studies.

Page 19: Measures of Dispersion

19

4.The Standard Deviation

Most widely used measure of studying variation. Its

significance lies in the fact that it is free from those

defects from which the earlier methods suffer and

satisfies most of the properties of a good measure of

variation.

It is a measure of how much spread or variability is

present in the sample.

Page 20: Measures of Dispersion

20

Standard Deviation is also known as Root Mean Square Deviation for the reason that it is the square root of the means of square deviations from the arithmetic mean. Standard deviation is denoted by small Greek letter (read as sigma) and is defined as

=

If we square standard deviation, we get Variance

Hence variance = or =

2_

N

xx

2 .var

Page 21: Measures of Dispersion

21

Calculation of SD- Ungrouped Data:-

(a) Deviation taken from Actual Mean:-

=

Or

=

2_

N

xx

22

N

x

N

x

2_

N

xx

2_

N

xx

22

N

x

N

x

2_

N

xx

22

N

x

N

x

2_

N

xx

22

N

x

N

x

2_

N

xx

Page 22: Measures of Dispersion

22

Deviations taken from Assumed mean:-

=

Where

and A is Assumed Mean or Arbitrary point

22

N

d

N

d

Axd

Page 23: Measures of Dispersion

23

Calculation of SD – Grouped Data

Deviations taken from Actual Mean:-

=

=

Where f is the frequency

And

22

N

fx

N

fx

2_

N

xxf

fN

Page 24: Measures of Dispersion

24

Deviations taken from Assumed Mean:-

=

Where

and A is Assumed Mean or Arbitrary point.

h is class interval

hN

fd

N

fd

22

h

Axd

fN

Page 25: Measures of Dispersion

25

Coefficient of Variation

The corresponding relative measure of S.D. is known as

coefficient of variation. It is most commonly used

measure of relative measure.

It is used in such problems where we want to compare

the variability of two or more than two series.

Coefficient of Variation denoted by C.V. is obtained as

follows:

C.V. = 100_ x

Page 26: Measures of Dispersion

26

Mathematical Properties of S.D.

• We can obtain combined S.D.

• The sum of the squares of the deviations of all the

observations from their arithmetic mean is

minimum. In other words, the sum of the squares

of the observations taken from a value other than

the A.M. would always be greater.

• S.D. is independent of change of origin but not

scale.

Page 27: Measures of Dispersion

27

Merits of S.D.

• S.D. is the best measure of variation because of its

mathematical characteristic

• It is based on every observation of the distribution.

• It is capable for further algebraic treatment

• It is less affected by sampling fluctuations.

• For comparing the variability of two or more

distributions coefficients of variation is considered

to be most appropriate and this measure is based on

mean and S.D.

Page 28: Measures of Dispersion

28

• S.D. is most prominently used in further

statistical work (Skewness, Correlation etc.)

• It is a key-note in sampling and provides a unit

of measurement for the Normal Distribution.

Page 29: Measures of Dispersion

29

Demerits of S.D.

• As compare to other measures it is difficulty to

compute. However, it does not reduce the

importance of this measure because of eh high

degree of accuracy of results it gives.

• It gives more weight (importance) to extreme

values and less to those which are near the mean.

Page 30: Measures of Dispersion

30

5. Lorenz Curve

• It is a graphic method of studying variation. It was devised by Max O. Lorenz (Economic Statistician). This curve was used by him for the first time to measure the distribution of wealth and income.

• Now the curve is also used to study the distribution of profits, wages, turnovers etc.

• The most common use of this curve is in the study of the degree of inequality in the distribution of income and wealth between countries or between different periods of time.

Page 31: Measures of Dispersion

31

It is a cumulative percentage curve in which

the % of items is combined with the % of

other things as wealth, profits, turnovers

etc.