Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard...

148
Measure of Variability (Dispersion, Spread) 1. Range 2. Inter-Quartile Range 3. Variance, standard deviation 4. Pseudo-standard deviation

Transcript of Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard...

Page 1: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Measure of Variability (Dispersion, Spread)

1. Range

2. Inter-Quartile Range

3. Variance, standard deviation

4. Pseudo-standard deviation

Page 2: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Measure of Central Location

1. Mean

2. Median

Page 3: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

1. Range

R = Range = max - min

2. Inter-Quartile Range (IQR)

Inter-Quartile Range = IQR = Q3 - Q1

Page 4: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Example

The data Verbal IQ on n = 23 students arranged in increasing order is:

80 82 84 86 86 89 90 94 94 95 95 96 99 99 102 102 104 105 105 109 111 118 119

Q2 = 96Q1 = 89 Q3 = 105min = 80 max = 119

Page 5: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Range and IQR

Range = max – min = 119 – 80 = 39

Inter-Quartile Range

= IQR = Q3 - Q1 = 105 – 89 = 16

Page 6: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

3. Sample Variance

Let x1, x2, x3, … xn denote a set of n numbers.

Recall the mean of the n numbers is defined as:

n

xxxxx

n

xx nn

n

ii

13211

Page 7: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

The numbers

are called deviations from the the mean

xxd 11

xxd 22

xxd 33

xxd nn

Page 8: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

The sum

is called the sum of squares of deviations from the the mean.

Writing it out in full:

or

n

ii

n

ii xxd

1

2

1

2

223

22

21 ndddd

222

21 xxxxxx n

Page 9: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

The Sample Variance

Is defined as the quantity:

and is denoted by the symbol

111

2

1

2

n

xx

n

dn

ii

n

ii

2s

Page 10: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

The Sample Standard Deviation s

Definition: The Sample Standard Deviation is defined by:

Hence the Sample Standard Deviation, s, is the square root of the sample variance.

111

2

1

2

n

xx

n

ds

n

ii

n

ii

Page 11: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Example

Let x1, x2, x3, x4, x5 denote a set of 5 denote the set of numbers in the following table.

i 1 2 3 4 5

xi 10 15 21 7 13

Page 12: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Then

= x1 + x2 + x3 + x4 + x5

= 10 + 15 + 21 + 7 + 13

= 66

and

5

1iix

n

xxxxx

n

xx nn

n

ii

13211

2.135

66

Page 13: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

The deviations from the mean d1, d2, d3, d4, d5 are given in the following table.

i 1 2 3 4 5

x i 10 15 21 7 13-3.2 1.8 7.8 -6.2 -0.2

10.24 3.24 60.84 38.44 0.04i id x x

22i id x x

Page 14: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

The sum

and

n

ii

n

ii xxd

1

2

1

2

22222 2.02.68.78.12.3

80.112

04.044.3884.6024.324.10

2.28

4

8.112

11

2

2

n

xxs

n

ii

Page 15: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Also the standard deviation is:

31.52.28

4

8.112

11

2

2

n

xxss

n

ii

Page 16: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Interpretations of s

• In Normal distributions– Approximately 2/3 of the observations will lie

within one standard deviation of the mean– Approximately 95% of the observations lie

within two standard deviations of the mean– In a histogram of the Normal distribution, the

standard deviation is approximately the distance from the mode to the inflection point

Page 17: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0 5 10 15 20 25

s

Inflection point

Mode

Page 18: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

s

2/3

s

Page 19: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

2s

Page 20: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Example

A researcher collected data on 1500 males aged 60-65.

The variable measured was cholesterol and blood pressure.

– The mean blood pressure was 155 with a standard deviation of 12.

– The mean cholesterol level was 230 with a standard deviation of 15

– In both cases the data was normally distributed

Page 21: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Interpretation of these numbers

• Blood pressure levels vary about the value 155 in males aged 60-65.

• Cholesterol levels vary about the value 230 in males aged 60-65.

Page 22: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

• 2/3 of males aged 60-65 have blood pressure within 12 of 155. i.e. between 155-12 =143 and 155+12 = 167.

• 2/3 of males aged 60-65 have Cholesterol within 15 of 230. i.e. between 230-15 =215 and 230+15 = 245.

Page 23: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

• 95% of males aged 60-65 have blood pressure within 2(12) = 24 of 155. Ii.e. between 155-24 =131 and 155+24 = 179.

• 95% of males aged 60-65 have Cholesterol within 2(15) = 30 of 230. i.e. between 230-30 =200 and 230+30 = 260.

Page 24: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

A Computing formula for:

Sum of squares of deviations from the the mean :

The difficulty with this formula is that will have many decimals.

The result will be that each term in the above sum will also have many decimals.

n

ii xx

1

2

x

Page 25: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

The sum of squares of deviations from the the mean can also be computed using the following identity:

n

x

xxx

n

iin

ii

n

ii

2

1

1

2

1

2

Page 26: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

To use this identity we need to compute:

and 211

n

n

ii xxxx

222

21

1

2n

n

ii xxxx

Page 27: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Then:

n

x

xxx

n

iin

ii

n

ii

2

1

1

2

1

2

11 and

2

1

1

2

1

2

2

nn

x

x

n

xxs

n

iin

ii

n

ii

Page 28: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

11

and

2

1

1

2

1

2

nn

x

x

n

xxs

n

iin

ii

n

ii

Page 29: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Example

The data Verbal IQ on n = 23 students arranged in increasing order is:

80 82 84 86 86 89 90 94

94 95 95 96 99 99 102 102

104 105 105 109 111 118 119

Page 30: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

= 80 + 82 + 84 + 86 + 86 + 89

+ 90 + 94 + 94 + 95 + 95 + 96 + 99 + 99 + 102 + 102 + 104

+ 105 + 105 + 109 + 111 + 118 + 119 = 2244

= 802 + 822 + 842 + 862 + 862 + 892

+ 902 + 942 + 942 + 952 + 952 + 962 + 992 + 992 + 1022 + 1022 + 1042

+ 1052 + 1052 + 1092 + 1112

+ 1182 + 1192 = 221494

n

iix

1

n

iix

1

2

Page 31: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Then:

n

x

xxx

n

iin

ii

n

ii

2

1

1

2

1

2

652.2557

23

2244221494

2

You will obtain exactly the same answer if you use the left hand side of the equation

Page 32: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

11 and

2

1

1

2

1

2

2

nn

x

x

n

xxs

n

iin

ii

n

ii

26.116

22

652.2557

2223

2244221494

2

Page 33: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

11 Also

2

1

1

2

1

2

nn

x

x

n

xxs

n

iin

ii

n

ii

26.116

22

652.2557

2223

2244221494

2

782.10

Page 34: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

A quick (rough) calculation of s

The reason for this is that approximately all (95%) of the observations are between

and

Thus

4

Ranges

sx 2.2sx

sx 2max .2min and sx .22minmax and sxsxRange

s4

4

Range Hence s

Page 35: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Example

Verbal IQ on n = 23 students min = 80 and max = 119

This compares with the exact value of s which is 10.782.The rough method is useful for checking your calculation of s.

75.94

39

4

80-119s

Page 36: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

The Pseudo Standard Deviation (PSD)

Definition: The Pseudo Standard Deviation (PSD) is defined by:

35.1

Range ileInterQuart

35.1

IQRPSD

Page 37: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Properties

• For Normal distributions the magnitude of the pseudo standard deviation (PSD) and the standard deviation (s) will be approximately the same value

• For leptokurtic distributions the standard deviation (s) will be larger than the pseudo standard deviation (PSD)

• For platykurtic distributions the standard deviation (s) will be smaller than the pseudo standard deviation (PSD)

Page 38: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Example

Verbal IQ on n = 23 students Inter-Quartile Range

= IQR = Q3 - Q1 = 105 – 89 = 16

Pseudo standard deviation

This compares with the standard deviation

85.1135.1

16

35.1

IQRPSD

782.10s

Page 39: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

• An outlier is a “wild” observation in the data

• Outliers occur because– of errors (typographical and computational)– Extreme cases in the population

• We will now consider the drawing of box-plots where outliers are identified

Page 40: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Box-whisker Plots showing outliers

Page 41: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

• An outlier is a “wild” observation in the data

• Outliers occur because– of errors (typographical and computational)– Extreme cases in the population

• We will now consider the drawing of box-plots where outliers are identified

Page 42: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

To Draw a Box Plot we need to:

• Compute the Hinge (Median, Q2) and the Mid-hinges (first & third quartiles – Q1 and Q3 )

• To identify outliers we will compute the inner and outer fences

Page 43: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

The fences are like the fences at a prison. We expect the entire population to be within both sets of fences.

If a member of the population is between the inner and outer fences it is a mild outlier.

If a member of the population is outside of the outer fences it is an extreme outlier.

Page 44: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Lower outer fence

F1 = Q1 - (3)IQR

Upper outer fence

F2 = Q3 + (3)IQR

Page 45: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Lower inner fence

f1 = Q1 - (1.5)IQR

Upper inner fence

f2 = Q3 + (1.5)IQR

Page 46: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

• Observations that are between the lower and upper fences are considered to be non-outliers.

• Observations that are outside the inner fences but not outside the outer fences are considered to be mild outliers.

• Observations that are outside outer fences are considered to be extreme outliers.

Page 47: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

• mild outliers are plotted individually in a box-plot using the symbol

• extreme outliers are plotted individually in a box-plot using the symbol

• non-outliers are represented with the box and whiskers with– Max = largest observation within the fences– Min = smallest observation within the fences

Page 48: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Inner fencesOuter fence

Mild outliers

Extreme outlierBox-Whisker plot representing the data that are not outliers

Page 49: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Example

Data collected on n = 109 countries in 1995.

Data collected on k = 25 variables.

Page 50: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

The variables

1. Population Size (in 1000s)

2. Density = Number of people/Sq kilometer

3. Urban = percentage of population living in cities

4. Religion

5. lifeexpf = Average female life expectancy

6. lifeexpm = Average male life expectancy

Page 51: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

7. literacy = % of population who read

8. pop_inc = % increase in popn size (1995)

9. babymort = Infant motality (deaths per 1000)

10. gdp_cap = Gross domestic product/capita

11. Region = Region or economic group

12. calories = Daily calorie intake.

13. aids = Number of aids cases

14. birth_rt = Birth rate per 1000 people

Page 52: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

15. death_rt = death rate per 1000 people

16. aids_rt = Number of aids cases/100000 people

17. log_gdp = log10(gdp_cap)

18. log_aidsr = log10(aids_rt)

19. b_to_d =birth to death ratio

20. fertility = average number of children in family

21. log_pop = log10(population)

Page 53: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

22. cropgrow = ??

23. lit_male = % of males who can read

24. lit_fema = % of females who can read

25. Climate = predominant climate

Page 54: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

The data file as it appears in SPSS

Page 55: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Consider the data on infant mortality

Stem-Leaf diagram stem = 10s, leaf = unit digit

0 4455555666666666777778888899 1 0122223467799 2 0001123555577788 3 45567999 4 135679 5 011222347 6 03678 7 4556679 8 5 9 4 10 1569 11 0022378 12 46 13 7 14 15 16 8

Page 56: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

median = Q2 = 27

Quartiles

Lower quartile = Q1 = the median of lower half

Upper quartile = Q3 = the median of upper half

Summary Statistics

1 3

12 12 66 6712, 66.5

2 2Q Q

Interquartile range (IQR)

IQR = Q1 - Q3 = 66.5 – 12 = 54.5

Page 57: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

lower = Q1 - 3(IQR) = 12 – 3(54.5) = - 151.5

The Outer Fences

No observations are outside of the outer fences

lower = Q1 – 1.5(IQR) = 12 – 1.5(54.5) = - 69.75

The Inner Fences

upper = Q3 = 1.5(IQR) = 66.5 – 1.5(54.5) = 148.25

upper = Q3 = 3(IQR) = 66.5 – 3(54.5) = 230.0

Only one observation (168 – Afghanistan) is outside of the inner fences – (mild outlier)

Page 58: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Box-Whisker Plot of Infant Mortality

0

0 50 100 150 200

Infant Mortality

Page 59: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Example 2

In this example we are looking at the weight gains (grams) for rats under six diets differing in level of protein (High or Low) and source of protein (Beef, Cereal, or Pork).

– Ten test animals for each diet

Page 60: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

TableGains in weight (grams) for rats under six diets

differing in level of protein (High or Low)and source of protein (Beef, Cereal, or Pork)

Level  High Protein Low protein

Source  Beef  Cereal  Pork Beef Cereal Pork

Diet 1 2 3 4 5 6

  73 98 94 90 107 49

  102 74 79 76 95 82

  118 56 96 90 97 73

  104 111 98 64 80 86

  81 95 102 86 98 81

  107 88 102 51 74 97

  100 82 108 72 74 106

  87 77 91 90 67 70

  117 86 120 95 89 61

  111 92 105 78 58 82

Median 103.0 87.0 100.0 82.0 84.5 81.5

Mean 100.0 85.9 99.5 79.2 83.9 78.7

IQR 24.0 18.0 11.0 18.0 23.0 16.0

PSD 17.78 13.33 8.15 13.33 17.04 11.05

Variance 229.11 225.66 119.17 192.84 246.77 273.79

Std. Dev. 15.14 15.02 10.92 13.89 15.71 16.55

Page 61: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Non-Outlier MaxNon-Outlier Min

Median; 75%25%

Box Plots: Weight Gains for Six Diets

Diet

We

igh

t G

ain

40

50

60

70

80

90

100

110

120

130

1 2 3 4 5 6

High Protein Low Protein

Beef Beef Cereal Cereal Pork Pork

Page 62: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Conclusions

• Weight gain is higher for the high protein meat diets

• Increasing the level of protein - increases weight gain but only if source of protein is a meat source

Page 63: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Measures of Shape

Page 64: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Measures of Shape• Skewness

• Kurtosis

00.020.040.060.080.1

0.120.140.16

0 5 10 15 20 250

0.020.040.060.080.1

0.120.140.16

0 5 10 15 20 25

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0 5 10 15 20 25

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0 5 10 15 20 250

-3 -2 -1 0 1 2 3

0

-3 -2 -1 0 1 2 3

Positively skewed

Negatively skewed

Symmetric

PlatykurticLeptokurticNormal

(mesokurtic)

Page 65: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

• Measure of Skewness – based on the sum of cubes

• Measure of Kurtosis – based on the sum of 4th powers

n

ii xx

1

3

n

ii xx

1

4

Page 66: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

The Measure of Skewness

3

11 3

22

1

n

ii

n

ii

n x x

g

x x

Page 67: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

The Measure of Kurtosis

4

12

2

1

3

n

ii

n

ii

x xg

n x x

The 3 is subtracted so that g2 is zero for the normal distribution

Page 68: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Interpretations of Measures of Shape

• Skewness

• Kurtosis

00.020.040.060.080.1

0.120.140.16

0 5 10 15 20 25

00.020.040.060.080.1

0.120.140.16

0 5 10 15 20 25

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0 5 10 15 20 25

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0 5 10 15 20 25

0

-3 -2 -1 0 1 2 3

0

-3 -2 -1 0 1 2 3

g1 > 0 g1 = 0 g1 < 0

g2 < 0 g2 = 0 g2 > 0

Page 69: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Descriptive techniques for Multivariate data

In most research situations data is collected on more than one variable (usually many variables)

Page 70: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Graphical Techniques

• The scatter plot

• The two dimensional Histogram

Page 71: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

The Scatter Plot

For two variables X and Y we will have a measurements for each variable on each case:

xi, yi

xi = the value of X for case i

and

yi = the value of Y for case i.

Page 72: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

To Construct a scatter plot we plot the points:

(xi, yi)

for each case on the X-Y plane.

(xi, yi)

xi

yi

Page 73: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

 Data Set #3

The following table gives data on Verbal IQ, Math IQ,Initial Reading Acheivement Score, and Final Reading Acheivement Score

for 23 students who have recently completed a reading improvement program 

Initial FinalVerbal Math Reading Reading

Student IQ IQ Acheivement Acheivement 

1 86 94 1.1 1.72 104 103 1.5 1.73 86 92 1.5 1.94 105 100 2.0 2.05 118 115 1.9 3.56 96 102 1.4 2.47 90 87 1.5 1.88 95 100 1.4 2.09 105 96 1.7 1.7

10 84 80 1.6 1.711 94 87 1.6 1.712 119 116 1.7 3.113 82 91 1.2 1.814 80 93 1.0 1.715 109 124 1.8 2.516 111 119 1.4 3.017 89 94 1.6 1.818 99 117 1.6 2.619 94 93 1.4 1.420 99 110 1.4 2.021 95 97 1.5 1.322 102 104 1.7 3.123 102 93 1.6 1.9

Page 74: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Scatter Plot

0

20

40

60

80

100

120

140

0 20 40 60 80 100 120 140

Verbal IQ

Mat

h I

Q

Page 75: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Scatter Plot

0

20

40

60

80

100

120

140

0 20 40 60 80 100 120 140

Verbal IQ

Mat

h I

Q

(84,80)

Page 76: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Scatter Plot

60

70

80

90

100

110

120

130

60 70 80 90 100 110 120 130

Verbal IQ

Mat

h I

Q

Page 77: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Some Scatter Patterns

Page 78: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

-100

-50

0

50

100

150

200

250

40 60 80 100 120 140

Page 79: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

-100

-50

0

50

100

150

200

250

40 60 80 100 120 140

Page 80: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

• Circular

• No relationship between X and Y

• Unable to predict Y from X

Page 81: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

160

40 60 80 100 120 140

Page 82: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

160

40 60 80 100 120 140

Page 83: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

• Ellipsoidal

• Positive relationship between X and Y

• Increases in X correspond to increases in Y (but not always)

• Major axis of the ellipse has positive slope

Page 84: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

160

40 60 80 100 120 140

Page 85: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Example

Verbal IQ, MathIQ

Page 86: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Scatter Plot

60

70

80

90

100

110

120

130

60 70 80 90 100 110 120 130

Verbal IQ

Mat

h I

Q

Page 87: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Some More Patterns

Page 88: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

40 60 80 100 120 140

Page 89: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

40 60 80 100 120 140

Page 90: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

• Ellipsoidal (thinner ellipse)

• Stronger positive relationship between X and Y

• Increases in X correspond to increases in Y (more freqequently)

• Major axis of the ellipse has positive slope

• Minor axis of the ellipse much smaller

Page 91: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

40 60 80 100 120 140

Page 92: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

• Increased strength in the positive relationship between X and Y

• Increases in X correspond to increases in Y (almost always)

• Minor axis of the ellipse extremely small in relationship to the Major axis of the ellipse.

Page 93: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

40 60 80 100 120 140

Page 94: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

40 60 80 100 120 140

Page 95: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

• Perfect positive relationship between X and Y

• Y perfectly predictable from X

• Data falls exactly along a straight line with positive slope

Page 96: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

40 60 80 100 120 140

Page 97: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

40 60 80 100 120 140

Page 98: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

• Ellipsoidal

• Negative relationship between X and Y

• Increases in X correspond to decreases in Y (but not always)

• Major axis of the ellipse has negative slope slope

Page 99: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

40 60 80 100 120 140

Page 100: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

• The strength of the relationship can increase until changes in Y can be perfectly predicted from X

Page 101: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

40 60 80 100 120 140

Page 102: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

40 60 80 100 120 140

Page 103: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

40 60 80 100 120 140

Page 104: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

40 60 80 100 120 140

Page 105: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

40 60 80 100 120 140

Page 106: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Some Non-Linear Patterns

Page 107: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

200

400

600

800

1000

1200

-20 -10 0 10 20 30 40 50

Page 108: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

200

400

600

800

1000

1200

-20 -10 0 10 20 30 40 50

Page 109: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

• In a Linear pattern Y increase with respect to X at a constant rate

• In a Non-linear pattern the rate that Y increases with respect to X is variable

Page 110: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Growth Patterns

Page 111: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

-20

0

20

40

60

80

100

120

0 10 20 30 40 50

Page 112: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

-150

-100

-50

0

50

100

150

0 10 20 30 40 50

-20

0

20

40

60

80

100

120

0 10 20 30 40 50

Page 113: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

• Growth patterns frequently follow a sigmoid curve

• Growth at the start is slow

• It then speeds up

• Slows down again as it reaches it limiting size

0

20

40

60

80

100

120

0 10 20 30 40 50

Page 114: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Measures of strength of a relationship (Correlation)

• Pearson’s correlation coefficient (r)

• Spearman’s rank correlation coefficient (rho, )

Page 115: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Assume that we have collected data on two variables X and Y. Let

(x1, y1) (x2, y2) (x3, y3) … (xn, yn)

denote the pairs of measurements on the on two variables X and Y for n cases in a sample (or population)

Page 116: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

From this data we can compute summary statistics for each variable.

The means

and

n

xx

n

ii

1

n

yy

n

ii

1

Page 117: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

The standard deviations

and

11

2

n

xxs

n

ii

x

11

2

n

yys

n

ii

y

Page 118: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

These statistics:

• give information for each variable separately

but

• give no information about the relationship between the two variables

x yxs ys

Page 119: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Consider the statistics:

n

iixx xxS

1

2

n

iiyy yyS

1

2

n

iiixy yyxxS

1

Page 120: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

The first two statistics:

• are used to measure variability in each variable

• they are used to compute the sample standard deviations

n

iixx xxS

1

2

n

iiyy yyS

1

2and

1

n

Ss xx

x 1

n

Ss yy

y

Page 121: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

The third statistic:

• is used to measure correlation• If two variables are positively related the sign of

will agree with the sign of

n

iiixy yyxxS

1

xxi

yyi

Page 122: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

•When is positive will be positive.

•When xi is above its mean, yi will be above its

mean

•When is negative will be negative.

•When xi is below its mean, yi will be below its

mean

The product will be positive for most cases.

xxi yyi

xxi yyi

yyxx ii

Page 123: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

This implies that the statistic

• will be positive

• Most of the terms in this sum will be positive

n

iiixy yyxxS

1

Page 124: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

On the other hand

• If two variables are negatively related the sign of

will be opposite in sign to

xxi

yyi

Page 125: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

•When is positive will be negative.

•When xi is above its mean, yi will be below its

mean

•When is negative will be positive.

•When xi is below its mean, yi will be above its

mean

The product will be negative for most cases.

xxi yyi

xxi yyi

yyxx ii

Page 126: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Again implies that the statistic

• will be negative

• Most of the terms in this sum will be negative

n

iiixy yyxxS

1

Page 127: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Pearsons correlation coefficient is defined as below:

n

ii

n

ii

n

iii

yyxx

xy

yyxx

yyxx

SS

Sr

1

2

1

2

1

Page 128: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

The denominator:

is always positive

n

ii

n

ii yyxx

1

2

1

2

Page 129: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

The numerator:

• is positive if there is a positive relationship between X ad Y and

• negative if there is a negative relationship between X ad Y.

• This property carries over to Pearson’s correlation coefficient r

n

iii yyxx

1

Page 130: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Properties of Pearson’s correlation coefficient r

1. The value of r is always between –1 and +1.2. If the relationship between X and Y is positive, then

r will be positive.3. If the relationship between X and Y is negative,

then r will be negative.4. If there is no relationship between X and Y, then r

will be zero.

5. The value of r will be +1 if the points, (xi, yi) lie on a straight line with positive slope.

6. The value of r will be -1 if the points, (xi, yi) lie on a straight line with negative slope.

Page 131: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

40 60 80 100 120 140

r =1

Page 132: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

40 60 80 100 120 140

r = 0.95

Page 133: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

40 60 80 100 120 140

r = 0.7

Page 134: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

160

40 60 80 100 120 140

r = 0.4

Page 135: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

-100

-50

0

50

100

150

200

250

40 60 80 100 120 140

r = 0

Page 136: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

40 60 80 100 120 140

r = -0.4

Page 137: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

40 60 80 100 120 140

r = -0.7

Page 138: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

40 60 80 100 120 140

r = -0.8

Page 139: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

40 60 80 100 120 140

r = -0.95

Page 140: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

0

20

40

60

80

100

120

140

40 60 80 100 120 140

r = -1

Page 141: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Computing formulae for the statistics:

n

iixx xxS

1

2

n

iiyy yyS

1

2

n

iiixy yyxxS

1

Page 142: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

n

x

xxxS

n

iin

ii

n

iixx

2

1

1

2

1

2

n

yx

yx

n

ii

n

iin

iii

11

1

n

y

yyyS

n

iin

ii

n

iiyy

2

1

1

2

1

2

n

iiixy yyxxS

1

Page 143: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

To compute

first compute

Then

xxS yyS xyS

n

iixC

1

2

n

iii yxE

1

n

iiyD

1

2

n

iiyB

1

n

iixA

1

n

ACSxx

2

n

BDS yy

2

n

BAESxy

Page 144: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Example

Verbal IQ, MathIQ

Page 145: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

 Data Set #3

The following table gives data on Verbal IQ, Math IQ,Initial Reading Acheivement Score, and Final Reading Acheivement Score

for 23 students who have recently completed a reading improvement program 

Initial FinalVerbal Math Reading Reading

Student IQ IQ Acheivement Acheivement 

1 86 94 1.1 1.72 104 103 1.5 1.73 86 92 1.5 1.94 105 100 2.0 2.05 118 115 1.9 3.56 96 102 1.4 2.47 90 87 1.5 1.88 95 100 1.4 2.09 105 96 1.7 1.7

10 84 80 1.6 1.711 94 87 1.6 1.712 119 116 1.7 3.113 82 91 1.2 1.814 80 93 1.0 1.715 109 124 1.8 2.516 111 119 1.4 3.017 89 94 1.6 1.818 99 117 1.6 2.619 94 93 1.4 1.420 99 110 1.4 2.021 95 97 1.5 1.322 102 104 1.7 3.123 102 93 1.6 1.9

Page 146: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Scatter Plot

60

70

80

90

100

110

120

130

60 70 80 90 100 110 120 130

Verbal IQ

Mat

h I

Q

Page 147: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Now

Hence

2214941

2

n

iix 227199

1

n

iii yx234363

1

2

n

iiy

23071

n

iiy2244

1

n

iix

652.255723

2244221494

2

xxS

87.296023

2307234363

2

yyS

043.2116

23

23072244227199 xyS

Page 148: Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.

Thus Pearsons correlation coefficient is:

yyxx

xy

SS

Sr

769.087.2960652.2557

043.2116