Measures of Variability
description
Transcript of Measures of Variability
![Page 1: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/1.jpg)
Measures of Variability
00.020.040.060.080.10.120.14
0 5 10 15 20 25
Variability
![Page 2: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/2.jpg)
Measure of Variability (Dispersion, Spread)
• Variance, standard deviation• Range• Inter-Quartile Range• Pseudo-standard deviation
![Page 3: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/3.jpg)
Range
![Page 4: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/4.jpg)
Range
DefinitionLet min = the smallest observationLet max = the largest observationThen Range =max - min
00.020.040.060.080.10.120.14
0 5 10 15 20 25
Range
![Page 5: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/5.jpg)
Inter-Quartile Range (IQR)
![Page 6: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/6.jpg)
Inter-Quartile Range (IQR)
DefinitionLet Q1 = the first quartile,
Q3 = the third quartile
Then the Inter-Quartile Range
= IQR = Q3 - Q1
![Page 7: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/7.jpg)
00.020.040.060.080.10.120.14
0 5 10 15 20 25Q1 Q3
25% 25%
50%
Inter-Quartile Range
![Page 8: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/8.jpg)
Example
The data Verbal IQ on n = 23 students arranged in increasing order is:80 82 84 86 86 89 90 9494 95 95 96 99 99 102 102104 105 105 109 111 118 119
![Page 9: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/9.jpg)
Example
The data Verbal IQ on n = 23 students arranged in increasing order is:
80 82 84 86 86 89 90 94 94 95 95 96 99 99 102 102 104 105 105 109 111 118 119
Q2 = 96Q1 = 89 Q3 = 105min = 80 max = 119
![Page 10: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/10.jpg)
Range
Range = max – min = 119 – 80 = 39
Inter-Quartile Range = IQR = Q3 - Q1 = 105 – 89 = 16
![Page 11: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/11.jpg)
Some Comments
• Range and Inter-quartile range are relatively easy to compute.
• Range slightly easier to compute than the Inter-quartile range.
• Range is very sensitive to outliers (extreme observations)
![Page 12: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/12.jpg)
Varianceand
Standard deviation
![Page 13: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/13.jpg)
Sample Variance
Let x1, x2, x3, … xn denote a set of n numbers.
Recall the mean of the n numbers is defined as:
nxxxxx
n
xx nn
n
ii
13211
![Page 14: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/14.jpg)
The numbers
are called deviations from the the mean
xxd 11
xxd 22
xxd 33
xxd nn
![Page 15: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/15.jpg)
The sum
is called the sum of squares of deviations from the the mean.Writing it out in full:
or
n
ii
n
ii xxd
1
2
1
2
223
22
21 ndddd
222
21 xxxxxx n
![Page 16: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/16.jpg)
The Sample Variance
Is defined as the quantity:
and is denoted by the symbol
111
2
1
2
n
xx
n
dn
ii
n
ii
2sCommentOne might think that the divisor in variance should be n. For certain reasons it was found that a divisor of n – 1, resulted in a estimator with a particular desirable property – unbiasedness
![Page 17: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/17.jpg)
ExampleLet x1, x2, x3, x3 , x4, x5 denote a set of 5 denote the set of numbers in the following table.
i 1 2 3 4 5
xi 10 15 21 7 13
![Page 18: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/18.jpg)
Then= x1 + x2 + x3 + x4 + x5
= 10 + 15 + 21 + 7 + 13= 66
and
5
1iix
nxxxxx
n
xx nn
n
ii
13211
2.135
66
![Page 19: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/19.jpg)
The deviations from the mean d1, d2, d3, d4, d5 are given in the following table.
i 1 2 3 4 5
xi 10 15 21 7 13
di -3.2 1.8 7.8 -6.2 -0.2
![Page 20: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/20.jpg)
The sum
and
n
ii
n
ii xxd
1
2
1
2
22222 2.02.68.78.12.3
80.11204.044.3884.6024.324.10
2.28
48.112
11
2
2
n
xxs
n
ii
![Page 21: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/21.jpg)
The Sample Standard Deviation s
Definition: The Sample Standard Deviation is defined by:
Hence the Sample Standard Deviation, s, is the square root of the sample variance.
111
2
1
2
n
xx
n
ds
n
ii
n
ii
![Page 22: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/22.jpg)
In the last example
31.52.28
48.112
11
2
2
n
xxss
n
ii
![Page 23: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/23.jpg)
Interpretations of s
• In Normal distributions– Approximately 2/3 of the observations will lie
within one standard deviation of the mean– Approximately 95% of the observations lie
within two standard deviations of the mean– In a histogram of the Normal distribution, the
standard deviation is approximately the distance from the mode to the inflection point
![Page 24: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/24.jpg)
00.020.040.060.080.10.120.14
0 5 10 15 20 25
s
Inflection point
Mode
![Page 25: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/25.jpg)
s
2/3
s
![Page 26: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/26.jpg)
2s
![Page 27: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/27.jpg)
ExampleA researcher collected data on 1500 males aged 60-65.The variable measured was cholesterol and blood pressure.
– The mean blood pressure was 155 with a standard deviation of 12.
– The mean cholesterol level was 230 with a standard deviation of 15
– In both cases the data was normally distributed
![Page 28: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/28.jpg)
Interpretation of these numbers• Blood pressure levels vary about the value
155 in males aged 60-65.• Cholesterol levels vary about the value 230
in males aged 60-65.
![Page 29: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/29.jpg)
• 2/3 of males aged 60-65 have blood pressure within 12 of 155. Ii.e. between 155-12 =143 and 155+12 = 167.
• 2/3 of males aged 60-65 have Cholesterol within 15 of 230. i.e. between 230-15 =215 and 230+15 = 245.
![Page 30: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/30.jpg)
• 95% of males aged 60-65 have blood pressure within 2(12) = 24 of 155. Ii.e. between 155-24 =131 and 155+24 = 179.
• 95% of males aged 60-65 have Cholesterol within 2(15) = 30 of 230. i.e. between 230-30 =200 and 230+30 = 260.
![Page 31: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/31.jpg)
A Computing formula for:
Sum of squares of deviations from the the mean :
The difficulty with this formula is that will have many decimals.The result will be that each term in the above sum will also have many decimals.
n
ii xx
1
2
x
![Page 32: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/32.jpg)
The sum of squares of deviations from the the mean can also be computed using the following identity:
n
xxxx
n
iin
ii
n
ii
2
1
1
2
1
2
![Page 33: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/33.jpg)
To use this identity we need to compute:
and 211
n
n
ii xxxx
222
21
1
2n
n
ii xxxx
![Page 34: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/34.jpg)
Then:
n
xxxx
n
iin
ii
n
ii
2
1
1
2
1
2
11 and
2
1
1
2
1
2
2
nn
xx
n
xxs
n
iin
ii
n
ii
![Page 35: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/35.jpg)
11
and2
1
1
2
1
2
nn
xx
n
xxs
n
iin
ii
n
ii
![Page 36: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/36.jpg)
Example
The data Verbal IQ on n = 23 students arranged in increasing order is:80 82 84 86 86 89 90 9494 95 95 96 99 99 102 102104 105 105 109 111 118 119
![Page 37: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/37.jpg)
= 80 + 82 + 84 + 86 + 86 + 89 + 90 + 94 + 94 + 95 + 95 + 96
+ 99 + 99 + 102 + 102 + 104 + 105 + 105 + 109 + 111 + 118 + 119 = 2244
= 802 + 822 + 842 + 862 + 862 + 892
+ 902 + 942 + 942 + 952 + 952 + 962 + 992 + 992 + 1022 + 1022 + 1042 + 1052 + 1052 + 1092 + 1112 + 1182 + 1192 = 221494
n
iix
1
n
iix
1
2
![Page 38: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/38.jpg)
Then:
n
xxxx
n
iin
ii
n
ii
2
1
1
2
1
2
652.255723
22442214942
![Page 39: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/39.jpg)
11 and
2
1
1
2
1
2
2
nn
xx
n
xxs
n
iin
ii
n
ii
26.116
22652.2557
2223
22442214942
![Page 40: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/40.jpg)
11 Also
2
1
1
2
1
2
nn
xx
n
xxs
n
iin
ii
n
ii
26.116
22652.2557
2223
22442214942
782.10
![Page 41: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/41.jpg)
A quick (rough) calculation of s
The reason for this is that approximately all (95%) of the observations are between andThus
4Range
s
sx 2.2sx
sx 2max .2min and sx .22minmax and sxsxRange
s4
4Range Hence s
![Page 42: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/42.jpg)
Example
Verbal IQ on n = 23 students min = 80 and max = 119
This compares with the exact value of s which is 10.782.The rough method is useful for checking your calculation of s.
75.94
394
80-119s
![Page 43: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/43.jpg)
The Pseudo Standard Deviation (PSD)
![Page 44: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/44.jpg)
The Pseudo Standard Deviation (PSD)
Definition: The Pseudo Standard Deviation (PSD) is defined by:
35.1Range ileInterQuart
35.1IQRPSD
![Page 45: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/45.jpg)
Properties
• For Normal distributions the magnitude of the pseudo standard deviation (PSD) and the standard deviation (s) will be approximately the same value
• For leptokurtic distributions the standard deviation (s) will be larger than the pseudo standard deviation (PSD)
• For platykurtic distributions the standard deviation (s) will be smaller than the pseudo standard deviation (PSD)
![Page 46: Measures of Variability](https://reader036.fdocuments.in/reader036/viewer/2022062305/56815aa2550346895dc828c6/html5/thumbnails/46.jpg)
Example
Verbal IQ on n = 23 students Inter-Quartile Range
= IQR = Q3 - Q1 = 105 – 89 = 16
Pseudo standard deviation
This compares with the standard deviation
85.1135.1
1635.1
IQRPSD
782.10s