1
MEASURES OF VARIABILITY
• Variance– Population variance – Sample variance
• Standard Deviation– Population standard deviation – Sample standard deviation
• Coefficient of Variation (CV)– Sample CV– Population CV
2
MEASURES OF VARIABILITYPOPULATION VARIANCE
• The population variance is the mean squared deviation from the population mean:
• Where 2 stands for the population variance is the population mean• N is the total number of values in the population• is the value of the i-th observation.• represents a summation
N
xN
ii
12
)(
ix
3
MEASURES OF VARIABILITYSAMPLE VARIANCE
• The sample variance is defined as follows:
• Where s2 stands for the sample variance• is the sample mean• n is the total number of values in the sample• is the value of the i-th observation.• represents a summation
112
n
xxs
N
ii )(
ix
x
4
MEASURES OF VARIABILITYSAMPLE VARIANCE
• A sample of monthly advertising expenses (in 000$) is taken. The data for five months are as follows: 2.5, 1.3, 1.4, 1.0 and 2.0. Compute the sample variance.
5
MEASURES OF VARIABILITYSAMPLE VARIANCE
• Notice that the sample variance is defined as the sum of the squared deviations divided by n-1.
• Sample variance is computed to estimate the population variance.
• An unbiased estimate of the population variance may be obtained by defining the sample variance as the sum of the squared deviations divided by n-1 rather than by n.
• Defining sample variance as the mean squared deviation from the sample mean tends to underestimate the population variance.
6
MEASURES OF VARIABILITYSAMPLE VARIANCE
• A shortcut formula for the sample variance:
• Where s2 is the sample variance• n is the total number of values in the sample• is the value of the i-th observation.• represents a summation
n
x
xn
s
n
iin
ii
2
1
1
22
1
1
ix
7
MEASURES OF VARIABILITYSAMPLE VARIANCE
• A sample of monthly sales expenses (in 000 units) is taken. The data for five months are as follows: 264, 116, 165, 101 and 209. Compute the sample variance using the short-cut formula.
8
MEASURES OF VARIABILITYSAMPLE VARIANCE
• The shortcut formula for the sample variance:
• If you have the sum of the measurements already computed, the above formula is a shortcut because you need only to compute the sum of the squares,
n
x
xn
s
n
iin
ii
2
1
1
22
1
1
n
iix
1
n
iix
1
9
MEASURES OF VARIABILITY POPULATION/SAMPLE STANDARD DEVIATION
• The standard deviation is the positive square root of the variance:
Population standard deviation:
Sample standard deviation: • Compute the standard deviations of advertising and
sales.
2ss
2
10
MEASURES OF VARIABILITY POPULATION/SAMPLE STANDARD DEVIATION
• Compute the sample standard deviation of advertising data: 2.5, 1.3, 1.4, 1.0 and 2.0
• Compute the sample standard deviation of sales data: 264, 116, 165, 101 and 209
11
MEASURES OF VARIABILITY POPULATION/SAMPLE CV
• The coefficient of variation is the standard deviation divided by the means
Population coefficient of variation:
Sample coefficient of variation:x
scv
CV
12
MEASURES OF VARIABILITY POPULATION/SAMPLE CV
• Compute the sample coefficient of variation of advertising data: 2.5, 1.3, 1.4, 1.0 and 2.0
• Compute the sample coefficient of variation of sales data: 264, 116, 165, 101 and 209
13
MEASURES OF ASSOCIATION
• Scatter diagram plot provides a graphical description of positive/negative, linear/non-linear relationship
• Some numerical description of the positive/negative, linear/non-linear relationship are obtained by:– Covariance
• Population covariance• Sample covariance
– Coefficient of correlation• Population coefficient of correlation• Sample coefficient of correlation
14
• A sample of monthly advertising and sales data are collected and shown below:
• How is the relationship between sales and advertising? Is the relationship linear/non-linear, positive/negative, etc.
MEASURES OF ASSOCIATION: EXAMPLE
Sales AdvertisingMonth (000 units) (000 $)
1 264 2.52 116 1.33 165 1.44 101 1.05 209 2.0
15
POPULATION COVARIANCE
• The population covariance is mean of products of deviations from the population mean:
• Where COV(X,Y) is the population covariance x, y are the population means of X and Y respectively
• N is the total number of values in the population• are the values of the i-th observations of X and Y
respectively.• represents a summation
N
yxYXCOV
N
iyixi
1
),(
ii yx ,
16
SAMPLE COVARIANCE
• The sample covariance is mean of products of deviations from the sample mean:
• Where cov(X,Y) is the sample covariance• are the sample means of X and Y respectively• n is the total number of values in the population• are the values of the i-th observations of X and Y
respectively.• represents a summation
1
1
1
n
yyxx)Y,Xcov(
n
iii
ii yx ,
y,x
17
SAMPLE COVARIANCE
Advertising SalesMonth (in 000$) (in 000 units)
1 2.5 2642 1.3 1163 1.4 1654 1 1015 2 209
Mean 1.64 171 Total=SD 0.602495 67.18258703 cov =
18
POPULATION/SAMPLE COVARIANCE
• If two variables increase/decrease together, covariance is a large positive number and the relationship is called positive.
• If the relationship is such that when one variable increases, the other decreases and vice versa, then covariance is a large negative number and the relationship is called negative.
• If two variables are unrelated, the covariance may be a small number.
• How large is large? How small is small?
19
POPULATION/SAMPLE COVARIANCE
• How large is large? How small is small? A drawback of covariance is that it is usually difficult to provide any guideline how large covariance shows a strong relationship and how small covariance shows no relationship.
• Coefficient of correlation can overcome this drawback to a certain extent.
20
POPULATION COEFFICIENT OF CORRELATION
• The population coefficient of correlation is the population covariance divided by the population standard deviations of X and Y:
• Where is the population coefficient of correlation• COV(X,Y) is the population covariance x, y are the population means of X and Y
respectively
yx
)Y,X(COV
21
SAMPLE COEFFICIENT OF CORRELATION
• The sample coefficient of correlation is the sample covariance divided by the sample standard deviations of X and Y:
• Where r is the sample coefficient of correlation• cov(X,Y) is the sample covariance
• sx, sy are the sample means of X and Y respectively
yx
)Y,X(COV
22
Advertising SalesMonth (in 000$) (in 000 units)
1 2.5 2642 1.3 1163 1.4 1654 1 1015 2 209
Mean 1.64 171 Total=SD 0.602495 67.18258703 cov =
r =
SAMPLE COEFFICIENT OF CORRELATION
23
POPULATION/SAMPLE COEFFICIENT OF CORRELATION
• The coefficient of correlation is always between -1 and +1.– Values near -1 or +1 show strong relationship– Values near 0 show no relationship’– Values near 1 show strong positive linear
relationship– Values near -1 show strong negative linear
relationship
24
EXAMPLE
• Salary and expenses for cultural activities, and sports related activities are collected from 100 households. Data of only 5 households shown below:
How are the relationships (linear/non-linear, positive/negative)between (i) salary and culture, (ii) salary and sports, and (iii) sports and culture?
Salary and expensesdata for 100 households
Salary Culture Sports$54,600 $1,020 $990$57,500 $1,100 $460$53,300 $900 $780$43,500 $570 $860$57,200 $900 $1,390
25
SALARY-CULTURE
$0
$400
$800
$1,200
$1,600
$35,000 $55,000 $75,000 $95,000
Salary
Ex
pe
ns
es
fo
r C
ult
ura
l A
cti
viti
es
cov = 1094787, r = 0.5065 (positive, linear)
26
SPORTS-CULTURE
0
400
800
1200
1600
$500 $1,000 $1,500 $2,000
Expenses for sports related activities
Ex
pe
ns
es
fo
r c
ult
ura
l a
cti
viti
es
cov = -33608, r = -0.5201 (negative, linear)
27
SALARY-SPORTS
$400
$900
$1,400
$1,900
$35,000 $55,000 $75,000 $95,000
Salary
Ex
pe
ns
es
fo
r s
po
rts
re
late
d a
cti
viti
es
cov = -219026, r = -0.08122 (no linear relationship)
Top Related