Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a...

21
Descriptive Statistics Measures of Variation

Transcript of Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a...

Page 1: Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.

Descriptive Statistics

Measures of Variation

Page 2: Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.

Essentials: Measures of Variation(Variation – a must for statistical analysis.)

• Know the types of measures used to look at variation and the type data to which they apply.

• Be able to calculate the range, standard deviation and inter-quartile range.

• Be able to determine the distance away from the mean a given value lies in terms of standard deviations (think z-score).

• Be able to apply the Empirical Rule and Chebychev’s Theorem to specific situations.

Page 3: Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.

Measures of Variation

• Range

• Variance

• Standard Deviation

• Interquartile Range • (IQR; see Measures of Position)

Page 4: Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.

Range

• The Range of a data set is the difference between the highest value and the lowest value.

• Example: Given – the following data values, identify the range of the distribution.

• Values: 2, 4, 6, 8, 10• Range = 10 – 2 = 8

Page 5: Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.

Variance

• For a sample the variance is a measure of variation equal to the sum of the squared deviation scores divided by n-1. It is also the square of the standard deviation.

Sample Variance:1

)( 22

nxxs

Page 6: Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.

Sample Standard Deviation

• Standard deviation is a measure of the typical amount an entry deviates (or varies) from the mean.

• The more the entries are spread out, the greater the standard deviation.

• Sample Standard Deviation(s): Definition Formula Calculation Formula

1)( 2

nxxs

)1(

)( 22

nn

xxns

Page 7: Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.

Interpreting Standard Deviation

• Standard deviation is a measure of the typical amount an entry deviates from the mean.

• The more the entries are spread out, the greater the standard deviation.

Larson/Farber 4th ed. 7

Page 8: Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.

112 117 122 127

0

10

20

30

40

50

60

Fre

qu

en

cy

Mean = 120 Standard Deviation = 2 n = 500

80 130 180

0

10

20

30

40

50

60

70

80

Fre

quency

Mean = 120 Standard Deviation = 20 n = 500

Anatomy of the Standard Deviation

The value of the Standard Deviation tells us how closely the values of observations for a data set areclustered around the mean. A lower value of the Standard Deviation for a data set indicates that the values of that data set are spread over a relatively smaller range around the mean. A large value of the Standard Deviation for a data set indicates that the values of that data set are spread over a relatively larger range around the mean.

The Standard Deviation is the most used measure of dispersion (how spread out the data are from one another).

NOTATIONWhen we refer to the Population Standard Deviation, it is denoted by

When we refer to the Sample Standard Deviation, it is denoted by s

Page 9: Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.

Interpreting Standard Deviation: Empirical Rule (68 – 95 – 99.7 Rule)

For data with a (symmetric) bell-shaped distribution, the standard deviation has the following characteristics:

• About 68.26% of the data lie within one standard deviation of the mean.

• About 95.44% of the data lie within two standard deviations of the mean.

• About 99.74% of the data lie within three standard deviations of the mean.

Page 10: Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.

Interpreting Standard Deviation: Empirical Rule (68 – 95 – 99.7 Rule)

Source: Larson/Farber 4th ed.

3x s x s 2x s 3x sx s x2x s

68% within 1 standard deviation

34% 34%

99.7% within 3 standard deviations

2.35% 2.35%

95% within 2 standard deviations

13.5% 13.5%

Page 11: Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.

Example: Using the Empirical Rule

Example: In a survey conducted by the National Center for Health Statistics, the sample mean height of women in the United States (ages 20-29) was 64 inches, with a sample standard deviation of 2.71 inches. Estimate the percent of the women whose heights are between 64 inches and 69.42 inches.

Source: Larson/Farber 4th ed.

Page 12: Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.

Solution: Using the Empirical Rule

Source: Larson/Farber 4th ed.

3x s x s 2x s 3x sx s x2x s55.87 58.58 61.29 64 66.71 69.42 72.13

34%

13.5%

• Because the distribution is bell-shaped, you can use the Empirical Rule.

34% + 13.5% = 47.5% of women are between 64 and 69.42 inches tall. (64 + 2.71 = 66.71 + 2.71 = 69.42; all inches)

Page 13: Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.

ADDITIONAL TOPICS

Page 14: Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.

Range Rule of Thumb

• To obtain a rough estimate of the standard deviation, s,

• Conversely, the “minimum” value would be approximately equal to the mean – 2*(standard deviation). The “maximum” value would be approximately equal to the mean + 2*(standard deviation).

4ranges

Page 15: Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.

Population Variance & Standard Deviation

• The population variance, sigma-squared is a measure of variation equal to the sum of the squared deviation scores divided by N. It is also the square of the standard deviation.

Population Variance:

Population Standard Deviation:

Nx 2

2 )(

Nx 2)(

Page 16: Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.

Interquartile Range (IQR)• The Interquartile Range is a

measure of variation. It is the difference between the first quartile, Q1(25th percentile) and the third quartile, Q3 (75th percentile).

Page 17: Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.

• The Interquartile Range enables us to determine the existence of outliers.

• Outliers exist in a data set if any of the values are – Less than or– Greater than

)(5.11 IQRQ

)(5.13 IQRQ

Page 18: Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.

Chebyshev’s Theorem

• The Empirical Rule applies if the distribution of the data is approximately bell-shaped.

• Chebyshev’s Theorem applies to distributions regardless of shape. It states that the proportion (fraction) of data lying within K standard deviations of the mean is always at least 1 – 1/K2, where K is any possible number > 1.– When K = 2: At least 3/4 (75%) of all values lie within 2 standard deviations of the mean.– When K = 3: At least 8/9 (89%) of all values lie within 3 standard deviations of the mean.

2

1 31 or 75%

2 4

2

1 81 or 88.9%

3 9

Page 19: Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.

Example: Using Chebychev’s Theorem

The age distribution for Florida is shown in the histogram. Apply Chebychev’s Theorem to the data using k = 2. What can you conclude?

Source: Larson/Farber 4th ed.

Page 20: Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.

Given k = 2: •Two S.D. below the mean = μ – 2σ = 39.2 – 2(24.8) = -10.4 (use 0 since age can’t be negative)•Two S.D. above the mean = μ + 2σ = 39.2 + 2(24.8) = 88.8

Source: Larson/Farber 4th ed.

At least 75% of the population of Florida is between 0 and 88.8 years old.

Solution: Using Chebychev’s Theorem

Page 21: Descriptive Statistics Measures of Variation. Essentials: Measures of Variation (Variation – a must for statistical analysis.) Know the types of measures.

End of Slides