Variability

22
Range and Interquartile Range Standard Deviation and Variance

Transcript of Variability

Page 1: Variability

Range and Interquartile Range

Standard Deviation and Variance

Page 2: Variability

Variability is a measure of how different scores are from one another within a set of data.

Synonyms: spread,dispersion.

Does the amount of variability (spread, dispersion) make a difference? Do we care?

How could we measure the amount of variability?

Bookhaven by waffler at http://www.flickr.com/photos/adrian_s/23441729/

Page 3: Variability

Julian and Delia ask for help

Their mean quiz score is the same:M = 15 (out of 25, on 20 quizzes)

Do we know enough to help them?

Let’s look at the actual scores for each student

Page 4: Variability

Best score = 18

Worst score = 13

Range=18.5-12.5=6

Range of middle 50% is IQR=16.5-13.5 =3

Scores are pretty similaracross all 20 quizzes

Julian’s Quiz Scores (Mean = 15.0)

16 14 15 14 16 14 14 17 14 13 15 15 16 15 18 15 14 16 14 15

Page 5: Variability

Delia’s Quiz Scores (Mean = 15.0)

15 22 10 11 16 13 20 13 17 8 18 16 12 14 9 19 18 13 21 15

Best score = 22

Worst score = 8

Range=22.5 – 7.5 = 15

Range of middle 50% isIQR=18.5-12.5 =6

Scores seem to differ quite a bit from quiz to quiz

Page 6: Variability

Range: Distance from highest number to lowest number (may require real limits)

Interquartile Range: Distance from 25% point to 75% point (range of middle 50% of scores)

• Quartile: 25th, 50th (median), and 75th percentiles

• The 25th and 75th are midpoint of each half

• May be an exact value, or may be between two values, using the same rules as the median.

Page 7: Variability

Essentials of Statistics for Behavioral Science, 6th Edition by Frederick Gravetter and Larry Wallnau Copyright 2008 Wadsworth Publishing, a division of Thomson Learning. All rights reserved.

Page 8: Variability

Developed by John Tukey to display central tendency & variability efficiently

Box = middle 50% of cases Top = 75th percentileBottom = 25th percentileHeight = Interquartile Range IQR

Line inside the box = MEDIANIf line is not centered, data arenot perfectly symmetric.

Line (“whiskers”) extend to minimum or maximum values within 1.5 IQR

Page 9: Variability
Page 10: Variability

Outliers: Beyond 1.5 IQR from edge of box

Extremes: More than 3 IQRs from edge of box

http://web.anglia.ac.uk/numbers/common_folder/graphics/fig6_single_box.jpg

Page 11: Variability
Page 12: Variability

Range: Distance from highest number to lowest number (may require real limits)

Interquartile Range: Distance from 25% point to 75% point (range of middle 50% of scores)

Average deviation: Sum of deviations of scores from M, divided by N = (X-M) / N

Page 13: Variability

Range: Distance from highest number to lowest number (may require real limits)

Interquartile Range: Distance from 25% point to 75% point (range of middle 50% of scores)

Average deviation: Sum of deviations of scores from M, divided by N = (X-M) / NDOESN’T WORK: ALWAYS EQUALS 0.

Page 14: Variability

Square the deviations so all scores positive

Sum of Squares (SS) used in most inferential statistics

SS is in the numerator of a fraction for both Variance and Standard Deviation

X Mean (X–Mean) (X-Mean)2

Jim 48 138.75 -90.75 8235.5625

Orlend 27 138.75 -111.75 12488.0625

Ellen 189 138.75 50.25 2525.0625

Steve 136 138.75 -2.75 7.5625

Jose 250 138.75 111.25 12376.5625

Tabia 218 138.75 79.25 6280.5625

Kaleb 151 138.75 12.25 150.0625

Lisa 201 138.75 62.25 3875.0625

Pavlik 78 138.75 -60.75 3690.5625

Kris 163 138.75 24.25 588.0625

Emma 106 138.75 -32.75 1072.5625

Michael 98 138.75 -40.75 1660.5625

Mean= 138.75 Sum= 0 52950.25

Called SS or “Sum of Squares” which

means Sum of Squared Deviations from the Mean

Page 15: Variability

Julian Mean X-M (X-M)2

16 15 1 114 15 -1 115 15 0 014 15 -1 116 15 1 114 15 -1 114 15 -1 117 15 2 414 15 -1 113 15 -2 415 15 0 015 15 0 016 15 1 115 15 0 018 15 3 915 15 0 014 15 -1 116 15 1 114 15 -1 115 15 0 0

Sum 300 Sum 0 28 SSMean 15 1.474 Variance

1.214 Std Dev

2)( MXSS

Page 16: Variability

Delia Scores X X2

15 225

22 484

10 100

11 121

16 256

13 169

20 400

13 169

17 289

8 64

18 324

16 256

12 144

14 196

9 81

19 361

18 324

13 169

21 441

15 225

Sum 300 4798

N 20 60.1 SS

3.163Variance

1.779Std Dev

N

XXSS

2

2)(

Page 17: Variability
Page 18: Variability

Range: Distance from highest number to lowest number (may require real limits)

Interquartile Range: Distance from 25% point to 75% point (range of middle 50% of scores)

Average deviation: Sum deviations from M, divide by N. Doesn’t work. Always = 0.

Variance: Average of squared deviation scores

2

n

SS

n

XX

2

2)(

Page 19: Variability

Variance: Average of squared deviation scores.

Standard Deviation: Square root of Variance

n

SS

n

MX

2

2)(

n

SS

n

MX

2)(

Page 20: Variability

STANDARD DEVIATIONFOR A SAMPLE

The data we use are from a randomly selected sample

Numerator of fraction is Sum of Squares (SS)

Denominator of fraction is n – 1

Symbols: s or SD

Excel function: STDEV

STANDARD DEVIATIONFOR A POPULATION

The data we use are from all members of population

Numerator of fraction is Sum of Squares (SS)

Denominator of fraction is n

Symbols: σ (Greek sigma = s)

Excel function: STDEVP

1

)(

1

2

n

XX

n

SSs

n

XX

n

SS

2)(

Page 21: Variability

Range and Interquartile Range use only a few scores.

Standard deviation and variance use all scores

Measure of Variability

Can be used with … Best or most commonly used for …

Percentages Any type of data Categorical / Nominal data

Range or Interquartile Range

Data with order (low to high)and equal intervals(Interval or Ratio data)

•Open-ended categories•Indeterminate values•Extreme or skewed values

Boxplot Interval or Ratio dataGood for graph

•Comparing variability in two or more groups (Ch 8-14)

Standard Deviation or Variance

Interval or Ratio data with no open-ended or indeterminate values

•Any situation in which it can be appropriately computed•Inferential Statistics

Page 22: Variability

Range and Interquartile Range

Standard Deviation and Variance