Variance and Standard Deviation (3) Frequency Distributions.

23
Standard Deviation (3) Standard Deviation (3) Frequency Frequency Distributions Distributions

Transcript of Variance and Standard Deviation (3) Frequency Distributions.

Page 1: Variance and Standard Deviation (3) Frequency Distributions.

Variance and Variance and Standard Deviation (3)Standard Deviation (3)

Frequency DistributionsFrequency Distributions

Page 2: Variance and Standard Deviation (3) Frequency Distributions.

Standard Deviation Standard Deviation = (xi - x)2

n

Standard Deviation can more conveniently be written

… this makes manual calculations much simpler

sx = xi2

- nx2 xi2

- x2

n nor

Page 3: Variance and Standard Deviation (3) Frequency Distributions.

Total visitsVisits to doctors Frequency calculation total

0 3 0 x 3 01 4 1 x 4 42 4 2 x 4 83 5 3 x 5 154 1 4 x 1 45 2 5 x 2 106 1 6 x 1 6

TOTAL 20 47

Frequency Distributions Visits to the doctors:0 1 5 2 1 5 3 3 2 4 3 6 2 3 1 0 1 0 3 2

Mean = 47 / 20 = 2.35

Page 4: Variance and Standard Deviation (3) Frequency Distributions.

Boys Girlscm cm cm cm cm cm cm cm cm cm

23 32 1 32 13 10 16 30 246 5 2 23 3 31 16 26 207 2 3 32 20 15 18 33 2

18 12 7 25 6 1 4 27 131 6 8 32 32 11 19 15 218 8 32 15 23 14 18 12 267 10 2 32 16 14 12 14 183 19 32 17 11 20 3 25 48 2 26 16 11 7 11 10 191 6 3 10 13 10 6 17 18

13 17 23 1 25 26 211 10 28 3 16 15 311 8 23 14 17 15 2710 4 24 15 12 12 2510 2 7 19 4 1 1923 23 8 22 29 17 287 13 6 12 5 30 8

21 18 11 14 10 16 1832 19 9 10 18 17 2332 9 12 4 34 20 19

Ruler Experiment

Page 5: Variance and Standard Deviation (3) Frequency Distributions.

Ruler Experiment - Mean Boys

Reaction (cm) Tally Frequency Mid Point Calculation Total

0 <= R < 5 17 2.5 2.5 x 17 42.5

5 <= R < 10 24 7.5 7.5 x 24 180

10 <= R < 15 16 12.5 12.5 x 16 200

15 <= R < 20 10 17.5 17.5 x 10 175

20 <= R < 25 11 22.5 22.5 x 11 247.5

25 <= R < 30 2 27.5 27.5 x 2 55

30 <= R < 35 10 32.5 32.5 x 10 325

Total 90 1225

Mean 13.6

Girls

Reaction (cm) Tally Frequency Mid Point Calculation Total

0 <= R < 5 10 2.5 2.5 x 10 25

5 <= R < 10 7 7.5 7.5 x 7 52.5

10 <= R < 15 14 12.5 12.5 x 14 175

15 <= R < 20 21 17.5 17.5 x 21 367.5

20 <= R < 25 6 22.5 22.5 x 6 135

25 <= R < 30 9 27.5 27.5 x 9 247.5

30 <= R < 35 3 32.5 32.5 x 3 97.5

Total 70 1100

Mean 15.7

Estimate of Mean = total (based on mean) total frequency

Page 6: Variance and Standard Deviation (3) Frequency Distributions.

Mean within Frequency Distributions

Within frequency distribution, mean is defined as ...

x = xifi

n

fi means frequency

Where data is provided in ranges, the xi value are the mid-point in the range.

It represents an estimate of the mean, since it assumes that values are evenly distributed in the range

Page 7: Variance and Standard Deviation (3) Frequency Distributions.

Standard Deviation with Frequency Distributions

sx = xi2

- x2

n

Previously, we arrived at the formula:

… now x = xifi

nfi means frequency

The xi2

part can also be calculated from the tables

Page 8: Variance and Standard Deviation (3) Frequency Distributions.

BoysMid-Point Squared

Reaction (cm) Tally Frequency Mid Point

Mid-Point Squared x Frequency

0 <= R < 5 17 2.5 6.25 106.25

5 <= R < 10 24 7.5 56.25 1350

10 <= R < 15 16 12.5 156.25 2500

15 <= R < 20 10 17.5 306.25 3062.5

20 <= R < 25 11 22.5 506.25 5568.75

25 <= R < 30 2 27.5 756.25 1512.5

30 <= R < 35 10 32.5 1056.25 10562.5

Total 90 Sum 24662.5

Girls

Mid-Point Squared

Reaction (cm) Tally Frequency Mid Point

Mid-Point Squared x Frequency

0 <= R < 5 10 2.5 6.25 62.5

5 <= R < 10 7 7.5 56.25 393.75

10 <= R < 15 14 12.5 156.25 2187.5

15 <= R < 20 21 17.5 306.25 6431.25

20 <= R < 25 6 22.5 506.25 3037.5

25 <= R < 30 9 27.5 756.25 6806.25

30 <= R < 35 3 32.5 1056.25 3168.75

Total 70 Sum 22087.5

Page 9: Variance and Standard Deviation (3) Frequency Distributions.

Standard Deviation with Frequency Distributions

… with frequency distribution, it becomes

sx = xi2

- x2

n

sx = xi2fi - x2

n

Previously, we arrived at the formula:

… where x = xifi

n

fi means frequency

Page 10: Variance and Standard Deviation (3) Frequency Distributions.

BoysMid-Point Squared

Reaction (cm) Tally Frequency Mid Point

Mid-Point Squared x Frequency

0 <= R < 5 17 2.5 6.25 106.25

5 <= R < 10 24 7.5 56.25 1350

10 <= R < 15 16 12.5 156.25 2500

15 <= R < 20 10 17.5 306.25 3062.5

20 <= R < 25 11 22.5 506.25 5568.75

25 <= R < 30 2 27.5 756.25 1512.5

30 <= R < 35 10 32.5 1056.25 10562.5

Total 90 Sum 24662.5

Girls

Mid-Point Squared

Reaction (cm) Tally Frequency Mid Point

Mid-Point Squared x Frequency

0 <= R < 5 10 2.5 6.25 62.5

5 <= R < 10 7 7.5 56.25 393.75

10 <= R < 15 14 12.5 156.25 2187.5

15 <= R < 20 21 17.5 306.25 6431.25

20 <= R < 25 6 22.5 506.25 3037.5

25 <= R < 30 9 27.5 756.25 6806.25

30 <= R < 35 3 32.5 1056.25 3168.75

Total 70 Sum 22087.5

Boys: xi2fi = 24662.5

sx = xi2fi - x2

n

Girls: xi2fi = 22087.5

X = 15.7

X = 13.6 n = 90

n = 70

sx = 24662.5 - 13.62

90

sx = 22087.5 - 15.72

70

sx = 9.44

sx = 8.31

Page 11: Variance and Standard Deviation (3) Frequency Distributions.

sx = xi2fi - x2

n

Mean = 47 / 20 = 2.35

sx = 167 - 2.352

20

= (8.35 – 5.5225) = 2.8275 = 1.68

x Total visitsVisits to doctors Frequency calculation total x squared freq times x squared

0 3 0 x 3 0 0 01 4 1 x 4 4 1 42 4 2 x 4 8 4 163 5 3 x 5 15 9 454 1 4 x 1 4 16 165 2 5 x 2 10 25 506 1 6 x 1 6 36 36

TOTAL 20 47 167

Page 12: Variance and Standard Deviation (3) Frequency Distributions.

The right average? In a 5 person office:

The boss makes 50KThe 2 secretaries make 14KThe sales rep makes 25KThe trainee sales rep gets 16K

The median pay is 14, 14, 16, 25, 50

16KThe modal pay is 14K

The mean pay is 119K 5 = 23.8K

… which represents the ‘best average’ ?

The boss says “on average you earn over 23K in my office”

The sales rep says “on average you only get 16K in my office”

Page 13: Variance and Standard Deviation (3) Frequency Distributions.

BoysReaction (cm) Tally Frequency

0 <= R < 5 17

5 <= R < 10 24

10 <= R < 15 16

15 <= R < 20 10

20 <= R < 25 11

25 <= R < 30 2

Missed 10

Total 90

Girls

Reaction (cm) Tally Frequency

0 <= R < 5 9

5 <= R < 10 7

10 <= R < 15 14

15 <= R < 20 18

20 <= R < 25 6

25 <= R < 30 9

Missed 7

Total 70

Boys Girlscm cm cm cm cm cm cm cm cm cm

23 miss 1 miss 13 10 16 30 246 5 2 23 3 miss 16 26 207 2 3 miss 20 15 18 miss 218 12 7 25 6 1 4 27 131 6 8 miss miss 11 19 15 miss8 8 miss 15 23 14 18 miss miss7 10 2 miss 16 14 12 14 miss3 19 miss 17 11 20 3 25 218 2 26 16 11 7 11 10 261 6 3 10 13 10 6 17 1213 17 23 1 25 26 211 10 28 3 16 15 311 8 23 14 17 15 2710 4 24 15 12 12 2510 2 7 19 4 1 1923 23 8 22 29 17 287 13 6 12 5 30 821 18 11 14 10 16 18

miss 19 9 10 18 17 23miss 9 12 4 miss 20 19

Suppose this had been our experiment

Cannot calculate a mean and standard deviation, since not all data value are known

Mean, median, spread?

Page 14: Variance and Standard Deviation (3) Frequency Distributions.

Reaction Distaces

0

10

20

30

40

50

60

70

80

90

0 5 10 15 20 25 30 35

cm

Cu

mu

lati

ve

Fre

qu

en

cy

Boys

Girls

You can still estimate the median and inter-quartile ranges

Median boy = 11 cmMedian girl = 17 cm

90 boys tested

70 girls tested

Page 15: Variance and Standard Deviation (3) Frequency Distributions.

Reaction Distaces

0

10

20

30

40

50

60

70

80

90

0 5 10 15 20 25 30 35

cm

Cu

mu

lati

ve

Fre

qu

en

cy

Boys

Girls

You can still estimate the median and inter-quartile ranges

90 boys tested

70 girls tested

Boy IQR = 20 - 6 = 14 cmGirl IQR = 25 - 11 = 14 cm

Page 16: Variance and Standard Deviation (3) Frequency Distributions.

Pros and Cons of different averages (mean and median)and

measures of spread (inter-quartile range and standard deviation)

•Median and inter-quartile range are unaffected by extreme

values

•therefore the most suitable measures when extreme value

occur

•Median and inter-quartile range can be calculated with some

data missing (in the end ranges)

•Mean and standard deviation include all values

•Mean and standard deviation are more ‘sensitive’ measures

•they provide a better picture of the whole data

•You can therefore chose the values that bias the interpretation

in you favour!

Page 17: Variance and Standard Deviation (3) Frequency Distributions.

"There are three kinds of lies: lies,

damned lies

and statistics.”;

Mark Twain

Page 18: Variance and Standard Deviation (3) Frequency Distributions.

ActivityActivityPage 29 of your Statistics 1 Page 29 of your Statistics 1 book. Read and make a book. Read and make a memory mapmemory map

Page 19: Variance and Standard Deviation (3) Frequency Distributions.

Time, t Frequency Time, tCumulative Frequency

0 < t <= 20 6 t <= 20 620 < t <= 40 13 t <= 40 1940 < t <= 60 23 t <= 60 4260 < t <= 80 18 t <= 80 6080 < t <= 100 10 t <= 100 70100 < t <= 120 6 t <= 120 76120 < t <= 140 4 t <= 140 80

Puzzle Solving - cumulative frequency

0

10

20

30

40

50

60

70

80

90

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

time t

cum

ula

tive

fre

qu

ency

CF

Median= 58 cm

IQR =80 - 41= 39 cm

Page 20: Variance and Standard Deviation (3) Frequency Distributions.

Puzzle Solving - cumulative frequency

0

10

20

30

40

50

60

70

80

90

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160

time t

cu

mu

lati

ve f

req

uen

cy

CF

Time, t Frequency Time, tCumulative Frequency

0 < t <= 20 6 t <= 20 620 < t <= 40 13 t <= 40 1940 < t <= 60 23 t <= 60 4260 < t <= 80 18 t <= 80 6080 < t <= 100 10 t <= 100 70100 < t <= 120 6 t <= 120 76120 < t <= 150 4 t <= 150 80

Median= 58 cm

IQR =80 - 41= 39 cm

Median and IQR are unaffected by a change in the upper range

Page 21: Variance and Standard Deviation (3) Frequency Distributions.

Time, t Frequency Mid point Freq x Mid-pt

0 < t <= 20 6 10 60

20 < t <= 40 13 30 390

40 < t <= 60 23 50 1150

60 < t <= 80 18 70 1260

80 < t <= 100 10 90 900

100 < t <= 120 6 110 660

120 < t <= 140 4 130 520

TOTAL 80 4940

Estimate of mean = 4940 / 80 = 61.75 sec

Page 22: Variance and Standard Deviation (3) Frequency Distributions.

sx = xi2fi - x2

nMid-Point Squared

Time, t Frequency Mid point Mid-Point Squared x Frequency

0 < t <= 20 6 10 3600 21600

20 < t <= 40 13 30 152100 1977300

40 < t <= 60 23 50 1322500 30417500

60 < t <= 80 18 70 1587600 28576800

80 < t <= 100 10 90 810000 8100000

100 < t <= 120 6 110 435600 2613600

120 < t <= 140 4 130 270400 1081600

TOTAL 80 72788400

sx = 72788400 -

61.752

80

X = 61.75

= 951.9

Page 23: Variance and Standard Deviation (3) Frequency Distributions.

sx = xi2fi - x2

n

sx = 72873200 -

62.002

80

X = 62.00 (61.75)

= 952.4 (951.9)

Mid-Point Squared

Time, t Frequency Mid point Freq x Mid-pt Mid-Point Squared x Frequency

0 < t <= 20 6 10 60 3600 21600

20 < t <= 40 13 30 390 152100 1977300

40 < t <= 60 23 50 1150 1322500 30417500

60 < t <= 80 18 70 1260 1587600 28576800

80 < t <= 100 10 90 900 810000 8100000

100 < t <= 120 6 110 660 435600 2613600

120 < t <= 150 4 135 540 291600 1166400

TOTAL 80 4960 72873200

Mean 62.00SD 952.4

Mean and SD are changed slightly by a change in the upper range