Histograms, Frequency Distributions and Related Topics
description
Transcript of Histograms, Frequency Distributions and Related Topics
Histograms, Frequency Distributions and Related Topics
Histograms, Frequency Distributions and Related Topics
These are constructions that will allow us to represent large sets of data in ways that may be more meaningful to the reader.
These are constructions that will allow us to represent large sets of data in ways that may be more meaningful to the reader.
Histograms provide graphical representation of data with bars whose heights indicate the number of data in a certain range.
A frequency table shows the distribution of data in classes (intervals). The classes are constructed so that each data values falls into exactly one class, and the class frequency is the number of data in the class.
Histograms provide graphical representation of data with bars whose heights indicate the number of data in a certain range.
A frequency table shows the distribution of data in classes (intervals). The classes are constructed so that each data values falls into exactly one class, and the class frequency is the number of data in the class.
How long does the 1161 mile Iditarod take? (p. 47, problem 7).How long does the 1161 mile Iditarod take? (p. 47, problem 7).
261 271 236 244 279 296 284 299 288 288 247 256
338 360 341 333 261 266 287 296 313 311 307 307
299 303 277 283 304 305 288 290 288 289 297 299
332 330 309 328 307 328 285 291 295 298 306 315
310 318 318 320 333 321 323 324 327
Can you easily see what the maximum and minimum times are?
Is it easy to tell how the times are distributed?
To find the class width,
First compute: Largest value - smallest Value Desired number of classes
Increase the value computed to the next highest whole,number even if the first value was a whole number. This will ensure the classes cover the data.
The lower class limit of a class is the lowest data that can fit into the class, the upper class limit is the highest data value that can fit into the class. The class width is the difference between lower class limits of adjacent classes.
In a frequency table, divide the data range into classesequal width,
compute: Largest value - smallest Value Desired number of classes
Increase the value computed to the next highest whole,number even if the first value was a whole number. This will ensure the classes cover the data.
The lower class limit of a class is the lowest data that can fit into the class, the upper class limit is the highest data value that can fit into the class. The class width is the difference between lower class limits of adjacent classes.
Class BoundariesClass Boundaries
Class boundaries cannot belong to any class. Class boundaries between adjacent classes are the
midpoint between the upper limit of the first class, and the lower limit of the higher class.
Differences between upper and lower boundaries of a given class is the class width.
The midpoint of a class (class mark) is the average of its upper and lower boundaries, which is also the average of its upper and lower limits.
Class boundaries cannot belong to any class. Class boundaries between adjacent classes are the
midpoint between the upper limit of the first class, and the lower limit of the higher class.
Differences between upper and lower boundaries of a given class is the class width.
The midpoint of a class (class mark) is the average of its upper and lower boundaries, which is also the average of its upper and lower limits.
It is easier to make the histogram if the data is sorted:It is easier to make the histogram if the data is sorted:
236 244 247 256 261 261 266 271 277 279 283 284
285 287 288 288 288 288 289 290 291 295 296 296
297 298 299 299 299 303 304 305 306 307 307 307
309 310 311 313 315 318 318 320 321 323 324 327
328 328 330 332 333 333 338 341 360
The class width is computed as (360-236)/5 which is 24.8. Hence the class width is 25.
The class width is computed as (360-236)/5 which is 24.8. Hence the class width is 25.
Lower
Limit
Upper
Limit
Lower
Boundary
Upper
Boundary
Mark Frequency
236 260 235.5 260.5 248 4
261 285 260.5 285.5 273 9
286 310 285.5 310.5 298 25
311 335 310.5 335.5 323 16
336 360 335.5 360.5 348 3
HistogramsHistograms
A histogram is a bar graph that can be constructed using a frequency table:
Put the class boundaries on the horizontal axis The bars have the same width and always touch
and the edges of the bars are on class boundaries. The height of the bar is the class frequency.
A histogram is a bar graph that can be constructed using a frequency table:
Put the class boundaries on the horizontal axis The bars have the same width and always touch
and the edges of the bars are on class boundaries. The height of the bar is the class frequency.
Histogram for Iditarod DataHistogram for Iditarod DataTime to Complete Iditarod
0
5
10
15
20
25
30
235.5 260.5 285.5 310.5 335.5 360.5
Hours
Frequency
Frequency
Relative FrequenciesRelative Frequencies
The relative frequency of a class is f/n where f is the frequency of the class, and n is the total of all frequencies.
Relative frequency tables are like frequency tables except the relative frequency is given.
Relative frequency histograms are like frequency histograms except the height of the bars represent relative frequencies.
The relative frequency of a class is f/n where f is the frequency of the class, and n is the total of all frequencies.
Relative frequency tables are like frequency tables except the relative frequency is given.
Relative frequency histograms are like frequency histograms except the height of the bars represent relative frequencies.
Systolic blood pressures of 50 subjectsMake a histogram with 8 classes
Systolic blood pressures of 50 subjectsMake a histogram with 8 classes
100 102 104 108 108 110 110 112 112 112
115 116 116 118 118 118 118 120 120 126
126 126 128 128 128 130 130 130 130 130
132 132 134 134 136 136 138 140 140 146
148 152 152 152 156 160 190 200 208 208
Systolic blood pressures of 50 subjectsClass Width = (208-100)/8 = 13.5, thus use 14
Systolic blood pressures of 50 subjectsClass Width = (208-100)/8 = 13.5, thus use 14
L. Bndy U. Bndy L. Limit U. Limit Mark Freq. R. Freq. C. Freq
99.5 113.5 100 113 106.5 10 0.20 10
113.5 127.5 114 127 120.5 12 0.24 22
127.5 141.5 128 141 134.5 17 0.34 39
141.5 155.5 142 155 148.5 5 0.10 44
155.5 169.5 156 169 162.5 2 0.04 46
169.5 183.5 170 183 176.5 0 0.00 46
183.5 197.5 184 197 190.5 1 0.02 47
197.5 211.5 198 211 204.5 3 0.06 50
Frequency Histogram for Blood Pressure DataFrequency Histogram for Blood Pressure Data
Histogram
0
2
4
6
8
10
12
14
16
18
99.5 113.5127.5 141.5155.5 169.5183.5 197.5211.5
Systolic Blood Pressure
Frequency
Frequency
Relative Frequency Histogram for Blood Pressure DataRelative Frequency Histogram for Blood Pressure Data
Relative Frequency Histogram
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
99.5 113.5127.5141.5155.5169.5183.5197.5211.5
Systolic Pressure
Relative Frequency
Cumulative Frequencies & Ogives
Cumulative Frequencies & Ogives
The cumulative frequency of a class is the frequency of the class plus the frequencies for all previous classes.
An ogive is a line graph that displays cumulative frequencies.
The cumulative frequency of a class is the frequency of the class plus the frequencies for all previous classes.
An ogive is a line graph that displays cumulative frequencies.
Constructing OgivesConstructing Ogives
Make a frequency table showing class boundaries and cumulative frequencies.
For each class, put a dot over the upper class boundary at the height of the cumulative class frequency.
Place dot on horizontal axis at the lower class boundary of the first class.
Connect the dots.
Make a frequency table showing class boundaries and cumulative frequencies.
For each class, put a dot over the upper class boundary at the height of the cumulative class frequency.
Place dot on horizontal axis at the lower class boundary of the first class.
Connect the dots.
Ogive for Blood Pressure DataOgive for Blood Pressure Data
Blood Pressures of 50 Subjects
0
10
20
30
40
50
60
99.5 127.5 155.5 183.5 211.5
Systolic Pressure
Cummulative Frequency
(a) What number, and percentage, of winning times are under 2:07.15?
(b) Estimate number, and percentage, of winning times between 2:05.15 and 2:11.15.
Winning Times for Kentucky Derby
0
12
48
75
8594
100 101
0
20
40
60
80
100
120
-0.85 1.15 3.15 5.15 7.15 9.15 11.15 13.15
Seconds over 2 Minutes
Cumulative Frequency
Distribution ShapesDistribution Shapes
Symmetrical Uniform (it has a rectangular histogram) Skewed left – the longer tail is on the left side. Skewed right – the longer tail is on the right side. Bimodal (the two classes with the largest
frequencies are separated by at least one class)
Symmetrical Uniform (it has a rectangular histogram) Skewed left – the longer tail is on the left side. Skewed right – the longer tail is on the right side. Bimodal (the two classes with the largest
frequencies are separated by at least one class)