Understanding Basic Statistics Chapter One Organizing Data.
-
Upload
mervyn-kevin-fitzgerald -
Category
Documents
-
view
220 -
download
2
Transcript of Understanding Basic Statistics Chapter One Organizing Data.
Understanding Basic Statistics
Chapter One
Organizing Data
Statistics is
The study of how to:
• collect
• organize
• analyze
• interpret
numerical information from data
Types of Data
• Quantitative data are numerical measurements– example: number of siblings
• Qualitative data involve non-numerical observations– example: brand of computer
Population
all measurements or observations of interest
Example: incomes of all residents of a county
Sample
part of a population used to represent the population
Example: incomes of selected residents
Methods of Producing Data
• Sampling: drawing subsets from the population
• Experimention: impose a change and measure the result
• Simulation: numerical facsimile of real-world phenomena
• Census: using measurements from entire population
Potential Problems
• Strong opinions may be overepresented if responses are voluntary.
• A hidden bias may exist in the way data is collected.
• There may be hidden effects of other variables.
• There is no guarantee that results can be generalized.
Levels of Measurement
• Nominal
• Ordinal
• Interval
• Ratio
Nominal Measurement
Data is put into categories only.
Example: eye color
Ordinal Measurement
Data can be ordered. Differences cannot be
calculated or interpreted.
Example: class rank
Interval Measurement
Data can be ordered. Differences between data values can be compared.
Example: temperature
Ratio Measurement
Data can be ordered. Differences and ratios
between data values can be compared.
Example: time
Simple Random Sample of n measurements:
• every sample of size n has equal chance of being selected
• every item in the population has equal chance of being included
Not random sampling:
• asking for volunteers to respond to a survey
• choosing the first five customers in a store
Random sampling:
• drawing names “from a hat”
• using a random number table to select sample
• using a random number generator
Sampling techniques
• Simple Random Sampling
• Stratified Sampling
• Systematic Sampling
• Cluster Sampling
• Convenience Sampling
Stratified Sampling
• Population is divided into groups
• Random samples are drawn from each group
Systematic Sampling
• Population is arranged in sequential order.
• Select a random starting point.
• Select every “kth” item.
Cluster Sampling
• Population is divided into sections
• Some sections are randomly selected
• Every item in selected sections is included in sample
Convenience Sampling
• Use whatever data is readily available.
• Risk severe bias.
Which sampling technique is described?
College students are waiting in line for registration. Every eighth
person in line is surveyed.
Systematic sampling
Which sampling technique is described?
College students are waiting in line for registration. Students are asked to volunteer to respond to
a survey.Convenience sampling
Which sampling technique is described?
In a large high school, students from every homeroom are
randomly selected to participate in a survey
Stratified sampling
Which sampling technique is described?
An accountant uses a random number generator to select ten
accounts for audit.
Simple random sampling
Which sampling technique is described?
To determine students’ opinions of a new registration method, a college randomly selects five
majors. All students in the selected majors are surveyed.
Cluster sampling
Bar Graph
• bars of uniform width
• uniformly spaced• may be vertical or
horizontal• lengths represent
quantities being compared
Reasons for Returns
0
10
20
30
40
50
60
Color Si
ze
Didn'
t lik
e
Qualit
y
Pareto Chart
• tool of quality control• start with a bar chart• arrange bars in
decreasing order of frequency
• frequently used to investigate causes of problems
Reasons for returns
50
20
10
5
0
10
20
30
40
50
60
Size Didn't Like Quality Didn't Like
Circle Graph (Pie Chart)
• shows division of whole into component parts
• label parts with appropriate percentages of the whole
Conventions held
49%
27%
19%5%
FloridaCaliforniaVirginiaTexas
Time Plot
• Shows data values in chronological order
• time on horizontal scale
• other variable on vertical scale
• connect data points with line segments
Sales (in thousands of dollars)
050
100150200250300
Histogram
Differences from a bar chart:
• bars touch
• width of bars represents quantity
To construct a histogram from raw data:
• Decide on the number of classes (5 to 15 is customary) and find a convenient class width.
• Organize the data into a frequency table.
• Find the class boundaries and the class midpoints.
• Tally data and determine the freqency
• Sketch the histogram.
Computing the class width
1. Compute:
largest data value smallest data value
desired number of classes
2. Increase the value computed to the next highest whole number
Class Width
Raw Data:
10.2 18.7 22.3 20.0
6.3 17.8 17.1 5.0
2.4 7.9 0.3 2.5
8.5 12.5 21.4 16.5
0.4 5.2 4.1 14.3
19.5 22.5 0.0 24.7
11.4
Use 5 classes.
24.7 – 0.0
5
= 4.94
Round class width up to 5.
Computing Class Width
difference between the lower class limit of one class and the
lower class limit of the next class
Finding Class Widths# of miles Class Width
0.0 - 4.9 5
5.0 - 9.9 5
10.0 - 14.9 5
15.0 - 19.9 5
20.0 - 24.9 5
Class Boundaries
(Upper limit of one class + lower limit of next class)
divided by two
Finding Class Boundaries# of miles f class boundaries
0.0 - 4.9 6
5.0 - 9.9 5 4.95 - 9.95
10.0 - 14.9 4
15.0 - 19.9 5
20.0 - 24.9 5
Finding Class BoundariesFinding Class Boundaries
# of miles f class boundaries
0.0 - 4.9 6
5.0 - 9.9 5 4.95 - 9.95
10.0 - 14.9 4 9.95 - 14.95
15.0 - 19.9 5
20.0 - 24.9 5
# of miles f class boundaries
0.0 - 4.9 6
5.0 - 9.9 5 4.95 - 9.95
10.0 - 14.9 4 9.95 - 14.95
15.0 - 19.9 5 14.95 - 19.95
20.0 - 24.9 5
Finding Class Boundaries
# of miles f class boundaries
0.0 - 4.9 6 ??
5.0 - 9.9 5 4.95 - 9.95
10.0 - 14.9 4 9.95 - 14.95
15.0 - 19.9 5 14.95 - 19.95
20.0 - 24.9 5 19.95 - 24.95
Finding Class Boundaries
# of miles f class boundaries
0.0 - 4.9 6 ?? - 4.95 5.0 - 9.9 5 4.95 - 9.95
10.0 - 14.9 4 9.95 - 14.95
15.0 - 19.9 5 14.95 - 19.95
20.0 - 24.9 5 19.95 - 24.95
Finding Class Boundaries
# of miles f class boundaries
0.0 - 4.9 6 0.05 - 4.95 5.0 - 9.9 5 4.95 - 9.95
10.0 - 14.9 4 9.95 - 14.95
15.0 - 19.9 5 14.95 - 19.95
20.0 - 24.9 5 19.95 - 24.95
Finding Class Boundaries
Computing Class Midpoints
lower class limit + upper class limit
2
# of miles f class midpoints
0.0 - 4.9 6 2.45
5.0 - 9.9 5
10.0 - 14.9 4
15.0 - 19.9 5
20.0 - 24.9 5
Finding Class Midpoints
# of miles f class midpoints
0.0 - 4.9 6 2.45
5.0 - 9.9 5 7.45
10.0 - 14.9 4
15.0 - 19.9 5
20.0 - 24.9 5
Finding Class Midpoints
# of miles f class midpoints
0.0 - 4.9 6 2.45
5.0 - 9.9 5 7.45
10.0 - 14.9 4 12.45
15.0 - 19.9 5 17.45
20.0 - 24.9 5 22.45
Finding Class Midpoints
Tallying the Data# of miles tally frequency
0.0 - 4.9 |||| | 6
5.0 - 9.9 |||| 5
10.0 - 14.9 |||| 4
15.0 - 19.9 |||| 5
20.0 - 24.9 |||| 5
# of miles f
0.0 - 4.9 6
5.0 - 9.9 5
10.0 - 14.9 4
15.0 - 19.9 5
20.0 - 24.9 5
Constructing the Histogramf
| | | | | |
6
5
4
3
2
1
0
-
-
-
-
-
-
--0.05 4.95 9.95 14.95 19.95 24.95
mi.
Grouped Frequency Table# of miles f
0.0 - 4.9 6
5.0 - 9.9 5
10.0 - 14.9 4
15.0 - 19.9 5
20.0 - 24.9 5
Class limits:
lower - upper
Relative Frequency
Relative frequency =
f = class frequency
n total of all frequencies
Relative Frequency
f = 6 = 0.24
n 25
f = 5 = 0.20
n 25
# of miles f relative frequency
0.0 - 4.9 6 0.24
5.0 - 9.9 5 0.20
10.0 - 14.9 4 0.16
15.0 - 19.9 5 0.20
20.0 - 24.9 5 0.20
Relative Frequency Histogram
| | | | | |
.24
.20
.16
.12
.08
.04
0
-
-
-
-
-
-
--0.05 4.95 9.95 14.95 19.95 24.95
mi.
Rel
ativ
e fr
eque
ncy
f/n
Common Shapes of Histograms
Common Shapes of Histograms
Symmetric
ff
When folded vertically, both sides are (more or less) the same.
Common Shapes of Histograms
Common Shapes of Histograms
Also Symmetric
ff
Common Shapes of Histograms
Uniform
ff
Common Shapes of Histograms
Common Shapes of Histograms
Non-Symmetric Histograms
These histograms are skewed. skewed.
Common Shapes of Histograms
Common Shapes of Histograms
Skewed Histograms
Skewed left Skewed right
Common Shapes of Histograms
Common Shapes of Histograms
Bimodal
ff
The two largest rectangles are separated by at least one class.
Stem and Leaf Display
Raw Data:
35, 45, 42, 45, 41, 32, 25, 56, 67, 76, 65, 53, 53, 32, 34, 47, 43, 31
Stem and Leaf DisplayFirst data value = 35
Stem and Leaf DisplayFirst data value = 35
2
3
4
5
6
7
stems
5 leaf
Stem and Leaf DisplaySecond data value = 45
Stem and Leaf DisplaySecond data value = 45
2
3
4
5
6
7
5
5
Stem and Leaf DisplayThird data value = 42
Stem and Leaf DisplayThird data value = 42
2
3
4
5
6
7
5
5 2
Stem and Leaf DisplayNext data value = 45
Stem and Leaf DisplayNext data value = 45
2
3
4
5
6
7
5
5 2 5
Stem and Leaf DisplayNext data value = 41
Stem and Leaf DisplayNext data value = 41
2
3
4
5
6
7
5
5 2 5 1
Stem and Leaf DisplayNext data value = 32
Stem and Leaf DisplayNext data value = 32
2
3
4
5
6
7
5 2
5 2 5 1
Stem and Leaf DisplayNext data value = 25
Stem and Leaf DisplayNext data value = 25
2
3
4
5
6
7
5 2
5 2 5 1
5
Stem and Leaf DisplayNext data value = 56
Stem and Leaf DisplayNext data value = 56
2
3
4
5
6
7
5 2
5 2 5 1
5
6
Stem and Leaf DisplayNext data value = 67
Stem and Leaf DisplayNext data value = 67
2
3
4
5
6
7
5 2
5 2 5 1
5
6
7
Stem and Leaf DisplayNext data value = 76
Stem and Leaf DisplayNext data value = 76
2
3
4
5
6
7
5 2
5 2 5 1
5
6
7
6
Stem and Leaf DisplayNext data value = 65
Stem and Leaf DisplayNext data value = 65
2
3
4
5
6
7
5 2
5 2 5 1
5
6
7 5
6
Stem and Leaf DisplayNext data value = 53
Stem and Leaf DisplayNext data value = 53
2
3
4
5
6
7
5 2
5 2 5 1
5
6 3
7 5
6
Stem and Leaf DisplayNext data value = 53
Stem and Leaf DisplayNext data value = 53
2
3
4
5
6
7
5 2
5 2 5 1
5
6 3 3
7 5
6
Stem and Leaf DisplayNext data value = 32
Stem and Leaf DisplayNext data value = 32
2
3
4
5
6
7
5 2 2
5 2 5 1
5
6 3 3
7 5
6
Stem and Leaf DisplayNext data value = 34
Stem and Leaf DisplayNext data value = 34
2
3
4
5
6
7
5 2 2 4
5 2 5 1
5
6 3 3
7 5
6
Stem and Leaf DisplayNext data value = 47
Stem and Leaf DisplayNext data value = 47
2
3
4
5
6
7
5 2 2 4
5 2 5 1 7
5
6 3 3
7 5
6
Stem and Leaf DisplayNext data value = 43
Stem and Leaf DisplayNext data value = 43
2
3
4
5
6
7
5 2 2 4
5 2 5 1 7 3
5
6 3 3
7 5
6
Stem and Leaf DisplayNext data value = 31
Stem and Leaf DisplayNext data value = 31
2
3
4
5
6
7
5 2 2 4 1
5 2 5 1 7 3
5
6 3 3
7 5
6
Finished Stem and Leaf Display
Finished Stem and Leaf Display
2
3
4
5
6
7
5 2 2 4 1
5 2 5 1 7 3
5
6 3 3
7 5
6