Essential Statistics Chapter 1 1
Chapter 1
Picturing Distributions with Graphs
Essential Statistics Chapter 1 2
Statistics
Statistics is a science that involves the extraction of information from numerical data obtained during an experiment or from a sample. It involves the design of the experiment or sampling procedure, the collection and analysis of the data, and making inferences (statements) about the population based upon information in a sample.
Essential Statistics Chapter 1 3
Individuals and Variables
Individuals– the objects described by a set of data– may be people, animals, or things
Variable– any characteristic of an individual– can take different values for different
individuals
Essential Statistics Chapter 1 4
Variables
Categorical– Places an individual into one of several
groups or categories
Quantitative (Numerical)– Takes numerical values for which
arithmetic operations such as adding and averaging make sense
Essential Statistics Chapter 1 5
Case Study
The Effect of Hypnosison the
Immune System
reported in Science News, Sept. 4, 1993, p. 153
Essential Statistics Chapter 1 6
Case Study
The Effect of Hypnosison the
Immune System
Objective:To determine if hypnosis strengthens thedisease-fighting capacity of immune cells.
Essential Statistics Chapter 1 7
Case Study
65 college students. – 33 easily hypnotized– 32 not easily hypnotized
white blood cell counts measured all students viewed a brief video about
the immune system.
Essential Statistics Chapter 1 8
Case Study
Students randomly assigned to one of three conditions– subjects hypnotized, given mental exercise– subjects relaxed in sensory deprivation
tank– control group (no treatment)
Essential Statistics Chapter 1 9
Case Study
white blood cell counts re-measured after one week
the two white blood cell counts are compared for each group
results– hypnotized group showed larger jump in white
blood cells– “easily hypnotized” group showed largest immune
enhancement
Essential Statistics Chapter 1 10
Case Study
Variables measured
Easy or difficult to achieve hypnotic trance
Group assignment Pre-study white blood cell
count Post-study white blood cell
count
categorical
quantitative
Essential Statistics Chapter 1 11
Case Study
Weight Gain SpellsHeart Risk for Women
“Weight, weight change, and coronary heart disease in women.” W.C. Willett, et. al., vol. 273(6), Journal of the American Medical Association, Feb. 8, 1995.
(Reported in Science News, Feb. 4, 1995, p. 108)
Essential Statistics Chapter 1 12
Case Study
Weight Gain SpellsHeart Risk for Women
Objective:To recommend a range of body mass index (a function of weight and height) in terms of
coronary heart disease (CHD) risk in women.
Essential Statistics Chapter 1 13
Case Study
Study started in 1976 with 115,818 women aged 30 to 55 years and without a history of previous CHD.
Each woman’s weight (body mass) was determined.
Each woman was asked her weight at age 18.
Essential Statistics Chapter 1 14
Case Study
The cohort of women were followed for 14 years.
The number of CHD (fatal and nonfatal) cases were counted (1292 cases).
Essential Statistics Chapter 1 15
Case Study
Age (in 1976) Weight in 1976 Weight at age 18 Incidence of coronary heart
disease Smoker or nonsmoker Family history of heart disease
quantitative
categorical
Variables measured
Essential Statistics Chapter 1 16
Distribution
Tells what values a variable takes and how often it takes these values
Can be a table, graph, or function
Essential Statistics Chapter 1 17
Displaying Distributions
Categorical variables– Pie charts– Bar graphs
Quantitative variables– Histograms– Stemplots (stem-and-leaf plots)
Essential Statistics Chapter 1 18
Year Count Percent
Freshman 18 41.9%
Sophomore 10 23.3%
Junior 6 14.0%
Senior 9 20.9%
Total 43 100.1%
Data Table
Class Make-up on First Day
Essential Statistics Chapter 1 19
Freshman41.9%
Sophomore23.3%
Junior14.0%
Senior20.9%
Pie Chart
Class Make-up on First Day
Essential Statistics Chapter 1 20
41.9%
23.3%
14.0%
20.9%
0.0%
5.0%
10.0%
15.0%
20.0%
25.0%
30.0%
35.0%
40.0%
45.0%
Freshman Sophomore Junior Senior
Year in School
Per
cen
t
Class Make-up on First DayBar Graph
Essential Statistics Chapter 1 21
Example: U.S. Solid Waste (2000)
Data TableMaterial Weight (million tons) Percent of total
Food scraps 25.9 11.2 %
Glass 12.8 5.5 %
Metals 18.0 7.8 %
Paper, paperboard 86.7 37.4 %
Plastics 24.7 10.7 %
Rubber, leather, textiles 15.8 6.8 %
Wood 12.7 5.5 %
Yard trimmings 27.7 11.9 %
Other 7.5 3.2 %
Total 231.9 100.0 %
Essential Statistics Chapter 1 22
Example: U.S. Solid Waste (2000)
Pie Chart
Essential Statistics Chapter 1 23
Example: U.S. Solid Waste (2000)
Bar Graph
Essential Statistics Chapter 1 24
Examining the Distribution of Quantitative Data
Overall pattern of graph Deviations from overall pattern Shape of the data Center of the data Spread of the data (Variation) Outliers
Essential Statistics Chapter 1 25
Shape of the Data
Symmetric– bell shaped– other symmetric shapes
Asymmetric– right skewed– left skewed
Unimodal, bimodal
Essential Statistics Chapter 1 26
SymmetricBell-Shaped
Essential Statistics Chapter 1 27
SymmetricMound-Shaped
Essential Statistics Chapter 1 28
SymmetricUniform
Essential Statistics Chapter 1 29
AsymmetricSkewed to the Left
Essential Statistics Chapter 1 30
AsymmetricSkewed to the Right
Essential Statistics Chapter 1 31
Outliers
Extreme values that fall outside the overall pattern– May occur naturally– May occur due to error in recording– May occur due to error in measuring– Observational unit may be fundamentally
different
Essential Statistics Chapter 1 32
Histograms
For quantitative variables that take many values
Divide the possible values into class intervals (we will only consider equal widths)
Count how many observations fall in each interval (may change to percents)
Draw picture representing distribution
Essential Statistics Chapter 1 33
Histograms: Class Intervals
How many intervals?– One rule is to calculate the square root of the
sample size, and round up.
Size of intervals?– Divide range of data (maxmin) by number of
intervals desired, and round to convenient number
Pick intervals so each observation can only fall in exactly one interval (no overlap)
Essential Statistics Chapter 1 34
Case Study
Weight Data
Introductory Statistics classSpring, 1997
Virginia Commonwealth University
Essential Statistics Chapter 1 35
Weight Data
Essential Statistics Chapter 1 36
Weight Data: Frequency TableWeight Group Count
100 - <120 7 120 - <140 12 140 - <160 7 160 - <180 8 180 - <200 12 200 - <220 4 220 - <240 1 240 - <260 0 260 - <280 1
sqrt(53) = 7.2, or 8 intervals; range (260100=160) / 8 = 20 = class width
Essential Statistics Chapter 1 37
Weight Data: Histogram
0
2
4
6
8
10
12
14
Frequency
100 120 140 160 180 200 220 240 260 280Weight
* Left endpoint is included in the group, right endpoint is not.
Nu
mb
er
of s
tude
nts
Essential Statistics Chapter 1 38
Stemplots(Stem-and-Leaf Plots)
For quantitative variables Separate each observation into a stem (first
part of the number) and a leaf (the remaining part of the number)
Write the stems in a vertical column; draw a vertical line to the right of the stems
Write each leaf in the row to the right of its stem; order leaves if desired
Essential Statistics Chapter 1 39
Weight Data12
Essential Statistics Chapter 1 40
Weight Data:Stemplot
(Stem & Leaf Plot)
1011121314151617181920212223242526
Key
20|3 means203 pounds
Stems = 10’sLeaves = 1’s
192
2
1522
5
135
Essential Statistics Chapter 1 41
Weight Data:Stemplot
(Stem & Leaf Plot)
10 016611 00912 003457813 0035914 0815 0025716 55517 00025518 00005556719 24520 321 02522 023242526 0
Key
20|3 means203 pounds
Stems = 10’sLeaves = 1’s
Essential Statistics Chapter 1 42
Extended Stem-and-Leaf Plots
If there are very few stems (when the
data cover only a very small range of
values), then we may want to create
more stems by splitting the original
stems.
Essential Statistics Chapter 1 43
Extended Stem-and-Leaf Plots
Example: if all of the data values were between 150 and 179, then we may choose to use the following stems:
151516161717
Leaves 0-4 would go on each upper stem (first “15”), and leaves 5-9 would go on each lower stem (second “15”).
Essential Statistics Chapter 1 44
Time Plots A time plot shows behavior over time. Time is always on the horizontal axis, and the
variable being measured is on the vertical axis. Look for an overall pattern (trend), and
deviations from this trend. Connecting the data points by lines may emphasize this trend.
Look for patterns that repeat at known regular intervals (seasonal variations).
Essential Statistics Chapter 1 45
Class Make-up on First Day(Fall Semesters: 1985-1993)
0%
10%
20%
30%
40%
50%
60%
70%
Percent of ClassThat Are Freshman
1985 1986 1987 1988 1989 1990 1991 1992 1993
Year of Fall Semester
Class Make-up On First Day
Essential Statistics Chapter 1 46
Average Tuition (Public vs. Private)
Top Related