Statistics
-
Upload
igor-guerra -
Category
Documents
-
view
30 -
download
0
description
Transcript of Statistics
StatisticsGraphic distributions
What is Statistics?Statistics is a collection of methods for
planning experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions based on the data.
Uses of Statistics“Some students choose it because it is
required, but increasing numbers do so voluntarily because they recognize its value and application to whatsoever field they plan to pursue. Because employers love to see a statistics course on the transcript of a job applicant, you will have an advantage….” Mario F. Triola
Abuses of StatisticsSmall samplesPrecise numbersGuesstimatesDistorted percentagesPartial picturesDeliberate distortion
More AbusesLoaded questionsPictographsBad SamplesPollster PressureMisleading graphs
Example 1 of Misleading Graphs
Example 2 of Misleading Graphs
Exploratory Data AnalysisJust as an explorer crossing unknown
lands tells what he sees, we will be describing the data that we find. Examine each variable Describe relationship Begin with a graph
Nature of Data Quantitative Data – (QUANTITY) Numbers
representing counts or measurements EX:
Qualitative or Categorical Data – (QUALITY) Separated into different categories that can be divided into non-numeric characteristics EX:
M&M ExperimentMethod of collecting data:
Weigh candies using a digitized scale, check color, and record.
Weights in grams of a sample of M&M candies
.887 .923 .906 .923 .848 .911
.931 .783 .978 .942 .875 .930
.908 .942 .868 .922 .882 .949
.785 .898 .920 .923 .921 .959
.882 .942 .912 .975 .920
.791 .902 .892 .922
Weights in grams of a sample of M&M candies
.887 .923 .906 .923 .848 .911
.931 .783 .978 .942 .875 .930
.908 .942 .868 .922 .882 .949
.785 .898 .920 .923 .921 .959
.882 .942 .912 .975 .920
.791 .902 .892 .922
• What variables are recorded here?• What type of variables are they?
Data
Categorical
Binary
Quantitative
0
20
40
60
80
100
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
0
20
40
60
80
100
1st Qtr 2nd
Qtr
3rd Qtr 4th Qtr
0
20
40
60
80
100
0 1 2 3 4 5
0% 50% 100%
1st Qtr
2nd Qtr
3rd Qtr
4th Qtr
Types of Graphic Representations Frequency
Distribution Bar Graph Stacked Bar Graph
Pie Charts Dot Plots Histograms Stem and Leaf Plot …
Box and Whisker Time Plot Scatter Plot Cumulative Plots Normality Plot Normal Distribution
Frequency Distribution
Pattern of variation The distribution tells what values a variable
takes and how often Raw Data
Frequency Distribution List of categories along with counts
Colors in a bag of skittles
Red 14
Yellow 21
Blue 15
Green 21
Purple 17
Orange 15
Bar Graph
Use of Categorical data
Attractive Heights show counts More flexible than
pie charts Vertical and
Horizontal
Can distort values
Methods of Travel
0
5
10
15
20
25
30
35
Boats Cars Planes Trains
Number inthousands
BAR GRAPH EXAMPLE
Stacked Bar Graph Used to
distinguish two or more categories of the same variable
Great for comparing/ contrasting two variables
Can be a little difficult to distinguish size
Number of Toys Purchased
0
50
100
150
200
Board Games
BikesSports Equipment
Game cube
Adults
Girls
Boys
Pie Charts Visual Attractive Uses categorical
data Easy to interpret
Difficult to make precise
Must use percents Close values
difficult to differentiate
Flavors of Ice Cream
Vanilla Chocolate Strawberry Others
PIE CHART EXAMPLE
Guess what percentages these slices represent…
Flavors of Ice Cream
Vanilla Chocolate Strawberry Others
PIE CHART EXAMPLE
Were you close?
Dot Plots Good Visual Quantitative data Check for overall
pattern
Difficult with large amounts of data
Theme Park Attendance Per Day
35 40 45 50 55 60 65 70 75 80 85 90 95 100
105
East Coast Resorts per thousand
West Coast Resorts per thousand
DOT PLOT EXAMPLE
Tools for Interpretation
Don’t Forget your socks –SOCS
S – Shape O –Check for outliers C – Describe the center S – Describe the spread
S – Shape Symmetric? Skewed to the left? Skewed to the right ? Bimodal?
O –Check for outliers
Stuff that is outside of the normal range Exact details Later
C – Describe the center
Values of central tendency:MeanMedianMode(Range)
S – Describe the spread
Wide spread?Narrow Spread?
Uniform?
–IQR–Range–Standard Deviation
Stem and Leaf Plot Sometimes data is
too spread out to make a reasonable dot plot
Five stems is a good minimum
More flexible by rounding
Easy to construct
Hard with large data sets
Home Run Hits comparison Barry Hank Bonds vs. Aaron 9 6 1 3 5 5 4 2 0 4 6 7 9 7 7 4 4 3 3 3 0 2 4 4 8 9 9 9 6 2 0 4 0 0 4 4 4 4 5 7 5 6 3 7 17 = 17 hits
Histogram Quantitative
variables Divides data into
classes of equal size
Visual may distort understanding
HISTOGRAM EXAMPLE
Box and Whisker Plots Easy to compare
quartiles Outliers seen on
modified boxplot Side by side = best
comparison
Difficult to determine size of data
Can be misleading Show less detail
Weights of children to age 10
Time Plot Variables observed over
time Horizontal axis has the
time scale Check for overall pattern
• Does not show what happens WITHIN that time period!
Number of blankets sold each year
Scatter Plot Shows relationship
of two variables Can determine
overall tendencies Can determine
strength of relationship
Not all relationships are linear
Wife’s Age VS Husband’s Age
Cumulative PlotsAlso known as an
ogive (“oh-jive”)Adds onto each
progressive column
Rabbits born in a month
1 2 3 4 5Week
0
10
20
30
40
50
60
Rabbits
Commonly confused with bar graphs
Normal Distribution
Normality Plot
Questions????
Graph ExplorationFind five or more different
graphs and identify the type you think it is.