Topic 1 Organizing Information Pictorially Using Charts and Graphs.
-
date post
21-Dec-2015 -
Category
Documents
-
view
214 -
download
0
Transcript of Topic 1 Organizing Information Pictorially Using Charts and Graphs.
Topic 1
Organizing Information Pictorially Using Charts and Graphs
• Characteristics of the individuals under study are called variables– Some variables have values that are
attributes or characteristics … those are called qualitative or categorical variables
– Some variables have values that are numeric measurements … those are called quantitative variables
• The suggested approaches to analyzing problems vary by the type of variable
• Examples of categorical variables– Gender– Zip code– Blood type– States in the United States– Brands of televisions
• Categorical variables have category values … those values cannot be added, subtracted, etc.
• Examples of quantitative variables– Temperature– Height and weight– Sales of a product– Number of children in a family– Points achieved playing a video game
• Quantitative variables have numeric values … those values can be added, subtracted, etc.
• A simple data set is
blue, blue, green, red, red, blue, red, blue
• A frequency table for this qualitative data is
• The most commonly occurring color is blue
Color Frequency
Blue 4
Green 1
Red 3
• The relative frequencies are the proportions (or percents) of the observations out of the total
• A relative frequency distribution lists– Each of the categories– The relative frequency for each category
• A relative frequency table for this qualitative data is
• A relative frequency table can also be constructed with percents (50%, 12.5%, and 37.5% for the above table)
Color Relative Frequency
Blue .500
Green .125
Red .375
Bar graphs for categorical data
• Bar graphs for our simple data (using Excel)– Frequency bar graph– Relative frequency bar graph
Comparative Bar Graph
• An example side-by-side bar graph comparing educational attainment in 1990 versus 2003
Pie Chart
• An example of a pie chart
Histogram for quantitative data
• Quantitative data sometimes cannot be put directly into frequency tables since they do not have any obvious categories
• Categories are created using classes, or intervals of numbers
• The data is then put into the classes
• For ages of adults, a possible set of classes is20 – 2930 – 3940 – 4950 – 59
60 and older• For the class 30 – 39
– 30 is the lower class limit– 39 is the upper class limit
• The class width is the difference between the upper class limit and the lower class limit
• For the class 30 – 39, the class width is40 – 30 = 10
• All the classes have the same widths, except for the last class
• The class “60 and above” is an open-ended class because it has no upper limit
• Classes with no lower limits are also called open-ended classes
• The classes and the number of values in each can be put into a frequency table
• In this table, there are 1147 subjects between 30 and 39 years old
Age Number
(frequency)
20 – 29 533
30 – 39 1147
40 – 49 1090
50 – 59 493
60 and older 110
• Good practices for constructing tables for continuous variables– The classes should not overlap– The classes should not have any gaps between them– The classes should have the same width (except for
possible open-ended classes at the extreme low or extreme high ends)
– The class boundaries should be “reasonable” numbers
– The class width should be a “reasonable” number
• Just as for discrete data, a histogram can be created from the frequency table
• Instead of individual data values, the categories are the classes – the intervals of data
Stemplots
• A stemplot is a different way to represent data that is similar to a histogram
• To draw a stem-and-leaf plot, each data value must be broken up into two components– The stem consists of all the digits except for the right
most one– The leaf consists of the right most digit– For the number 173, for example, the stem would be
“17” and the leaf would be “3”
Stemplots
• In the stem-and-leaf plot below
– The smallest value is 56– The largest value is 180– The second largest value is 178
Stemplots
• To draw a stemplot– Write all the values in ascending order– Find the stems and write them vertically in ascending
order– For each data value, write its leaf in the row next to its
stem– The resulting leaves will also be in ascending order
• The list of stems with their corresponding leaves is the stem-and-leaf plot
Comparative Stemplots
If we wanted to compare two sets of data, we could draw two stem-and-leaf plots using the same stem, with leaves going left (for one set of data) and right (for the other set)
• A useful way to describe a variable is by the shape of its distribution
• Some common distribution shapes are– Uniform– Bell-shaped (or normal)– Skewed right– Skewed left
• A variable has a uniform distribution when– Each of the values tends to occur with the
same frequency– The histogram looks flat
• A variable has a bell-shaped distribution when– Most of the values fall in the middle– The frequencies tail off to the left and to the
right– It is symmetric
• A variable has a skewed right distribution when– The distribution is not symmetric– The tail to the right is longer than the tail to the left– The arrow from the middle to the long tail points right
Right
• A variable has a skewed left distribution when– The distribution is not symmetric– The tail to the left is longer than the tail to the right– The arrow from the middle to the long tail points left
Left
• The two graphs show the same data … the difference seems larger for the graph on the left
• The vertical scale is truncated on the left
• The gazebo on the right is twice as large in each dimension as the one on the left
• However, it is much more than twice as large as the one on the left
Original “Twice” as large