Describing Data: Frequency Tables, Frequency Distributions, and Graphic Presentation

33
©The McGraw-Hill Companies, Inc. 2008 McGraw-Hill/Irwin Describing Data: Frequency Tables, Frequency Distributions, and Graphic Presentation Chapter 2

description

Describing Data: Frequency Tables, Frequency Distributions, and Graphic Presentation. Chapter 2. GOALS. Organize qualitative data into a frequency table. Present a frequency table as a “bar chart” (Excel they are called column chart) or a pie chart. - PowerPoint PPT Presentation

Transcript of Describing Data: Frequency Tables, Frequency Distributions, and Graphic Presentation

©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin

Describing Data:Frequency Tables, FrequencyDistributions, and Graphic Presentation

Chapter 2

2

GOALS

•Organize qualitative data into a frequency table.•Present a frequency table as a “bar chart” (Excel they are called column chart) or a pie chart.•Organize quantitative data into a frequency distribution.•Present a frequency distribution for quantitative data using histograms, frequency polygons, and cumulative frequency polygons.

3

Mutually Exclusive

An individual, object, or measurement is included in only one category– It can’t be in two categories– Example: A particular phone call cannot

originate with both AT&T and MCI

Frequency Table

Frequency Table: A grouping of qualitative data into mutually exclusive classes (categories) showing the number of observations in each class

4

Sales data from Auto Dealership

Price Price($000) Age of Customer ForD Car Type24,624.00 24.624 50 Domestic GM23,032.00 23.032 50 Domestic Ford27,556.00 27.556 59 Domestic GM20,384.00 20.384 32 Domestic GM20,953.00 20.953 29 Domestic Ford37,270.00 27.270 35 Foreign Mercedes21,006.00 21.006 57 Domestic GM27,594.00 27.594 43 Domestic GM29,636.00 29.636 51 Domestic GM26,357.00 26.357 31 Foreign Honda38,262.00 28.262 39 Foreign Mercedes38,910.00 38.910 25 Foreign Honda23,947.00 23.947 43 Domestic Ford

We would like a Frequecy Table that shows how many of each Car Type we sold last

month from the Auto Dealership data (counting).

Count of Car TypeCar Type TotalFord 28GM 22Honda 13Mercedes 10Toyota 7Grand Total 80

Relative Class Frequencies

Class frequencies can be converted to relative class frequencies to show the fraction of the total number of observations in each class.

A relative frequency captures the relationship between a class total and the total number of observations.

5

Car Type Frequency Relative Frequency

GM 22 27.50%

Ford 28 35.00%

Mercedes 10 12.50%

Honda 13 16.25%

Toyota 7 8.75%80 100.00%

6

Textbook: Bar Charts Excel: Column Chart

In Excel, this is a Column chart. Column charts are good for Nominal Level Data. Notice that the columns do not touch.

7

Pie Charts

8

Frequency Distribution

A Frequency distribution is a grouping of data into mutually exclusive categories showing the number of observations in each class.

•The raw data are more easily interpreted if organized into a frequency distribution•The resulting frequency distribution helps a person to quickly see the “shape” of the data•Although the frequency distribution will result in the loss of some detail, seeing patterns in the data can help a person to make better decisions

9

5 Steps To Organize Raw Data Into A Frequency Distribution

Step 1: Decide on Number of Classes Step 2: Determine The Class Interval Step 3: Set The Individual Class Limits Step 4: Tally The Data Into Classes Step 5: Count The Tallies in Each Class & Present

the Frequency Distribution

10

Step 1: Determining The Number Of Classes

Goal is to use just enough classes so you can see the “shape” of the data.

You must use professional judgment. Useful recipe to determine the number

of classes:2k ≥ n

n = total observations

k = number of classes

Best to use 5 < k < 15

General guidelines that are not always possible to follow. Thus, making Frequency Distributions is often refer to as an “art”.

11

Definitions

Class Interval– Distance between lower limit of class and lower limit of the

next class– The class interval is obtained by subtracting the lower limit of

a class from the lower limit of the next class (also midpoint to midpoint)

Class Midpoint (Class Mark)– The midpoint can be thought of as the “typical value” for the

class– This is the average of the upper and lower class limits:

(Lower class limit + upper class limit)/2

12

Step 2: Determine The Class Interval Or Width

Class interval should be the same for every interval– If they are not equal

graphs may be misleading, & calculations may be problematic

– In some cases, where there is a potential for many empty classes, unequal class interval may be necessary

The classes all taken together must cover at least the distance from the lowest value in the raw data up to the highest value:

i ≥H - L

ki = Class IntervalH = Highest ValueL = Lowest Valuek = Number of Classes

Determine Class Interval

13

EXAMPLE – Creating a Frequency Distribution Table

Ms. Kathryn Ball of AutoUSA wants to develop tables, charts, and graphs to show the typical selling price on various dealer lots. The table on the right reports only the price of the 80 vehicles sold last month at Whitner Autoplex.

14

Constructing a Frequency Table - Example

Step 1: Decide on the number of classes. A useful recipe to determine the number of classes (k) is the “2 to the k rule.” such that 2k > n.There were 80 vehicles sold. So n = 80. If we try k = 6, which means we would use 6 classes, then 26 = 64, somewhat less than 80. Hence, 6 is not enough classes. If we let k = 7, then 27 128, which is greater than 80. So the recommended number of classes is 7.

Step 2: Determine the class interval or width. The formula is: i (H-L)/k where i is the class interval, H is the highest observed value, L is the lowest observed value, and k is the number of classes.($35,925 - $15,546)/7 = $2,911Round up to some convenient number, such as a multiple of 10

or 100. Use a class width of $3,000

15

Step 3: Set The Individual Class Limits

Classes must be mutually exclusive Avoid overlapping or unclear class limits:

– Include lower limit– Exclude upper limit

Example of class limits:– $12,000 up to $15,000 and $15,000 up to

$18,000 $12,000 & $14,999 belong in the first class $15,000 belongs in the second class

Avoid open ended classes (problems with graphing)

The lower limit of the first class should be a multiple of the class interval (not always possible)

Convenient multiples of ten are useful You must compare the actual range to the

range implied by the number of classes & class interval

General guidelines that are not always possible to follow. Thus, making Frequency Distributions is often refer to as an “art”.

16

Step 3: Set the individual class limits

Constructing a Frequency Table - Example

17

Step 4: Tally the vehicle selling prices into the classes.

Step 5: Count the number of items in each class.

Constructing a Frequency Table

Observed Patterns:

Range: about $15,000 to about $36,000

Concentration between $18,000 & $27,000

Largest concentration is in $18,000 - $21,000 class

– Typical Value = (18+21)/2 = 19.5 K.

Two sold for $33,000 or more

8 sold for less than $18,000

18

19

Relative Frequency Distribution

To convert a frequency distribution to a relative frequency distribution, each of the class frequencies is divided by the total number of observations.

20

Graphic Presentation of a Frequency Distribution

The three commonly used graphic forms are:

HistogramsFrequency polygonsCumulative frequency distributions

21

Histogram

Histogram for a frequency distribution based on quantitative data is very similar to the column charts (book says: bar chart) showing the distribution of qualitative data. The classes are marked on the horizontal axis and the class frequencies on the vertical axis. The class frequencies are represented by the heights of the bars. The columns must touch in order to visually articulate that the class interval spans from lower class limit to upper class limit.

22

Other Notes About Histogram

Histograms constructed from Relative Frequency Distributions look the same (have the same shape), but instead, the vertical axis would show percentages

Histograms must have the columns touching:– The columns must touch in order to visually articulate that the class

interval spans from lower class limit to upper class limit (a continuous variable)

– For nominal or ordinal level data, the columns are not drawn adjacent to each other

The category labels are usually words

23

Frequency Polygon

A frequency polygon also shows the shape of a distribution and is similar to a histogram.

It consists of line segments connecting the points formed by the intersections of the class midpoints and the class frequencies.

24

Cumulative Frequency Distribution

25

Cumulative Frequency Distribution

26

Second Example of a Cumulative Frequency Distribution (prices of vehicles are lower)

Number ofVehicles Sold(Frequency)

CumulativeFrequency

12 up to 15 8 815 up to 18 23 31 = 8 + 2318 up to 21 17 48 = 31 + 1721 up to 24 18 66 = 48 + 1824 up to 27 8 74 = 66 + 827 up to 30 4 78 = 74 + 430 up to 33 2 80 = 78 + 2

Total 80

Selling Prices($ thousands)

Cumulative Frequency Distribution forVehicles Selling Price

27

Cumulative Frequency Polygon

0

10

20

30

40

50

60

70

80

9 12 15 18 21 24 27 30 33

Selling Price ($000)

Nu

mb

er

of

Ve

hic

les

So

ld

25%

50%

75%

100%

28

Cumulative Frequency Polygon

Plot line on coordinate system

X-axis = Upper limit of class

Y-axis (Left) = Cumulative Frequency

Y-axis (Right) = % First point on graph is:

(lower limit of first class, 0)

x y (left)12 015 818 3121 4824 6627 7430 7833 80

29

Cumulative Frequency Polygon

0

10

20

30

40

50

60

70

80

9 12 15 18 21 24 27 30 33

Selling Price ($000)

Nu

mb

er

of

Ve

hic

les

So

ld

25%

50%

75%

100%

x y (left)12 015 818 3121 4824 6627 7430 7833 80

30

Cumulative Frequency Polygon

0

10

20

30

40

50

60

70

80

9 12 15 18 21 24 27 30 33

Selling Price ($000)

Nu

mb

er

of

Ve

hic

les

So

ld

25%

50%

75%

100%

50% of the vehicles sold for less than about $19,500

31

Cumulative Frequency Polygon

0

10

20

30

40

50

60

70

80

9 12 15 18 21 24 27 30 33

Selling Price ($000)

Nu

mb

er

of

Ve

hic

les

So

ld

25%

50%

75%

100%

25 of the vehicles sold for less than about $17,500

32

Cumulative Frequency Polygon

0

10

20

30

40

50

60

70

80

9 12 15 18 21 24 27 30 33

Selling Price ($000)

Nu

mb

er

of

Ve

hic

les

So

ld

25%

50%

75%

100%

80% of the vehicles sold for less than about $24,000

33

End of Chapter 2