Frequency Distributio2

download Frequency Distributio2

of 12

Transcript of Frequency Distributio2

  • 7/31/2019 Frequency Distributio2

    1/12

    Frequency distribution

    A frequency distribution is a tool for organizing data. We use it to group data into

    categories and show the number of observations in each category. Here are some

    test scores from a math class.

    65 91 85 76 85 87 79 93

    82 75 100 70 88 78 83 59

    87 69 89 54 74 89 83 80

    94 67 77 92 82 70 94 84

    96 98 46 70 90 96 88 72

    It's hard to get a feel for this data in this format because it is unorganized. Toconstruct a frequency distribution, you should first identify the lowest and highest

    values in the list. We do this because we want to be sure that each value in the list

    fits into one of our categories. The low value here is 46, and the high is 100. A set

    of categories that would work here is 41-50, 51-60, 61-70, 71-80, 81-90, and 91-

    100. Here's a finished product :

    Class Frequency

    41-50 1

    51-60 2

    61-70 6

    71-80 8

    81-90 14

    91-100 9

    We can now see that the biggest number of tests were between 81 and 90, and

    most of the tests were between 71 and 100.

    The low number in each category (or class) is called the lower class limit, and the

    high number is called the upper class limit.

    Now for some guidelines for constructing a frequency distribution.

  • 7/31/2019 Frequency Distributio2

    2/12

    Each value should fit into a category. The classes should be mutuallyexhaustive.

    No value should fit into more than 1 category. The classes should be mutuallyexclusive, there should be no overlapping of classes.

    Make the classes of equal size if possible. This makes it easier to comparethe frequency in one class to another.

    Avoid open-ended classes if possible such as "75 and over". Try to use between 5 and 20 classes if possible. If you have fewer than 5

    classes, you're not really breaking up the data, and if you use more than 20

    classes, this will probably be information overflow.

    It is usually convenient to use class sizes of 5 or 10, in other words, to haveeach class containing 5 or 10 possible values.

    It is usually convenient to make the lower limit of the first category amultiple of the class size.

    After the first two rules above, the rest are merely suggestions. Each set of data

    may require you to violate some of these suggestions. The best advice is to try and

    follow them whenever possible.

    The terms that we should need to know for frequency distribution:

    A. Qualitative Data: Data that are measured by either nominal or ordinal

    scales of measurement. Each value serves as a name or

    label for identifying an item.

    B. Quantitative Data: Data that are measured by interval or ratio scales of

    measurement. Quantitative data are numerical values

    on which mathematical operations can be performed.

    C. Bar Graph: A graphical method of presenting qualitative data that

    have been summarized in a frequency distribution or a

    relative frequency distribution.

    D. Pie Chart: A graphical device for presenting qualitative data by

    subdividing a circle into sectors that correspond to the

    relative frequency of each class.

  • 7/31/2019 Frequency Distributio2

    3/12

    E. Frequency A tabular presentation of data, which shows the

    Distribution: frequency of the appearance of data elements in

    several nonoverlapping classes. The purpose of the

    frequency distribution is to organize masses of data

    elements into smaller and more manageable groups. Thefrequency distribution can present both qualitative and

    quantitative data.

    F. Relative Frequency A tabular presentation of a set of data which shows

    Distribution: the frequency of each class as a fraction of the total

    frequency. The relative frequency distribution can

    present both qualitative and quantitative data.

    G. Percent Frequency A tabular presentation of a set of data which showsDistribution: the percentage of the total number of items in each

    class. The percent frequency of a class is simply the

    relative frequency multiplied by 100.

    H. Class: A grouping of data elements in order to develop a

    frequency distribution.

    I. Class Width: The length of the class interval. Each class has two

    limits. The lowest value is referred to as the lower

    class limit, and the highest value is the upper class limit.

    The difference between the upper and the lower class

    limits represents the class width.

    J. Class Midpoint: The point in each class that is halfway between the

    lower and the upper class limits.

    K. Cumulative A tabular presentation of a set of quantitative data

    Frequency which shows for each class the total number of data

    Distribution: elements with values less than the upper class limit.

    L. Cumulative Relative A tabular presentation of a set of quantitative data

    Frequency which shows for each class the fraction of the total

    Distribution: frequency with values less than the upper class limit.

  • 7/31/2019 Frequency Distributio2

    4/12

    M. Cumulative Percent A tabular presentation of a set of quantitative data

    Frequency which shows for each class the fraction of the total

    Distribution: frequency with values less than the upper class limit.

    N. Dot Plot: A graphical presentation of data, where the horizontalaxis shows the range of data values and each

    observation is plotted as a dot above the axis.

    O. Histogram: A graphical method of presenting a frequency or a

    relative frequency distribution.

    P. Ogive: A graphical method of presenting a cumulative

    frequency distribution or a cumulative relative

    frequency distribution.

    IMPORTANT FORMULAS

    Relative Frequency of a Class =Frequency of the Class

    n

    where n = total number of observations

    Approximate Class Width = Largest Data Value - Smallest Data ValueNumber of Classes

  • 7/31/2019 Frequency Distributio2

    5/12

    COMMULATIVE DISTRIBUTION

    One further extension to the frequency distribution is to look at the percentage

    of values that show up in each category. This is called a relative frequency

    distribution or percent frequency distribution.

    The final frequency distribution that we will discuss is the cumulative frequency

    distribution. Think about the word cumulative, it generally refers to some sort of

    total. A cumulative frequency distribution is a way to list how many values fit into

    the first class, the first 2 classes, the first 3 classes, etc., or the last class, the

    last 2 classes, etc.

    Frequency distribution tables Example 1 Constructing a frequency distribution table Example 2 Constructing a cumulative frequency

    distribution tableo Class intervals

    Example 3 Constructing a frequency distribution tablefor large numbers of observations

    o Relative frequency and percentage frequencyThefrequency(f) of a particular observation is the number of times theobservation occurs in the data. The distributionof a variable is the pattern of

    frequencies of the observation. Frequency distributions are portrayed

    as frequency tables, histograms, orpolygons.

    Frequency distributionscan show either the actual number of observations falling

    in each range or the percentage of observations. In the latter instance, the

    distribution is called a relative frequency distribution.

    Frequency distribution tables can be used for both categorical and numericvariables. Continuous variables should only be used with class intervals, which will

    be explained shortly.

    http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a1http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a1http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a1http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a2http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a2http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a2http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a2http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a2http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a2http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a5http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a3http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a3http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a3http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a3http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a3http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a3http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a6http://www.statcan.gc.ca/edu/power-pouvoir/glossary-glossaire/5214842-eng.htm#frequencyhttp://www.statcan.gc.ca/edu/power-pouvoir/glossary-glossaire/5214842-eng.htm#frequencyhttp://www.statcan.gc.ca/edu/power-pouvoir/glossary-glossaire/5214842-eng.htm#frequencyhttp://www.statcan.gc.ca/edu/power-pouvoir/glossary-glossaire/5214842-eng.htm#freqdisthttp://www.statcan.gc.ca/edu/power-pouvoir/glossary-glossaire/5214842-eng.htm#histogramhttp://www.statcan.gc.ca/edu/power-pouvoir/glossary-glossaire/5214842-eng.htm#freqpolyhttp://www.statcan.gc.ca/edu/power-pouvoir/glossary-glossaire/5214842-eng.htm#disfrequencehttp://www.statcan.gc.ca/edu/power-pouvoir/glossary-glossaire/5214842-eng.htm#disfrequencehttp://www.statcan.gc.ca/edu/power-pouvoir/glossary-glossaire/5214842-eng.htm#disfrequencehttp://www.statcan.gc.ca/edu/power-pouvoir/glossary-glossaire/5214842-eng.htm#freqpolyhttp://www.statcan.gc.ca/edu/power-pouvoir/glossary-glossaire/5214842-eng.htm#histogramhttp://www.statcan.gc.ca/edu/power-pouvoir/glossary-glossaire/5214842-eng.htm#freqdisthttp://www.statcan.gc.ca/edu/power-pouvoir/glossary-glossaire/5214842-eng.htm#frequencyhttp://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a6http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a3http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a3http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a5http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a2http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a2http://www.statcan.gc.ca/edu/power-pouvoir/ch8/5214814-eng.htm#a1
  • 7/31/2019 Frequency Distributio2

    6/12

    Example 1 Constructing a frequency distribution table

    A survey was taken on Maple Avenue. In each of 20 homes, people were asked how

    many cars were registered to their households. The results were recorded as

    follows:

    1, 2, 1, 0, 3, 4, 0, 1, 1, 1, 2, 2, 3, 2, 3, 2, 1, 4, 0, 0

    Use the following steps to present this data in a frequency distribution table.

    1. Divide the results (x) into intervals, and then count thenumber of results in each interval. In this case, the

    intervals would be the number of households with no car

    (0), one car (1), two cars (2) and so forth.

    2. Make a table with separate columns for the intervalnumbers (the number of cars per household), the tallied

    results, and the frequency of results in each interval.

    Label these columns Number of cars, Tallyand Frequency.

    3. Read the list of data from left to right and place a tallymark in the appropriate row. For example, the first result

    is a 1, so place a tally mark in the row beside where 1

    appears in the interval column (Number of cars). The next

    result is a 2, so place a tally mark in the row beside the 2,

    and so on. When you reach your fifth tally mark, draw a

    tally line through the preceding four marks to make your

    final frequency calculations easier to read.

    4. Add up the number of tally marks in each row and recordthem in the final column entitled Frequency.

    Your frequency distribution table for this exercise should look like this:

    Table 1. Frequency table for thenumber of cars registered in

    each household

    Number of Tally Frequency

  • 7/31/2019 Frequency Distributio2

    7/12

    cars (x) (f)

    0 4

    1 6

    2 5

    3 3

    4 2

    By looking at this frequency distribution table quickly, we can see that out of

    20 households surveyed, 4 households had no cars, 6 households had 1 car, etc.

    Example 2 Constructing a cumulative frequency distribution table

    A cumulative frequency distribution tableis a more detailed table. It looks almost

    the same as a frequency distribution table but it has added columns that give the

    cumulative frequency and the cumulative percentage of the results, as well.

    At a recent chess tournament, all 10 of the participants had to fill out a form that

    gave their names, address and age. The ages of the participants were recorded as

    follows:

    36, 48, 54, 92, 57, 63, 66, 76, 66, 80

    Use the following steps to present these data in a cumulative frequency

    distribution table.

    1. Divide the results into intervals, and then count thenumber of results in each interval. In this case, intervals

    of 10 are appropriate. Since 36 is the lowest age and 92

    is the highest age, start the intervals at 35 to 44 and end

    the intervals with 85 to 94.

  • 7/31/2019 Frequency Distributio2

    8/12

    2. Create a table similar to the frequency distribution tablebut with three extra columns.

    In the first column or the Lower valuecolumn, listthe lower value of the result intervals. For

    example, in the first row, you would put thenumber 35.

    The next column is the Upper valuecolumn. Placethe upper value of the result intervals. For

    example, you would put the number 44 in the first

    row.

    The third column is the Frequencycolumn. Recordthe number of times a result appears between the

    lower and upper values. In the first row, place thenumber 1.

    The fourth column is the Cumulativefrequencycolumn. Here we add the cumulative

    frequency of the previous row to the frequency of

    the current row. Since this is the first row, the

    cumulative frequency is the same as the frequency.

    However, in the second row, the frequency for the

    3544 interval (i.e., 1) is added to the frequency

    for the 4554 interval (i.e., 2). Thus, the

    cumulative frequency is 3, meaning we have 3

    participants in the 34 to 54 age group.

    1 + 2 = 3

    The next column is the Percentagecolumn. In thiscolumn, list the percentage of the frequency. To do

    this, divide the frequency by the total number of

    results and multiply by 100. In this case, the

    frequency of the first row is 1 and the total

    number of results is 10. The percentage would then

    be 10.0.

    10.0. (1 10) X 100 = 10.0

  • 7/31/2019 Frequency Distributio2

    9/12

    The final column is Cumulative percentage. In thiscolumn, divide the cumulative frequency by the

    total number of results and then to make a

    percentage, multiply by 100. Note that the last

    number in this column should always equal 100.0. Inthis example, the cumulative frequency is 1 and the

    total number of results is 10, therefore the

    cumulative percentage of the first row is 10.0.

    10.0. (1 10) X 100 = 10.0

    3. The cumulative frequency distribution table should looklike this:

    Table 2. Ages of participants at a chess tournament

    Lower

    Value

    Upper

    Value

    Frequency (f) Cumulative

    frequency

    Percentage Cumulative

    percentage

    35 44 1 1 10.0 10.0

    45 54 2 3 20.0 30.0

    55 64 2 5 20.0 50.0

    65 74 2 7 20.0 70.0

    75 84 2 9 20.0 90.0

    85 94 1 10 10.0 100.0

    For more information on how to make cumulative frequency tables, see the section

    onCumulative frequency and Cumulative percentage.

    Class intervals

    http://www.statcan.gc.ca/edu/power-pouvoir/ch10/5214862-eng.htmhttp://www.statcan.gc.ca/edu/power-pouvoir/ch10/5214864-eng.htmhttp://www.statcan.gc.ca/edu/power-pouvoir/ch10/5214864-eng.htmhttp://www.statcan.gc.ca/edu/power-pouvoir/ch10/5214862-eng.htm
  • 7/31/2019 Frequency Distributio2

    10/12

    If a variable takes a large number of values, then it is easier to present and handle

    the data by grouping the values into class intervals. Continuous variables are more

    likely to be presented in class intervals, while discrete variables can be grouped

    into class intervals or not.

    To illustrate, suppose we set out age ranges for a study of young people, while

    allowing for the possibility that some older people may also fall into the scope of

    our study.

    The frequencyof a class interval is the number of observations that occur in a

    particular predefined interval. So, for example, if 20 people aged 5 to 9 appear in

    our study's data, the frequency for the 59 interval is 20.

    The endpointsof a class interval are the lowest and highest values that a variable

    can take. So, the intervals in our study are 0 to 4 years, 5 to 9 years,10 to 14 years, 15 to 19 years, 20 to 24 years, and 25 years and over. The

    endpoints of the first interval are 0 and 4 if the variable is discrete, and 0 and

    4.999 if the variable is continuous. The endpoints of the other class intervals would

    be determined in the same way.

    Class interval widthis the difference between the lower endpoint of an interval

    and the lower endpoint of the next interval. Thus, if our study's continuous

    intervals are 0 to 4, 5 to 9, etc., the width of the first five intervals is 5, and the

    last interval is open, since no higher endpoint is assigned to it. The intervals couldalso be written as 0 to less than 5, 5 to less than 10, 10 to less than 15, 15 to less

    than 20, 20 to less than 25, and 25 and over.

    Rules for data sets that contain a large number of observations

    In summary, follow these basic rules when constructing a frequency distribution

    table for a data set that contains a large number of observations:

    find the lowest and highest values of the variables decide on the width of the class intervals include all possible values of the variable.

    In deciding on the width of the class intervals, you will have to find a compromise

    between having intervals short enough so that not all of the observations fall in the

    same interval, but long enough so that you do not end up with only one observation

    per interval.

  • 7/31/2019 Frequency Distributio2

    11/12

    It is also important to make sure that the class intervals are mutually exclusive.

    Example 3 Constructing a frequency distribution table for large numbers of

    observations

    Thirty AA batteries were tested to determine how long they would last. The

    results, to the nearest minute, were recorded as follows:

    423, 369, 387, 411, 393, 394, 371, 377, 389, 409, 392, 408, 431, 401, 363, 391,

    405, 382, 400, 381, 399, 415, 428, 422, 396, 372, 410, 419, 386, 390

    Use the steps in Example 1 and the above rules to help you construct a frequency

    distribution table.

    Answer

    The lowest value is 363 and the highest is 431.

    Using the given data and a class interval of 10, the interval for the first class is

    360 to 369 and includes 363 (the lowest value). Remember, there should always be

    enough class intervals so that the highest value is included.

    The completed frequency distribution table should look like this:

    Table 3. Life of AA batteries,in minutes

    Battery life,

    minutes (x)

    Tally Frequency

    (f)

    360369 2

    370379 3

    380389 5

    390399 7

  • 7/31/2019 Frequency Distributio2

    12/12

    400409 5

    410419 4

    420429 3

    430439 1

    Total 30

    Relative frequency and percentage frequency

    An analyst studying these data might want to know not only how long batteries last,

    but also what proportion of the batteries falls into each class interval of battery

    life.

    This relative frequencyof a particular observation or class interval is found by

    dividing the frequency (f) by the number of observations (n): that is, (f n). Thus:

    Relative frequency = frequency number of observations

    Thepercentage frequencyis found by multiplying each relative frequency value by

    100. Thus:

    Percentage frequency = relative frequency X 100 = f n X 100