Statistics F5 2A F5 2A.pdfForm5% % Statistics2A% [email protected]% % 2% %...
Transcript of Statistics F5 2A F5 2A.pdfForm5% % Statistics2A% [email protected]% % 2% %...
Form 5 Statistics 2A
Chapter 3: Statistics:
• Revision of calculation of the mean, median, range for discrete and continuous data. Also for grouped data in 2A.
• Revision of bar charts, pie charts, simple frequency distribution & histograms with equal intervals.
• Cumulative Frequency curves: Interpreting/ box plots/ estimate median, lower and upper quartile/ interquartile range.
• Independent Events and probability tree diagrams • Understand and use histograms with unequal intervals
Core (2A & 2B) Extension (2A)
• Collect, classify and tabulate statistical data (e.g. gather data from Information and Communication Technology (ICT) sources).
• Read, interpret and draw simple inferences from tables and statistical diagrams.
• Understand, use and construct, by both pencil and paper and ICT methods, bar charts, pie charts, simple frequency distributions and histograms with equal intervals.
• Calculate and interpret the range, mean, median and • mode for discrete and continuous data. • Use appropriate statistical functions on a calculator
and a spreadsheet to calculate these statistics.
• Understand and use histograms with unequal intervals.
• Interpret and construct cumulative frequency curves.
• Interpret and construct box plots to • illustrate or compare distributions with
large data sets. • Estimate the median, the lower and
upper quartiles and the interquartile range from cumulative frequency curves.
• Calculate the mean, median and mode for grouped data.
• Identify the modal class from a grouped frequency distribution.
4.1: SEC Syllabus (2015): Mathematics Section 3.1: The mean, mode, median and range. The idea of an average is extremely useful, because it enables you to compare one set of data with another set by comparing just two values – their averages. There are several ways of expressing an average, but the most commonly used averages are the mean, mode, median and range. The mean
The mean of a set of data is the sum of all the values in the set divided by the total number of values in the set. That is:
Mean value = total amount ÷ number of figures Example: The ages of 11 players in a football team are: 21 23 20 27 25 24 25 30 21 22 28 What is the mean age of the team? Sum of all ages = 266 Total number in team = 11 Therefore, Mean age = 266 ÷ 11 = 24.2
Form 5 Statistics 2A
Consolidation: Find the mean for each set of data:
a) 7 8 3 6 7 3 8 5 4 9
b) 47 3 23 19 30 22
c) 1.53 1.51 1.64 1.55 1.48 1.62 1.58 1.65
The Mode
The mode is the value that occurs the most in a set of data. That is, it is the value with the highest frequency. The mode is a useful average because it is very easy to find and it can be applied to non-‐numerical data. For example, you can find the modal birthday month of the class. Example: What is the mode of the following?
1, 1, 3, 7, 10, 13 Mode = 1 Consolidation: What is the mode of the following?
a) 3 4 7 3 2 4 5 3 4 6 8 4
b) 100 10 1000 10 100 1000 10 1000 100 1000
100 10
Form 5 Statistics 2A
c) The frequency table shows the colours of eyes of the students in a class.
Blue Brown Green Boys 4 8 1 Girls 8 5 2
a. How many students are in class? ________________________________
b. What is the modal eye colour for:
i. boys
ii. girls
iii. the whole class
c. After two students join the class the modal eye colour for the whole class is blue. Which of the following statements are true?
• Both students had green eyes
• Both students had brown eyes
• Both students had blue eyes
• You cannot tell what their eye colours were
Form 5 Statistics 2A
The Median
The median is the middle number in a set of ordered numbers. To find the Median:
1. Arrange the numbers in order from smallest to largest 2. Find the middle number 3. If you have two middle numbers find the mean of those two numbers
OR You can use the formula to find the position of the median number :
Median = ½ (n + 1) Example: Find the median of the following numbers: 1, 3, 7, 10, 13 Therefore the median is 7 Example: Let us find the median of the following set of numbers 18, 19, 21, 25, 27, 28 If we have two median numbers all we have to do is find the mean of the two numbers. Therefore; Sum := 21 + 25 = 46 Mean := 46 ÷ 2 = 23 As a result, 23 is the median. Consolidation: Find the median of the following numbers:
a) 14 8 6 16 4 12 10 4 18 16 6
b) 10 6 5 7 13 11 14 6 13 15 4 15
Form 5 Statistics 2A
The Range
The range for a set of data is the highest value of the set minus the lowest value. It shows the spread of the data. It is, therefore, used when comparing two or more sets of similar data. You can also use it to comment on the consistency of two or more sets of data.
Range = Highest value – Lowest Value Consolidation: Find the range of the following set of data:
a) 3 8 7 4 5 9 10 6 7 4
b) 1 0 4 5 3 2 5 4 2 1 0
c) In a golf tournament, the club chairperson had to choose either Maria or Fay to play in the first round. In the previous eight rounds, their scores were as follows.
Maria’s Scores : 75 92 80 73 72 88 86 90 Fay’s Scores: 80 87 85 76 85 79 84 88 i. Calculate the mean score for each golfer
ii. Find the range of each golfer
iii. Which golfer would you choose to play in the tournament? Explain why.
Support Exercise Pg 263 Exercsie 17A Nos 5, 7, 8, 9, 10 (Mean), 1, 2, 3, 4, 6
Form 5 Statistics 2A
Section 3.2: Using frequency tables to find the mean, mode, median and
range.
When a lot of information has been gathered, it is often convenient to put it together in a frequency table. From this table you can then find the values of the three averages and the range. Example 1: A survey was done on the number of people in each car leaving a shopping centre. The results are summarized in the table below.
Number of people in each car
1 2 3 4 5 6
Frequency 45 198 121 76 52 13 For the number of people in a car, calculate:
a) the mode b) the median c) the mean
a) The modal number of people in a car is easy to spot. It is the number with
the largest frequency, which is 198. Hence, the modal number of people in a car is 2.
b) The median number of people in a car is found by working out where the middle of the set of numbers is located.
First, add up frequencies to get the total number of cars surveyed. This equals to 505. Next calculate the middle position: (505 + 1) ÷ 2 = 253 Now add the frequency across the table to find which group contains the 253rd item. The 243rd item is the end of the group with 2 in a car. Therefore, the 243rd item must be in the group with 3 in a car. Hence, the median number of people in a car is 3.
c) To calculate the mean number of people in a car, multiply the number of
people in the car by the frequency. This is best done in an extra column. Add these to find the total number of people and divide by the total frequency (the number of cars surveyed).
Form 5 Statistics 2A
Number in Car Frequency Number in these cars 1 45 1 ╳ 45 = 45 2 198 2 ╳ 198 = 396 3 121 3 ╳ 121 = 363 4 76 4 ╳ 76 = 304 5 52 5 ╳ 52 = 260 6 13 6 ╳ 13 = 78 TOTAL 505 1446 Hence, the mean number of people in a car is: 1446 ÷ 505 = 2.9 (to 1 decimal place)
Consolidation: Find the i. mode, ii. median and iii. mean from each frequency table below.
1. A survey of the shoe size of all boys in one year of a school gave these results.
Shoe Size 4 5 6 7 8 9 10 Number of Students
12 30 34 35 23 8 3
Mode: Median: Mean: Shoe Size Frequency Total Shoe Size 4 5 6 7 8 9 10 TOTAL
Form 5 Statistics 2A
2. A school did a survey on how many times in a week students arrived late at school. These are the findings.
Number of times late
0 1 2 3 4 5
Frequency 481 34 23 15 3 4 Mode: Median: Mean: Number of times late Frequency Total times late 0 1 2 3 4 5 TOTAL
Support Exercise Pg 267 Exercise 17B Nos 1 – 6
Form 5 Statistics 2A
Section 3.3: Frequency Tables with Grouped Data
Normally, grouped data are continuous data, which is data that can have any value within a range of values (e.g. height, mass, time). In these situations, the mean can only be estimated, as you do not have all the information. Discrete data is data that consists of separate numbers, for example, goals scored, marks in a test, number of children and shoe size. In both cases, when using a grouped table to estimate the mean, first find the midpoint of the interval by adding the two end-‐values and then dividing by two. Example 1 Pocket money, p ($) 0 < p ≤ 1 1 < p ≤ 2 2 < p ≤ 3 3 < p ≤ 4 4 < p ≤ 5 Number of students 2 5 5 9 15
a) Write down the modal class.
b) Calculate an estimate of the mean weekly pocket money.
a) The modal class is easy to pick out, since it is simply the one with the
largest frequency. Here the modal class is $4 to $5.
b) To estimate the mean, assume that each person in each class has the ‘midpoint’ amount, then build up the following table.
To find the midpoint value, the two end-‐values are added together and then divided by two.
Pocket money, p ($)
Frequency (f) Midpoint (m) f × m
0 < p ≤ 1 2 0.5 2 × 0.5 = 1 1 < p ≤ 2 5 1.5 5 × 1.5 = 7.5 2 < p ≤ 3 5 2.5 5 × 2.5 = 12.5 3 < p ≤ 4 9 3.5 9 × 3.5 = 31.5 4 < p ≤ 5 15 4.5 15 × 4.5 = 67.5 Totals Σf = 36 Σ (m × f) = 120 The estimated mean will be 120 ÷ 36 = $3.33 (rounded to the nearest cent) Note: You cannot find the median or range from a grouped table since you do not know the actual values.
Form 5 Statistics 2A
Example 2: For the table of values given below, find:
i) the modal group
ii) an estimate for the mean. x 0 < x ≤ 10 10 < x ≤ 20 20 < x ≤ 30 30 < x ≤ 40 40 < x ≤ 50 Frequency 4 6 11 17 9
i) The modal group:
ii) Mean: x Frequency (f) Midpoint (m) f ╳ m Total Σf = Σ f ╳ m = Mean : Consolidation
1) Jason brought 100 pebbles back from the beach and found their masses, recording each mass to the nearest gram. His results are summarized in the table below:
Mass (m) 40 < m ≤ 60 60 < m ≤ 80 80 < m ≤ 100 100 < m ≤ 120 120 < m ≤ 140 140 < m ≤ 160 Frequency 5 9 22 27 26 11
i) Mode:
ii) Mean:
Mass Frequency (f) Midpoint (m) f ╳ m Total
Form 5 Statistics 2A
2) A gardener measured the heights of all his roses to the nearest centimeter
and summarized his results as follows:
Height (cm) 10-‐14 15-‐18 19-‐22 23-‐26 27-‐40 Frequency 21 57 65 52 12 a) How many roses did the gardener have?
b) What is the modal class of the roses?
c) What is the estimated mean height of the roses?
Height (cm) Frequency (f) Midpoint (m) f ╳ m 10-‐14 15-‐18 19-‐22 23-‐26 27-‐40 Total
Support Exercise Pg 275 Exercise 17D Nos 1 – 5
Form 5 Statistics 2A
Section 3.4: Drawing and Interpreting Bar Charts
A bar chart consists of a series of bars or blocks of the same width, drawn either vertically or horizontally from an axis. The heights and lengths of the bars always represent frequencies. Example 1: The grouped frequency table below shows the marks of 24 students in a test. Draw a bar chart for the data. Marks 1-‐10 11-‐20 21-‐30 31-‐40 41-‐50 Frequency 2 3 5 8 6
Frequency
8 7 6 5 4 3 2 1
0
1-‐10 11-‐20
21-‐30
31-‐40
41-‐50
Mark Note:
• Both axes are labeled
• The class intervals are written under the middle of each bar
• The bars are separated by equal spaces
By using a dual bar chart, it is easy to compare two sets of related data, as Example 2 shows.
Form 5 Statistics 2A
Example 2 This dual bar chart shows the average daily maximum temperature for England and Turkey over a five-‐month period.
Temperature (°F)
100 90 80 70 60 50 40 30 Key 20 England 10 Turkey
0 April May June July August Month In which month was the difference between temperatures in England and Turkey the greatest? Note: You must always include a key to identify the two different sets of data. Consolidation
1) For her survey on fitness, Samina asked a sample of people, as they left a sports centre, which activity they had taken part in. She then drew a bar chart to show her data.
Frequency
20
18 16 14 12 10 8 6 4 2
0
Squash Weight Training Badminto
n Aerobics Basketball Swimmin
g
Activity
Form 5 Statistics 2A
a. Which was the most popular activity?
b. How many tool part in Samina’s survey?
2) The frequency table below shows the levels achieved by 100 students in their A’ levels.
Grade F E D C B A Frequency 12 22 24 25 15 2
a. Draw a suitable bar chart to illustrate the data.
b. What fraction of the students achieve a Grade C or Grade B?
Form 5 Statistics 2A
3) This table shows the number of point Mark and Joseph were each awarded in eight rounds of a general knowledge quiz.
Round 1 2 3 4 5 6 7 8 Mark 7 8 7 6 8 6 9 4 Joseph 6 7 6 9 6 8 5 6
a. Draw a dual bar chart to illustrate the data.
b. Comment on how well each of them did in the quiz.
Support Exercise Handout
Form 5 Statistics 2A
Section 3.5: Interpreting Pie Charts & drawing pie charts
Pictograms, bar charts and line graphs are easy to draw but they can be difficult to interpret when there is a big difference between the frequencies or there are only a few categories. In these cases, it is often more convenient to illustrate the data on a pie chart. In a pie chart the whole of a data is represented by a circle (the ‘pie’) and each category of it is represented by a sector of the circle. The angle of each sector is proportional to the frequency of the category it represents. So a pie chart cannot show individual frequencies, like a bar chart can, it can only show proportions. Calculating the frequency that each sector represents:
1) Measure angles from the pie chart
2) Find the fraction of the whole circle
3) Multiply this fraction with the total number of items to calculate the frequency.
Example 1: A chocolate firm asked 1440 students which type of chocolate they preferred. The pie chart showed the following results: Milk Chocolate -‐ 150° White Chocolate -‐ 120° Fruit and nut -‐ 90° This information can be recorded into a table and the frequencies for each type are calculated.
Chocolate Angle Working Frequency Milk 150° 150
360×1440 600 students
White 120° 120360×1440
480 students
Fruit and Nut 90° 90360×1440
360 students
TOTAL 360° 1440 students
Form 5 Statistics 2A
Consolidation
1) 300 passengers have boarded a train at Waterloo Station in London. The following angles where given on a pie chart:
Southampton -‐ 120° Bournemouth -‐ 90° Parkstone -‐ 36° Branksome -‐ 54° Poole -‐ 60° Town Angle Working Frequency
TOTAL We are not always given the angle of the sector; we could be given the frequency and asked to find the angle of the sector. Calculating the angle that each frequency represents
1) Calculate the frequency of a particular sector from the pie chart
2) Find the fraction of the whole frequency
3) Multiply this fraction with the total degrees.
Example 2: The following table shows the eye colours of a group of 36 people. Find by how many degrees each sector is going to be represented with.
Form 5 Statistics 2A
Colour of Eyes Number of people Working Angle
Brown 12 1236×360
120°
Blue 15 1536×360
150°
Green 6 636×360
60°
Other 3 336×360
30°
Total 36 360° Example 3: 20 people were surveyed about their preferred drink. The replies
are shown in the table below:
Drink Tea Coffee Milk Cola
Frequency 6 7 4 3
Show the results on a pie chart.
• First we must know what the total number of people observed were if you
are not told in the question.
6 + 7 + 4 + 3 = 20 people
• Second, we must work out the size of the angle which will represent the
drink choice.
Tea : 620×360° = 108°
Coffee: 720×360° = 126°
Milk: 420×360° = 72°
Cola: 320×360° = 54°
• Third, draw the pie chart sector by sector.
Form 5 Statistics 2A
Note:
• You should always label the sectors of the pie chart
• You should always write the angle on each sector of the pie chart
Tea, 108
Coffee, 126
Milk, 72
Cola, 54
Tea
Coffee
Milk
Cola
Form 5 Statistics 2A
Consolidation
1. Joseph asked 180 boys what was their favourite sport. Here are the results.
Sport Football Rugby Cricket Basketball Other Frequency 74 25 18 37 26
a) Draw a pie chart for these results with radius 6cm.
Joseph also asked 90 girls for their favourite sport. In a pie chart showing the results, the angle for Tennis was 84°. b) How many of the girls said that Tennis was their favourite sport?
Support Exercise Handout
Form 5 Statistics 2A
Section 3.6: Frequency density and histograms Section 4.4 introduced bar charts. All the bar charts drawn in section 4.4 had class intervals of equal with and so the bars were of equal width. Histograms are used to represent unequal class intervals. The vertical axis is labeled frequency density where the frequency density = !"#$%#&'(
!"#$$ !"#$!
Example 1: The table gives some information about the ages of the audience at a concert.
Age (x) in years Frequency 0< x ≤ 15 12 15< x ≤ 25 66 25< x ≤ 35 90 35< x ≤ 40 45 40< x ≤ 70 60
To draw a histogram to represent this information:
• Work out the width of each class interval (the class width) • Divide the frequency by the class width to find the frequency density
which gives the height of each bar.
Age (x) in years Frequency Class width Frequency density = 𝒇𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒚𝒄𝒍𝒂𝒔𝒔 𝒘𝒊𝒅𝒕𝒉
0< x ≤ 15 12 15 – 0 = 15 0.8 15< x ≤ 25 66 25 – 15 = 10 6.6 25< x ≤ 35 90 35 – 25 = 10 9 35< x ≤ 40 45 40 – 35 = 5 9 40< x ≤ 70 60 70 – 40 = 30 2
Form 5 Statistics 2A
Rearranging, Frequency density = !"#$%#&'(
!"#$$ !"#$! gives
Frequency = Frequency density × class width For each bar the ‘width’ is the class width and the ‘height’ is the frequency density. So the area of each bar gives the frequency Consolidation: Pg 349, Ex. 21 E Section 3.7: Range and Interquartile Range In section 4.1 the range was already described as a measure of how spread out numerical data is. To find the range of a set of number, work out the difference between the highest and the lowest number. The median is the value that is halfway through the data: ½(n+1) th number The lower quartile is the value that is a quarter of the way through the data: ¼(n +1) th number The upper quartile is the value that is three quarters of the way through the data: ¾(n + 1) th number
Interquartile range = upper quartile – lower quartile
For the 11 heights in order,
1.68 1.74 1.78 1.80 1.81 1.82 1.83 1.88 1.88 1.97 2.05
The lower quartile is the ¼ (11 + 1) = 3rd number = 1.78 The upper quartile is the ¾(11 + 1) = 9th number = 1.88 Interquartile range = 1.88 – 1.78 = 0.1 metres
Form 5 Statistics 2A
Example 1: The table shows information about the ages, in years, of junior members of a tennis club.
a) Find the range of their ages b) Find the interquartile range of their ages.
Age in years Frequency 9 30 10 40 11 19 12 38 13 11 14 18 15 13
a) Range = 15 – 9 = 6 years
b) Lower quartile is the ¼(169 + 1) = 42½th number. The 42nd and 43rd ages are 10 Therefore, lower quartile = 10 Upper quartile is the ¾(169 + 1) = 127½th number. The 127th age is 12 and 128th age is 13 Therefore, upper quartile = 12.5 Interquartile Range = 12.5 – 10 = 2.5 years Consolidation: Pg 272, Ex. 17C, nos 1-‐5 Section 3.8: Cumulative Frequency The grouped frequency table shows information about the amount of time 160 students spent doing homework one evening.
Time (x minutes) Frequency 0 < x ≤ 10 4 10 < x ≤ 20 12 20 < x ≤ 30 46 30 < x ≤ 40 68 40 < x ≤ 50 20 50 < x ≤ 60 10
Here is the complete cumulative frequency table.
Form 5 Statistics 2A
Time (x minutes) Cumulative Frequency 0 < x ≤ 10 4 10 < x ≤ 20 (4 + 12) = 16 20 < x ≤ 30 (16 + 46) = 62 30 < x ≤ 40 (62 + 68) = 130 40 < x ≤ 50 (130 + 20) = 150 50 < x ≤ 60 (150 + 10) = 160
The last number in the cumulative frequency column is 160, the total number of students. A cumulative frequency table can be used to draw a cumulative frequency graph. The cumulative frequency for the interval 0 < x ≤ 10 is plotted at (10,4), that is, at the top end of the interval to ensure that all 4 students have been included. The remaining points are plotted (20, 16), (30, 62), (40, 130), (50, 150), (60, 160).
The position of the median is the ½(n + 1)th student = ½(160 + 1) = 80.5th student. When finding an estimate for the median from a cumulative frequency graph it is acceptable to use the ½nth. Thus, to find an estimate of the median in this example the 80th student is used. Estimates of the lower quartile and the upper quartile can also be read off a cumulative frequency graph.
Form 5 Statistics 2A
Lower quartile = ¼(160) = 40th value Upper quartile = ¾(160) = 120th value The estimates of the lower quartile and the upper quartile from a cumulative frequency graph can be used to find an estimate for the interquartile range.
• Lower quartile = 26 • Median = 32 • Upper Quartile = 38
Interquartile Range = 38 – 26 = 12 Thus, when finding estimates from cumulative frequency graph:
• ½n for the median • ¼n for the lower quartile • ¾n for the upper quartile
Form 5 Statistics 2A
Box plots (sometimes called box and whisker diagrams) are diagrams that show the spread of a set of data. The median, lower and upper quartiles along with the minimum and maximum value are used to draw a box plot. For the above example:
• minimum value = 8 • maximum values = 57 • lower quartile = 26 • median = 32 • upper quartile = 38
Here is the box plot for this data.
Form 5 Statistics 2A
Consolidation: The grouped frequency table gives information about the number of minutes 60 music students practiced last week.
Minutes (m) Frequency 0< x ≤ 15 3 15< x ≤ 30 8 30< x ≤ 45 12 45< x ≤ 60 18 60< x ≤ 75 8 75< x ≤ 90 5 90< x ≤ 105 4 105< x ≤ 120 2
a) Complete the cumulative frequency table.
Minutes (m) Cumulative Frequency
0< x ≤ 15
15< x ≤ 30
30< x ≤ 45
45< x ≤ 60
60< x ≤ 75
75< x ≤ 90
90< x ≤ 105
105< x ≤ 120
b) Draw a cumulative frequency graph for your table. c) Use your graph to find an estimate for the number of music
students who practiced for i. Less than 40 minutes
ii. More than 40 minutes
Form 5 Statistics 2A
d) Use the cumulative frequency graph to find estimates for the median, lower quartile and upper quartile. Hence, find the interquartile range.
e) Draw a box plot to show this information.
Consolidation: Pg 337, Ex. 21B, Pg 341, Ex. 21C, Pg 344 Ex. 21D
Section 3.9: Comparing Distributions Box plots are useful for comparing the distribution of data sets. Example: 80 seedlings were divided into 2 groups. Group A were grown in a greenhouse. Group B were grown outside.
Group A Group B Shortest Seedling (cm) 1.6 0.3 Tallest seedling (cm) 4.4 3.8
After a period of time the heights of the seedlings were measured. The heights were used to draw two cumulative frequency graphs.
Form 5 Statistics 2A
a. Use the information provided in the table and the cumulative frequency graphs to draw a box plot of the heights of seedlings in group A and a box plot of the heights of the seedlings in group B.
Group A Group B
Shortest Seedling (cm) 1.6 0.3 Tallest seedling (cm) 4.4 3.8 Lower quartile 2.6 2.1
Median 3.3 2.4 Upper quartile 3.7 2.8
Form 5 Statistics 2A
b. Compare the heights of the seedlings in the two graphs. The heights of the seedlings in group B are more spread out than the heights of the seedlings in group A. The seedlings in group A are generally taller that the seedlings in group B. The middle 50% of the seedlings in group A have a wider spread than the middle 50% of the seedlings in group B. Consolidation: Pg 344, 21D