MS3081
-
Upload
te-aho-o-te-kura-pounamu -
Category
Documents
-
view
213 -
download
1
description
Transcript of MS3081
te tauanga
statistics
2013/1
MS3081
introduction to time seriesncea level 3
© te aho o te kura pounamu
statisticsncea level 3
Expected time to complete workThis work will take you about 10 hours to complete.
You will work towards the following standards: Achievement Standard 91580 (Version 1) Mathematics and Statistics 3.8Investigate time series data Level 3, External 4 credits
In this booklet you will focus on this learning outcome: • investigating time series data with odd point moving means.
You will continue to work towards this standard in booklets MS3082 and MS3083.
Copyright © 2013 Board of Trustees of Te Aho o Te Kura Pounamu, Private Bag 39992, Wellington Mail Centre, Lower Hutt 5045,
New Zealand. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means without
the written permission of Te Aho o Te Kura Pounamu.
1MS3081© te aho o te kura pounamu
contents
1 Introduction to time series
2 First steps in smoothing the raw data
3 Using a spreadsheet to find the trend
4 Linear trend lines
5 First steps of writing a statistical report
6 Review activity
7 Answer guide
2 MS3081 © te aho o te kura pounamu
how to do the work
When you see:
1A Complete the activity.
Check your answers.
Your teacher will assess this work.
Check the website.
Contact your teacher.
You will need:
• access to the Internet to get on to the MS3000 OTLE website where you will find the data sets you need and the spreadsheet answers.
• any type of appropriate mathematical and statistical technology such as graphing calculator, net book or tablet that has a spreadsheet.
Resource overviewIn this booklet you will learn about how to comment on the features of time series graphs. You will learn how to smooth the raw data using odd point moving means and then find the equation of the trend line. It is expected that you will use a spreadsheet for this work. In the following booklet MS3082 you will learn about even point moving means.
3MS3081© te aho o te kura pounamu
introduction to time series1
learning intentionsIn this lesson you will learn to: • identifythefeaturesinatimeseries • relatethefeaturestothecontext.
introductionDo you: • measureyourheartbeat? • logyoursportsperformance? • watchthesharemarket?If you do, then you’re already into time series.
features of time seriesA time series is a set of observations or measurements made at regular intervals over a period of time. Daily temperatures, commodity prices, records of growth, population patterns and exchange rates are all examples of time series.
The graph of an electrocardiogram (ECG) is a time series graph.
Throughout the booklets on time series, you’ll be looking at various types of time series, learning the skills and techniques needed to interpret what has happened and forecast what may happen next.
You’ll have a chance to apply the skills as you learn them, then in the last lesson you will put it all together and work through the assessment task.
iSto
ck
4 © te aho o te kura pounamuMS3081
introduction to time series
features of time series graphsTime series are made up of four components:
two long-term:
T the overall trend C the long-term cycle
and two short-term: S the seasonal variation R (or I) is the residual; that is, any random irregularity.
Any or all of these components may be present in a time series. Let’s have a look at some time series graphs and identify the components.
the trendT, the trend of a series is sometimes called the secular trend (secular means ‘over a long period of time’). The easiest way to discover a trend is to use a spreadsheet to graph the raw data and trend line. The trend may be a straight line or a curve. Try to think what the trend implies about the data.
You can see from the graph that the trend shows a gradual increase in the cost of living from March 1992 to December 1994.
The steeper slope of the graph line from June 1994 shows that the rate of increase was greater during the second half of 1994 than it was during the first half.
The secular trend tells you that the cost of living is rising.
N Z Consumer Price Index (quarterly)
960
970
980
990
1000
1010
1020
Mar-92Jun-92 Sep-92 Dec-92Mar-93Jun-93 Sep-93Dec-93Mar-94Jun-94 Sep-94 Dec-94
Mar
– 9
2
Dec
– 9
2
Sep
– 93
Jun
– 92
Mar
– 9
3
Dec
– 9
3
Sep
– 92
Jun
– 93
Mar
– 9
4
Jun
– 94
Sep
– 94
Dec
– 9
4
New Zealand consumer price index
1 020
1 010
1 000
990
980
970
960
Year
Pric
e ($
)
5© te aho o te kura pounamu MS3081
introduction to time series
the cycleLooking at the graph of a series is also the best way to pick up any long-term cycles, C, which are hard to spot from the data. For a cycle to be considered long-term, it has to be over two years in length. Anything shorter is called a seasonal variation.
Youʼllbeabletoworkoutthelengthofacyclebyfindingthetimebetweenpeaksordipsinthe graph line.
Here, you can see that the graph line peaks roughly every seven years.
The Sound and Sight Company share price time series has a seven-year cycle. There’s also a long-term or secular trend showing an overall increase in the share price.
iSto
ck
Sound and Sight Company share price
Pric
e (c
ents
)
200
150
100
50
0
1965
1972
1979
1986
1993
2000
Year
6 © te aho o te kura pounamuMS3081
introduction to time series
seasonal variationS, the seasonal variations, are often known about before the data is collected. With their regular and recurring patterns, they are easy to spot from the graph. Look at the time axis (horizontal) to work out the period of the variations. The period is the length of time between two peaks (or troughs). It may be a week, a month or any fraction of a year.
Even regular daily variations over the period of a day are classed as seasonal features, as in this example.
Patient temperature chart May 13-16
temperature °C
35
36
37
38
39
40
6.00am2.00pm9.00pm6.00am2.00pm9.00pm6.00am2.00pm9.00pm6.00am2.00pm9.00pm
36.8normal
6.00
am
Tem
pera
ture
(ºC
)
6.00
am
6.00
am
6.00
am
9.00
pm
9.00
pm
9.00
pm
9.00
pm
2.00
pm
2.00
pm
2.00
pm
2.00
pm
Patient temperature chart May 13–1640
39
38
37
36
35
36.8normal
Time
You can see that the patient’s temperature rises to a peak at 9.00 pm, then falls steadily during the night to an early-morning low. This marked seasonal pattern diminishes as she recovers.
Seasonal variations are often present in time series connected with climate or season.
7© te aho o te kura pounamu MS3081
introduction to time series
residual effectR, the residual effect, describes the irregular or random variations that occur in all kinds of time series. They show up as ‘spikes’ or ‘dips’ in the graph line and are often described as noise in a time series.
They can be caused by unpredictable rare events (earthquake, fire, etc.), measurement error, or just plain random human behaviour. The residual components make it impossible to forecast future values with total accuracy. Who knows what spanner may get thrown in the works!
Mar
– 9
0
Mar
– 9
1
Mar
– 9
2
Mar
– 9
3
Nov
– 9
0
Nov
– 9
1
Nov
– 9
2
Jul –
90
Jul –
91
Jul –
92
Jul –
93
Pric
e ($
)
Price of audio component ($)
dollars
0
50
100
150
200
250
300
350
Mar-90Jul-90 Nov-90Mar-91Jul-91 Nov-91Mar-92Jul-92 Nov-92Mar-93Jul-93
Price of audio component ($)
350
300
250
200
150
100
50
0
Months
The spike in the graph shows a temporary rise in price. A factory fire halted production of the component. This led to a shortage, which in turn pushed prices up temporarily.
8 © te aho o te kura pounamuMS3081
introduction to time series
Other irregular features you may see from time to time are:
stepsrampsSteps Ramps
These cause the trend line to be displaced.
They can be caused by: •achangeinthetimeperiod •achangeincollectionmethodorreportingprocedure •openingorclosingofasalesoutlet •achangeinprice •achangeinthelaw • andsoon.
In the July 1986 budget, there was a large increase in the tax on tobacco products. This is evident in the step feature indicating a dramatic rise in the price index recorded over the next quarter (three months).
Pric
e in
dex
Quarterly price index for tobacco products (Dec 93, 1000)
200
250
300
350
400
450
500
550
600
Dec-82Mar-83Jun-83Sep-83Dec-83Mar-84Jun-84Sep-84Dec-84Mar-85Jun-85Sep-85Dec-85Mar-86Jun-86Sep-86Dec-86Mar-87Jun-87Sep-87Dec-87Mar-88Jun-88Sep-88Dec-88
Dec
– 8
2
Jun
– 83
Jun
– 84
Jun
– 85
Jun
– 86
Jun
– 87
Jun
– 88
Dec
– 8
3
Dec
– 8
4
Dec
– 8
5
Dec
– 8
6
Dec
– 8
7
Dec
– 8
8
Mar
– 8
3
Mar
– 8
4
Mar
– 8
5
Mar
– 8
6
Mar
– 8
7
Mar
– 8
8
Sep
– 83
Sep
– 84
Sep
– 85
Sep
– 86
Sep
– 87
Sep
– 88
Quarterly price index for tobacco products (Base Dec 93, index 1000)
Quarter
9© te aho o te kura pounamu MS3081
introduction to time series
reading time scales on graphsSometimes it can be confusing to know exactly when changes are occurring, especially when data points are not graphed separately, but joined by lines.
For most data measured or collected, the time point stated usually means ‘up to the end of the period’.
This is a small part of a data set on mackerel caught in New Zealand waters. Year
197719781979
Tonnes
18 4926 6057 589
Between the end of 1977 and the end of 1978, there were 6 605 tonnes of mackerel caught. That means that during 1978, there was a decrease of almost 12 000 tonnes. You can see this represented on the graph in activity 1B, question 3.
identifying features in the data
Look at the time series graph of the raw data to identify features.
Consider all four components of time series, although you may find that all four are not necessarily present. Often the long-term cycle, c, is not apparent. Make sure that you write your comments in context. This means that you will need to identify the variables and relate these to your comments. Your comments should be about the data set that you are investigating and not generic comments alone. You need to be able to interpret what you see in the graphs and any statistical summaries that you create from the spreadsheet. You are a data detective and your mission is to unravel the story that is hidden within and between data points.
An ECG graph is made by recording the electrical impulses made while the heart is beating and printing these on paper strip where the side of each small square represents 1 mm (each larger square is 5 mm in length). Each larger square represent 0.2 seconds. Look at the bottom time series in the ECG graph below. This is the rhythm strip. The data was collected over approximately five seconds and the voltage is measured on the vertical axis, with 10 mm being equal to 1 mV. Comment on the features of this time series by filling in the gaps in the passage below.
1A
10 © te aho o te kura pounamuMS3081
introduction to time series
The variables for the ECG graph are 1. ________________ measured in mV and time measured in 2. ________________.
There is a clear seasonal 3. ________________ that begins with a small wave, a large spike and a second wave that is 4. _______________ than the first. This indicates that the resting 5. ____________ recorded from the heart makes a small 6. ____________ and decays, then rapidly spikes by about 1 mV followed by an equally rapid 7.____________ that drops below the resting voltage. Finally there is a second slower increase of just under 8.________________ mV. The trend is neither increasing nor 9._______________ so we say that the trend for voltage is constant over this 10.__________________ period.
As this data covers only a few seconds we are not able to determine the 11. _____________________ effect.
The third spike is slightly lower than the others recorded on this graph but is probably not significant enough to consider as 12.__________________.
To find out more about the ECG refer to the links on the MS3000 OTLE website.
iSto
ck
11© te aho o te kura pounamu MS3081
introduction to time series
Describe the features you see in these time series graphs.Make sure that you write your comments in context.
1.
Yearly cigarette consumption in NZ (millions)
year
number (m)
30003500400045005000550060006500
80 81 82 83 84 85 86 87 88 89 90 91 92 93
1980
Num
ber (
mill
ions
)
1982
1984
1986
1988
1990
1992
1981
1983
1985
1987
1989
1991
1993
Yearly cigarette consumption in New Zealand
6 500
6 000
5 500
5 000
4 500
4 000
3 500
3 000
2.
Sale
s ($m
)
Quarterly retail chemists' sales in New Zealand ($ millions)
year
sales $(m)
200
220
240
260
280
Mar-91 Sep-91 Mar-92 Sep-92 Mar-93 Sep-93 Mar-94 Sep-94
Quarterly retail chemist sales in New Zealand
Mar
– 9
1
Mar
– 9
2
Mar
– 9
3
Mar
– 9
4
Sep
– 91
Sep
– 92
Sep
– 93
Sep
– 94
280
260
240
220
200
1B
12 © te aho o te kura pounamuMS3081
introduction to time series
3.
Tonn
es
year
071 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91
mackerel barracouta
Year
45 00040 00035 00030 00025 00020 00015 00010 0005 000
Mackerel and barracouta catches in New Zealand waters19
71
1973
1975
1977
1979
1981
1987
1983
1989
1972
1974
1976
1978
1980
1986
1982
1988
1984
1990
1985
1991
13© te aho o te kura pounamu MS3081
introduction to time series
4. Graph matching Choose the correct title from the list on page 15 or at the end of this exercise, and write it above the appropriate graph.
1. No trend with random variation
2. Trend plus seasonal variation plus random variation
3. Perfect trend
4. Spike
5. Perfect trend plus seasonal variation
Work through the PPT ‘Belle’s Dairy Farm’ on the MS3000 OTLE site and use this to check your answers.
Check your answers.
y
x0
1
2
3
4
5
6
0 1 2 3 4 5 6 7 80
2
4
6
8
10
12
14
0 1 2 3 4 5 6 7 8
time period10
000L
milk
Time periodM
ilk 1
0 00
0 L
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
0 2 4 6 8 10 12 14 16 18
period
milk
100
00L
Time period
Milk
10
000
L
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
0 2 4 6 8 10 12 14 16 18
time period
milk
100
00L
Time period
Milk
10
000
L
0
2
4
6
8
10
12
14
0 1 2 3 4 5 6 7 8
time period
milk
100
00L
Time period
Milk
10
000
L
14 MS3081 © te aho o te kura pounamu
first steps in smoothing the raw data2
learning intentionIn this lesson you will learn to:
• smooth the raw data using odd point moving means.
introductionOften it is hard to see the secular trend in a time series graph because of the fluctuations in the graph line. Smoothing the data makes the trend more evident.
moving meansIt is easy to use a spreadsheet to calculate moving means.The advantages of this method are: • itgivesasmoothercurvethantherawdata • itspreadstheeffectofextremevalues.
We will use moving means in our analysis of time series.In this example, the average or mean of every three data values is found.
Death rate per thousand of young American males from 1912 to 1922
Year 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922
Rate 4.5 4.7 4.4 4.2 4.5 5.0 12.2 5.3 4.8 3.8 3.8
Mean 3 years 4.5 4.4 4.4 4.6 7.2 7.5 7.4 4.6 4.1
The smoothed rate for 1918 is the mean of the rates for 1917, 1918 and 1919.See how the effect of the ‘spike’ has been spread over the three years.
Death rate per thousand of young American males
year
rate per thousand
0.0
2.0
4.0
6.0
8.0
10.0
12.0
14.0
1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922
raw data mean (3)
1912
1913
1916
1919
1914
1917
1920
1915
1918
1921
1922
Rat
e pe
r tho
usan
d
Death rate per thousand of young American males
Yearmoving means(3)raw data
5.0 + 12.2 + 5.3 3
= 7.5
15© te aho o te kura pounamu MX3081
first steps in smoothing the raw data
number of points – series without a seasonal componentThe number of values the data is averaged over is often referred to as the number of points. So ‘a three-point moving average’ tells you that groups of three means have been used to smooth the data. We calculate a moving mean of order three.
For time series without a seasonal pattern, the number of time periods chosen is not critical.
But remember, the larger the number of time periods or points chosen for averaging, the more values you lose off the ends. With three points you lose one value from each end. With five points you lose two, and with seven you lose three. This can be a drawback if you don't have many data values.
If the number of periods averaged is too small, the graph may still have too many variations for the trend to be evident. If the number of periods is too large, many features of the data may be lost.
You'll get the idea by looking at the example shown for the value of exports. The 12-point moving average smooths the data well, while the six-point moving average leaves quite large variations remaining.
series with a seasonal patternFor time series with a seasonal pattern, the number of time periods chosen for averaging is usually the same as is present in one cycle of the raw data. For quarterly raw data, one cycle has four pieces of data so a moving mean of order four (four-point moving mean) is used. For data collected on each of seven days of a week, one cycle is made up of seven pieces of data so a seven-point moving mean is used. We would say that we calculate a moving mean of order seven.
Value of exports: fish, crustaceans, dairy and other animal products
500
420
340
260
180
100Mar–85 Mar–87 Mar–89 Mar–91 Mar–93 Mar–95
Key: – monthly – 6pt moving mean – 12 pt moving mean
Valu
e
16 © te aho o te kura pounamuMS3081
first steps in smoothing the raw data
In the chart below, means of five adjacent data values have been found. (The means have been rounded to the nearest whole number.)
Complete the chart by calculating the missing five-point means, d, e and f.
Thickness of the ozone layer above New Zealand (measured in Dobson Units in July each year)
Year Data Mean 5
1974197519761977197819791980198119821983198419851986198719881989199019911992
289305300326307308313292320322313306271290292292289276291
305309d309308311312311306300e290287288f
Check your answers.
2A
17MS3081© te aho o te kura pounamu
using a spreadsheet to find the trend3
learning intentionIn this lesson you will learn to:
• use a spreadsheet to find the trend. introductionYou will need access to the MS3000 OTLE website. The data sets can be downloaded from the Time Series section.
spreadsheets It is expected that you will complete this work by using a statistical package or a spreadsheet programme (such as Excel) on a net book, tablet or any other computer or on a graphing or CAS calculator. You can use the spreadsheet you are familiar with for processing the time series data.ThefilesontheMS3000OTLEwebsiteareExcelfiles,soifyouarenotusingExcelyouwillneedtocopyandpastethedataintothefiletypesuitableforthespreadsheetyouareusing.Makeitahabittoaddyournametothebeginningofthefilenamebecauseyourteacherneedsto know that it is yours.
Make sure that you save your work on your spreadsheet regularly.
Write your answers to each exercise in the spaces in this workbook, print the spreadsheet work you have completed, making sure that all work is clearly labelled, and post your work back to your Mathematics and Statistics teacher either with this booklet or before you complete this booklet to provide evidence of your learning. It is important that you send a cover sheet with any work you send in by post as the bar code is necessary for Te Kura to track your work return.
It is important to post work regularly to your Mathematics and Statistics teacher so that it can be recorded as engagement.
If you need help with your spreadsheet contact your Statistics teacher.
iSto
ck
18 © te aho o te kura pounamuMS3081
using a spreadsheet to find the trend
the trendOne of the main uses for time series analysis is to make a forecast. We need to graph the raw data and the moving means and this is easier when using a spreadsheet. Once the raw data and themovingmeanshavebeengraphedonthesameaxes,itisasmallsteptofindtheequationofthe trend line. The trend equation is based on the smoothed data only so it is vital to ensure that only the graph of the moving means is highlighted.
In Excel once the moving means is highlighted a dropdown menu gives a choice of models tofitasthetrendandgivestheoptiontofinditsequation.Thisbookletwillfocusonalineartrendsoyouwillchoosethelinearmodelfromthedropdownmenu.YouwillfindhelpforusingExcel on the MS3000 OTLE site. Contact your Mathematics and Statistics teacher if you need help with this.
Absences data1. Save the data set Absences data from the MS3000 OTLE site to your computer. Add your name to the beginning of the file name. Open the file by double-clicking on the left of the file name.
This data is a time series of the number of absences by Year 12 students over a four-week period. As the data were collected over a five-day period we calculate moving means of order five. We say that the data has seasonality of order five.
Using your spreadsheet, you are to: • calculate odd point moving means to smooth the data • draw the graph of the raw data and the smoothed data on the same graph • draw the trend line on the graph (this must be based on the smoothed data not the raw) • comment on the features of this time series data • find the equation of the trend line • check the MS3000 OTLE website to mark
and correct your work.
Name all work clearly.
iSto
ck
3A
Nameyourfilesbyaddingyourname tothebeginningofthefilename.
Make sure that you save your work regularly. (Press the save icon or use CTRL + S on Excel.)
19© te aho o te kura pounamu MS3081
using a spreadsheet to find the trend
2. Write your comments on the features of this time series here. Make sure that you write your comments in context.
3. For the Absences data, write your equation of the trend line in the space provided.
y =
Check your answers.
Check the website.
4. If your answer is not correct, comment on the changes that you think you need to make to your spreadsheet so that you get the correct answer.
It is best to correct your work before you send it to your teacher.
Name your marked spreadsheet work. Print out your work and send it in to your teacher with this booklet.
20 © te aho o te kura pounamuMS3081
using a spreadsheet to find the trend
naming filesWhen you save a file to your computer, it’s best to personalise it.Make sure that the file name begins with your name so that your teacher can read your name without opening the file. Then write the name or the task and whether it is an assignment or assessment. Your Te Kura ID number is useful too. For example: Jo Brown-10339906-TimeSeriesAssignment.doc
Remember that your teacher gets lots of these and if they are all named ‘Retail data’ it would be a real problem to try to identify who each file belongs to.
How to name files (*for some computers):• left-clicktohighlightthefilename• left-clickagainsothattheboxaroundthefilenameappears• thenhoveroverthenameofthefilewiththemousetothepositionyouwantandclickagain to place a cursor in the name box, then you can type what you want.
*NB: For Mac users don’t left/right-click.
21MS3081© te aho o te kura pounamu
linear trend lines4
learning intentionIn this lesson you will learn to:
• write comments about the gradient in context.
introductionA line of best fit can be drawn through the smoothed data, enabling us to see the trend clearly. This line is used for the first step when calculating a forecast. This lesson will focus on describing linear trends.
linear trend lines Before we attempt to describe the gradient we note: • thetrend(itcanoftenbedescribedaslinearinthiscourse)• theunitsonthey-axis – number of absences• theunitsonthex-axis – day• thenumericalvalueofthegradient(roundingisusuallysensible)• whetherthegradientispositiveornegative
– if positive, use increasing to describe the trend – if negative, use decreasing to describe the trend
• allcommentsneedtobeincontext.Thismeansthatyouneedtowriteasentenceusingthe variables and units given with the data – in this case number of absences per day.
using the absences dataNumber of absences by day
I notice that: • the trend is linear • the units on the y-axis – number of absences • the units on the x-axis – days of a five day week • the numerical value of the gradient is – 0.0618 • the gradient is negative so use decreasing to describe the trend.
Num
ber
of a
bsen
ces
Mon
35
30
25
20
15
10
5
0
Tue
Wed Th
u Fri
Mon Tu
eW
ed Thu Fri
Mon Tu
eW
ed Thu Fri
Mon Tu
eW
ed
y = –0.0618x + 16.424
Thu Fri
– Raw
– cmm
– Linear (cmm)
22 © te aho o te kura pounamuMS3081
linear trend lines
This means that the quantitative description of the gradient is:Absences for Year 12 students at this school are decreasing by approximately 0.0618 absences each day of a five-day week. A secondary school is required by law to be open 190 days.We could multiply 0.0618 by 190, to give a yearly approximation (190 × 0.0618 = 11.74).Absences are decreasing by approximately 12 absences per year for a school year of 190 days.
Remember the units must be correct and it is wise to round your answer sensibly.
Sleep data 1. Save the file Sleep data from the MS3000 OTLE website on OTLE to your computer. Add your name to the beginning of the file name. This time series data records the number of hours of sleep had by a Year 13 student over a four-week period (moving means of order 7). Note that in this case you will have to leave three empty cells at the top of the moving average column and three empty cells at the bottom.
You are to: • calculate odd point moving means • draw the graph of the raw data and the moving means • draw the trend line on the graph • comment on the features of this time series data • find the equation of the trend line • write a quantitative comment about the gradient.
Make sure that you save your work regularly.
2. Write your comments on the features of this time series here:
Your answers must be written in context and rounded sensibly.
4A
iSto
ck
23© te aho o te kura pounamu MS3081
linear trend lines
3. For the Sleep data, write your equation of the trend line in the space provided.
4. Write your description of the gradient here:
5. If your answer is not correct, comment on the changes you need to make:
Name your final marked spreadsheet work. Send it to your teacher.
Check your answers.
Your teacher will be interested in reading your answer.
significant figures and rounding numbersComputers and calculators will give us many more decimal places for a number than is sensible for the problem. If we are working with data from an experiment that has been recorded to threesignificantfiguresisitsensibletogivetheanswertomorethanthreesignificantfigures?No, it is not. The third significant number has itself arisen as a result of rounding and could contain errors. It is perhaps more sensible to round the final answer to two significant figures. Think carefully about the number of decimal places you will use in your spreadsheets and always look at the number of significant figures in the original data as a guide. It is wise to consider how the data were gathered and why a given number of significant figures or decimal places was selected.
y =
24 MS3081 © te aho o te kura pounamu
first steps writing a statistical report5
learning intentionsIn this lesson you will learn to:
• present your report in a structured manner.
introductionIn order that the report has cohesiveness it is advisable to follow the structure outlined below: • Introduction:thepurposeoftheinvestigation• Datasource:abriefdescriptionofthedataandwhereitcomesfrom• Resultsfromyourdataincludinganalysis• Discussionofyourfindings.
Remember the statistical enquiry cycle (PPDAC). You may wish to check the PPDAC poster on the MS3000 OTLE site.
the report introduction1. Begin your report with a statement describing the purpose of your investigation.
The following could be used as an introduction for the absences data used in chapter three. Complete the paragraph by filling in the gaps:
‘The purpose of this 1. _______________ is to determine the features of the 2. ___________
series formed by the number of 3. _____________________ by Year 12 students over a
four-week period. I will find the 4. ____________ of the trend line after smoothing the raw
5. _________________ .’
5A
25© te aho o te kura pounamu MS3081
first steps of writing a statistical report
the data2. Describe briefly the data, where it comes from and state the variables used.
Here’s an example that could be used for the Absences data. Complete the paragraph by filling in the gaps:
This data set I am using is of the 1. _________________________ of absences by Year 12 2. ________________ over a four 3. _____________________ period for the five 4. ________________________ of a school week. The 5. ___________________ are days of the week for the independent variable and number of absences for the
6. _____________________________ variable. I found this data on the Te Kura MS3000 7. ________________________ website.
Check your answers.
26 MS3081 © te aho o te kura pounamu
review activity6
learning intentionIn this lesson you will learn to:
• investigate times series data with odd point moving means.
the hot stuff bakery
Save the file Hot Stuff Bakery data from the MS3000 OTLE website to your computer. Add your name and ID number to the beginning of the file name.
taskThe Hot Stuff Bakery kept a record of bread sales for a four-week period. The Hot Stuff Bakery is a small bakery situated on the main highway on the outskirts of a small rural town. It specialises in bread, but also sells other bakery products.
Carry out a time series statistical investigation using the Hot Stuff bakery data.
statistical investigation processUndertake an investigation to determine what pattern there is to food spending and use this to determine the trend.
• Selectanduseappropriatedisplaysinyouranalysisanddiscussthese.• Identifyfeaturesinthedataandrelatethesetothecontext.• Findanappropriatemodel.
report structure• Introduction: the purpose of the investigation.• Data source: a brief description of the data and where it comes from.• Results from your data including analysis.• Discussion of your findings.
the resultsInclude all of the analysis you have carried out on the data set. Use the statistical graphs and numerical statistics you have obtained from the time-series graph, and make sure that you include clear headings in the spreadsheet and that your work is set out in a logical way.
Refer to lessons 1–4 to review the concepts of: • smoothing a time-series graph using odd point moving means • using a spreadsheet to view the overall trend • adding linear trend lines • evaluating the value of the correlation coefficient and other features evident in the data set.
iSto
ck
27© te aho o te kura pounamu MS3081
review activity
discussionMake sure that you answer the questions that you pose about the data set and that you make statements, based on your findings, which relate to the situation being investigated.
Again, refer back to lessons 1–4 to review the concepts.
Read your introduction to ensure that you have addressed all the points raised initially you saw in the data set. Justify statements you make by referring to data, trends, seasonal patterns or other features of the visual displays you have produced. Well-supported answers need references back to the data, visual and calculated trends, any seasonal patterns noted or other features evident in the visual displays.
Relate your findings to the context of the data set – your statistical thinking should be evident in your report.
Name and print out all of your files.
Remember to fill in the self-assessment page when you have completed the work in this booklet. You’ll find it inside the back cover sheet. Return this booklet and a printout (hard copy) of your self-marked Excel answers to the exercises along with your answers to this review activity to your Mathematics and Statistics teacher to assess.
Your ability to relate your findings to the context and the depth of thinking evident in your report will determine your overall grade.
Your teacher will assess this work.
28 © te aho o te kura pounamuMS3081
review activity
what to do nowComplete the cover sheet and self-assessment form and send with this booklet to your Te Kura Mathematics and Statistics teacher. Remember to include a printout of your spreadsheet with all of your working for this review activity.
presenting material for assessmentYour teacher needs to see all the evidence of your selected method, so you must print out all files and include these with any other related material. Send these, to your Te Kura Statistics teacher all together with the cover sheet (MS3081) attached.
You could email your assessment files to your teacher, but you must also post the completed task. Ideally, you should do both. This means you should post a printout of your complete Review activity and email the files to your Statistics teacher.
Post: print out all files related to the assessment. Write your name and MS3081 at the top of each piece of paper. Send to your Statistics teacher with the cover sheet attached (MS3081).
Email: ensure that all files for the assessment are named correctly before you email them to your teacher. Begin the file name with your name. Then write the name of the task and whether it is an assignment or assessment. Your Te Kura ID number is useful too. For example: Jo Brown-10339906-TimeSeriesAssignment.docx
Remember – clearly name your electronic files.
29MS3081© te aho o te kura pounamu
answer guide7
1. introduction to time series1. voltage2. seconds3. effect4. larger5. voltage6. increase7. decrease8. 0.5 mV9. decreasing10. time11. cyclical12. unusual
Your answers could be different from these.
1. Until about 1984 the overall trend appears to be increasing slightly. After 1984 the trend is decreasing and it appears that it could be linear. This means that the consumption of cigarettes was decreasing from 1984 to 1993. There appears to be some kind of pattern with peaks and troughs and, as the time period is approximately three years, this indicates that there could be a cyclical effect, but it does not appear to be strong. There are no obvious spikes.
2. There appears to be a linear trend that is increasing. A clear seasonal pattern is present indicating that there is a strong seasonal effect. The peak of the last cycle could be higher than expected indicating that some kind of noise is present.
3. The large spike or irregularity in the barracouta catches was caused by foreign fishing vessels catching as much as possible before the introduction of the 200 nautical mile Exclusive Economic Zone (EEZ) to New Zealand waters in 1978. The effect is visible in the marked drop in mackerel catches during 1978, too. Neither graph has any obvious cyclical or seasonal component. The trend in the barracouta catches shows a slight overall increase since 1978, and mackerel catches are on the rise since 1980 and at a faster rate than the barracouta catch. More mackerel than barracouta were caught in 1991.
4. Check out the file Belle's Dairy Farm on MS3000 OTLE website.
1A
1B
30 MS3081 © te aho o te kura pounamu
answer guide
2. first steps in smoothing the raw data
Year Data Mean 5
1974197519761977197819791980198119821983198419851986198719881989199019911992
289305300326307308313292320322313306271290292292289276291
305309311309308311312311306300294290287288288
d = 311 That is the mean of the five data values centred on 1978. (300 + 326 + 307 + 308 + 313) ÷ 5 = 310.8 … (311 to nearest whole number)
3. using the spreadsheet to find the trend1. Check the Absences data spreadsheet answers on the MS3000 OTLE site.
2. There is a linear trend that is decreasing slightly. This means that over this three-week period, the number of absences is decreasing slightly.
Seasonal variation is present, with a peak in the number of absences for Year 12 students on Fridays and decrease in the number of absences on Wednesdays.
Random variation (residual effect) is present and is shown by the irregular pattern in the peaks and troughs. The data for the third week shows a trough that is greater than expected and a peak that is higher than expected. This means that on the third Wednesday there were fewer absences than expected whereas on the Friday there were more absences than expected.
3. y = –0.0618x + 16.424
4. Your comments.
2A
3A
31MS3081© te aho o te kura pounamu
answer guide
4. linear trend lines1. Check the Absences data spreadsheet answers on the MS3000 OTLE site.
2. There is a linear trend that is decreasing slightly.
Seasonal variation is present, with a decrease in the hours of sleep for Year 13 students on Saturdays followed by a peak in the hours of sleep for Year 13 students on Sundays.
Random variation (residual effect) is present and is shown by the irregular pattern in the peaks and troughs and in particular the second weekend showing a deeper trough and larger peak.
3. y = –0.0285x + 7.2609 If you did not get this answer, check the Excel file on OTLE MS3000 site
4. I notice that: • the trend is linear • the units on the y-axis – hours of sleep • the units on the x-axis – days of a five day week • the numerical value of the gradient is –0.0285 • the gradient is negative so use decreasing to describe the trend.
This means that: The hours of sleep for Year 13 students at this school are decreasing by approximately 0.0285 hours each day over this four-week period. This is better given as: the hours of sleepforYear13studentsatthisschoolare‛decreasing’byapproximately1minute and 43 seconds each day of this four-week period.
5. Your comments.
5. first steps of the report1. 1. investigation 2. time 3. absences 4. equation 5. data
2. 1. number 2. students 3. week 4. days 5. variables 6. dependent 7. OTLE
4A
5A
32 MS3081 © te aho o te kura pounamu
acknowledgments
Every effort has been made to acknowledge and contact copyright holders. Te Aho o Te Kura Pounamu apologises for any
omissions and welcomes more accurate information.
Data sets: All from Stage 1 team, Department of Statistics, The University of Auckland except for the Hot Stuff Bakery
data that is imaginary.
Images:
iStock:
Cover: Generic chart – 18869948
Detective – 18432196
ECG graph (BW closeup) – 171684
Annual report – 6372295
Happy students attending class – 17757341
Student sleeping – 1739813
Bread – 17790922.
self-assessment ms3081
Fill in the rubric by ticking the boxes you think apply for your work. This is an opportunity for you to reflect on your achievement in this topic and think about what you need to do next. It will also help your teacher. Write a comment if you want to give your teacher more feedback about your work or to ask any questions.
Fill in your name and ID number. Student name: Student ID:
Not yet attempted
Didn’t understand
Understood some
Understood most
Very confident in my understanding
Investigate time series data with odd point moving means.
Please place your comments in the relevant boxes below.
Student comment
Investigate time series data with odd point moving means.
Any further student comments.
© te aho o te kura pounamu
Contact your teacher if you want to talk about any of this work. Freephone 0800 65 99 88
teacher use only
Teacher comment
Please find attached letter
authentication statement I certify that the assessment work is the original work of the student named above.
for school use only
assessment
www.tekura.school.nz
cover sheet – ms3081
students – place student address label below or write in your details.
Full name
ID no.
Address (If changed)
Signed(Student)
Signed(Supervisor)