MS3081

36
te tauanga statistics 2013/1 MS3081 introduction to time series ncea level 3

description

Maths Statistics NCEA Level 3

Transcript of MS3081

Page 1: MS3081

te tauanga

statistics

2013/1

MS3081

introduction to time seriesncea level 3

Page 2: MS3081

© te aho o te kura pounamu

statisticsncea level 3

Expected time to complete workThis work will take you about 10 hours to complete.

You will work towards the following standards: Achievement Standard 91580 (Version 1) Mathematics and Statistics 3.8Investigate time series data Level 3, External 4 credits

In this booklet you will focus on this learning outcome: • investigating time series data with odd point moving means.

You will continue to work towards this standard in booklets MS3082 and MS3083.

Copyright © 2013 Board of Trustees of Te Aho o Te Kura Pounamu, Private Bag 39992, Wellington Mail Centre, Lower Hutt 5045,

New Zealand. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means without

the written permission of Te Aho o Te Kura Pounamu.

Page 3: MS3081

1MS3081© te aho o te kura pounamu

contents

1 Introduction to time series

2 First steps in smoothing the raw data

3 Using a spreadsheet to find the trend

4 Linear trend lines

5 First steps of writing a statistical report

6 Review activity

7 Answer guide

Page 4: MS3081

2 MS3081 © te aho o te kura pounamu

how to do the work

When you see:

1A Complete the activity.

Check your answers.

Your teacher will assess this work.

Check the website.

Contact your teacher.

You will need:

• access to the Internet to get on to the MS3000 OTLE website where you will find the data sets you need and the spreadsheet answers.

• any type of appropriate mathematical and statistical technology such as graphing calculator, net book or tablet that has a spreadsheet.

Resource overviewIn this booklet you will learn about how to comment on the features of time series graphs. You will learn how to smooth the raw data using odd point moving means and then find the equation of the trend line. It is expected that you will use a spreadsheet for this work. In the following booklet MS3082 you will learn about even point moving means.

Page 5: MS3081

3MS3081© te aho o te kura pounamu

introduction to time series1

learning intentionsIn this lesson you will learn to: • identifythefeaturesinatimeseries • relatethefeaturestothecontext.

introductionDo you: • measureyourheartbeat? • logyoursportsperformance? • watchthesharemarket?If you do, then you’re already into time series.

features of time seriesA time series is a set of observations or measurements made at regular intervals over a period of time. Daily temperatures, commodity prices, records of growth, population patterns and exchange rates are all examples of time series.

The graph of an electrocardiogram (ECG) is a time series graph.

Throughout the booklets on time series, you’ll be looking at various types of time series, learning the skills and techniques needed to interpret what has happened and forecast what may happen next.

You’ll have a chance to apply the skills as you learn them, then in the last lesson you will put it all together and work through the assessment task.

iSto

ck

Page 6: MS3081

4 © te aho o te kura pounamuMS3081

introduction to time series

features of time series graphsTime series are made up of four components:

two long-term:

T the overall trend C the long-term cycle

and two short-term: S the seasonal variation R (or I) is the residual; that is, any random irregularity.

Any or all of these components may be present in a time series. Let’s have a look at some time series graphs and identify the components.

the trendT, the trend of a series is sometimes called the secular trend (secular means ‘over a long period of time’). The easiest way to discover a trend is to use a spreadsheet to graph the raw data and trend line. The trend may be a straight line or a curve. Try to think what the trend implies about the data.

You can see from the graph that the trend shows a gradual increase in the cost of living from March 1992 to December 1994.

The steeper slope of the graph line from June 1994 shows that the rate of increase was greater during the second half of 1994 than it was during the first half.

The secular trend tells you that the cost of living is rising.

N Z Consumer Price Index (quarterly)

960

970

980

990

1000

1010

1020

Mar-92Jun-92 Sep-92 Dec-92Mar-93Jun-93 Sep-93Dec-93Mar-94Jun-94 Sep-94 Dec-94

Mar

– 9

2

Dec

– 9

2

Sep

– 93

Jun

– 92

Mar

– 9

3

Dec

– 9

3

Sep

– 92

Jun

– 93

Mar

– 9

4

Jun

– 94

Sep

– 94

Dec

– 9

4

New Zealand consumer price index

1 020

1 010

1 000

990

980

970

960

Year

Pric

e ($

)

Page 7: MS3081

5© te aho o te kura pounamu MS3081

introduction to time series

the cycleLooking at the graph of a series is also the best way to pick up any long-term cycles, C, which are hard to spot from the data. For a cycle to be considered long-term, it has to be over two years in length. Anything shorter is called a seasonal variation.

Youʼllbeabletoworkoutthelengthofacyclebyfindingthetimebetweenpeaksordipsinthe graph line.

Here, you can see that the graph line peaks roughly every seven years.

The Sound and Sight Company share price time series has a seven-year cycle. There’s also a long-term or secular trend showing an overall increase in the share price.

iSto

ck

Sound and Sight Company share price

Pric

e (c

ents

)

200

150

100

50

0

1965

1972

1979

1986

1993

2000

Year

Page 8: MS3081

6 © te aho o te kura pounamuMS3081

introduction to time series

seasonal variationS, the seasonal variations, are often known about before the data is collected. With their regular and recurring patterns, they are easy to spot from the graph. Look at the time axis (horizontal) to work out the period of the variations. The period is the length of time between two peaks (or troughs). It may be a week, a month or any fraction of a year.

Even regular daily variations over the period of a day are classed as seasonal features, as in this example.

Patient temperature chart May 13-16

temperature °C

35

36

37

38

39

40

6.00am2.00pm9.00pm6.00am2.00pm9.00pm6.00am2.00pm9.00pm6.00am2.00pm9.00pm

36.8normal

6.00

am

Tem

pera

ture

(ºC

)

6.00

am

6.00

am

6.00

am

9.00

pm

9.00

pm

9.00

pm

9.00

pm

2.00

pm

2.00

pm

2.00

pm

2.00

pm

Patient temperature chart May 13–1640

39

38

37

36

35

36.8normal

Time

You can see that the patient’s temperature rises to a peak at 9.00 pm, then falls steadily during the night to an early-morning low. This marked seasonal pattern diminishes as she recovers.

Seasonal variations are often present in time series connected with climate or season.

Page 9: MS3081

7© te aho o te kura pounamu MS3081

introduction to time series

residual effectR, the residual effect, describes the irregular or random variations that occur in all kinds of time series. They show up as ‘spikes’ or ‘dips’ in the graph line and are often described as noise in a time series.

They can be caused by unpredictable rare events (earthquake, fire, etc.), measurement error, or just plain random human behaviour. The residual components make it impossible to forecast future values with total accuracy. Who knows what spanner may get thrown in the works!

Mar

– 9

0

Mar

– 9

1

Mar

– 9

2

Mar

– 9

3

Nov

– 9

0

Nov

– 9

1

Nov

– 9

2

Jul –

90

Jul –

91

Jul –

92

Jul –

93

Pric

e ($

)

Price of audio component ($)

dollars

0

50

100

150

200

250

300

350

Mar-90Jul-90 Nov-90Mar-91Jul-91 Nov-91Mar-92Jul-92 Nov-92Mar-93Jul-93

Price of audio component ($)

350

300

250

200

150

100

50

0

Months

The spike in the graph shows a temporary rise in price. A factory fire halted production of the component. This led to a shortage, which in turn pushed prices up temporarily.

Page 10: MS3081

8 © te aho o te kura pounamuMS3081

introduction to time series

Other irregular features you may see from time to time are:

stepsrampsSteps Ramps

These cause the trend line to be displaced.

They can be caused by: •achangeinthetimeperiod •achangeincollectionmethodorreportingprocedure •openingorclosingofasalesoutlet •achangeinprice •achangeinthelaw • andsoon.

In the July 1986 budget, there was a large increase in the tax on tobacco products. This is evident in the step feature indicating a dramatic rise in the price index recorded over the next quarter (three months).

Pric

e in

dex

Quarterly price index for tobacco products (Dec 93, 1000)

200

250

300

350

400

450

500

550

600

Dec-82Mar-83Jun-83Sep-83Dec-83Mar-84Jun-84Sep-84Dec-84Mar-85Jun-85Sep-85Dec-85Mar-86Jun-86Sep-86Dec-86Mar-87Jun-87Sep-87Dec-87Mar-88Jun-88Sep-88Dec-88

Dec

– 8

2

Jun

– 83

Jun

– 84

Jun

– 85

Jun

– 86

Jun

– 87

Jun

– 88

Dec

– 8

3

Dec

– 8

4

Dec

– 8

5

Dec

– 8

6

Dec

– 8

7

Dec

– 8

8

Mar

– 8

3

Mar

– 8

4

Mar

– 8

5

Mar

– 8

6

Mar

– 8

7

Mar

– 8

8

Sep

– 83

Sep

– 84

Sep

– 85

Sep

– 86

Sep

– 87

Sep

– 88

Quarterly price index for tobacco products (Base Dec 93, index 1000)

Quarter

Page 11: MS3081

9© te aho o te kura pounamu MS3081

introduction to time series

reading time scales on graphsSometimes it can be confusing to know exactly when changes are occurring, especially when data points are not graphed separately, but joined by lines.

For most data measured or collected, the time point stated usually means ‘up to the end of the period’.

This is a small part of a data set on mackerel caught in New Zealand waters. Year

197719781979

Tonnes

18 4926 6057 589

Between the end of 1977 and the end of 1978, there were 6 605 tonnes of mackerel caught. That means that during 1978, there was a decrease of almost 12 000 tonnes. You can see this represented on the graph in activity 1B, question 3.

identifying features in the data

Look at the time series graph of the raw data to identify features.

Consider all four components of time series, although you may find that all four are not necessarily present. Often the long-term cycle, c, is not apparent. Make sure that you write your comments in context. This means that you will need to identify the variables and relate these to your comments. Your comments should be about the data set that you are investigating and not generic comments alone. You need to be able to interpret what you see in the graphs and any statistical summaries that you create from the spreadsheet. You are a data detective and your mission is to unravel the story that is hidden within and between data points.

An ECG graph is made by recording the electrical impulses made while the heart is beating and printing these on paper strip where the side of each small square represents 1 mm (each larger square is 5 mm in length). Each larger square represent 0.2 seconds. Look at the bottom time series in the ECG graph below. This is the rhythm strip. The data was collected over approximately five seconds and the voltage is measured on the vertical axis, with 10 mm being equal to 1 mV. Comment on the features of this time series by filling in the gaps in the passage below.

1A

Page 12: MS3081

10 © te aho o te kura pounamuMS3081

introduction to time series

The variables for the ECG graph are 1. ________________ measured in mV and time measured in 2. ________________.

There is a clear seasonal 3. ________________ that begins with a small wave, a large spike and a second wave that is 4. _______________ than the first. This indicates that the resting 5. ____________ recorded from the heart makes a small 6. ____________ and decays, then rapidly spikes by about 1 mV followed by an equally rapid 7.____________ that drops below the resting voltage. Finally there is a second slower increase of just under 8.________________ mV. The trend is neither increasing nor 9._______________ so we say that the trend for voltage is constant over this 10.__________________ period.

As this data covers only a few seconds we are not able to determine the 11. _____________________ effect.

The third spike is slightly lower than the others recorded on this graph but is probably not significant enough to consider as 12.__________________.

To find out more about the ECG refer to the links on the MS3000 OTLE website.

iSto

ck

Page 13: MS3081

11© te aho o te kura pounamu MS3081

introduction to time series

Describe the features you see in these time series graphs.Make sure that you write your comments in context.

1.

Yearly cigarette consumption in NZ (millions)

year

number (m)

30003500400045005000550060006500

80 81 82 83 84 85 86 87 88 89 90 91 92 93

1980

Num

ber (

mill

ions

)

1982

1984

1986

1988

1990

1992

1981

1983

1985

1987

1989

1991

1993

Yearly cigarette consumption in New Zealand

6 500

6 000

5 500

5 000

4 500

4 000

3 500

3 000

2.

Sale

s ($m

)

Quarterly retail chemists' sales in New Zealand ($ millions)

year

sales $(m)

200

220

240

260

280

Mar-91 Sep-91 Mar-92 Sep-92 Mar-93 Sep-93 Mar-94 Sep-94

Quarterly retail chemist sales in New Zealand

Mar

– 9

1

Mar

– 9

2

Mar

– 9

3

Mar

– 9

4

Sep

– 91

Sep

– 92

Sep

– 93

Sep

– 94

280

260

240

220

200

1B

Page 14: MS3081

12 © te aho o te kura pounamuMS3081

introduction to time series

3.

Tonn

es

year

071 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91

mackerel barracouta

Year

45 00040 00035 00030 00025 00020 00015 00010 0005 000

Mackerel and barracouta catches in New Zealand waters19

71

1973

1975

1977

1979

1981

1987

1983

1989

1972

1974

1976

1978

1980

1986

1982

1988

1984

1990

1985

1991

Page 15: MS3081

13© te aho o te kura pounamu MS3081

introduction to time series

4. Graph matching Choose the correct title from the list on page 15 or at the end of this exercise, and write it above the appropriate graph.

1. No trend with random variation

2. Trend plus seasonal variation plus random variation

3. Perfect trend

4. Spike

5. Perfect trend plus seasonal variation

Work through the PPT ‘Belle’s Dairy Farm’ on the MS3000 OTLE site and use this to check your answers.

Check your answers.

y

x0

1

2

3

4

5

6

0 1 2 3 4 5 6 7 80

2

4

6

8

10

12

14

0 1 2 3 4 5 6 7 8

time period10

000L

milk

Time periodM

ilk 1

0 00

0 L

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

0 2 4 6 8 10 12 14 16 18

period

milk

100

00L

Time period

Milk

10

000

L

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

0 2 4 6 8 10 12 14 16 18

time period

milk

100

00L

Time period

Milk

10

000

L

0

2

4

6

8

10

12

14

0 1 2 3 4 5 6 7 8

time period

milk

100

00L

Time period

Milk

10

000

L

Page 16: MS3081

14 MS3081 © te aho o te kura pounamu

first steps in smoothing the raw data2

learning intentionIn this lesson you will learn to:

• smooth the raw data using odd point moving means.

introductionOften it is hard to see the secular trend in a time series graph because of the fluctuations in the graph line. Smoothing the data makes the trend more evident.

moving meansIt is easy to use a spreadsheet to calculate moving means.The advantages of this method are: • itgivesasmoothercurvethantherawdata • itspreadstheeffectofextremevalues.

We will use moving means in our analysis of time series.In this example, the average or mean of every three data values is found.

Death rate per thousand of young American males from 1912 to 1922

Year 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922

Rate 4.5 4.7 4.4 4.2 4.5 5.0 12.2 5.3 4.8 3.8 3.8

Mean 3 years 4.5 4.4 4.4 4.6 7.2 7.5 7.4 4.6 4.1

The smoothed rate for 1918 is the mean of the rates for 1917, 1918 and 1919.See how the effect of the ‘spike’ has been spread over the three years.

Death rate per thousand of young American males

year

rate per thousand

0.0

2.0

4.0

6.0

8.0

10.0

12.0

14.0

1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922

raw data mean (3)

1912

1913

1916

1919

1914

1917

1920

1915

1918

1921

1922

Rat

e pe

r tho

usan

d

Death rate per thousand of young American males

Yearmoving means(3)raw data

5.0 + 12.2 + 5.3 3

= 7.5

Page 17: MS3081

15© te aho o te kura pounamu MX3081

first steps in smoothing the raw data

number of points – series without a seasonal componentThe number of values the data is averaged over is often referred to as the number of points. So ‘a three-point moving average’ tells you that groups of three means have been used to smooth the data. We calculate a moving mean of order three.

For time series without a seasonal pattern, the number of time periods chosen is not critical.

But remember, the larger the number of time periods or points chosen for averaging, the more values you lose off the ends. With three points you lose one value from each end. With five points you lose two, and with seven you lose three. This can be a drawback if you don't have many data values.

If the number of periods averaged is too small, the graph may still have too many variations for the trend to be evident. If the number of periods is too large, many features of the data may be lost.

You'll get the idea by looking at the example shown for the value of exports. The 12-point moving average smooths the data well, while the six-point moving average leaves quite large variations remaining.

series with a seasonal patternFor time series with a seasonal pattern, the number of time periods chosen for averaging is usually the same as is present in one cycle of the raw data. For quarterly raw data, one cycle has four pieces of data so a moving mean of order four (four-point moving mean) is used. For data collected on each of seven days of a week, one cycle is made up of seven pieces of data so a seven-point moving mean is used. We would say that we calculate a moving mean of order seven.

Value of exports: fish, crustaceans, dairy and other animal products

500

420

340

260

180

100Mar–85 Mar–87 Mar–89 Mar–91 Mar–93 Mar–95

Key: – monthly – 6pt moving mean – 12 pt moving mean

Valu

e

Page 18: MS3081

16 © te aho o te kura pounamuMS3081

first steps in smoothing the raw data

In the chart below, means of five adjacent data values have been found. (The means have been rounded to the nearest whole number.)

Complete the chart by calculating the missing five-point means, d, e and f.

Thickness of the ozone layer above New Zealand (measured in Dobson Units in July each year)

Year Data Mean 5

1974197519761977197819791980198119821983198419851986198719881989199019911992

289305300326307308313292320322313306271290292292289276291

305309d309308311312311306300e290287288f

Check your answers.

2A

Page 19: MS3081

17MS3081© te aho o te kura pounamu

using a spreadsheet to find the trend3

learning intentionIn this lesson you will learn to:

• use a spreadsheet to find the trend. introductionYou will need access to the MS3000 OTLE website. The data sets can be downloaded from the Time Series section.

spreadsheets It is expected that you will complete this work by using a statistical package or a spreadsheet programme (such as Excel) on a net book, tablet or any other computer or on a graphing or CAS calculator. You can use the spreadsheet you are familiar with for processing the time series data.ThefilesontheMS3000OTLEwebsiteareExcelfiles,soifyouarenotusingExcelyouwillneedtocopyandpastethedataintothefiletypesuitableforthespreadsheetyouareusing.Makeitahabittoaddyournametothebeginningofthefilenamebecauseyourteacherneedsto know that it is yours.

Make sure that you save your work on your spreadsheet regularly.

Write your answers to each exercise in the spaces in this workbook, print the spreadsheet work you have completed, making sure that all work is clearly labelled, and post your work back to your Mathematics and Statistics teacher either with this booklet or before you complete this booklet to provide evidence of your learning. It is important that you send a cover sheet with any work you send in by post as the bar code is necessary for Te Kura to track your work return.

It is important to post work regularly to your Mathematics and Statistics teacher so that it can be recorded as engagement.

If you need help with your spreadsheet contact your Statistics teacher.

iSto

ck

Page 20: MS3081

18 © te aho o te kura pounamuMS3081

using a spreadsheet to find the trend

the trendOne of the main uses for time series analysis is to make a forecast. We need to graph the raw data and the moving means and this is easier when using a spreadsheet. Once the raw data and themovingmeanshavebeengraphedonthesameaxes,itisasmallsteptofindtheequationofthe trend line. The trend equation is based on the smoothed data only so it is vital to ensure that only the graph of the moving means is highlighted.

In Excel once the moving means is highlighted a dropdown menu gives a choice of models tofitasthetrendandgivestheoptiontofinditsequation.Thisbookletwillfocusonalineartrendsoyouwillchoosethelinearmodelfromthedropdownmenu.YouwillfindhelpforusingExcel on the MS3000 OTLE site. Contact your Mathematics and Statistics teacher if you need help with this.

Absences data1. Save the data set Absences data from the MS3000 OTLE site to your computer. Add your name to the beginning of the file name. Open the file by double-clicking on the left of the file name.

This data is a time series of the number of absences by Year 12 students over a four-week period. As the data were collected over a five-day period we calculate moving means of order five. We say that the data has seasonality of order five.

Using your spreadsheet, you are to: • calculate odd point moving means to smooth the data • draw the graph of the raw data and the smoothed data on the same graph • draw the trend line on the graph (this must be based on the smoothed data not the raw) • comment on the features of this time series data • find the equation of the trend line • check the MS3000 OTLE website to mark

and correct your work.

Name all work clearly.

iSto

ck

3A

Nameyourfilesbyaddingyourname tothebeginningofthefilename.

Make sure that you save your work regularly. (Press the save icon or use CTRL + S on Excel.)

Page 21: MS3081

19© te aho o te kura pounamu MS3081

using a spreadsheet to find the trend

2. Write your comments on the features of this time series here. Make sure that you write your comments in context.

3. For the Absences data, write your equation of the trend line in the space provided.

y =

Check your answers.

Check the website.

4. If your answer is not correct, comment on the changes that you think you need to make to your spreadsheet so that you get the correct answer.

It is best to correct your work before you send it to your teacher.

Name your marked spreadsheet work. Print out your work and send it in to your teacher with this booklet.

Page 22: MS3081

20 © te aho o te kura pounamuMS3081

using a spreadsheet to find the trend

naming filesWhen you save a file to your computer, it’s best to personalise it.Make sure that the file name begins with your name so that your teacher can read your name without opening the file. Then write the name or the task and whether it is an assignment or assessment. Your Te Kura ID number is useful too. For example: Jo Brown-10339906-TimeSeriesAssignment.doc

Remember that your teacher gets lots of these and if they are all named ‘Retail data’ it would be a real problem to try to identify who each file belongs to.

How to name files (*for some computers):• left-clicktohighlightthefilename• left-clickagainsothattheboxaroundthefilenameappears• thenhoveroverthenameofthefilewiththemousetothepositionyouwantandclickagain to place a cursor in the name box, then you can type what you want.

*NB: For Mac users don’t left/right-click.

Page 23: MS3081

21MS3081© te aho o te kura pounamu

linear trend lines4

learning intentionIn this lesson you will learn to:

• write comments about the gradient in context.

introductionA line of best fit can be drawn through the smoothed data, enabling us to see the trend clearly. This line is used for the first step when calculating a forecast. This lesson will focus on describing linear trends.

linear trend lines Before we attempt to describe the gradient we note: • thetrend(itcanoftenbedescribedaslinearinthiscourse)• theunitsonthey-axis – number of absences• theunitsonthex-axis – day• thenumericalvalueofthegradient(roundingisusuallysensible)• whetherthegradientispositiveornegative

– if positive, use increasing to describe the trend – if negative, use decreasing to describe the trend

• allcommentsneedtobeincontext.Thismeansthatyouneedtowriteasentenceusingthe variables and units given with the data – in this case number of absences per day.

using the absences dataNumber of absences by day

I notice that: • the trend is linear • the units on the y-axis – number of absences • the units on the x-axis – days of a five day week • the numerical value of the gradient is – 0.0618 • the gradient is negative so use decreasing to describe the trend.

Num

ber

of a

bsen

ces

Mon

35

30

25

20

15

10

5

0

Tue

Wed Th

u Fri

Mon Tu

eW

ed Thu Fri

Mon Tu

eW

ed Thu Fri

Mon Tu

eW

ed

y = –0.0618x + 16.424

Thu Fri

– Raw

– cmm

– Linear (cmm)

Page 24: MS3081

22 © te aho o te kura pounamuMS3081

linear trend lines

This means that the quantitative description of the gradient is:Absences for Year 12 students at this school are decreasing by approximately 0.0618 absences each day of a five-day week. A secondary school is required by law to be open 190 days.We could multiply 0.0618 by 190, to give a yearly approximation (190 × 0.0618 = 11.74).Absences are decreasing by approximately 12 absences per year for a school year of 190 days.

Remember the units must be correct and it is wise to round your answer sensibly.

Sleep data 1. Save the file Sleep data from the MS3000 OTLE website on OTLE to your computer. Add your name to the beginning of the file name. This time series data records the number of hours of sleep had by a Year 13 student over a four-week period (moving means of order 7). Note that in this case you will have to leave three empty cells at the top of the moving average column and three empty cells at the bottom.

You are to: • calculate odd point moving means • draw the graph of the raw data and the moving means • draw the trend line on the graph • comment on the features of this time series data • find the equation of the trend line • write a quantitative comment about the gradient.

Make sure that you save your work regularly.

2. Write your comments on the features of this time series here:

Your answers must be written in context and rounded sensibly.

4A

iSto

ck

Page 25: MS3081

23© te aho o te kura pounamu MS3081

linear trend lines

3. For the Sleep data, write your equation of the trend line in the space provided.

4. Write your description of the gradient here:

5. If your answer is not correct, comment on the changes you need to make:

Name your final marked spreadsheet work. Send it to your teacher.

Check your answers.

Your teacher will be interested in reading your answer.

significant figures and rounding numbersComputers and calculators will give us many more decimal places for a number than is sensible for the problem. If we are working with data from an experiment that has been recorded to threesignificantfiguresisitsensibletogivetheanswertomorethanthreesignificantfigures?No, it is not. The third significant number has itself arisen as a result of rounding and could contain errors. It is perhaps more sensible to round the final answer to two significant figures. Think carefully about the number of decimal places you will use in your spreadsheets and always look at the number of significant figures in the original data as a guide. It is wise to consider how the data were gathered and why a given number of significant figures or decimal places was selected.

y =

Page 26: MS3081

24 MS3081 © te aho o te kura pounamu

first steps writing a statistical report5

learning intentionsIn this lesson you will learn to:

• present your report in a structured manner.

introductionIn order that the report has cohesiveness it is advisable to follow the structure outlined below: • Introduction:thepurposeoftheinvestigation• Datasource:abriefdescriptionofthedataandwhereitcomesfrom• Resultsfromyourdataincludinganalysis• Discussionofyourfindings.

Remember the statistical enquiry cycle (PPDAC). You may wish to check the PPDAC poster on the MS3000 OTLE site.

the report introduction1. Begin your report with a statement describing the purpose of your investigation.

The following could be used as an introduction for the absences data used in chapter three. Complete the paragraph by filling in the gaps:

‘The purpose of this 1. _______________ is to determine the features of the 2. ___________

series formed by the number of 3. _____________________ by Year 12 students over a

four-week period. I will find the 4. ____________ of the trend line after smoothing the raw

5. _________________ .’

5A

Page 27: MS3081

25© te aho o te kura pounamu MS3081

first steps of writing a statistical report

the data2. Describe briefly the data, where it comes from and state the variables used.

Here’s an example that could be used for the Absences data. Complete the paragraph by filling in the gaps:

This data set I am using is of the 1. _________________________ of absences by Year 12 2. ________________ over a four 3. _____________________ period for the five 4. ________________________ of a school week. The 5. ___________________ are days of the week for the independent variable and number of absences for the

6. _____________________________ variable. I found this data on the Te Kura MS3000 7. ________________________ website.

Check your answers.

Page 28: MS3081

26 MS3081 © te aho o te kura pounamu

review activity6

learning intentionIn this lesson you will learn to:

• investigate times series data with odd point moving means.

the hot stuff bakery

Save the file Hot Stuff Bakery data from the MS3000 OTLE website to your computer. Add your name and ID number to the beginning of the file name.

taskThe Hot Stuff Bakery kept a record of bread sales for a four-week period. The Hot Stuff Bakery is a small bakery situated on the main highway on the outskirts of a small rural town. It specialises in bread, but also sells other bakery products.

Carry out a time series statistical investigation using the Hot Stuff bakery data.

statistical investigation processUndertake an investigation to determine what pattern there is to food spending and use this to determine the trend.

• Selectanduseappropriatedisplaysinyouranalysisanddiscussthese.• Identifyfeaturesinthedataandrelatethesetothecontext.• Findanappropriatemodel.

report structure• Introduction: the purpose of the investigation.• Data source: a brief description of the data and where it comes from.• Results from your data including analysis.• Discussion of your findings.

the resultsInclude all of the analysis you have carried out on the data set. Use the statistical graphs and numerical statistics you have obtained from the time-series graph, and make sure that you include clear headings in the spreadsheet and that your work is set out in a logical way.

Refer to lessons 1–4 to review the concepts of: • smoothing a time-series graph using odd point moving means • using a spreadsheet to view the overall trend • adding linear trend lines • evaluating the value of the correlation coefficient and other features evident in the data set.

iSto

ck

Page 29: MS3081

27© te aho o te kura pounamu MS3081

review activity

discussionMake sure that you answer the questions that you pose about the data set and that you make statements, based on your findings, which relate to the situation being investigated.

Again, refer back to lessons 1–4 to review the concepts.

Read your introduction to ensure that you have addressed all the points raised initially you saw in the data set. Justify statements you make by referring to data, trends, seasonal patterns or other features of the visual displays you have produced. Well-supported answers need references back to the data, visual and calculated trends, any seasonal patterns noted or other features evident in the visual displays.

Relate your findings to the context of the data set – your statistical thinking should be evident in your report.

Name and print out all of your files.

Remember to fill in the self-assessment page when you have completed the work in this booklet. You’ll find it inside the back cover sheet. Return this booklet and a printout (hard copy) of your self-marked Excel answers to the exercises along with your answers to this review activity to your Mathematics and Statistics teacher to assess.

Your ability to relate your findings to the context and the depth of thinking evident in your report will determine your overall grade.

Your teacher will assess this work.

Page 30: MS3081

28 © te aho o te kura pounamuMS3081

review activity

what to do nowComplete the cover sheet and self-assessment form and send with this booklet to your Te Kura Mathematics and Statistics teacher. Remember to include a printout of your spreadsheet with all of your working for this review activity.

presenting material for assessmentYour teacher needs to see all the evidence of your selected method, so you must print out all files and include these with any other related material. Send these, to your Te Kura Statistics teacher all together with the cover sheet (MS3081) attached.

You could email your assessment files to your teacher, but you must also post the completed task. Ideally, you should do both. This means you should post a printout of your complete Review activity and email the files to your Statistics teacher.

Post: print out all files related to the assessment. Write your name and MS3081 at the top of each piece of paper. Send to your Statistics teacher with the cover sheet attached (MS3081).

Email: ensure that all files for the assessment are named correctly before you email them to your teacher. Begin the file name with your name. Then write the name of the task and whether it is an assignment or assessment. Your Te Kura ID number is useful too. For example: Jo Brown-10339906-TimeSeriesAssignment.docx

Remember – clearly name your electronic files.

Page 31: MS3081

29MS3081© te aho o te kura pounamu

answer guide7

1. introduction to time series1. voltage2. seconds3. effect4. larger5. voltage6. increase7. decrease8. 0.5 mV9. decreasing10. time11. cyclical12. unusual

Your answers could be different from these.

1. Until about 1984 the overall trend appears to be increasing slightly. After 1984 the trend is decreasing and it appears that it could be linear. This means that the consumption of cigarettes was decreasing from 1984 to 1993. There appears to be some kind of pattern with peaks and troughs and, as the time period is approximately three years, this indicates that there could be a cyclical effect, but it does not appear to be strong. There are no obvious spikes.

2. There appears to be a linear trend that is increasing. A clear seasonal pattern is present indicating that there is a strong seasonal effect. The peak of the last cycle could be higher than expected indicating that some kind of noise is present.

3. The large spike or irregularity in the barracouta catches was caused by foreign fishing vessels catching as much as possible before the introduction of the 200 nautical mile Exclusive Economic Zone (EEZ) to New Zealand waters in 1978. The effect is visible in the marked drop in mackerel catches during 1978, too. Neither graph has any obvious cyclical or seasonal component. The trend in the barracouta catches shows a slight overall increase since 1978, and mackerel catches are on the rise since 1980 and at a faster rate than the barracouta catch. More mackerel than barracouta were caught in 1991.

4. Check out the file Belle's Dairy Farm on MS3000 OTLE website.

1A

1B

Page 32: MS3081

30 MS3081 © te aho o te kura pounamu

answer guide

2. first steps in smoothing the raw data

Year Data Mean 5

1974197519761977197819791980198119821983198419851986198719881989199019911992

289305300326307308313292320322313306271290292292289276291

305309311309308311312311306300294290287288288

d = 311 That is the mean of the five data values centred on 1978. (300 + 326 + 307 + 308 + 313) ÷ 5 = 310.8 … (311 to nearest whole number)

3. using the spreadsheet to find the trend1. Check the Absences data spreadsheet answers on the MS3000 OTLE site.

2. There is a linear trend that is decreasing slightly. This means that over this three-week period, the number of absences is decreasing slightly.

Seasonal variation is present, with a peak in the number of absences for Year 12 students on Fridays and decrease in the number of absences on Wednesdays.

Random variation (residual effect) is present and is shown by the irregular pattern in the peaks and troughs. The data for the third week shows a trough that is greater than expected and a peak that is higher than expected. This means that on the third Wednesday there were fewer absences than expected whereas on the Friday there were more absences than expected.

3. y = –0.0618x + 16.424

4. Your comments.

2A

3A

Page 33: MS3081

31MS3081© te aho o te kura pounamu

answer guide

4. linear trend lines1. Check the Absences data spreadsheet answers on the MS3000 OTLE site.

2. There is a linear trend that is decreasing slightly.

Seasonal variation is present, with a decrease in the hours of sleep for Year 13 students on Saturdays followed by a peak in the hours of sleep for Year 13 students on Sundays.

Random variation (residual effect) is present and is shown by the irregular pattern in the peaks and troughs and in particular the second weekend showing a deeper trough and larger peak.

3. y = –0.0285x + 7.2609 If you did not get this answer, check the Excel file on OTLE MS3000 site

4. I notice that: • the trend is linear • the units on the y-axis – hours of sleep • the units on the x-axis – days of a five day week • the numerical value of the gradient is –0.0285 • the gradient is negative so use decreasing to describe the trend.

This means that: The hours of sleep for Year 13 students at this school are decreasing by approximately 0.0285 hours each day over this four-week period. This is better given as: the hours of sleepforYear13studentsatthisschoolare‛decreasing’byapproximately1minute and 43 seconds each day of this four-week period.

5. Your comments.

5. first steps of the report1. 1. investigation 2. time 3. absences 4. equation 5. data

2. 1. number 2. students 3. week 4. days 5. variables 6. dependent 7. OTLE

4A

5A

Page 34: MS3081

32 MS3081 © te aho o te kura pounamu

acknowledgments

Every effort has been made to acknowledge and contact copyright holders. Te Aho o Te Kura Pounamu apologises for any

omissions and welcomes more accurate information.

Data sets: All from Stage 1 team, Department of Statistics, The University of Auckland except for the Hot Stuff Bakery

data that is imaginary.

Images:

iStock:

Cover: Generic chart – 18869948

Detective – 18432196

ECG graph (BW closeup) – 171684

Annual report – 6372295

Happy students attending class – 17757341

Student sleeping – 1739813

Bread – 17790922.

Page 35: MS3081

self-assessment ms3081

Fill in the rubric by ticking the boxes you think apply for your work. This is an opportunity for you to reflect on your achievement in this topic and think about what you need to do next. It will also help your teacher. Write a comment if you want to give your teacher more feedback about your work or to ask any questions.

Fill in your name and ID number. Student name: Student ID:

Not yet attempted

Didn’t understand

Understood some

Understood most

Very confident in my understanding

Investigate time series data with odd point moving means.

Please place your comments in the relevant boxes below.

Student comment

Investigate time series data with odd point moving means.

Any further student comments.

© te aho o te kura pounamu

Contact your teacher if you want to talk about any of this work. Freephone 0800 65 99 88

teacher use only

Teacher comment

Please find attached letter

Page 36: MS3081

authentication statement I certify that the assessment work is the original work of the student named above.

for school use only

assessment

www.tekura.school.nz

cover sheet – ms3081

students – place student address label below or write in your details.

Full name

ID no.

Address (If changed)

Signed(Student)

Signed(Supervisor)