Mutagenesis of the cyclic AMP receptor protein of cyclic nucleotide ...
Forecasting - Rob J Hyndman · 2014. 9. 16. · Forecasting: Principles and Practice Seasonal or...
Transcript of Forecasting - Rob J Hyndman · 2014. 9. 16. · Forecasting: Principles and Practice Seasonal or...
2. The forecaster’s toolbox
OTexts.com/fpp/2/Forecasting: Principles and Practice 1
Rob J Hyndman
Forecasting:
Principles and Practice
Outline
1 Time series graphics
2 Seasonal or cyclic?
3 Autocorrelation
4 Forecast residuals
5 White noise
6 Evaluating forecast accuracy
Forecasting: Principles and Practice Time series graphics 2
Time series graphics
Forecasting: Principles and Practice Time series graphics 3
Economy class passengers: Melbourne−Sydney
Year
Tho
usan
ds
1988 1989 1990 1991 1992 1993
05
1015
2025
30
plot(melsyd[,"Economy.Class"])
Time series graphics
Forecasting: Principles and Practice Time series graphics 4
Antidiabetic drug sales
Year
$ m
illio
n
1995 2000 2005
510
1520
2530
> plot(a10)
Time series graphics
Forecasting: Principles and Practice Time series graphics 5
510
1520
2530
Seasonal plot: antidiabetic drug sales
Year
$ m
illio
n
●● ●
● ●
●
●
● ● ● ● ●● ● ● ●
●
●●
●● ● ● ●
●● ● ●
●
●●
●●
●● ● ●
● ● ●●
●●
●
● ●● ● ●
●●
● ●
●
●
● ●●
●●
● ● ●● ●
●●
●●
● ●●
●●
●●
●
●
●
●
● ●●
●●
●
● ●●
●●
●
●
●
● ●●
● ● ●
●
●
●
●
● ●
● ●
● ●
●
●●
●
●
●
●●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●●
●●
●
●
●
●
●
●
● ●
● ●
● ●
●●
●
●
●
●
●
●●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●●
●●
●
●
●
● ●
●
1991
1992
19931994
1995
1996
19971998199920002001
2002
2003
2004
2005
2006
2007
1992199319941995
199619971998
1999
2000
2001
20022003
2004
2005
2006
2007
2008
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Seasonal plots
Data plotted against the individual “seasons” inwhich the data were observed. (In this case a“season” is a month.)
Something like a time plot except that the datafrom each season are overlapped.
Enables the underlying seasonal pattern to beseen more clearly, and also allows anysubstantial departures from the seasonalpattern to be easily identified.
In R: seasonplot
Forecasting: Principles and Practice Time series graphics 6
Seasonal plots
Data plotted against the individual “seasons” inwhich the data were observed. (In this case a“season” is a month.)
Something like a time plot except that the datafrom each season are overlapped.
Enables the underlying seasonal pattern to beseen more clearly, and also allows anysubstantial departures from the seasonalpattern to be easily identified.
In R: seasonplot
Forecasting: Principles and Practice Time series graphics 6
Seasonal plots
Data plotted against the individual “seasons” inwhich the data were observed. (In this case a“season” is a month.)
Something like a time plot except that the datafrom each season are overlapped.
Enables the underlying seasonal pattern to beseen more clearly, and also allows anysubstantial departures from the seasonalpattern to be easily identified.
In R: seasonplot
Forecasting: Principles and Practice Time series graphics 6
Seasonal plots
Data plotted against the individual “seasons” inwhich the data were observed. (In this case a“season” is a month.)
Something like a time plot except that the datafrom each season are overlapped.
Enables the underlying seasonal pattern to beseen more clearly, and also allows anysubstantial departures from the seasonalpattern to be easily identified.
In R: seasonplot
Forecasting: Principles and Practice Time series graphics 6
Seasonal subseries plots
Forecasting: Principles and Practice Time series graphics 7
Seasonal subseries plot: antidiabetic drug sales
Month
$ m
illio
n
510
1520
2530
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
> monthplot(a10)
Seasonal subseries plots
Data for each season collected together in timeplot as separate time series.
Enables the underlying seasonal pattern to beseen clearly, and changes in seasonality overtime to be visualized.
In R: monthplot
Forecasting: Principles and Practice Time series graphics 8
Seasonal subseries plots
Data for each season collected together in timeplot as separate time series.
Enables the underlying seasonal pattern to beseen clearly, and changes in seasonality overtime to be visualized.
In R: monthplot
Forecasting: Principles and Practice Time series graphics 8
Seasonal subseries plots
Data for each season collected together in timeplot as separate time series.
Enables the underlying seasonal pattern to beseen clearly, and changes in seasonality overtime to be visualized.
In R: monthplot
Forecasting: Principles and Practice Time series graphics 8
Quarterly Australian Beer Production
beer <- window(ausbeer,start=1992)
plot(beer)
seasonplot(beer,year.labels=TRUE)
monthplot(beer)
Forecasting: Principles and Practice Time series graphics 9
Time series graphics
Forecasting: Principles and Practice Time series graphics 10
Australian quarterly beer production
meg
alite
rs
1995 2000 2005
400
450
500
Time series graphics
Forecasting: Principles and Practice Time series graphics 11
400
450
500
Seasonal plot: quarterly beer production
Quarter
meg
alitr
es
Q1 Q2 Q3 Q4
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
19921993
1994
1995
1996
199719981999
2000
2001
200220032004
2005
2006
20072008
1992
1993
1994
1995
1996
199719981999
20002001
2002
2003
2004
20052006
2007
Time series graphics
Forecasting: Principles and Practice Time series graphics 12
Seasonal subseries plot: quarterly beer production
Quarter
Meg
alitr
es
400
450
500
Jan Apr Jul Oct
Time series graphics
Time plotsR command: plot or plot.ts
Seasonal plotsR command: seasonplot
Seasonal subseries plotsR command: monthplot
Lag plotsR command: lag.plot
ACF plotsR command: Acf
Forecasting: Principles and Practice Time series graphics 13
Outline
1 Time series graphics
2 Seasonal or cyclic?
3 Autocorrelation
4 Forecast residuals
5 White noise
6 Evaluating forecast accuracy
Forecasting: Principles and Practice Seasonal or cyclic? 14
Time series patterns
Trend pattern exists when there is a long-termincrease or decrease in the data.
Seasonal pattern exists when a series isinfluenced by seasonal factors (e.g., thequarter of the year, the month, or day ofthe week).
Cyclic pattern exists when data exhibit rises andfalls that are not of fixed period (durationusually of at least 2 years).
Forecasting: Principles and Practice Seasonal or cyclic? 15
Time series patterns
Forecasting: Principles and Practice Seasonal or cyclic? 16
Australian electricity production
Year
GW
h
1980 1985 1990 1995
8000
1000
012
000
1400
0
Time series patterns
Forecasting: Principles and Practice Seasonal or cyclic? 17
Australian clay brick production
Year
mill
ion
units
1960 1970 1980 1990
200
300
400
500
600
Time series patterns
Forecasting: Principles and Practice Seasonal or cyclic? 18
Sales of new one−family houses, USA
Tota
l sal
es
1975 1980 1985 1990 1995
3040
5060
7080
90
Time series patterns
Forecasting: Principles and Practice Seasonal or cyclic? 19
US Treasury bill contracts
Day
pric
e
0 20 40 60 80 100
8586
8788
8990
91
Time series patterns
Forecasting: Principles and Practice Seasonal or cyclic? 20
Annual Canadian Lynx trappings
Time
Num
ber
trap
ped
1820 1840 1860 1880 1900 1920
010
0020
0030
0040
0050
0060
0070
00
Seasonal or cyclic?
Differences between seasonal and cyclicpatterns:
seasonal pattern constant length; cyclic patternvariable length
average length of cycle longer than length ofseasonal pattern
magnitude of cycle more variable thanmagnitude of seasonal pattern
The timing of peaks and troughs is predictable withseasonal data, but unpredictable in the long termwith cyclic data.
Forecasting: Principles and Practice Seasonal or cyclic? 21
Seasonal or cyclic?
Differences between seasonal and cyclicpatterns:
seasonal pattern constant length; cyclic patternvariable length
average length of cycle longer than length ofseasonal pattern
magnitude of cycle more variable thanmagnitude of seasonal pattern
The timing of peaks and troughs is predictable withseasonal data, but unpredictable in the long termwith cyclic data.
Forecasting: Principles and Practice Seasonal or cyclic? 21
Seasonal or cyclic?
Differences between seasonal and cyclicpatterns:
seasonal pattern constant length; cyclic patternvariable length
average length of cycle longer than length ofseasonal pattern
magnitude of cycle more variable thanmagnitude of seasonal pattern
The timing of peaks and troughs is predictable withseasonal data, but unpredictable in the long termwith cyclic data.
Forecasting: Principles and Practice Seasonal or cyclic? 21
Seasonal or cyclic?
Differences between seasonal and cyclicpatterns:
seasonal pattern constant length; cyclic patternvariable length
average length of cycle longer than length ofseasonal pattern
magnitude of cycle more variable thanmagnitude of seasonal pattern
The timing of peaks and troughs is predictable withseasonal data, but unpredictable in the long termwith cyclic data.
Forecasting: Principles and Practice Seasonal or cyclic? 21
Seasonal or cyclic?
Differences between seasonal and cyclicpatterns:
seasonal pattern constant length; cyclic patternvariable length
average length of cycle longer than length ofseasonal pattern
magnitude of cycle more variable thanmagnitude of seasonal pattern
The timing of peaks and troughs is predictable withseasonal data, but unpredictable in the long termwith cyclic data.
Forecasting: Principles and Practice Seasonal or cyclic? 21
Outline
1 Time series graphics
2 Seasonal or cyclic?
3 Autocorrelation
4 Forecast residuals
5 White noise
6 Evaluating forecast accuracy
Forecasting: Principles and Practice Autocorrelation 22
Autocorrelation
Covariance and correlation: measure extent oflinear relationship between two variables (y andX).
Autocovariance and autocorrelation: measurelinear relationship between lagged values of atime series y.
We measure the relationship between: yt and yt−1
yt and yt−2
yt and yt−3
etc.
Forecasting: Principles and Practice Autocorrelation 23
Autocorrelation
Covariance and correlation: measure extent oflinear relationship between two variables (y andX).
Autocovariance and autocorrelation: measurelinear relationship between lagged values of atime series y.
We measure the relationship between: yt and yt−1
yt and yt−2
yt and yt−3
etc.
Forecasting: Principles and Practice Autocorrelation 23
Autocorrelation
Covariance and correlation: measure extent oflinear relationship between two variables (y andX).
Autocovariance and autocorrelation: measurelinear relationship between lagged values of atime series y.
We measure the relationship between: yt and yt−1
yt and yt−2
yt and yt−3
etc.
Forecasting: Principles and Practice Autocorrelation 23
Example: Beer production
Forecasting: Principles and Practice Autocorrelation 24
lag 1
beer
1
2
3
4
5
6
7
8
9
10
11
12
13
1415
16
17
18 19
20
21
2223
24
25
2627
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42 43
44
45
46
47
48
49
50
51
52
53
5455
56
57
58
59
60
61
62
63
64
65
66400
450
500
400 450 500
lag 2
beer
1
2
3
4
5
6
7
8
9
10
11
12
13
1415
16
17
1819
20
21
2223
24
25
2627
28
29
30
31
32
33
34
35
36
37
38
39
40
41
4243
44
45
46
47
48
49
50
51
52
53
5455
56
57
58
59
60
61
62
63
64
65
lag 3
beer
1
2
3
4
5
6
7
8
9
10
11
12
13
1415
16
17
1819
20
21
2223
24
25
2627
28
29
30
31
32
33
34
35
36
37
38
39
40
41
4243
44
45
46
47
48
49
50
51
52
53
5455
56
57
58
59
60
61
62
63
64
400 450 500
lag 4
beer
1
2
3
4
5
6
7
8
9
10
11
12
13
1415
16
17
18 19
20
21
2223
24
25
2627
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42 43
44
45
46
47
48
49
50
51
52
53
5455
56
57
58
59
60
61
62
63
lag 5
beer
1
2
3
4
5
6
7
8
9
10
11
12
13
1415
16
17
18 19
20
21
2223
24
25
2627
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42 43
44
45
46
47
48
49
50
51
52
53
5455
56
57
58
59
60
61
62
lag 6
beer
1
2
3
4
5
6
7
8
9
10
11
12
13
1415
16
17
1819
20
21
2223
24
25
2627
28
29
30
31
32
33
34
35
36
37
38
39
40
41
4243
44
45
46
47
48
49
50
51
52
53
5455
56
57
58
59
60
61
400
450
500
lag 7
beer
1
2
3
4
5
6
7
8
9
10
11
12
13
1415
16
17
1819
20
21
2223
24
25
2627
28
29
30
31
32
33
34
35
36
37
38
39
40
41
4243
44
45
46
47
48
49
50
51
52
53
5455
56
57
58
59
60
400
450
500
lag 8
beer
1
2
3
4
5
6
7
8
9
10
11
12
13
1415
16
17
1819
20
21
2223
24
25
2627
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42 43
44
45
46
47
48
49
50
51
52
53
5455
56
57
58
59
400 450 500 lag 9
beer
1
2
3
4
5
6
7
8
9
10
11
12
13
1415
16
17
18 19
20
21
2223
24
25
2627
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42 43
44
45
46
47
48
49
50
51
52
53
5455
56
57
58
> lag.plot(beer,lags=9)
Example: Beer production
Forecasting: Principles and Practice Autocorrelation 25
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
lag 1
beer
400
450
500
400 450 500
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
lag 2
beer
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
lag 3
beer
400 450 500
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
lag 4
beer
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
lag 5
beer
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
lag 6
beer
400
450
500
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
lag 7
beer
400
450
500
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
lag 8
beer
400 450 500
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
lag 9
beer
> lag.plot(beer,lags=9,do.lines=FALSE)
Lagged scatterplots
Each graph shows yt plotted against yt−k fordifferent values of k.
The autocorrelations are the correlationsassociated with these scatterplots.
Forecasting: Principles and Practice Autocorrelation 26
Lagged scatterplots
Each graph shows yt plotted against yt−k fordifferent values of k.
The autocorrelations are the correlationsassociated with these scatterplots.
Forecasting: Principles and Practice Autocorrelation 26
AutocorrelationWe denote the sample autocovariance at lag k by ck and thesample autocorrelation at lag k by rk. Then define
ck =1
T
T∑t=k+1
(yt − y)(yt−k − y)
and rk = ck/c0
r1 indicates how successive values of y relate to eachother
r2 indicates how y values two periods apart relate toeach other
rk is almost the same as the sample correlation betweenyt and yt−k.
Forecasting: Principles and Practice Autocorrelation 27
AutocorrelationWe denote the sample autocovariance at lag k by ck and thesample autocorrelation at lag k by rk. Then define
ck =1
T
T∑t=k+1
(yt − y)(yt−k − y)
and rk = ck/c0
r1 indicates how successive values of y relate to eachother
r2 indicates how y values two periods apart relate toeach other
rk is almost the same as the sample correlation betweenyt and yt−k.
Forecasting: Principles and Practice Autocorrelation 27
AutocorrelationWe denote the sample autocovariance at lag k by ck and thesample autocorrelation at lag k by rk. Then define
ck =1
T
T∑t=k+1
(yt − y)(yt−k − y)
and rk = ck/c0
r1 indicates how successive values of y relate to eachother
r2 indicates how y values two periods apart relate toeach other
rk is almost the same as the sample correlation betweenyt and yt−k.
Forecasting: Principles and Practice Autocorrelation 27
AutocorrelationWe denote the sample autocovariance at lag k by ck and thesample autocorrelation at lag k by rk. Then define
ck =1
T
T∑t=k+1
(yt − y)(yt−k − y)
and rk = ck/c0
r1 indicates how successive values of y relate to eachother
r2 indicates how y values two periods apart relate toeach other
rk is almost the same as the sample correlation betweenyt and yt−k.
Forecasting: Principles and Practice Autocorrelation 27
AutocorrelationResults for first 9 lags for beer data:
r1 r2 r3 r4 r5 r6 r7 r8 r9
−0.126 −0.650 −0.094 0.863 −0.099 −0.642 −0.098 0.834 −0.116
Forecasting: Principles and Practice Autocorrelation 28
AutocorrelationResults for first 9 lags for beer data:
r1 r2 r3 r4 r5 r6 r7 r8 r9
−0.126 −0.650 −0.094 0.863 −0.099 −0.642 −0.098 0.834 −0.116−
0.5
0.0
0.5
Lag
AC
F
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1711 13 17
Forecasting: Principles and Practice Autocorrelation 28
Autocorrelation
r4 higher than for the other lags. This is due tothe seasonal pattern in the data: the peakstend to be 4 quarters apart and the troughstend to be 2 quarters apart.
r2 is more negative than for the other lagsbecause troughs tend to be 2 quarters behindpeaks.
Together, the autocorrelations at lags 1, 2, . . . ,make up the autocorrelation or ACF.
The plot is known as a correlogram
Forecasting: Principles and Practice Autocorrelation 29
Autocorrelation
r4 higher than for the other lags. This is due tothe seasonal pattern in the data: the peakstend to be 4 quarters apart and the troughstend to be 2 quarters apart.
r2 is more negative than for the other lagsbecause troughs tend to be 2 quarters behindpeaks.
Together, the autocorrelations at lags 1, 2, . . . ,make up the autocorrelation or ACF.
The plot is known as a correlogram
Forecasting: Principles and Practice Autocorrelation 29
Autocorrelation
r4 higher than for the other lags. This is due tothe seasonal pattern in the data: the peakstend to be 4 quarters apart and the troughstend to be 2 quarters apart.
r2 is more negative than for the other lagsbecause troughs tend to be 2 quarters behindpeaks.
Together, the autocorrelations at lags 1, 2, . . . ,make up the autocorrelation or ACF.
The plot is known as a correlogram
Forecasting: Principles and Practice Autocorrelation 29
Autocorrelation
r4 higher than for the other lags. This is due tothe seasonal pattern in the data: the peakstend to be 4 quarters apart and the troughstend to be 2 quarters apart.
r2 is more negative than for the other lagsbecause troughs tend to be 2 quarters behindpeaks.
Together, the autocorrelations at lags 1, 2, . . . ,make up the autocorrelation or ACF.
The plot is known as a correlogram
Forecasting: Principles and Practice Autocorrelation 29
ACF
Acf(beer)
Forecasting: Principles and Practice Autocorrelation 30
−0.
50.
00.
5
Lag
AC
F
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1711 13 17
ACF
Acf(beer)
Forecasting: Principles and Practice Autocorrelation 30
−0.
50.
00.
5
Lag
AC
F
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1711 13 17
Recognizing seasonality in a time series
If there is seasonality, the ACF at the seasonal lag(e.g., 12 for monthly data) will be large andpositive.
For seasonal monthly data, a large ACF valuewill be seen at lag 12 and possibly also at lags24, 36, . . .
For seasonal quarterly data, a large ACF valuewill be seen at lag 4 and possibly also at lags 8,12, . . .
Forecasting: Principles and Practice Autocorrelation 31
Recognizing seasonality in a time series
If there is seasonality, the ACF at the seasonal lag(e.g., 12 for monthly data) will be large andpositive.
For seasonal monthly data, a large ACF valuewill be seen at lag 12 and possibly also at lags24, 36, . . .
For seasonal quarterly data, a large ACF valuewill be seen at lag 4 and possibly also at lags 8,12, . . .
Forecasting: Principles and Practice Autocorrelation 31
Australian monthly electricity production
Forecasting: Principles and Practice Autocorrelation 32
Australian electricity production
Year
GW
h
1980 1985 1990 1995
8000
1000
012
000
1400
0
Australian monthly electricity production
Forecasting: Principles and Practice Autocorrelation 33
−0.
20.
00.
20.
40.
60.
8
Lag
AC
F
0 10 20 30 40
Australian monthly electricity production
Time plot shows clear trend and seasonality.The same features are reflected in the ACF.
The slowly decaying ACF indicates trend.
The ACF peaks at lags 12, 24, 36, . . . , indicateseasonality of length 12.
Forecasting: Principles and Practice Autocorrelation 34
Australian monthly electricity production
Time plot shows clear trend and seasonality.The same features are reflected in the ACF.
The slowly decaying ACF indicates trend.
The ACF peaks at lags 12, 24, 36, . . . , indicateseasonality of length 12.
Forecasting: Principles and Practice Autocorrelation 34
Which is which?
chirp
s pe
r m
inut
e0 20 40 60
4060
80
1. Daily morning temperature of a cow
thou
sand
s
1973 1975 1977 1979
78
910
2. Accidental deaths in USA (monthly)
thou
sand
s
1950 1952 1954 1956
100
200
300
400
3. International airline passengers
thou
sand
s
1850 1870 1890 1910
2060
100
4. Annual mink trappings (Canada)
A
AC
F
5 10 15 20
-0.4
0.2
0.6
1.0
B
AC
F
5 10 15 20
-0.4
0.2
0.6
1.0
C
AC
F
5 10 15 20
-0.4
0.2
0.6
1.0
D
AC
F
5 10 15 20
-0.4
0.2
0.6
1.0
Outline
1 Time series graphics
2 Seasonal or cyclic?
3 Autocorrelation
4 Forecast residuals
5 White noise
6 Evaluating forecast accuracy
Forecasting: Principles and Practice Forecast residuals 36
Forecasting residuals
Residuals in forecasting: difference betweenobserved value and its forecast based on allprevious observations: et = yt − yt|t−1.
Assumptions1 {et} uncorrelated. If they aren’t, then
information left in residuals that should be usedin computing forecasts.
2 {et} have mean zero. If they don’t, thenforecasts are biased.
Useful properties (for prediction intervals)3 {et} have constant variance.4 {et} are normally distributed.
Forecasting: Principles and Practice Forecast residuals 37
Forecasting residuals
Residuals in forecasting: difference betweenobserved value and its forecast based on allprevious observations: et = yt − yt|t−1.
Assumptions1 {et} uncorrelated. If they aren’t, then
information left in residuals that should be usedin computing forecasts.
2 {et} have mean zero. If they don’t, thenforecasts are biased.
Useful properties (for prediction intervals)3 {et} have constant variance.4 {et} are normally distributed.
Forecasting: Principles and Practice Forecast residuals 37
Forecasting residuals
Residuals in forecasting: difference betweenobserved value and its forecast based on allprevious observations: et = yt − yt|t−1.
Assumptions1 {et} uncorrelated. If they aren’t, then
information left in residuals that should be usedin computing forecasts.
2 {et} have mean zero. If they don’t, thenforecasts are biased.
Useful properties (for prediction intervals)3 {et} have constant variance.4 {et} are normally distributed.
Forecasting: Principles and Practice Forecast residuals 37
Forecasting Dow-Jones index
Forecasting: Principles and Practice Forecast residuals 38
Day
Dow
−Jo
nes
inde
x
0 50 100 150 200 250 300
3600
3700
3800
3900
Forecasting Dow-Jones index
Naïve forecast:
yt|t−1 = yt−1
et = yt − yt−1
Note: et are one-step-forecast residuals
Forecasting: Principles and Practice Forecast residuals 39
Forecasting Dow-Jones index
Naïve forecast:
yt|t−1 = yt−1
et = yt − yt−1
Note: et are one-step-forecast residuals
Forecasting: Principles and Practice Forecast residuals 39
Forecasting Dow-Jones index
Naïve forecast:
yt|t−1 = yt−1
et = yt − yt−1
Note: et are one-step-forecast residuals
Forecasting: Principles and Practice Forecast residuals 39
Forecasting Dow-Jones index
Forecasting: Principles and Practice Forecast residuals 40
Day
Dow
−Jo
nes
inde
x
0 50 100 150 200 250 300
3600
3700
3800
3900
Forecasting Dow-Jones index
Forecasting: Principles and Practice Forecast residuals 41
Day
Cha
nge
in D
ow−
Jone
s in
dex
0 50 100 150 200 250 300
−10
0−
500
50
Forecasting Dow-Jones index
Forecasting: Principles and Practice Forecast residuals 42
Histogram of residuals
Change in Dow−Jones index
Fre
quen
cy
−100 −50 0 50
010
2030
4050
60
Normal?
Forecasting Dow-Jones index
Forecasting: Principles and Practice Forecast residuals 43
−0.
15−
0.05
0.05
0.10
0.15
Lag
AC
F
1 2 3 4 5 6 7 8 9 10 12 14 16 18 20 22
Forecasting Dow-Jones index
fc <- rwf(dj)
res <- residuals(fc)
plot(res)
hist(res,breaks="FD")
Acf(res,main="")
Forecasting: Principles and Practice Forecast residuals 44
Outline
1 Time series graphics
2 Seasonal or cyclic?
3 Autocorrelation
4 Forecast residuals
5 White noise
6 Evaluating forecast accuracy
Forecasting: Principles and Practice White noise 45
Example: White noise
Forecasting: Principles and Practice White noise 46
White noise
Time
x
0 10 20 30 40 50
−3
−2
−1
01
2
Example: White noise
Forecasting: Principles and Practice White noise 46
White noise
Time
x
0 10 20 30 40 50
−3
−2
−1
01
2
White noise data is uncorrelated acrosstime with zero mean and constant variance.(Technically, we require independence aswell.)
Example: White noise
Forecasting: Principles and Practice White noise 46
White noise
Time
x
0 10 20 30 40 50
−3
−2
−1
01
2
White noise data is uncorrelated acrosstime with zero mean and constant variance.(Technically, we require independence aswell.)
Think of white noise as completelyuninteresting with no predictable patterns.
Example: White noise
r1 = 0.013r2 = −0.163r3 = 0.163r4 = −0.259r5 = −0.198r6 = 0.064r7 = −0.139r8 = −0.032r9 = 0.199
r10 = −0.240
Sample autocorrelations for white noise series.For uncorrelated data, we would expect eachautocorrelation to be close to zero.
Forecasting: Principles and Practice White noise 47
−0.
4−
0.2
0.0
0.2
0.4
Lag
AC
F
1 2 3 4 5 6 7 8 9 10 11 12 13 14 1511 13
Sampling distribution of autocorrelations
Sampling distribution of rk for white noise data isasymptotically N(0,1/T).
95% of all rk for white noise must lie within±1.96/
√T.
If this is not the case, the series is probably notWN.
Common to plot lines at ±1.96/√
T whenplotting ACF. These are the critical values.
Forecasting: Principles and Practice White noise 48
Sampling distribution of autocorrelations
Sampling distribution of rk for white noise data isasymptotically N(0,1/T).
95% of all rk for white noise must lie within±1.96/
√T.
If this is not the case, the series is probably notWN.
Common to plot lines at ±1.96/√
T whenplotting ACF. These are the critical values.
Forecasting: Principles and Practice White noise 48
Sampling distribution of autocorrelations
Sampling distribution of rk for white noise data isasymptotically N(0,1/T).
95% of all rk for white noise must lie within±1.96/
√T.
If this is not the case, the series is probably notWN.
Common to plot lines at ±1.96/√
T whenplotting ACF. These are the critical values.
Forecasting: Principles and Practice White noise 48
Sampling distribution of autocorrelations
Sampling distribution of rk for white noise data isasymptotically N(0,1/T).
95% of all rk for white noise must lie within±1.96/
√T.
If this is not the case, the series is probably notWN.
Common to plot lines at ±1.96/√
T whenplotting ACF. These are the critical values.
Forecasting: Principles and Practice White noise 48
Autocorrelation
Forecasting: Principles and Practice White noise 49
−0.
4−
0.2
0.0
0.2
0.4
Lag
AC
F
1 2 3 4 5 6 7 8 9 10 11 12 13 14 1511 13
Example:T = 50 and socritical values at±1.96/
√50 =
±0.28.All autocorrelationcoefficients lie withinthese limits,confirming that thedata are white noise.(More precisely, the data cannot bedistinguished from white noise.)
Example: Pigs slaughtered
Forecasting: Principles and Practice White noise 50
Number of pigs slaughtered in Victoria
thou
sand
s
1990 1991 1992 1993 1994 1995
8090
100
110
Example: Pigs slaughtered
Forecasting: Principles and Practice White noise 51
−0.
20.
00.
2
Lag
AC
F
0 10 20 30 40
Example: Pigs slaughtered
Monthly total number of pigs slaughtered in thestate of Victoria, Australia, from January 1990through August 1995. (Source: Australian Bureau ofStatistics.)
Difficult to detect pattern in time plot.
ACF shows some significant autocorrelation atlags 1, 2, and 3.
r12 relatively large although not significant.This may indicate some slight seasonality.
These show the series is not a white noise series.
Forecasting: Principles and Practice White noise 52
Example: Pigs slaughtered
Monthly total number of pigs slaughtered in thestate of Victoria, Australia, from January 1990through August 1995. (Source: Australian Bureau ofStatistics.)
Difficult to detect pattern in time plot.
ACF shows some significant autocorrelation atlags 1, 2, and 3.
r12 relatively large although not significant.This may indicate some slight seasonality.
These show the series is not a white noise series.
Forecasting: Principles and Practice White noise 52
Example: Pigs slaughtered
Monthly total number of pigs slaughtered in thestate of Victoria, Australia, from January 1990through August 1995. (Source: Australian Bureau ofStatistics.)
Difficult to detect pattern in time plot.
ACF shows some significant autocorrelation atlags 1, 2, and 3.
r12 relatively large although not significant.This may indicate some slight seasonality.
These show the series is not a white noise series.
Forecasting: Principles and Practice White noise 52
Example: Pigs slaughtered
Monthly total number of pigs slaughtered in thestate of Victoria, Australia, from January 1990through August 1995. (Source: Australian Bureau ofStatistics.)
Difficult to detect pattern in time plot.
ACF shows some significant autocorrelation atlags 1, 2, and 3.
r12 relatively large although not significant.This may indicate some slight seasonality.
These show the series is not a white noise series.
Forecasting: Principles and Practice White noise 52
Example: Pigs slaughtered
Monthly total number of pigs slaughtered in thestate of Victoria, Australia, from January 1990through August 1995. (Source: Australian Bureau ofStatistics.)
Difficult to detect pattern in time plot.
ACF shows some significant autocorrelation atlags 1, 2, and 3.
r12 relatively large although not significant.This may indicate some slight seasonality.
These show the series is not a white noise series.
Forecasting: Principles and Practice White noise 52
Example: Pigs slaughtered
Monthly total number of pigs slaughtered in thestate of Victoria, Australia, from January 1990through August 1995. (Source: Australian Bureau ofStatistics.)
Difficult to detect pattern in time plot.
ACF shows some significant autocorrelation atlags 1, 2, and 3.
r12 relatively large although not significant.This may indicate some slight seasonality.
These show the series is not a white noise series.
Forecasting: Principles and Practice White noise 52
ACF of residuals
We assume that the residuals are white noise(uncorrelated, mean zero, constant variance). Ifthey aren’t, then there is information left in theresiduals that should be used in computingforecasts.So a standard residual diagnostic is to checkthe ACF of the residuals of a forecastingmethod.We expect these to look like white noise.
Dow-Jones naive forecasts revisited
yt|t−1 = yt−1
et = yt − yt−1
Forecasting: Principles and Practice White noise 53
ACF of residuals
We assume that the residuals are white noise(uncorrelated, mean zero, constant variance). Ifthey aren’t, then there is information left in theresiduals that should be used in computingforecasts.So a standard residual diagnostic is to checkthe ACF of the residuals of a forecastingmethod.We expect these to look like white noise.
Dow-Jones naive forecasts revisited
yt|t−1 = yt−1
et = yt − yt−1
Forecasting: Principles and Practice White noise 53
ACF of residuals
We assume that the residuals are white noise(uncorrelated, mean zero, constant variance). Ifthey aren’t, then there is information left in theresiduals that should be used in computingforecasts.So a standard residual diagnostic is to checkthe ACF of the residuals of a forecastingmethod.We expect these to look like white noise.
Dow-Jones naive forecasts revisited
yt|t−1 = yt−1
et = yt − yt−1
Forecasting: Principles and Practice White noise 53
ACF of residuals
We assume that the residuals are white noise(uncorrelated, mean zero, constant variance). Ifthey aren’t, then there is information left in theresiduals that should be used in computingforecasts.So a standard residual diagnostic is to checkthe ACF of the residuals of a forecastingmethod.We expect these to look like white noise.
Dow-Jones naive forecasts revisited
yt|t−1 = yt−1
et = yt − yt−1
Forecasting: Principles and Practice White noise 53
ACF of residuals
We assume that the residuals are white noise(uncorrelated, mean zero, constant variance). Ifthey aren’t, then there is information left in theresiduals that should be used in computingforecasts.So a standard residual diagnostic is to checkthe ACF of the residuals of a forecastingmethod.We expect these to look like white noise.
Dow-Jones naive forecasts revisited
yt|t−1 = yt−1
et = yt − yt−1
Forecasting: Principles and Practice White noise 53
Forecasting Dow-Jones index
Forecasting: Principles and Practice White noise 54
Day
Cha
nge
in D
ow−
Jone
s in
dex
0 50 100 150 200 250 300
−10
0−
500
50
Forecasting Dow-Jones index
Forecasting: Principles and Practice White noise 55
−0.
15−
0.05
0.05
0.10
0.15
Lag
AC
F
1 2 3 4 5 6 7 8 9 10 12 14 16 18 20 2211 13 15 17 19 21 23
Example: Dow-Jones residuals
−0.
15−
0.05
0.05
0.10
0.15
Lag
AC
F
1 2 3 4 5 6 7 8 9 10 12 14 16 18 20 2211 13 15 17 19 21 23
These look like white noise.
But the ACF is a multiple testing problem.
Forecasting: Principles and Practice White noise 56
Example: Dow-Jones residuals
−0.
15−
0.05
0.05
0.10
0.15
Lag
AC
F
1 2 3 4 5 6 7 8 9 10 12 14 16 18 20 2211 13 15 17 19 21 23
These look like white noise.
But the ACF is a multiple testing problem.
Forecasting: Principles and Practice White noise 56
Portmanteau testsConsider a whole set of rk values, and develop atest to see whether the set is significantly differentfrom a zero set.
Forecasting: Principles and Practice White noise 57
Portmanteau testsConsider a whole set of rk values, and develop atest to see whether the set is significantly differentfrom a zero set.
Box-Pierce test
Q = Th∑
k=1
r2k
where h is max lag being considered and T isnumber of observations.
My preferences: h = 10 for non-seasonal data, h = 2mfor seasonal data.If each rk close to zero, Q will be small.If some rk values large (positive or negative), Q will belarge.
Forecasting: Principles and Practice White noise 57
Portmanteau testsConsider a whole set of rk values, and develop atest to see whether the set is significantly differentfrom a zero set.
Box-Pierce test
Q = Th∑
k=1
r2k
where h is max lag being considered and T isnumber of observations.
My preferences: h = 10 for non-seasonal data, h = 2mfor seasonal data.If each rk close to zero, Q will be small.If some rk values large (positive or negative), Q will belarge.
Forecasting: Principles and Practice White noise 57
Portmanteau testsConsider a whole set of rk values, and develop atest to see whether the set is significantly differentfrom a zero set.
Box-Pierce test
Q = Th∑
k=1
r2k
where h is max lag being considered and T isnumber of observations.
My preferences: h = 10 for non-seasonal data, h = 2mfor seasonal data.If each rk close to zero, Q will be small.If some rk values large (positive or negative), Q will belarge.
Forecasting: Principles and Practice White noise 57
Portmanteau testsConsider a whole set of rk values, and develop atest to see whether the set is significantly differentfrom a zero set.
Box-Pierce test
Q = Th∑
k=1
r2k
where h is max lag being considered and T isnumber of observations.
My preferences: h = 10 for non-seasonal data, h = 2mfor seasonal data.If each rk close to zero, Q will be small.If some rk values large (positive or negative), Q will belarge.
Forecasting: Principles and Practice White noise 57
Portmanteau testsConsider a whole set of rk values, and develop atest to see whether the set is significantly differentfrom a zero set.
Ljung-Box test
Q∗ = T(T + 2)h∑
k=1
(T − k)−1r2k
where h is max lag being considered and T isnumber of observations.
My preferences: h = 10 for non-seasonal data, h = 2mfor seasonal data.Better performance, especially in small samples.
Forecasting: Principles and Practice White noise 57
Portmanteau testsIf data are WN, Q∗ has χ2 distribution with(h− K) degrees of freedom where K = no.parameters in model.
When applied to raw data, set K = 0.For the Dow-Jones example,
res <- residuals(naive(dj))
# lag=h and fitdf=K> Box.test(res, lag=10, fitdf=0)Box-Pierce test
X-squared = 14.0451, df = 10, p-value = 0.1709> Box.test(res, lag=10, fitdf=0, type="Lj")Box-Ljung test
X-squared = 14.4615, df = 10, p-value = 0.153
Forecasting: Principles and Practice White noise 58
Portmanteau testsIf data are WN, Q∗ has χ2 distribution with(h− K) degrees of freedom where K = no.parameters in model.
When applied to raw data, set K = 0.For the Dow-Jones example,
res <- residuals(naive(dj))
# lag=h and fitdf=K> Box.test(res, lag=10, fitdf=0)Box-Pierce test
X-squared = 14.0451, df = 10, p-value = 0.1709> Box.test(res, lag=10, fitdf=0, type="Lj")Box-Ljung test
X-squared = 14.4615, df = 10, p-value = 0.153
Forecasting: Principles and Practice White noise 58
Portmanteau testsIf data are WN, Q∗ has χ2 distribution with(h− K) degrees of freedom where K = no.parameters in model.
When applied to raw data, set K = 0.For the Dow-Jones example,
res <- residuals(naive(dj))
# lag=h and fitdf=K> Box.test(res, lag=10, fitdf=0)Box-Pierce test
X-squared = 14.0451, df = 10, p-value = 0.1709> Box.test(res, lag=10, fitdf=0, type="Lj")Box-Ljung test
X-squared = 14.4615, df = 10, p-value = 0.153
Forecasting: Principles and Practice White noise 58
Exercise
1 Calculate the residuals from a seasonal
naive forecast applied to the quarterly
Australian beer production data from
1992.2 Test if the residuals are white noise.
Forecasting: Principles and Practice White noise 59
Exercise
1 Calculate the residuals from a seasonal
naive forecast applied to the quarterly
Australian beer production data from
1992.2 Test if the residuals are white noise.
beer <- window(ausbeer,start=1992)fc <- snaive(beer)res <- residuals(fc)Acf(res)Box.test(res, lag=8, fitdf=0, type="Lj")
Forecasting: Principles and Practice White noise 60
Outline
1 Time series graphics
2 Seasonal or cyclic?
3 Autocorrelation
4 Forecast residuals
5 White noise
6 Evaluating forecast accuracy
Forecasting: Principles and Practice Evaluating forecast accuracy 61
Measures of forecast accuracy
Let yt denote the tth observation and yt|t−1 denote its forecastbased on all previous data, where t = 1, . . . , T. Then thefollowing measures are useful.
MAE = T−1T∑
t=1
|yt − yt|t−1|
MSE = T−1T∑
t=1
(yt − yt|t−1)2 RMSE =
√√√√T−1
T∑t=1
(yt − yt|t−1)2
MAPE = 100T−1T∑
t=1
|yt − yt|t−1|/|yt|
MAE, MSE, RMSE are all scale dependent.
MAPE is scale independent but is only sensible if yt � 0for all t, and y has a natural zero.
Forecasting: Principles and Practice Evaluating forecast accuracy 62
Measures of forecast accuracy
Let yt denote the tth observation and yt|t−1 denote its forecastbased on all previous data, where t = 1, . . . , T. Then thefollowing measures are useful.
MAE = T−1T∑
t=1
|yt − yt|t−1|
MSE = T−1T∑
t=1
(yt − yt|t−1)2 RMSE =
√√√√T−1
T∑t=1
(yt − yt|t−1)2
MAPE = 100T−1T∑
t=1
|yt − yt|t−1|/|yt|
MAE, MSE, RMSE are all scale dependent.
MAPE is scale independent but is only sensible if yt � 0for all t, and y has a natural zero.
Forecasting: Principles and Practice Evaluating forecast accuracy 62
Measures of forecast accuracy
Let yt denote the tth observation and yt|t−1 denote its forecastbased on all previous data, where t = 1, . . . , T. Then thefollowing measures are useful.
MAE = T−1T∑
t=1
|yt − yt|t−1|
MSE = T−1T∑
t=1
(yt − yt|t−1)2 RMSE =
√√√√T−1
T∑t=1
(yt − yt|t−1)2
MAPE = 100T−1T∑
t=1
|yt − yt|t−1|/|yt|
MAE, MSE, RMSE are all scale dependent.
MAPE is scale independent but is only sensible if yt � 0for all t, and y has a natural zero.
Forecasting: Principles and Practice Evaluating forecast accuracy 62
Measures of forecast accuracy
Mean Absolute Scaled Error
MASE = T−1T∑
t=1
|yt − yt|t−1|/Q
where Q is a stable measure of the scale of the timeseries {yt}.
Forecasting: Principles and Practice Evaluating forecast accuracy 63
Measures of forecast accuracy
Mean Absolute Scaled Error
MASE = T−1T∑
t=1
|yt − yt|t−1|/Q
where Q is a stable measure of the scale of the timeseries {yt}.
Proposed by Hyndman and Koehler (IJF, 2006)
Forecasting: Principles and Practice Evaluating forecast accuracy 63
Measures of forecast accuracy
Mean Absolute Scaled Error
MASE = T−1T∑
t=1
|yt − yt|t−1|/Q
where Q is a stable measure of the scale of the timeseries {yt}.
For non-seasonal time series,
Q = (T − 1)−1T∑
t=2
|yt − yt−1|
works well. Then MASE is equivalent to MAE relativeto a naive method.
Forecasting: Principles and Practice Evaluating forecast accuracy 63
Measures of forecast accuracy
Mean Absolute Scaled Error
MASE = T−1T∑
t=1
|yt − yt|t−1|/Q
where Q is a stable measure of the scale of the timeseries {yt}.
For seasonal time series,
Q = (T −m)−1T∑
t=m+1
|yt − yt−m|
works well. Then MASE is equivalent to MAE relativeto a seasonal naive method.
Forecasting: Principles and Practice Evaluating forecast accuracy 64
Measures of forecast accuracy
Forecasting: Principles and Practice Evaluating forecast accuracy 65
Forecasts for quarterly beer production
1995 2000 2005
400
450
500
Mean methodNaive methodSeasonal naive method
Measures of forecast accuracy
Forecasting: Principles and Practice Evaluating forecast accuracy 65
Forecasts for quarterly beer production
1995 2000 2005
400
450
500
Mean methodNaive methodSeasonal naive method
Measures of forecast accuracy
Mean method
RMSE MAE MAPE MASE38.0145 33.7776 8.1700 2.2990
Naïve method
RMSE MAE MAPE MASE70.9065 63.9091 15.8765 4.3498
Seasonal naïve method
RMSE MAE MAPE MASE12.9685 11.2727 2.7298 0.7673
Forecasting: Principles and Practice Evaluating forecast accuracy 66
Measures of forecast accuracy
Forecasting: Principles and Practice Evaluating forecast accuracy 67
Dow Jones Index (daily ending 15 Jul 94)
Day
0 50 100 150 200 250 300
3600
3700
3800
3900
Mean methodNaive methodDrift model
Measures of forecast accuracy
Forecasting: Principles and Practice Evaluating forecast accuracy 67
Dow Jones Index (daily ending 15 Jul 94)
Day
0 50 100 150 200 250 300
3600
3700
3800
3900
Mean methodNaive methodDrift model
Measures of forecast accuracy
Mean method
RMSE MAE MAPE MASE148.2357 142.4185 3.6630 8.6981
Naïve method
RMSE MAE MAPE MASE62.0285 54.4405 1.3979 3.3249
Drift model
RMSE MAE MAPE MASE53.6977 45.7274 1.1758 2.7928
Forecasting: Principles and Practice Evaluating forecast accuracy 68
Training and test sets
Available data
Training set Test set(e.g., 80%) (e.g., 20%)
The test set must not be used for any aspect ofmodel development or calculation of forecasts.
Forecast accuracy is based only on the test set.
Forecasting: Principles and Practice Evaluating forecast accuracy 69
Training and test sets
Available data
Training set Test set(e.g., 80%) (e.g., 20%)
The test set must not be used for any aspect ofmodel development or calculation of forecasts.
Forecast accuracy is based only on the test set.
Forecasting: Principles and Practice Evaluating forecast accuracy 69
Training and test sets
beer3 <- window(ausbeer,start=1992,end=2005.99)beer4 <- window(ausbeer,start=2006)
fit1 <- meanf(beer3,h=20)fit2 <- rwf(beer3,h=20)
accuracy(fit1,beer4)accuracy(fit2,beer4)
In-sample accuracy (one-step forecasts)accuracy(fit1)accuracy(fit2)
Forecasting: Principles and Practice Evaluating forecast accuracy 70
Training and test sets
beer3 <- window(ausbeer,start=1992,end=2005.99)beer4 <- window(ausbeer,start=2006)
fit1 <- meanf(beer3,h=20)fit2 <- rwf(beer3,h=20)
accuracy(fit1,beer4)accuracy(fit2,beer4)
In-sample accuracy (one-step forecasts)accuracy(fit1)accuracy(fit2)
Forecasting: Principles and Practice Evaluating forecast accuracy 70
Beware of over-fitting
A model which fits the data well does notnecessarily forecast well.A perfect fit can always be obtained by using amodel with enough parameters. (Compare R2)Over-fitting a model to data is as bad as failingto identify the systematic pattern in the data.Problems can be overcome by measuring trueout-of-sample forecast accuracy. That is, totaldata divided into “training” set and “test” set.Training set used to estimate parameters.Forecasts are made for test set.Accuracy measures computed for errors in testset only.
Forecasting: Principles and Practice Evaluating forecast accuracy 71
Beware of over-fitting
A model which fits the data well does notnecessarily forecast well.A perfect fit can always be obtained by using amodel with enough parameters. (Compare R2)Over-fitting a model to data is as bad as failingto identify the systematic pattern in the data.Problems can be overcome by measuring trueout-of-sample forecast accuracy. That is, totaldata divided into “training” set and “test” set.Training set used to estimate parameters.Forecasts are made for test set.Accuracy measures computed for errors in testset only.
Forecasting: Principles and Practice Evaluating forecast accuracy 71
Beware of over-fitting
A model which fits the data well does notnecessarily forecast well.A perfect fit can always be obtained by using amodel with enough parameters. (Compare R2)Over-fitting a model to data is as bad as failingto identify the systematic pattern in the data.Problems can be overcome by measuring trueout-of-sample forecast accuracy. That is, totaldata divided into “training” set and “test” set.Training set used to estimate parameters.Forecasts are made for test set.Accuracy measures computed for errors in testset only.
Forecasting: Principles and Practice Evaluating forecast accuracy 71
Beware of over-fitting
A model which fits the data well does notnecessarily forecast well.A perfect fit can always be obtained by using amodel with enough parameters. (Compare R2)Over-fitting a model to data is as bad as failingto identify the systematic pattern in the data.Problems can be overcome by measuring trueout-of-sample forecast accuracy. That is, totaldata divided into “training” set and “test” set.Training set used to estimate parameters.Forecasts are made for test set.Accuracy measures computed for errors in testset only.
Forecasting: Principles and Practice Evaluating forecast accuracy 71
Beware of over-fitting
A model which fits the data well does notnecessarily forecast well.A perfect fit can always be obtained by using amodel with enough parameters. (Compare R2)Over-fitting a model to data is as bad as failingto identify the systematic pattern in the data.Problems can be overcome by measuring trueout-of-sample forecast accuracy. That is, totaldata divided into “training” set and “test” set.Training set used to estimate parameters.Forecasts are made for test set.Accuracy measures computed for errors in testset only.
Forecasting: Principles and Practice Evaluating forecast accuracy 71
Poll: true or false?
1 Good forecast methods should have normallydistributed residuals.
2 A model with small residuals will give goodforecasts.
3 The best measure of forecast accuracy is MAPE.
4 If your model doesn’t forecast well, you shouldmake it more complicated.
5 Always choose the model with the best forecastaccuracy as measured on the test set.
Forecasting: Principles and Practice Evaluating forecast accuracy 72