# Weighing Individual Observations for Time Series Forecasting 2017. 4. 7.¢ Time Series...

date post

05-Oct-2020Category

## Documents

view

1download

0

Embed Size (px)

### Transcript of Weighing Individual Observations for Time Series Forecasting 2017. 4. 7.¢ Time Series...

Weighing Individual Observations for Time Series Forecasting

Victor Hoornweg & Philip Hans Franses

Erasmus University Rotterdam, Tinbergen Institute, Econometric Institute

Rotterdam, July 1, 2014

1

Introduction

Issue: • How to deal with structural breaks or outliers? Weigh individual observations

• Model: 𝑦𝑡 = 𝜇 + 𝜂𝑡 • DGP: 𝑦𝑡 = 3 − 2 ∙ 𝟏𝑡>80 + 2 ∙ 𝟏𝑡>120 + 𝜀𝑡,

– where 𝑡 = 1, 2, … , 170 and 𝜀𝑡~𝑁(0, 1)

2

Figure 1. Simulated series

Introduction

Issue:

• How to deal with structural breaks or outliers?

Weigh individual observations

• Model: 𝑦𝑡 = 𝜇 + 𝜂𝑡

• DGP: 𝑦𝑡 = 3 − 2 ∙ 𝟏𝑡>80 + 2 ∙ 𝟏𝑡>120 + 𝜀𝑡, – where 𝑡 = 1, 2, … , 170 and 𝜀𝑡~𝑁(0, 1)

3

Figure 2. Individual weights assigned to observations across time

Introduction

Issue:

• How to deal with structural breaks or outliers?

Proposed solution:

• Assign robust weights to observations based on pseudo out-of-sample forecasts (posf):

– 𝑦𝑤,𝑡 = 𝑤𝑡𝑦𝑡

– 𝑋𝑤,𝑡 = 𝑤𝑡𝑋𝑡

– 𝑤𝑡 = 1 𝑇 𝑡=1

• Use discrete, exponential, and/or equal weights (𝑤𝑡 = 1

𝑇 ∀ t)

• Exponential posf

Relevance:

• Interpretation: which period in the past is akin to the present period

• Forecasting accuracy: focus on relevant data

• Robust: shrink towards equal weights with penalty for unequal weights

• Easy to apply to many types of datasets (high/low-frequency, many/few variables) and models

4

Introduction

Overview:

• Literature

• Innovations

• Simulations

– Forecasting accuracy

– Influence statistical decisions on forecasts

• Practical application

• Discussion

5

Literature on weighing observations

Select optimal starting point (Pesaran & Timmermann 2007):

• Compute posf for different starting points

– Select best starting point

– Take a weighted combination of starting points

Exponential smoothing (Holt 1957, Brown 1959):

– Basic model: 𝑦 𝑇+1 = 𝑤𝑖(𝛾)𝑦𝑖 𝑇 𝑖=1

Discrete and exponential weights (Pesaran, Pick & Pranovich -PPP- 2013):

• 𝛽 𝑇(𝒘) = 𝑤𝑡𝒙𝑡𝒙𝑡 ′𝑇

𝑡=1 −1 𝑤𝑡𝒙𝑡

𝑇 𝑡=1 𝒚𝑡, 𝑤𝑡

𝑇 𝑡=1 = 1, h = 1

• Choose weights so that pMSFE of 𝑦 𝑇+1 = 𝛽 𝑇𝒙𝑇+1 is minimized

– Discrete breaks: analytic expression of optimal weights for multiple IVs

• Determine breakpoints by considering all possible combinations between two breakpoints with certain limits for 𝑏1 and 𝑏2

– Continuous breaks: exponential smoothing

6

Innovations

Example

• DGP: 𝑦𝑡 = 3 − 2 ∙ 𝟏𝑡>80 + 2 ∙ 𝟏𝑡>120 + 𝜀𝑡 , where 𝑡 = 1, 2, … , 170 and 𝜀𝑡~𝑁(0, 1)

• Model: 𝑦𝑡 = 𝜇 + 𝜂𝑡

– Computation time: 2.35 sec

7

Figure 3. Individual weights assigned to observations across time

Innovations

• Exponentially weighted posf

• Steps:

1. Determine breakpoints

2. Assign discrete weights to observations

3. Shrink discrete weights towards equal or exponential weights

• Use penalty for deviating from equally weighted observations

Figure 4. Individual weights assigned to observations at T=120

8

1. Determine breakpoints

Known methods to identify breakpoints or outliers

• CUSUM(SQ)

• Chow break test

• Quandt-Andrews Sup F test

• Studentized residuals / dfbetas/ dffits

– 𝑦 = 𝑋𝛽 + 𝐷𝑗𝛾 + 𝜀,

where 𝐷𝑗 is an (n × 1) indicator vector with 𝐷𝑗𝑗 = 1

Motivation for new method:

• Determine multiple breakpoints

• Applicable to various statistical models

9

Determine breakpoints

Figure 5. Finding breakpoints at T=120

10

Model: 𝑦𝑡 = 𝜇 + 𝜂𝑡 DGP: 𝑦𝑡 = 3 − 2 ∙ 𝟏𝑡>80 + 2 ∙ 𝟏𝑡>120 + 𝜀𝑡 ,

where 𝑡 = 1, 2, … , 170 and 𝜀𝑡~𝑁(0, 1)

Determine breakpoints

Figure 5. Finding breakpoints at T=120

10

Model: 𝑦𝑡 = 𝜇 + 𝜂𝑡 DGP: 𝑦𝑡 = 3 − 2 ∙ 𝟏𝑡>80 + 2 ∙ 𝟏𝑡>120 + 𝜀𝑡 ,

where 𝑡 = 1, 2, … , 170 and 𝜀𝑡~𝑁(0, 1)

Determine breakpoints

Figure 5. Finding breakpoints at T=120

10

Model: 𝑦𝑡 = 𝜇 + 𝜂𝑡 DGP: 𝑦𝑡 = 3 − 2 ∙ 𝟏𝑡>80 + 2 ∙ 𝟏𝑡>120 + 𝜀𝑡 ,

where 𝑡 = 1, 2, … , 170 and 𝜀𝑡~𝑁(0, 1)

1. Determine breakpoints

𝑆 𝑡 = 1

𝑊𝐼𝑁 𝑦 𝑝

¬ 𝑡 − 𝑦 𝑝

¬ 𝑡−1

𝑇

𝑝=𝑇−𝑊𝐼𝑁+1

• Largest values of 𝑆 are breakpoints

– Contiguous high values of 𝑆 form a ‘breakperiod’

– Quick way to find many candidate breakpoints in real-time

• Combination of test for outlier identification (‘leave-one-out’) and analyzing influence of configurations on posf (Hoornweg & Franses, 2013)

11

1. Determine breakpoints

Alternative

• Equally distribute breaks over treatment sample.

• Adjust each breakpoint and select adjustment that leads to the biggest increase in forecasting accuracy of posf. Continue until no improvement is made (adjustment to Patient Rule Induction Method -PRIM-algorithm).

• Computation time: 7.61 seconds instead of 2.35.

Figure 6. Adjusting equally distributed breakpoints at T=120

12

2. Discrete weights

1. Determine pMSFE of each period.

– Periods with too few observations receive an average weight.

2. Consider all possible combinations of leaving out periods.

– Periods left in receive equal weights or inverse pMSFE weights

𝑤𝑡 𝑖 =

1 𝑣

𝑒𝜏,𝑖 2𝑇

𝜏=𝑇−𝑣+1

−1

1 𝑣

𝑒𝜏,𝑗 2𝑇

𝜏=𝑇−𝑣+1

−1 𝑁 𝑗=1

3. Select discrete weights with highest accuracy of posf

Figure 6. Assigning weights to periods at T=120

13

3. Shrink

• 𝑤𝑡 𝐸𝑋𝑃 =

−log (1−𝑡/𝑇)

𝑇−1 , for 𝑡 = 1, 2, … , 𝑇 − 1, and 𝑤𝑇

𝐸𝑋𝑃 = log (𝑇)

𝑇−1 (PPP, pp. 144)

• 𝑤𝑡 𝐸𝑄𝑈𝐴𝐿

= 1

𝑇

• 𝑤𝑡 𝑠ℎ𝑟𝑖𝑛𝑘 𝜑 = 1 − 𝜑 𝑤𝑡

𝑑𝑖𝑠𝑐𝑟𝑒𝑡𝑒 + 𝜑𝑤𝑡 {𝐸𝑋𝑃,𝐸𝑄𝑈𝐴𝐿}

, 𝜑 ∈ (0, 0.1, 0.2, … , 1)

• 𝑅𝑀𝑆𝐹𝐸 𝜆 = 𝑀𝑆𝐹𝐸𝑊 + 𝜆 ∙ 𝑤𝑖−

1

𝑇 𝑇 𝑖=1

𝑤𝑗 𝑚𝑖𝑛−

1

𝑇 𝑇 𝑗=1

∙ 𝑀𝑆𝐹𝐸𝐸𝑄𝑈𝐴𝐿 .

Figure 7. Shrinking to exponential weights at T=120

Figure 8. Shrinking to equal weights at T=120

14

HF-weight

• Exponentially weighted posf

• Steps:

1. Determine breakpoints (𝑆 𝑡 )

2. Assign discrete weights to observations

3. Shrink discrete weights towards equal or exponential weights

Figure 9. Individual weights assigned to observations at T=120

15

HF-weight

• Exponentially weighted posf

• Steps:

1. Determine breakpoints (𝑆 𝑡 )

2. Assign discrete weights to observations

3. Shrink discrete weights towards equal or exponential weights

• Ad hoc decisions:

– posf:

• #: 20

• exponential

– Maximum # of periods: 4

– minOBS = 20

• minimum # obs for periods to get an individual weight

• minimum # obs in treatment sample.

– 𝜆 = 0.5: Penalty for deviating from equally weighted observations

Figure 9. Individual weights assigned to observations at T=120

15

Simulation study • 𝑦𝑡 = 𝑋𝑡𝛽𝑡 + 𝜀𝑡 , 𝜀𝑡~𝑁 0,1 , 𝑣𝑡~𝑁 0,1 #simul=1000

• Score: % better (-) or worse (+) 𝑀𝑆𝐹𝐸 in comparison to 𝑀𝑆𝐹𝐸(𝑦 𝐸𝑊)

16

DGP Mean1 Mean2 Random walk Regressor

𝑋𝑡 1 1 𝑋𝑡−1 + 0.5 ∙ 𝑣𝑡 ~𝑁(0,1)

𝛽1≤𝑡≤70 3 3 1 3

𝛽71≤𝑡≤120 4 5 1 4

𝛽121≤𝑡≤170 3 3 1 3

Model

HF-Weight -9 -36 -224 -35

Exponential -11 -43 -252 -40

Discrete -14 -53 -27

*View more*