Forecasting Multiple Time Series Using the baselineforecast R Package
-
Upload
konstantin-golyaev -
Category
Data & Analytics
-
view
126 -
download
5
Transcript of Forecasting Multiple Time Series Using the baselineforecast R Package
Forecasting (Revenue for S&P 500 Companies) Using the baselineforecast Package
by Konstantin GolyaevMicrosoft Azure Machine Learning
Konstantin Golyaev, useR! 2016, Stanford, CA 16/30/2016
Motivation
• “Prediction is very difficult, especially about the future”• © Niels Bohr (allegedly)
• We want to: • Forecast multiple time series at different horizons
• Leverage useful external information, when available
• Employ state-of-the-art methods
Note: won’t show any results due to five-minute time constraint
Konstantin Golyaev, useR! 2016, Stanford, CA 26/30/2016
Two Ways to Forecast
1. Time-series methods (ARIMA, ETS, STL, etc.)• Great for modeling trend and seasonality
2. Regression-based methods (elastic net, random forest, boosted regression trees, etc.)• Derive power from external information (features)
Can we get the best of both worlds?
Konstantin Golyaev, useR! 2016, Stanford, CA 36/30/2016
Konstantin Golyaev, useR! 2016, Stanford, CA 46/30/2016
Illustration
• Take small window of series
𝑦1𝑦2𝑦3𝑦4𝑦5𝑦6𝑦7𝑦8⋮
Konstantin Golyaev, useR! 2016, Stanford, CA 56/30/2016
Illustration
• Take small window of series
• Fit a model to it, make forecasts few steps ahead
𝑦1𝑦2𝑦3𝑦4𝑦5𝑦6𝑦7𝑦8⋮
𝑓7|6𝑓8|6⋮
Konstantin Golyaev, useR! 2016, Stanford, CA 66/30/2016
Illustration
• Take small window of series
• Fit a model to it, make forecasts few steps ahead
• Move the window forward
𝑦1𝑦2𝑦3𝑦4𝑦5𝑦6𝑦7𝑦8⋮
Konstantin Golyaev, useR! 2016, Stanford, CA 76/30/2016
Illustration
• Take small window of series
• Fit a model to it, make forecasts few steps ahead
• Move the window forward
• Repeat the process
𝑦1𝑦2𝑦3𝑦4𝑦5𝑦6𝑦7𝑦8⋮
𝑓8|7𝑓9|7⋮
Konstantin Golyaev, useR! 2016, Stanford, CA 86/30/2016
Illustration
• Take small window of series
• Fit a model to it, make forecasts few steps ahead
• Move the window forward
• Repeat the process
• Continue until out of data, combine results when done
𝑦7 𝑓7|6𝑦8 𝑓8|6𝑦8𝑦9⋮
𝑓8|7𝑓9|7⋮
Konstantin Golyaev, useR! 2016, Stanford, CA 96/30/2016
What Else Can We Do?
Konstantin Golyaev, useR! 2016, Stanford, CA 106/30/2016
Date-Based Features
Examples:
• Year
• Quarter
• Month
• Week
• Holidays
• Etc…
Konstantin Golyaev, useR! 2016, Stanford, CA 116/30/2016
Lags or Other Functions of 𝑦𝑡
• R does not compute lags correctly when series has gaps in its index (e.g. missing months/days)
• So we implemented it
Konstantin Golyaev, useR! 2016, Stanford, CA 126/30/2016
External Series as Features
• This is very much problem-specific
• What we used in various projects:• Macroeconomic data from Federal Reserve Economic Data (FRED)
• Web search trends from Bing/Google/etc
• Tweets scored for sentiments
• External business drivers such as promotions
Konstantin Golyaev, useR! 2016, Stanford, CA 136/30/2016
Implementation
• All code is combined into baselineforecast R package
• Function ConstructDataset() takes series 𝑦𝑡 and external data 𝑋𝑡, returns data frame with target and features
• Function FitModel() interfaces with caret package to train any regression learning algorithm and perform time series cross-validation
Konstantin Golyaev, useR! 2016, Stanford, CA 146/30/2016
Future Work
• Exploratory Data Analysis
• Computing Prediction Intervals
• Decide on the license/distribution model
Have questions?
Ping me at [email protected]
Konstantin Golyaev, useR! 2016, Stanford, CA 156/30/2016