Model Building Steps: Forecasting the Jobs number
-
Upload
john-muller -
Category
Documents
-
view
615 -
download
0
Transcript of Model Building Steps: Forecasting the Jobs number
![Page 1: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/1.jpg)
Model Building Steps
Forecasting the jobs number
John H. Muller
October 1, 2012
John H. Muller () Model Building Steps October 1, 2012 1 / 40
![Page 2: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/2.jpg)
Outline
1 Goals
2 The TaskForecasting the jobs numberPredictor Variables
3 Modeling ProcessPreliminariesFitting and Tuning the ModelModel SelectionPrediction
4 Resources
John H. Muller () Model Building Steps October 1, 2012 2 / 40
![Page 3: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/3.jpg)
1 Goals
2 The TaskForecasting the jobs numberPredictor Variables
3 Modeling ProcessPreliminariesFitting and Tuning the ModelModel SelectionPrediction
4 Resources
John H. Muller () Model Building Steps October 1, 2012 3 / 40
![Page 4: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/4.jpg)
Goals for the presentation
Illustrate issues and choices in a typical model building process.
To do that we take the following as out task.Build a model to forecast a macro economic time series
Time is limited, so we don’t have time to discuss:
econometrics or macroeconomics
time series methods
details or merits of particular modeling or model fitting methods
John H. Muller () Model Building Steps October 1, 2012 4 / 40
![Page 5: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/5.jpg)
1 Goals
2 The TaskForecasting the jobs numberPredictor Variables
3 Modeling ProcessPreliminariesFitting and Tuning the ModelModel SelectionPrediction
4 Resources
John H. Muller () Model Building Steps October 1, 2012 5 / 40
![Page 6: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/6.jpg)
Total nonfarm (FRED symbol = PAYEMS)
60000
80000
100000
120000
140000
1960 1970 1980 1990 2000 2010
Every month BLS publishes the Employment Situation report.
Two most important numbers: Unemployment rate and Total nonfarm
Total nonfarm: count of jobs from survey of businesses(units: thousands of jobs)
John H. Muller () Model Building Steps October 1, 2012 6 / 40
![Page 7: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/7.jpg)
Total nonfarm (FRED symbol = PAYEMS)
60000
80000
100000
120000
140000
1960 1970 1980 1990 2000 2010
Every month BLS publishes the Employment Situation report.
Two most important numbers: Unemployment rate and Total nonfarm
Total nonfarm: count of jobs from survey of businesses(units: thousands of jobs)
Task: Forecast month-over-month change in Total nonfarm
John H. Muller () Model Building Steps October 1, 2012 6 / 40
![Page 8: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/8.jpg)
Total nonfarm (FRED symbol = PAYEMS)
130000
132000
134000
136000
138000
2000 2002 2004 2006 2008 2010 2012
Figure: PAYEMS since 2000
John H. Muller () Model Building Steps October 1, 2012 7 / 40
![Page 9: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/9.jpg)
Month over Month Change in PAYEMS
mean=23, sd = 289
−1000
−500
0
500
1000
2000 2002 2004 2006 2008 2010 2012
Figure: Month-over-month change in PAYEMS
John H. Muller () Model Building Steps October 1, 2012 8 / 40
![Page 10: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/10.jpg)
1 Goals
2 The TaskForecasting the jobs numberPredictor Variables
3 Modeling ProcessPreliminariesFitting and Tuning the ModelModel SelectionPrediction
4 Resources
John H. Muller () Model Building Steps October 1, 2012 9 / 40
![Page 11: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/11.jpg)
ID Description
ALTSALES Light Weight Vehicle Sales: Autos & Light Trucks
BUSLOANS Commercial and Industrial Loans at All Commercial Banks
CE16OV Civilian Employment
CIVPART Civilian Labor Force Participation Rate
CLF16OV Civilian Labor Force
CONSUMER Consumer Loans at All Commercial Banks
CPATAX Corporate Profits After Tax
DPI Disposable Personal Income
PAYEMS All Employees: Total nonfarm
PCE Personal Consumption Expenditures
PSAVERT Personal Saving Rate
SRVPRD All Employees: Service-Providing Industries
TCU Capacity Utilization: Total Industry
UEMP27OV Civilians Unemployed for 27 Weeks and Over
UEMPLT5 Civilians Unemployed - Less Than 5 Weeks
UEMPMEAN Average (Mean) Duration of Unemployment
UEMPMED Median Duration of Unemployment
UNEMPLOY Unemployed
UNRATE Civilian Unemployment Rate
USGOOD All Employees: Goods-Producing Industries
Table: Variables and descriptionsJohn H. Muller () Model Building Steps October 1, 2012 10 / 40
![Page 12: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/12.jpg)
Jobs
1350
00
2000 2004 2008 2012
CE16OV
6466
2000 2004 2008 2012
CIVPART
1400
00
2000 2004 2008 2012
CLF16OV
1280
00
2000 2004 2008 2012
PAYEMS
1050
002000 2004 2008 2012
SRVPRD
2000
2000 2004 2008 2012
UEMP27OV
2500
2000 2004 2008 2012
UEMPLT5
1530
2000 2004 2008 2012
UEMPMEAN
515
25
2000 2004 2008 2012
UEMPMED
6000
1600
0
2000 2004 2008 2012
UNEMPLOY
46
810
2000 2004 2008 2012
UNRATE
1800
0
2000 2004 2008 2012
USGOOD
Figure: Original Series
John H. Muller () Model Building Steps October 1, 2012 11 / 40
![Page 13: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/13.jpg)
Consumer
1014
18
2000 2004 2008 2012
ALTSALES
600
1000
2000 2004 2008 2012
CONSUMER
6000
9000
2000 2004 2008 2012
DPI
6000
9000
2000 2004 2008 2012
PCE
−2
02
46
2000 2004 2008 2012
PSAVERT
Figure: Original Series
John H. Muller () Model Building Steps October 1, 2012 12 / 40
![Page 14: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/14.jpg)
Business
1000
1400
2000 2002 2004 2006 2008 2010 2012
BUSLOANS
600
1200
2000 2002 2004 2006 2008 2010 2012
CPATAX
7075
80
2000 2002 2004 2006 2008 2010 2012
TCU
Figure: Original Series
John H. Muller () Model Building Steps October 1, 2012 13 / 40
![Page 15: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/15.jpg)
Jobs
−10
00
2000 2004 2008 2012
CE16OV
−0.
40.
2
2000 2004 2008 2012
CIVPART
−10
0030
00
2000 2004 2008 2012
CLF16OV
−10
00
2000 2004 2008 2012
PAYEMS
−10
0010
002000 2004 2008 2012
SRVPRD
−40
040
0
2000 2004 2008 2012
UEMP27OV
−50
050
0
2000 2004 2008 2012
UEMPLT5
−1
12
2000 2004 2008 2012
UEMPMEAN
−20
2
2000 2004 2008 2012
UEMPMED
−50
0
2000 2004 2008 2012
UNEMPLOY
−0.
40.
2
2000 2004 2008 2012
UNRATE
−10
000
2000 2004 2008 2012
USGOOD
Figure: Differenced Series
John H. Muller () Model Building Steps October 1, 2012 14 / 40
![Page 16: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/16.jpg)
Consumer
−4
−2
02
4
2000 2004 2008 2012
ALTSALES
010
020
0
2000 2004 2008 2012
CONSUMER
−10
00
100
300
20002002 2006 2010
DPI
−10
00
100
200
2000 2004 2008 2012
PCE
−4
−2
02
4
2000 2004 2008 2012
PSAVERT
Figure: Differenced Series
John H. Muller () Model Building Steps October 1, 2012 15 / 40
![Page 17: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/17.jpg)
Business
−40
020
4060
2000 2002 2004 2006 2008 2010 2012
BUSLOANS
−10
00
100
200
2000 2002 2004 2006 2008 2010 2012
CPATAX
−2
−1
01
2000 2002 2004 2006 2008 2010 2012
TCU
Figure: Differenced Series
John H. Muller () Model Building Steps October 1, 2012 16 / 40
![Page 18: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/18.jpg)
1 Goals
2 The TaskForecasting the jobs numberPredictor Variables
3 Modeling ProcessPreliminariesFitting and Tuning the ModelModel SelectionPrediction
4 Resources
John H. Muller () Model Building Steps October 1, 2012 17 / 40
![Page 19: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/19.jpg)
Preliminaries
choose target and predictor variablesconsideration might include: history, cost, frequency, accuracy
choose model form & method: Lasso & random forestalternatives: neural networks, OLS, robust regression, ...Criteria for choosing:
◮ prediction accuracy◮ interpretability◮ suitability to the task and data◮ available software, model maintenance, implementation complexity
Derive variables from inputs. smoothed, standardizedalternatives: powers of original variables, cross terms, ratios
Plan for estimating out of sample error:cross validation & test/train split
John H. Muller () Model Building Steps October 1, 2012 18 / 40
![Page 20: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/20.jpg)
Preliminaries
Data issues
Missing data: removealternatives: impute, ignore (for some model forms)
Outliers: trim to within 3 sd of rolling meanalternatives: ignore, remove
Correlated predictor variable: ignorealternatives: cluster variables and choose 1 from each cluster
John H. Muller () Model Building Steps October 1, 2012 19 / 40
![Page 21: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/21.jpg)
Figure: Trimmed and Smoothed
Jobs
−10
0030
00
2000 2004 2008 2012
CE16OV
−0.
40.
2
2000 2004 2008 2012
CIVPART
−10
0030
00
2000 2004 2008 2012
CLF16OV
−10
0010
00
2000 2004 2008 2012
PAYEMS
−10
0050
0
2000 2004 2008 2012
SRVPRD
−40
0040
0
2000 2004 2008 2012
UEMP27OV
−50
050
0
2000 2004 2008 2012
UEMPLT5
−1
12
2000 2004 2008 2012
UEMPMEAN
−2
02
2000 2004 2008 2012
UEMPMED
−50
050
02000 2004 2008 2012
UNEMPLOY
−0.
40.
2
2000 2004 2008 2012
UNRATE
−10
000
2000 2004 2008 2012
USGOOD
John H. Muller () Model Building Steps October 1, 2012 20 / 40
![Page 22: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/22.jpg)
Figure: Trimmed and Smoothed
Consumer
−4
−2
02
4
2000 2004 2008 2012
ALTSALES
010
020
0
2000 2004 2008 2012
CONSUMER
−10
010
030
0
2000 2004 2008 2012
DPI
−10
00
100
200
2000 2004 2008 2012
PCE
−4
−2
02
42000 2004 2008 2012
PSAVERT
John H. Muller () Model Building Steps October 1, 2012 21 / 40
![Page 23: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/23.jpg)
Figure: Trimmed and Smoothed
Business
−40
020
4060
2000 2002 2004 2006 2008 2010 2012
BUSLOANS
−10
00
100
2000 2002 2004 2006 2008 2010 2012
CPATAX
−2
−1
01
2000 2002 2004 2006 2008 2010 2012
TCU
John H. Muller () Model Building Steps October 1, 2012 22 / 40
![Page 24: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/24.jpg)
1 Goals
2 The TaskForecasting the jobs numberPredictor Variables
3 Modeling ProcessPreliminariesFitting and Tuning the ModelModel SelectionPrediction
4 Resources
John H. Muller () Model Building Steps October 1, 2012 23 / 40
![Page 25: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/25.jpg)
Fitting and Tuning the Model
Complexity: how many knobs the model hase.g. degrees of freedom,# variables, shrinkage factor, tree size, ...
Fitting: estimating parameters for given complexityMethods: least squares, method of moments, maximum likelihood, optimization
Tuning: adjusting the models complexity
Possibly iterative, using diagnostics:
out of sample error
sensitivity, e.g. ∂error
∂data
significance of parameters
error structure, e.g. heteroskedastic
alignment with prior beliefsWhich variabless are important for the model?
John H. Muller () Model Building Steps October 1, 2012 24 / 40
![Page 26: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/26.jpg)
0.0 0.2 0.4 0.6 0.8 1.0
4e+
048e
+04
Fraction of final L1 norm
Cro
ss−
Val
idat
ed M
SE
John H. Muller () Model Building Steps October 1, 2012 25 / 40
![Page 27: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/27.jpg)
** * * * * ** ** **** **** * *** * *** ** * * * * *
0.0 0.2 0.4 0.6 0.8 1.0
−20
000
|beta|/max|beta|
Sta
ndar
dize
d C
oeffi
cien
ts
** * * * * ** ** **** **** * *** * *** ** * * * * *** * * * * ** ** ******** * *** * ***
** * * * **
** * * * * ** ** ******** * ***
* *** ** * * * * *
** * * * * ** ** **** **** * *** * ***** * * * *
*** * * * * ** ** **** **** * *** * *** ** * * * * *** * * * * ** ** **** **** * *** * *** ** * * * * *** * * * * ** ** **** **** * *** * *** ** * * * * *** * * * * ** ** **** **** * *** * *** ** * * * * *
** * * ** ** ** **** **** * *** * *** ** * * * * *
** * * * * ** ** **** **** * *** * *** ** * * * * *** * * * * ** ** **** **** * *** * *** ** * * * * *** * * * * ** ** **** **** * *** * *** ** * * * * *** * * * * ** ** **** **** * *** * *** ** * * * * *** * * * * ** ** **** **** * *** * *** ** * * * * *** * * * * ** ** **** **** * *** * *** ** * * * * *** * * * * ** ** **** **** * *** * *** ** * * * * *** * * * * ** ** ****
**** * *** * ***** * * * *
*
** * * * * ** ** **** **** * *** * *** ** * * * * *
** * * * * ** ** **** **** * *** * *** ** * * * * *
LASSO
183
87
4
John H. Muller () Model Building Steps October 1, 2012 26 / 40
![Page 28: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/28.jpg)
variable estimate
ALTSALES 0.000
BUSLOANS 0.000
CE16OV 0.157
CIVPART 0.000
CLF16OV 0.000
CONSUMER 0.000
CPATAX 0.000
DPI 0.000
PAYEMS 0.028
PCE 2.250
PSAVERT -15.350
SRVPRD 0.000
TCU 0.000
UEMP27OV -0.265
UEMPLT5 0.000
UEMPMEAN -18.559
UEMPMED 0.000
UNEMPLOY 0.000
UNRATE -305.852
USGOOD 0.292
Table: Coefficient estimates for LASSOJohn H. Muller () Model Building Steps October 1, 2012 27 / 40
![Page 29: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/29.jpg)
ALTSALES
CPATAX
DPI
UEMPLT5
BUSLOANS
CONSUMER
PSAVERT
UEMPMED
UEMPMEAN
SRVPRD
CIVPART
PAYEMS
CLF16OV
PCE
UNRATE
UEMP27OV
CE16OV
UNEMPLOY
TCU
USGOOD
0 200000 400000 600000 800000 1000000 1200000
Random Forest: predictor variable importance
IncNodePurity
John H. Muller () Model Building Steps October 1, 2012 28 / 40
![Page 30: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/30.jpg)
1 Goals
2 The TaskForecasting the jobs numberPredictor Variables
3 Modeling ProcessPreliminariesFitting and Tuning the ModelModel SelectionPrediction
4 Resources
John H. Muller () Model Building Steps October 1, 2012 29 / 40
![Page 31: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/31.jpg)
Model Selection
Model selection: choosing the best among different models
Our criteria: prediction accuracyHow will we measure this?
Training set cross validation estimates of out of sample MSERF: 52,000Lasso: 62,000
Separate test data.25,000 essentially the same for both!
John H. Muller () Model Building Steps October 1, 2012 30 / 40
![Page 32: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/32.jpg)
2000 2002 2004 2006 2008 2010 2012
−10
00−
500
050
010
00
targetrflasso
John H. Muller () Model Building Steps October 1, 2012 31 / 40
![Page 33: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/33.jpg)
2000 2002 2004 2006 2008 2010 2012−10
000
500
rflasso
Figure: Training set error
John H. Muller () Model Building Steps October 1, 2012 32 / 40
![Page 34: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/34.jpg)
0 5 10 15 20
−0.
20.
20.
61.
0
Lag
AC
FRandom Forest
0 5 10 15 20
−0.
20.
20.
61.
0Lag
AC
F
Lasso
Figure: Training error ACF
John H. Muller () Model Building Steps October 1, 2012 33 / 40
![Page 35: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/35.jpg)
Jan Mar May Jul Sep
010
020
030
040
050
0 targetrflasso
John H. Muller () Model Building Steps October 1, 2012 34 / 40
![Page 36: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/36.jpg)
Jan Mar May Jul Sep
−40
0−
200
020
0
rflasso
Figure: Test set error
John H. Muller () Model Building Steps October 1, 2012 35 / 40
![Page 37: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/37.jpg)
1 Goals
2 The TaskForecasting the jobs numberPredictor Variables
3 Modeling ProcessPreliminariesFitting and Tuning the ModelModel SelectionPrediction
4 Resources
John H. Muller () Model Building Steps October 1, 2012 36 / 40
![Page 38: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/38.jpg)
Prediction
Random Forest: 120
Lasso: 86
John H. Muller () Model Building Steps October 1, 2012 37 / 40
![Page 39: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/39.jpg)
1 Goals
2 The TaskForecasting the jobs numberPredictor Variables
3 Modeling ProcessPreliminariesFitting and Tuning the ModelModel SelectionPrediction
4 Resources
John H. Muller () Model Building Steps October 1, 2012 38 / 40
![Page 40: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/40.jpg)
The Secrets of Economic Indicators, Bernard Baumohl
The Elements of Statistical Learning, Hastie, Tibshirani, Friedman
Macroeconomic Patterns and Stories, Edward E. Leamer
Analysis of Financial Time Series, Ruey S. Tsay
http://api.stlouisfed.org/docs/fred/good source for both FRED and ALFRED
http://cran.r-project.org/
John H. Muller () Model Building Steps October 1, 2012 39 / 40
![Page 41: Model Building Steps: Forecasting the Jobs number](https://reader033.fdocuments.in/reader033/viewer/2022052906/55895972d8b42ad7638b470b/html5/thumbnails/41.jpg)
Thank you!
and thank you to John Verostek, Vladimir Valenta and Steve Kusiak
John H. Muller () Model Building Steps October 1, 2012 40 / 40