Chapter 12 Regression with Time-Series Data: Nonstationary Variables
Nonstationary Time Series Examples Consider the Chemical Process Viscosity data set Is this time...
-
Upload
claude-mathews -
Category
Documents
-
view
214 -
download
0
Transcript of Nonstationary Time Series Examples Consider the Chemical Process Viscosity data set Is this time...
Nonstationary Time Series Examples
• Consider the Chemical Process Viscosity data set
• Is this time series stationary? Why or why not?• Time effect present?• What do lag scatter plots tell us?• What does ACF tell us?• Conduct the Durbin-Watson test
The Durbin-Watson Test
Nonstationary Time Series Examples
• We want to predict the viscosity at time periods 101, 102, and 103.• Since data is nonstationary, time period is
important.• Could use a simple linear model to make the
predictions:
Nonstationary Time Series Examples
• We want to predict the viscosity at time periods 101, 102, and 103.
Note that in Minitab under Time Series the predictions won’t come with intervals. But, under regression they will.
Nonstationary Time Series Examples
• How do we know how well our model works compared to other models?• Mean Absolute Percentage Error (MAPE)• Mean Absolute Deviation (MAD)• Mean Squared Deviation (MSD)
For all three measures, smaller values indicate a better fitting model.
(hand outs)
Nonstationary Time Series Examples
The measures MAPE, MAD, and MSD are not very informative by themselves, but can be used to compare the fits obtained by using different models.
Mean absolute percentage error (MAPE) – Expresses accuracy as a percentage of the error. Because this number is a percentage, it may be easier to understand than the other statistics. For example, if the MAPE is 5, on average the forecast is off by 5%.
Mean absolute deviation (MAD) – Expresses accuracy in the same units as the data, which helps conceptualize the amount of error. Outliers have less of an affect on MAD than on MSD.
Mean squared deviation (MSD) – A commonly-used measure of accuracy of fitted time series values. Outliers have more influence on MSD than MAD.
Nonstationary Time Series Examples
• Minitab will compute MAPE, MAD, and MSD for this model:
Nonstationary Time Series Examples
• Now, fit the quadratic model and compare the MAPE, MAD, and MSD values to the linear model.
How do the predictions and prediction intervals for this model compare to the linear model?
Smoothing Techniques
Consider the Chemical Process Viscosity data set and in Minitab go to Time Series > Moving Average.
Choose a moving average of length 4
Look at the graph and comment
Smoothing Techniques
Consider the Chemical Process Viscosity data set and in Minitab go to Time Series > Moving Average.
Use a moving average of length 4 and predict the 101, 102, and 103 time period viscosities.
What do you notice?
Comparing Models
Compare predicting viscosity at time period 101 using a linear, quadratic, and moving average model with moving average length of your choice.
Which model do you prefer? Why?Did you check your residuals plots?
Comparing Models
Consider the Cheese data set again and compare predicting production in 1998 using a linear, quadratic, and moving average model with moving average length of your choice.
Which model do you prefer? Why?Did you check your residuals plots?
Comparing Models
Consider the Champagne data set again and compare predicting sales in January, 1970 using:• Linear• Quadratic• Linear or quadratic with month dummy variables• Moving average model with moving average length of
your choice (notice this data set is very sensitive to moving average length).
Which model do you prefer? Why?Did you check your residuals plots?
Comparing Models
Consider the Violent Crime Rate data set and model the data.
Who has the best model?
Remove the last observation from your data set and predict it. Who’s prediction is best?
Comparing Models
Consider the drowning children data set and model the data.
Who has the best model?
Remove the last observation from your data set and predict it. Who’s prediction is best?
Stepwise Regression
Open the data set Wage2
There are a lot of potential predictor variables to include in a model. One way to deal with this is stepwise regression.
Stepwise Regression
Response is: WagePotential predictors: Hours, IQ, KWW, educ, exper, tenure, age, sibs, brthord, meduc, feduc
Stepwise Regression
Forward selection, which involves starting with no variables in the model, trying out the variables one by one and including them if they are 'statistically significant'.
Backward elimination, which involves starting with all candidate variables and testing them one by one for statistical significance, deleting any that are not significant.
Methods that are a combination of the above, testing at each stage for variables to be included or excluded.
Forward or Backward?
If you have a very large set of potential independent variables from which you wish to extract a few, you should generally go forward.
If you have a modest-sized set of potential variables from which you wish to eliminate a few, you should generally go backward.
Best Subsets Regression
The ultimate in fishing trips. Minitab examines all possible subsets of the predictors, beginning with all models containing one predictor, and then all models containing two predictors, and so on. By default, Minitab displays the two best models for each number of predictors.