# Variational Bayes inference in high-dimensional time ... · PDF file we call it the...

date post

19-Oct-2020Category

## Documents

view

0download

0

Embed Size (px)

### Transcript of Variational Bayes inference in high-dimensional time ... · PDF file we call it the...

Munich Personal RePEc Archive

Variational Bayes inference in

high-dimensional time-varying parameter

models

Koop, Gary and Korobilis, Dimitris

University of Strathclyde, University of Essex

15 July 2018

Online at https://mpra.ub.uni-muenchen.de/87972/

MPRA Paper No. 87972, posted 18 Jul 2018 12:38 UTC

Variational Bayes inference in high-dimensional

time-varying parameter models

Gary Koop✯

University of Strathclyde Dimitris Korobilis ❸

University of Essex

Abstract

This paper proposes a mean field variational Bayes algorithm for efficient posterior and predictive inference in time-varying parameter models. Our approach involves: i) computationally trivial Kalman filter updates of regression coefficients, ii) a dynamic variable selection prior that removes irrelevant variables in each time period, and iii) a fast approximate state-space estimator of the regression volatility parameter. In an exercise involving simulated data we evaluate the new algorithm numerically and establish its computational advantages. Using macroeconomic data for the US we find that regression models that combine time-varying parameters with the information in many predictors have the potential to improve forecasts over a number of alternatives.

Keywords: dynamic linear model; approximate posterior inference; dynamic variable selection; forecasting

JEL Classification: C11, C13, C52, C53, C61

✯Strathclyde Business School, Glasgow, G4 0GE, UK, email: [email protected] ❸Essex Business School, Colchester, CO4 3SQ, UK, email: [email protected]

1

1 Introduction

Regression models that incorporate stochastic variation in parameters have been used

by economists at least since the works of Sarris (1973) and Cooley and Prescott (1976).

Granger (2008) argued that time-varying parameter models might become the norm

in econometric inference since (as he illustrated via White’s theorem) generic time-

varying parameter (TVP) models can approximate any form of nonlinearity. Much

empirical work shows the benefits of TVP models for forecasting. For instance, Stock

and Watson (2007) show that their flexible TVP model with no predictors can forecast

inflation more accurately than traditional constant parameter regressions based on the

Phillips curve augmented with exogenous predictors. Extending such evidence, recent

studies have developed novel Bayesian estimation algorithms that are able to combine

time-varying parameter regressions with information in exogenous predictors. Relevant

papers include Belmonte et al. (2014), Chan et al. (2012), Dangl and Halling (2012),

Groen et al. (2013), Kalli and Griffin (2014), Koop and Korobilis (2012), Kowal et al.

(2017), Nakajima and West (2013), Ročková and McAlinn (2018), and Uribe and Lopes

(2017).

Such algorithms demonstrate various inventive ways of allowing for many predictor

variables in a TVP regression setting. In practice, however, empirical application of all

these algorithms is restricted to a handful of predictors and short (quarterly instead

of monthly) data, because of their high complexity associated with their reliance on

Markov Chain Monte Carlo (MCMC) or other computationally intensive estimation

methods.1 In light of evidence that information in many predictors can be beneficial

in constant parameter regressions (Stock and Watson, 2002), the inability of existing

1The only exception is the recent work by Ročková and McAlinn (2018) that, alongside an MCMC algorithm, also proposes an expectation-maximization (EM) algorithm that searches for the mode of the posterior.

2

estimation algorithms to be used in high-dimensional TVP settings is a fundamental

shortcoming. Therefore, an open question of interest in applied econometric research

is whether models that combine large information sets with time-varying parameters

could also be beneficial. This question doesn’t just hold for the case of many exogenous

predictors, but it is also important when the high dimensionality comes from using

monthly or even daily data sets: Regressions with higher frequency data will be more

likely to exhibit time-varying parameter behavior.2

In this paper, we fill this gap in the literature by developing an iterative algorithm

that can handle regressions with many time series observations and/or many predictors in

the presence of time-varying parameters. We use variational Bayes (VB) methods which

allow us to approximate the true high-dimensional posterior distribution in a simple and

straightforward manner. The main idea behind VB methods is to approximate the high-

dimensional and intractable posterior distribution using a simpler, tractable distribution.

VB methods ensure that the approximation is good by minimizing the Kullback-Leibler

distance between the true posterior and the proposed approximation. Following a large

literature in physics and engineering where the mean field approximation was first

developed, our proposed approximation to the posterior is decomposed into a series

of simpler, independent densities that make inference scalable in high dimensions. We

tackle computation by means of the an optimization algorithm that has as output the

first two moments of the posterior density and resembles the expectation-maximization

(EM) algorithm, instead of relying on computationally intensive MCMC methods. The

result is an algorithm that combines Kalman filter updates for time-varying coefficients

and volatilities with trivial posterior updates of all other model parameters and, hence,

we call it the Variational Bayes Kalman Filter (VBKF).

2See Bauwens et al. (2015) for a comparison of the number of estimated breaks in monthly vs quarterly macroeconomic time series using a variety of structural breaks and time-varying parameter models.

3

The use of the VBKF surmounts the computational problem associated with TVP

regressions with many predictors. However, on its own it does not surmount over-

parameterization concerns. Accordingly, we derive a dynamic version of the stochastic

search variable selection (SSVS) prior of George and McCulloch (1993) and incorporate

it into the VBKF. This prior allows us to implement dynamic variable selection by

stochastically searching for probable predictors at each point in time. While this time-

varying variable selection problem is typically of tremendous complexity3, we are able

to integrate it to our efficient VBKF setting. Therefore, the proposed dynamic SSVS

prior extends existing dynamic model selection and shrinkage algorithms (e.g. Kalli

and Griffin, 2014; Koop and Korobilis, 2012) to high-dimensional regression problems.

Finally, we add to the VBKF algorithm for the time-varying regression coefficients, an

approximate VBKF estimator for stochastic volatility (SV) models. This latter filter is

as fast as the exponentially weighted moving average (EWMA) filter used in Koop and

Korobilis (2012), but it is less ad-hoc and can also provide a full characterization of the

posterior distribution of the volatility process instead of a point volatility estimate.

The purpose of these computationally efficient approximations, as well as the

dynamic shrinkage and variable selection prior, is prediction. While approximation-free

parameter estimation is equally important, there are several reasons we don’t focus on

this aspect of statistical inference using the proposed algorithm. First, even though

asymptotic properties of general variational Bayes estimators have been derived in

various regression settings (Wang and Blei, forthcoming), establishing consistency of our

time-varying parameter estimators under a dynamic hierarchical prior is a non-trivial

task. Second, for the kind of high-dimensional inference problems we are interested

in, estimation error might be large. For example, our empirical exercise uses up to

3A traditional static variable selection problem with p predictors involves a model space of K = 2p

possible models containing combinations of these predictors. The dynamic variable selection has to solve the static problem in all T observations associated with a given time series data set.

4

118 predictors, all featuring parameters that drift at each time period. In this case,

the parameter space is so vast that regardless of whether using exact or approximate

estimators the sampling error for TVP problems is high.4 As a result, having a

flexible and subjective shrinkage prior in our proposed algorithm is desirable as it leads

to posterior mean estimates that might be biased, but provide a huge reduction in

estimation variance (with benefits in terms of mean squared error compared to unbiased

estimators that might have extremely large variance). This observation is confirmed

by the fact that all the recent contributions in this field (see citations above) focus

exclusively on forecasting, and not causal analysis using flexible TVP models.

We show, via a Monte Carlo exercise and an empirical application, that our proposed

algorithm works well in high-dimensional sparse time-varying parameter sett