Variational Bayes inference in high-dimensional time ... · PDF file we call it the...

Click here to load reader

  • date post

  • Category


  • view

  • download


Embed Size (px)

Transcript of Variational Bayes inference in high-dimensional time ... · PDF file we call it the...

  • Munich Personal RePEc Archive

    Variational Bayes inference in

    high-dimensional time-varying parameter


    Koop, Gary and Korobilis, Dimitris

    University of Strathclyde, University of Essex

    15 July 2018

    Online at

    MPRA Paper No. 87972, posted 18 Jul 2018 12:38 UTC

  • Variational Bayes inference in high-dimensional

    time-varying parameter models

    Gary Koop✯

    University of Strathclyde Dimitris Korobilis ❸

    University of Essex


    This paper proposes a mean field variational Bayes algorithm for efficient posterior and predictive inference in time-varying parameter models. Our approach involves: i) computationally trivial Kalman filter updates of regression coefficients, ii) a dynamic variable selection prior that removes irrelevant variables in each time period, and iii) a fast approximate state-space estimator of the regression volatility parameter. In an exercise involving simulated data we evaluate the new algorithm numerically and establish its computational advantages. Using macroeconomic data for the US we find that regression models that combine time-varying parameters with the information in many predictors have the potential to improve forecasts over a number of alternatives.

    Keywords: dynamic linear model; approximate posterior inference; dynamic variable selection; forecasting

    JEL Classification: C11, C13, C52, C53, C61

    ✯Strathclyde Business School, Glasgow, G4 0GE, UK, email: [email protected] ❸Essex Business School, Colchester, CO4 3SQ, UK, email: [email protected]


  • 1 Introduction

    Regression models that incorporate stochastic variation in parameters have been used

    by economists at least since the works of Sarris (1973) and Cooley and Prescott (1976).

    Granger (2008) argued that time-varying parameter models might become the norm

    in econometric inference since (as he illustrated via White’s theorem) generic time-

    varying parameter (TVP) models can approximate any form of nonlinearity. Much

    empirical work shows the benefits of TVP models for forecasting. For instance, Stock

    and Watson (2007) show that their flexible TVP model with no predictors can forecast

    inflation more accurately than traditional constant parameter regressions based on the

    Phillips curve augmented with exogenous predictors. Extending such evidence, recent

    studies have developed novel Bayesian estimation algorithms that are able to combine

    time-varying parameter regressions with information in exogenous predictors. Relevant

    papers include Belmonte et al. (2014), Chan et al. (2012), Dangl and Halling (2012),

    Groen et al. (2013), Kalli and Griffin (2014), Koop and Korobilis (2012), Kowal et al.

    (2017), Nakajima and West (2013), Ročková and McAlinn (2018), and Uribe and Lopes


    Such algorithms demonstrate various inventive ways of allowing for many predictor

    variables in a TVP regression setting. In practice, however, empirical application of all

    these algorithms is restricted to a handful of predictors and short (quarterly instead

    of monthly) data, because of their high complexity associated with their reliance on

    Markov Chain Monte Carlo (MCMC) or other computationally intensive estimation

    methods.1 In light of evidence that information in many predictors can be beneficial

    in constant parameter regressions (Stock and Watson, 2002), the inability of existing

    1The only exception is the recent work by Ročková and McAlinn (2018) that, alongside an MCMC algorithm, also proposes an expectation-maximization (EM) algorithm that searches for the mode of the posterior.


  • estimation algorithms to be used in high-dimensional TVP settings is a fundamental

    shortcoming. Therefore, an open question of interest in applied econometric research

    is whether models that combine large information sets with time-varying parameters

    could also be beneficial. This question doesn’t just hold for the case of many exogenous

    predictors, but it is also important when the high dimensionality comes from using

    monthly or even daily data sets: Regressions with higher frequency data will be more

    likely to exhibit time-varying parameter behavior.2

    In this paper, we fill this gap in the literature by developing an iterative algorithm

    that can handle regressions with many time series observations and/or many predictors in

    the presence of time-varying parameters. We use variational Bayes (VB) methods which

    allow us to approximate the true high-dimensional posterior distribution in a simple and

    straightforward manner. The main idea behind VB methods is to approximate the high-

    dimensional and intractable posterior distribution using a simpler, tractable distribution.

    VB methods ensure that the approximation is good by minimizing the Kullback-Leibler

    distance between the true posterior and the proposed approximation. Following a large

    literature in physics and engineering where the mean field approximation was first

    developed, our proposed approximation to the posterior is decomposed into a series

    of simpler, independent densities that make inference scalable in high dimensions. We

    tackle computation by means of the an optimization algorithm that has as output the

    first two moments of the posterior density and resembles the expectation-maximization

    (EM) algorithm, instead of relying on computationally intensive MCMC methods. The

    result is an algorithm that combines Kalman filter updates for time-varying coefficients

    and volatilities with trivial posterior updates of all other model parameters and, hence,

    we call it the Variational Bayes Kalman Filter (VBKF).

    2See Bauwens et al. (2015) for a comparison of the number of estimated breaks in monthly vs quarterly macroeconomic time series using a variety of structural breaks and time-varying parameter models.


  • The use of the VBKF surmounts the computational problem associated with TVP

    regressions with many predictors. However, on its own it does not surmount over-

    parameterization concerns. Accordingly, we derive a dynamic version of the stochastic

    search variable selection (SSVS) prior of George and McCulloch (1993) and incorporate

    it into the VBKF. This prior allows us to implement dynamic variable selection by

    stochastically searching for probable predictors at each point in time. While this time-

    varying variable selection problem is typically of tremendous complexity3, we are able

    to integrate it to our efficient VBKF setting. Therefore, the proposed dynamic SSVS

    prior extends existing dynamic model selection and shrinkage algorithms (e.g. Kalli

    and Griffin, 2014; Koop and Korobilis, 2012) to high-dimensional regression problems.

    Finally, we add to the VBKF algorithm for the time-varying regression coefficients, an

    approximate VBKF estimator for stochastic volatility (SV) models. This latter filter is

    as fast as the exponentially weighted moving average (EWMA) filter used in Koop and

    Korobilis (2012), but it is less ad-hoc and can also provide a full characterization of the

    posterior distribution of the volatility process instead of a point volatility estimate.

    The purpose of these computationally efficient approximations, as well as the

    dynamic shrinkage and variable selection prior, is prediction. While approximation-free

    parameter estimation is equally important, there are several reasons we don’t focus on

    this aspect of statistical inference using the proposed algorithm. First, even though

    asymptotic properties of general variational Bayes estimators have been derived in

    various regression settings (Wang and Blei, forthcoming), establishing consistency of our

    time-varying parameter estimators under a dynamic hierarchical prior is a non-trivial

    task. Second, for the kind of high-dimensional inference problems we are interested

    in, estimation error might be large. For example, our empirical exercise uses up to

    3A traditional static variable selection problem with p predictors involves a model space of K = 2p

    possible models containing combinations of these predictors. The dynamic variable selection has to solve the static problem in all T observations associated with a given time series data set.


  • 118 predictors, all featuring parameters that drift at each time period. In this case,

    the parameter space is so vast that regardless of whether using exact or approximate

    estimators the sampling error for TVP problems is high.4 As a result, having a

    flexible and subjective shrinkage prior in our proposed algorithm is desirable as it leads

    to posterior mean estimates that might be biased, but provide a huge reduction in

    estimation variance (with benefits in terms of mean squared error compared to unbiased

    estimators that might have extremely large variance). This observation is confirmed

    by the fact that all the recent contributions in this field (see citations above) focus

    exclusively on forecasting, and not causal analysis using flexible TVP models.

    We show, via a Monte Carlo exercise and an empirical application, that our proposed

    algorithm works well in high-dimensional sparse time-varying parameter sett