WORKING PAPER CMVM - CMVM - Homepage · WORKING PAPER CMVM Modeling and Forecasting Liquidity using...
Transcript of WORKING PAPER CMVM - CMVM - Homepage · WORKING PAPER CMVM Modeling and Forecasting Liquidity using...
WORKING
PAPER
CMVM C O M I S S Ã O D O M E R C A D O D E V A L O R E S M O B I L I Á R I O S * N º 0 3 / 2 0 1 3
MODELING AND FORECASTING
LIQUIDITY USING PRINCIPAL
COMPONENT ANALYSIS
AND DYNAMIC FACTOR MODELS
AN ILLIQUIDITY COMPOSITE
INDICATOR PROPOSAL
WORKING PAPER
CMVM
Modeling and Forecasting Liquidity
using Principal Component Analysis
and Dynamic Factor Models
An Illiquidity Composite
Indicator Proposal
Paulo Pereira da Silva*
CMVM-Portuguese Securities Commission
Rua Laura Alves nº 4
Apartado 14258
1064-003 LISBOA
Email: [email protected]
* The views stated herein are those of the authors and not those of the Portuguese Securities Commission.
ABSTRACT
I survey and describe the main liquidity proxies used in the literature, highlighting
some of their merits. Some theoretical background and motivation for the usage of
PCA and DFM in the design of a liquidity composite indicator is provided. I apply the
PCA/ DFM to a set of nine liquidity proxies over a group of four western European
equity markets. The emphasis is placed in extracting a latent variable – a liquidity
component that captures the co-movement of the proxies. Besides the signal
extraction, stress testing for equity market liquidity is illustrated. Finally, I also
present some applications regarding the suitability of DFM to model and forecast
future liquidity.
W O R K I N G P A P E R N º 3 / 2 0 1 3
03
1. INTRODUCTION
The microeconomic concept of liquidity is multidimensional and comprehends
several dimensions, of which five are well documented in the literature:
(i) tightness, (ii) immediacy, (iii) depth, (iv) breadth and (v) resilience (see Sarr
and Lybek, 2002 for a quick survey).
Tightness refers to reduced transaction costs. Immediacy reflects the velocity by
which orders are transmitted to the market and settled. Depth concerns the
presence of abundant orders both above and below the price at which the security
is trading. Breadth refers to the existence of numerous and large in volume orders
with minimal impact on prices. Finally, resilience is associated to the market ability
to correct order imbalances, which tend to move the price away from the intrinsic
value of the security. In short, market participants perceive a security as liquid if
they can quickly sell large amounts of the security without affecting its price.
Liquidity is not directly observable. Since there are several dimensions of liquidity,
there are also numerous different empirical measures. Several proxies are often
used but none captures all the dimensions of the concept. In this regard, Goyenko
et al. (2009) perform a horserace of both monthly and annual liquidity measures to
evaluate their merits. Sarr and Lybek (2002), Lesmond et al. (1999), Hasbrouck
(2004, 2009) and Lesmond (2005) also compare several liquidity proxies based on
monthly and daily data.
In this paper, I propose the use of a composite indicator of liquidity based on a well
-known static method, Principal Component Analysis (PCA, henceforth), and
dynamic factor models. One of the main characteristics of these methods is their
ability to capture the main features of the data. Regarding PCA techniques, I will
use them to extract a few key, uncorrelated liquidity latent variables – which are
called the principal components – from a larger set of correlated liquidity proxies.
The suitability of these techniques will depend on the correlation of the proxies:
the higher the correlation between the original set of variables, the better this
technique will perform. In effect, a highly correlated set of variables means that it
will require only a few principal components to characterize the latent(s) variable
(s). PCA takes historical data on movements in the proxies and attempts to define a
set of orthogonal components that explain the movements. The PCA methodology is
derived from an eigenvalue analysis of a large covariance matrix of several
commonly used variables that proxy liquidity. The basic idea is that the main
M O D E L I N G A N D F O R E C A S T I N G L I Q U I D I T Y . . .
04
factors represent the common trend of liquidity over the analyzed time span. PCA
permits to reduce the number of liquidity proxies to a manageable dimension and
to detect its sources of variability.
Notice that proxies with higher correlations are considered more capable of captur-
ing liquidity, provided that they convey the same information (Naes et al., 2011). A
good liquidity proxy should capture time-series variation in liquidity. So, PCA and
dynamic factor models can be used to assess the liquidity of stock markets and to
capture the co-movement of different correlated proxies. In
addition, dynamic factor models also capture persistence in liquidity and allow
making one-step-ahead predictions for the liquidity proxies and latent liquidity.
In the second section, I survey and describe the liquidity proxies used in the study
highlighting some of their merits. I use the majority of the proxies proposed by
Sarr and Lybek (2002), Zhang (2010) and by Goyenko et al. (2005).
In the third section some theoretical background and motivation for the usage of
PCA and dynamic factor models in the design of the liquidity composite indicator is
provided.
In the fourth section, I apply the PCA/ dynamic factor models to a set of nine
liquidity proxies over a group of four western European equity markets. The
emphasis is placed in extracting a latent variable – a liquidity component – from
the proxies described in section 2. Besides the signal extraction, stress testing for
equity market liquidity is performed. In this section I also present some
applications regarding the suitability of dynamic factor models to model and
forecast future liquidity.
2. Liquidity proxies
Nine liquidity proxies are used in this paper. Three of them are closely related to
transaction costs (bid-ask spread, effective bid-ask spread and Roll’s modified
measure), four are associated to market impact (Amihud illiquidity indicator, HHL,
Zeros and Market-Efficiency coefficient) and the final two are related to breadth
and depth (value turnover and turnover ratio).
i) Value turnover: indicator of realized liquidity that is computed as the daily
sum of the value of all the transactions. Benston and Hagerman (1974) and
Stoll (1978) argue that value turnover, volatility and price influence liquidity.
W O R K I N G P A P E R N º 3 / 2 0 1 3
05
ii) Turnover ratio: defined as the ratio between the value turnover and the
market capitalization of a listed company:
iii) Bid-ask spread: measured as the absolute difference between bid and ask
prices or as a percentage spread. The latter is more convenient in compari-
sons of different securities provided that higher prices tend to exhibit higher
absolute spreads. The bid-ask spread is a measure of implicit transaction
costs: high transaction costs reduce the demand for trades and, thus, the
number of potentially active participants in a market. Concurrently, the
reduction of the number of participants in the market due to high transaction
costs influences market breadth and resilience. According to Glosten and
Milgrom (1985), bid-ask spreads may also reflect the degree of information
asymmetry. The absolute bid-ask spread is expressed as:
where and are the ask and bid prices, respectively. The percentage spread is
defined as:
iv) Effective bid-ask spread is also used to capture transaction costs.
where is the trading price of the security and the prevailing mid-quote when
the trade occurs.
v) Roll (1984) proposes an estimator of the effective spread based on the
serial covariance of the changes in prices. Suppose that the unobservable
fundamental value of a stock is a random walk with the following stochastic
behavior:
where is a white noise. The last observed trade price on day t is given by
where is the effective spread and is a categorical variable that equals 1 if the
last trade was buyer initiated and -1 otherwise. Roll (1984) assumes equal proba-
bilities for each of the possible values. In addition, he considers that is serially
uncorrelated and independent of such that:
06
M O D E L I N G A N D F O R E C A S T I N G L I Q U I D I T Y . . .
The serial covariance might be written as:
The effective bid-ask spread (Roll’s estimator) can be expressed as:
When the sample covariance is positive, the formula is undefined. Thus, a modified
version of the Roll’s estimator presented by Goyenko et al. (2009) is defined as
vi) Amihud (2000)1 suggests the following ratio as an indicator of market impact:
where is the stock return at t and is the value turnover in Euro at
t. One drawback of this indicator is its non-definition for zero volume days. None-
theless, it is useful to capture the price impact of trades and is widely used as a li-
quidity proxy. Hasbrouck (2009) argues that among the daily proxies, the Amihud
illiquidity measure is most strongly correlated with the transactions and quotes
based price impact coefficient. On the other hand, the liquidity effect of asymmetric
information is most likely captured in the price impact of a trade (Glosten and Har-
ris, 1988). Acharya and Pederson (2005), Watanabe and Watanabe (2008), Spigiel
and Wang (2005), Avramov et al. (2006) and Kamara et al. (2008) use the Amihud
proxy to assess commonality in liquidity among stocks.
vii) Zeros. Lesmond et al. (1999) compute the proportion of days with zero
returns as a proxy for illiquidity. They present two reasons to support this
indicator: (i) securities with lower liquidity are more likely to have zero volume
days and thus more likely to have zero return days; (ii) stocks with higher
transaction costs have less private information acquisition (since it is more
difficult to overcome higher transaction costs) and thus, even on positive
volume days, they are more likely to have no-information-revelation, zero
return days.
1- Acharya and Pedersen (2004) also adopted this indicator.
W O R K I N G P A P E R N º 3 / 2 0 1 3
07
viii) Hui-Heubel Liquidity ratio (HHL) attempts to capture the price impact, breadth
and resilience dimensions of liquidity. It relates the volumes of trades and
their impact on prices, and is computed as an average of 5-day periods in a
sample, in order to smooth volatility.
iThe Hui-Heubel Liquidity ratio uses the turnover ratio in the denominator,
scaling price movements by the speed of rotation of the equity in the markets.
The higher the liquidity of an asset, the lower will HHL be.
ix) The Market-Efficiency Coefficient (MEC) was proposed by Hasbrouck and
Schwartz (1988) to distinguish short-term from long-term price changes. In-
deed, price movements are more continuous in liquid markets, even if new in-
formation influences equilibrium prices and consequently, for a given perma-
nent price change, the transitory changes to that price should be minimal in
resilient markets.
where is the variance of returns over the longer period, is the vari-
ance of the return of the shorter period and T is the number of shorter periods
embedded in the longer period.
MEC should be close to one in more resilient markets (even though, slight lower
than one), in the sense that overreaction and underreaction to new information
should be minimal. Prices of assets with high market resilience may exhibit lower
volatility (less transitory changes) between periods in which the equilibrium price is
changing. Excessive short term volatility/overshooting leads to significantly lower
than one MEC figures.
2. EXTRACTING LATENT VARIABLES
A. The Principal Component Approach
Principal component analysis (PCA) is a method for detecting patterns in data and
to emphasize similarities and differences in variables. PCA reduces the dimension of
the data, that is, attempts to reduce the number of variables to analyse without
08
M O D E L I N G A N D F O R E C A S T I N G L I Q U I D I T Y . . .
much loss of information. Put differently, it aims to explain the variability of a set of
variables through a new smaller set of new non-correlated/orthogonal latent varia-
bles. Thus, one may describe the variation in a set of correlated variables using a
smaller new set of uncorrelated factors.
Each component is computed in order to consider for the maximum possible varia-
tion of the initial dataset. The first component will be the most relevant. It denotes
the linear combination of the original variables that yields the higher sample vari-
ance (eigenvalues) among all the possible linear combinations. If the variables dis-
play a high correlation altogether, the first component usually denotes a common
trend. Fewer components ease the analyst task of providing an intuitive meaning to
the set of components. The interpretation of the components is usually guided by
the level of correlation of each variable with a particular component.
Principal component analysis consists in the spectral decomposition of a covariance
matrix or of a correlation matrix. Performing PCA is equivalent to determining the
eigenvalues and eigenvectors of the covariance (correlation) matrix. If PCA is calcu-
lated using the correlation matrix then the outcome will only be affected by the cor-
relations of variables, but if the input to PCA is the covariance matrix the results
will depend not only on the correlations of the variables but also on their standard
errors. Indeed, the representation of the principal components based on the covari-
ance (correlation) provides a linear representation for the (standardized) liquidity
proxies.
It can be shown that there is no general association between the spectral decompo-
sition of a covariance matrix and the spectral decomposition of the corresponding
correlation matrix. Accordingly, there is no general association between the princi-
pal components of a covariance matrix and those of its correlation matrix. Though,
if the variables have similar standard errors, both methods should yield similar re-
sults. PCA is sensitive to the units of measurement, which determine variances and
covariances. In our case, it is preferable to work with correlation matrix because
correlation is not affected by the scale of the variables.
One should also note that the PCA is one of the simplest of many dimension
reduction methodologies that transform a set of correlated variables into a set of
uncorrelated variables. The main difference between the PCA and other factor
analysis methods derives from the fact that the former seeks to identify a small
number of factors to explain the total variation of the dataset while the latter place
09
W O R K I N G P A P E R N º 3 / 2 0 1 3
the emphasis on using a small number of hypothetical random variables to explain
the correlations or covariances in a multivariate dataset.
PCA can be applied to any set of stationary time series, regardless of the level of
correlation of the set of variables. The assumption that the variables are normally
distributed is not required, only that they have finite variances and covariances.
Standard variances and covariances are not robust and are sensitive to outliers.
In order to make PCA insensitive to outliers, robust versions of variances and
covariances are necessary.
PCA can be implemented in three steps:
1. Calculate the covariance or correlation matrix of the original dataset.
2. Derive the eigenvalues and the eigenvectors of that matrix. Next, rank/order the
eigenvalues by their value. The first principal component is associated to the higher
eigenvalue; the second principal component is associated to the second higher
eigenvalue, and so on.
The first component explains the most variation of the dataset. In very highly
correlated datasets, this component captures an almost parallel shift in all varia-
bles, and more generally it is labelled the common trend component (in our case it
captures the most often experienced type of common movement in all the liquidity
proxies). The second eigenvector belongs to the second largest eigenvalue, and
therefore the second component explains the second most variation in the dataset.
3 - Let X be the time series dataset, V the covariance matrix (correlation matrix)
and P the principal components. There is a representation of the data such that:
where W is a p-by-p matrix whose columns are the eigenvectors of XTX (factor
scores matrix).
PCA allows transforming the original data into a system of orthogonal factors.
Consider the following PCA representation of k liquidity proxies:
10
M O D E L I N G A N D F O R E C A S T I N G L I Q U I D I T Y . . .
where denotes the liquidity proxy ( and is the factor loading of li-
quidity proxy . Thus, if the j principal components moves by , the liquidity proxy
will move by , ceteris paribus. Factor Score i is provided by the following
expression:
is the i largest eigenvalue and corresponds to the eigenvector associated to
eigenvalue i. The correlation/covariance matrix of the principal components is diag-
onal, given that the factors are uncorrelated. As for the variance of the principal
components it will be equal to the corresponding eigenvalue.
The total variation of the original time series is provided by the sum of the eigen-
values of the covariance (correlation) matrix. Consequently, one can assess the
contribution of each factor by dividing its eigenvalue by the sum of the eigenvalues:
The capacity of PCA to reduce dimensions, combined with the use of orthogonal
variables for risk factors, makes this technique an extremely attractive option for
Monte Carlo simulation and scenario analysis.
B. Dynamic Factor Models
Dynamic Factor Models (DFM) are flexible models for multivariate time series. DFM
aim to combine the cross-section analysis through Principal Components Analysis
and the time series dimension of data through linear regression modelling (Federici
and Mazzitelli, 2010). These models allow for serial and mutual correlation of the
idiosyncratic errors. One advantage of factor models lies in the fact that they may
use information from many variables without running into scarce degrees of free-
dom, which is a problem frequently faced in regression analyses. Because of their
ability to simultaneously and consistently model data sets in which the number of
series exceeds the number of time series observations, these types of models have
received considerable attention in the past decade. Breitung and Eickmeier (2005)
point two other reasons to use factor models: the idiosyncratic movements which
possibly include measurement errors and local shocks can be eliminated with this
technique and one does not need to rely on overly tight assumptions as is some-
times the case in structural models.
W O R K I N G P A P E R N º 3 / 2 0 1 3
11
Coppi and Zannela (1978) introduced Dynamic Factor Models. They seek to decom-
pose the covariance matrix (V) of a set of time series variables into three distinct
covariance matrices:
where represents the variability of the data structure without taking the time
dimension into account (it equals the covariance matrix of the average of the units
with respect to time); reflects the variability, due to the time dimension, of the
average of the units, regardless of the dynamics of the single units; measures the
variability due to the difference between the dynamics of the overall average of the
units, that is the average dynamics, and the dynamics of the single units.
The observed endogenous variables are linear functions of exogenous covariates
and unobserved factors, which have a vector autoregressive structure, and thus are
persistent over time. In this framework, the unobserved factors can also be a func-
tion of exogenous covariates. The error terms in the equations for the dependent
variables may be autocorrelated.
Stock and Watson (2010) divide the time-domain estimation of DFM into three gen-
erations. The first generation is based in low-dimensional (small N) parametric
models estimated in the time domain using Gaussian maximum likelihood estima-
tion and the Kalman filter. This methodology provides optimal estimates of the fac-
tors (and optimal forecasts) under the model assumptions and parameters. Not-
withstanding, this estimation method requires nonlinear optimization, which may be
a serious drawback due to convergence issues. The second generation of estimators
involves nonparametric estimation with a large set of variables using cross-
sectional averaging methods, in particular principal components and related meth-
ods. The principal components estimator of the space spanned by the factors is
consistent. Moreover, if N is sufficiently large, then the factors are estimated pre-
cisely enough to be used as data in later regressions (Stock and Watson, 2010).
The third generation uses consistent nonparametric estimates of the factors to esti-
mate the parameters of the state space model used in the first generation solving
the dimensionality problem of first-generation models.
As Principal Components, latent factors estimated this way is sometimes referred to
as extracting or estimating an indicator. The principle of a dynamic factor model is
that a few latent dynamic factors lead the comovements of a high-dimensional
set of time-series variables, which is also influenced by a vector of mean-zero
idiosyncratic disturbances. The error term arises from measurement errors and
M O D E L I N G A N D F O R E C A S T I N G L I Q U I D I T Y . . .
12
from special aspects of the individual series. The latent factors follow a time series
process, which is commonly taken to be a vector autoregression (VAR) (Stock and
Watson, 2010).
For each of the analyzed countries, I estimate the following dynamic factor model:
where denotes a latent variable that represents a common movement in
liquidity and follows an AR(1) process. The model is estimated in its state space
representation using stationary Kalman Filter. represents persistence in liquidity.
Thus, if liquidity is persistent it may also be foreseeable, which is an additional
advantage over the standard PCA method. represents the expected value for the
proxy i during normal periods and is the sensibility of proxy i to movements in the
latent variable. Notice that I use seasonally adjusted variables in the estimation,
and consequently there is no need to model seasonality.
4. APPLICATION TO FOUR WESTERN EUROPEAN COUNTRIES
A. Principal Components Analysis
In this section, I apply the PCA methodology to measure the liquidity of four west-
ern European equity markets. As said before, liquidity is a latent variable, that is, it
is not directly observed. Nine well documented proxies of liquidity are used to cap-
ture the movements of liquidity: bid-ask spread, effective bid-ask spread, Roll’s
modified measure, Amihud illiquidity indicator, HHL, Zeros, MEC, turnover and turn-
over ratio.
The analysis is based on monthly data, given that some of the proxies are only
available at a monthly basis (e.g. zeros, MEC, Roll’s modified measure). In order to
calculate the values of the proxies, I collect daily data from Bloomberg, namely last
trade prices, bid and ask prices, market capitalization and turnover. The data
collected covers the period that ranges between 2000 and 2012, and 2043 securi-
ties traded in France, Italy, Spain and Portugal. All the securities (active or inactive)
W O R K I N G P A P E R N º 3 / 2 0 1 3
13
present in MiFid database at 31-12-2012 are included in the main sample.
In order to obtain country aggregate values for the liquidity proxies, the following
procedure is conducted:
i) Firstly, logs are introduced to smooth the path of some of the proxies (Amihud
indicator, HHL, turnover, bid-ask spread and effective bid-ask spread);
ii) next, the (average) monthly proxies for each of the individual securities is
calculated;
iii) at last, the weighted averages of the monthly liquidity proxies are computed
for each market, using the securities market capitalization as weights (a 20%
cap is introduced to reduce the dependency of the aggregate liquidity proxies
to a few set of securities).
The first step in PCA consists in the computation and analysis of the correlation
matrix. As expected, the reported results show some similarities in the data: the
correlation matrix shows a high linear statistical association between the (ln)
Amihud Indicator and the (ln) HHL; and between the Turnover Ratio and the
Turnover; MEC exhibits a low correlation with the other variables (Table 1).
In a first stage, the PCA method is applied to the nine variables (Table 2). The KMO
measure and Bartlett's Test of Sphericity suggest that the application of the PCA
method provides good results for Portugal and Italy (KMO higher than 0.7) and
acceptable for Spain and France (KMO higher than 0.5). In the case of Portugal, the
first component accounts for 44.9% of the variance of the data. The (ln) Amihud
Indicator, (ln) HHL, and (ln) Turnover are the variables with a higher percentage of
the variance explained by the extracted component. For Italy, the first component
explains 37.7% of the total variance, whereas for France and Spain that percentage
drops to 32.6% and 29.7%, respectively. In general, the (ln) Amihud Indicator, the
(ln) HHL and effective bid-ask spread are the proxies with higher contribution to the
first principal component.
In a second stage, I reduce the number of variables to six. Roll’s modified measure,
Zeros and Market-Efficiency coefficient seem to have little impact in the
co-movement of the proxies according to the factor scores. They are very indirect
measures of liquidity that account for little correlation with the first principal
component, and for that reason they are dropped in further analyses.
Notice that I am not taking seasonality into consideration. Two different approaches
M O D E L I N G A N D F O R E C A S T I N G L I Q U I D I T Y . . .
14
are used to address seasonality. In the first, raw data is used to run the PCA and
then seasonality is extracted from the principal component using the additive meth-
od of seasonality decomposition. In the second, seasonality is removed from the
raw data using again the additive seasonality decomposition method. The different
approaches yield similar results. Although both approaches are performed simulta-
neously, I put more emphasis in the second one in the subsequent analysis.
After dropping Roll’s modified measure, Zeros and Market-Efficiency coefficient, the
performance of PCA increases dramatically for some of the analyzed equity markets
(Table 3). In Portugal and Italy the first principal component now represents 65.7%
and 46.5% of the variance of the sub dataset. In Spain and France, the total vari-
ance explained increases to 40.2% and 46.5% of the variance of the proxies, re-
spectively. Communality analysis shows that the Amihud indicator, HHL, bid-ask
spread or effective bid-ask spread are the proxies that capture a higher percentage
of the variation of the first principal component.
Figure 1 displays the first component evolution between January 2000 and Decem-
ber 2012 and allows identifying the pattern of the latent variable. For instance, all
the analyzed countries exhibit a decline of liquidity after 2008, due to the interna-
tional financial crisis. PCA indicates that at the end of 2012 the liquidity level was
still above the level displayed in 2008 in three countries (the exception is France).
In 2010, Spain had already recovered the liquidity level displayed before the crisis.
Notwithstanding, the recovery process was reverted in the end of 2011 with the
European sovereign debt crisis.
One of the advantages of using the PCA approach is its flexibility to model the cor-
relations between variables. The factor loadings obtained from PCA allow designing
a stress test approach, where the impact of aggregate liquidity shocks over the
proxies is simulated. This stress test exercise is presented in Figure 2. For example,
regarding Spain the effective bid-ask spread rises from 0.362% in normal times, to
0.841% in stress periods. In France and Italy, the HHL and Amihud indicators more
than double in highly stress periods, meaning that negative shocks in aggregate li-
quidity affects price impact measures in a large extent in these countries.
Instead of performing PCA in the liquidity proxies, one might consider as an alter-
native their changes over time. Table 4 shows the correlation matrix of the first
differences of the liquidity proxies. Changes in the Amihud indicator and HHL are
highly correlated in the four countries. The same occurs with the (ln) turnover and
W O R K I N G P A P E R N º 3 / 2 0 1 3
15
the turnover ratio (except for Italy). Table 5 presents the total variance explained
by the first and second principal component, communalities, factor loadings and
factor scores for each proxy. With the exception of Italy, the first two principal
components account for more than 60% of the variance of the six liquidity proxies.
These principal components represent different dimensions of liquidity. One way of
assessing the economic interpretation of the principal components is through the
analysis of the factor loadings, which in some sense represent the correlation be-
tween the factors and the liquidity proxies. For instance, in the case of Spain the
first principal component is associated with breath (provided its correlation with the
turnover ratio and log turnover), whereas the second principal component denotes
price impact and transaction costs. In the case of France, the first principal compo-
nent represents price impact, whilst the second denotes breadth. At last, in Italy
and Portugal the first principal component is highly correlated with the price impact
measures.
Using differences instead of levels in PCA also permits simulating the impact of
shocks in the latent variables over the liquidity proxies. Figure 3 shows 95% confi-
dence intervals for the liquidity proxies (corrected for seasonality). In the case of
Portugal, stress testing indicates a decline of the turnover ratio of 0.37% in the
event of a 1.645 standard deviation shock in the liquidity latent variables. In Spain,
the turnover ratio is not particularly affected by changes in the latent variable, and
in France and Italy that impact is also of minor importance. However, a 1.645
standard deviation liquidity shock in the latent variables may have serious reper-
cussions in price impact measures: the HHL indicator more than triples in Spain and
the effective bid-ask spread increases by 30 basis points in France.
B. Dynamic Factor Models
DFM permit modelling the dynamics of our variables of interest and unobserved
components in a VAR framework. I estimate a DFM for the liquidity proxies of each
country assuming that the latent variable displays first order autocorrelation and
that the behaviour of the proxies is explained by this latent variable and a disturb-
ance term. Moreover, I am assuming that the dynamics of the proxies are solely
described by the dynamics of the unobserved liquidity. This analysis focuses on the
bid-ask spread, effective bid-ask spread, Roll’s modified measure, Amihud illiquidity
indicator, HHL and turnover ratio. Value turnover is excluded due to convergence
issues and possible non-stationary of the series.
M O D E L I N G A N D F O R E C A S T I N G L I Q U I D I T Y . . .
16
The estimated models reveal that the unobserved component is very persistent
over time. The value of is statistically different from zero and ranges between
0.86 and 0.94. Moreover, coefficients associated to the variables ln(1+ Amihud
indicator), ln(1+ HHL), ln(1+ bid-ask spread) and ln(1+ effective bid-ask spread)
are statistically significant at the 5% significance level for all countries. The liquidity
component does not seem to explain turnover ratio in France, Spain and Italy.
The forecasting ability of the DFM is also tested. In order to do so, the sample is
divided in two subsamples: from January 2000 to December 2010 and from Janu-
ary 2011 to December 2012. I re-estimate the model for the first period and use
the second for out-of-sample forecast. Two different measures of forecasting accu-
racy are computed, namely MAPE and RMSE. I also compare the accuracy of DFM
with the use of the historical mean, in terms of forecasting. To do so, RMSE-R
squares is computed and the Diebold and Mariano test is run. DFM presents a re-
markable forecasting accuracy in the cases of Portugal and Spain, where the
Diebold and Mariano t-stat is statistically significant at the 90% level in all the li-
quidity proxies, with the exception of turnover ratio in Spain. In the cases of France
and Italy, DFM only appears to provide higher predictive accuracy than the histori-
cal average in the forecast of the bid-ask spread and the Amihud illiquidity indica-
tor, respectively. RMSE based R-squared also suggests that the forecasting ability
of DFM is higher amid price impact measures (Amihud indicator and HHL) than
transaction costs measures.
5. FINAL REMARKS
PCA methodology is widely used to capture unobserved variables through the anal-
ysis of other observable proxies. In this paper, I apply PCA to capture the evolution
of liquidity, which is not directly observed. In doing so, I use a set of nine liquidity
proxies. Concurrently I show how to simulate the impact of liquidity shocks in prox-
ies of liquidity such as bid-ask spread, turnover ratio and Amihud Indicator. In that
sense, PCA can be useful for measuring aggregate liquidity risk and for stress test
reporting. In terms of the results, the unobserved liquidity variable evidences a sig-
nificant downturn after the Lehman Brothers bankruptcy in the four analysed mar-
kets. Even though this event affected the liquidity of all markets, it was particularly
severe in Spain and France, but with transitory effects. Both markets recovered to
their long term liquidity level before mid-2010. Italy also exhibits a decline of the
liquidity component after the Lehman Brothers bankruptcy, but contrary to Spain
and France that liquidity shock assumes a more permanent effect.
17
W O R K I N G P A P E R N º 3 / 2 0 1 3
Furthermore, I present an extension of PCA methodology, Dynamic Factor Models.
DFM reveal that the unobserved component is very persistent over time, and thus it
is predictable. DFM presents a remarkable forecasting accuracy, particularly in Por-
tugal and Spain. Even in the cases of France and Italy, DFM appears to provide
higher predictive accuracy than the historical average in forecasting the bid-ask
spread and the Amihud illiquidity indicator. This analysis also shows communalities
across the liquidity components of the four markets. In other words, the liquidity of
different European markets tends to co-move.
Comparing the two approaches, PCA has the advantage of being easier to imple-
ment and is more flexible, whereas DFM computation is slow and sometimes con-
vergence is not achieved. On the other hand, DFM permits modelling the time se-
ries structure of the proxies and to compute forecasts of the liquidity component
and of the proxies.
M O D E L I N G A N D F O R E C A S T I N G L I Q U I D I T Y . . .
18
REFERENCES
Amihud, Y. (2002). “Illiquidity and stock returns: Cross-section and time-series
effects”. Journal of Financial Markets 5, 31-56.
Acharya, V. and L. H. Pedersen (2005). “Asset pricing with liquidity risk”. Journal of
Financial Economics 77, 375-410.
Avramov, D., T. Chordia and A. Goyal (2005). “Liquidity and autocorrelations in
individual stock returns”. Working Paper.
Benston, G. and R. Hagerman (1974). “Determinant of bid-asked spreads in the
over-the-counter market”. Journal of Financial Economics 1(4), 353-364.
Breitung, J. and S. Eickmeier (2009). “Testing for structural breaks in dynamic
factor models”. Deutsche Bundesbank Economic Studies Discussion Paper No.
Coppi, R. and F. Zanella (1978). “L’analisi fattoriale di una serie temporale múltipla
relativa allo stesso insieme di unità statistiche”. Società Italiana di Statistica, XXIX
riunione.
Federici, A. and A. Mazzitelli (2010). “Dynamic factor analysis with Stata”. 2nd
Italian Stata Users Group meeting.
Glosten, L. and L. Harris (1988). “Estimating the components of the bid/ask
spread”. Journal of Financial Economics, 21, 125-142.
Glosten, L. and P.R. Milgrom (1985). “Bid, ask and transaction prices in a specialist
market with heterogeneously informed traders”. Journal of Financial Economics,
14-71.
Glosten, L. (1987). “Components of the bid–ask spread and the statistical
properties of transaction prices”. Journal of Finance 42, 1293–1307.
Goyenko, R.Y., C.W. Holden and C.A. Trzcinka (2009). “Do liquidity measures
measure liquidity?”. Journal of Financial Economics 92, 153-181.
Hasbrouck, J. (2004). “Liquidity in the futures pits: inferring market dynamics from
incomplete data”. Journal of Financial and Quantitative Analysis 39, 305–326.
Hasbrouck, J. (2009).”Trading costs and returns for US equities: estimating
effective costs from daily data”. Journal of Finance.
Hasbrouck, J. and R. A. Schwartz (1988). “An assessment of stock exchange and
over-the-counter markets”. Journal of Portfolio Management 14, 10-16.
W O R K I N G P A P E R N º 3 / 2 0 1 3
19
REFERENCES
Kamara, A., X. Lou and R. Sadka (2008). “The divergence of liquidity commonality
in the cross-section of stocks”. Journal of Financial Economics 89, 444–466.
Kyle, A. (1985). “Continuous auctions and insider trading”. Econometrica 53
(6), 1315-1335.
Lesmond, D. A., J. P. Ogden. and C. Trzcinka (1999). “A new estimate of
transaction costs”. Review of Financial Studies 12 (5).
Lesmond, D. (2005). “Liquidity of emerging markets”. Journal of Financial
Economics 77, 411–452.
Naes, R, J. A. Skjeltorp, B. A. Ødegaard (2011). “Stock market liquidity and the
business cycle”. The Journal of Finance 66, 139–176.
Roll, R. (1984). “A simple implicit measure of the effective bid–ask spread in an
efficient market”. Journal of Finance 39, 1127–1139.
Sarr, A. and T. Lybek (2002). “Measuring liquidity in financial markets”. IMF
Working Paper No. 02/232.
Spiegel, M. I. and X. Wang (2005). “Cross-sectional Variation in Stock Returns:
Liquidity and Idiosyncratic Risk”. Yale ICF Working Paper No. 05-13.
Stock, J. H. and M. W. Watson (2010). “Dynamic factor models”. Oxford Handbook
of Economic Forecasting.
Stoll, H. (1978). “The pricing of security dealers services: An empirical study of
NASDAQ Stocks”. Journal of Finance 33, 1153 – 1172.
Watanabe, A. and M. Watanabe (2008). “Time-varying liquidity risk and the cross
section of stock returns”. Review of Financial Studies, 21, 2449-2486.
Zhang, H. (2010). “Measuring liquidity in emerging markets”. Working Paper.
20
M O D E L I N G A N D F O R E C A S T I N G L I Q U I D I T Y . . .
TABLES
Table 1
Correlation Matrix
Panel A – Portugal
Panel B – Spain
Panel C – Italy
Panel D – France
W O R K I N G P A P E R N º 3 / 2 0 1 3
21
Table 2
Total Variance Explained by the First Principal Component
and Communalities Using the 9 Liquidity Proxies
Panel A - Portugal Panel C - Italy
Panel B - Spain Panel D - France
22
M O D E L I N G A N D F O R E C A S T I N G L I Q U I D I T Y . . .
Table 3
Total Variance Explained by the First Principal Component
and Communalities Using 6 Liquidity Proxies
Portugal
Panel A - Raw Series Panel B - Seasonally Adjusted Series
Spain
Panel A - Raw Series Panel B - Seasonally Adjusted Series
Italy
Panel A - Raw Series Panel B - Seasonally Adjusted Series
W O R K I N G P A P E R N º 3 / 2 0 1 3
23
France
Panel A - Raw Series Panel B - Seasonally Adjusted Series
Figure 1
First Principal Component Evolution Between 2000 and 2012
Panel A - Portugal
24
M O D E L I N G A N D F O R E C A S T I N G L I Q U I D I T Y . . .
Panel B - Spain
Panel C - Italy
W O R K I N G P A P E R N º 3 / 2 0 1 3
25
Panel D - France
Figure 2
Liquidity Shocks – Impact of a Liquidity Aggregate Factor Shock
on the Liquidity Proxies
Panel A - Portugal
Panel B - Spain
26
M O D E L I N G A N D F O R E C A S T I N G L I Q U I D I T Y . . .
Panel C - Italy
Panel D - France
Table 4
Correlation Matrix of Differentiated Variables
Panel A - Portugal
W O R K I N G P A P E R N º 3 / 2 0 1 3
27
Panel B - Spain
Panel C - Italy
Panel D - France
Table 5
Total Variance Explained by the First and Second Principal Component,
Factor Scores, Factor Loadings and Communalities Using 6 First-Differenced
Liquidity Proxies
Panel A - Portugal
28
M O D E L I N G A N D F O R E C A S T I N G L I Q U I D I T Y . . .
Panel B - Spain
Panel C - Italy
Panel D - France
W O R K I N G P A P E R N º 3 / 2 0 1 3
29
Figure 3
Liquidity Shocks – Impact of a Shock in the Liquidity Aggregate Factor
on the Liquidity Proxies, Corrected for Seasonality
Panel A - Portugal
Panel B - Spain
M O D E L I N G A N D F O R E C A S T I N G L I Q U I D I T Y . . .
30
Panel C - Italy
Panel D - France
31
W O R K I N G P A P E R N º 3 / 2 0 1 3
Table 6
Dynamic Factor Model Estimation
Figure 4
Latent Liquidity Derived From a Dynamic Factor Model
M O D E L I N G A N D F O R E C A S T I N G L I Q U I D I T Y . . .
32
Table 7
Out-of-Sample Forecasting Accuracy: DFM versus Sample Average
Table 8
Out-of-Sample (One-Step-Ahead) Forecasting Accuracy:
DFM versus Historical Average
W O R K I N G P A P E R N º 3 / 2 0 1 3
33
WORKING PAPER
CMVM COMISSÃO DO MERCADO DE VALORES MOBILIÁRIOS
Rua Laura Alves, n.º 4
Apartado 14258
1064-003 Lisboa . Portugal
Telefone 21 317 70 00 . Fax 21 353 70 77/ 78
Site: www.cmvm.pt
E-mail: [email protected]
APOIO AO INVESTIDOR
Linha verde: 800 205 339