A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data...

24
A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business and consumer surveys and short-term forecast BCS Workshop 15-16 November 2012, Brussels

Transcript of A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data...

Page 1: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS

data

Presentation by Christian GayerDG ECFIN A.4.2, Business and consumer surveys and short-term

forecast

BCS Workshop15-16 November 2012, Brussels

Page 2: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

Background (1)

• Transparency calls for regular updating of metadata sheets

• Apart from contact data etc, metadata sheets contain valuable info on sampling (universe, frame, sampling method, sample size, sampling error, response rates, treatment of non-response, weighting etc)

• Ideally, sheets should enable users to gauge the "quality" of survey data

Page 3: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

Background (2)• Quality is multi-facetted (relevance, accuracy, timeliness,

accessibility, comparability, …) • Focus here on accuracy• Components are: sampling errors and non-sampling

errors (frame, measurement, processing, non-response, assumptions)

• Sampling error depends on 1) inherent variability of figures to be measured, 2) sampling design, esp. sample size ("sample 4 times larger sampling error 2 times smaller")

Page 4: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

Background (3)

• High sampling error reduces accuracy of point estimates

• Also leads to higher volatility of estimates in time• Month-on-month changes therefore subject to noise

which is increasing in the sampling error• Currently: particular interest in signals of turning

point• The more noise, the more difficult to detect TPs• We look at ad-hoc quality indicators of BCS data

Page 5: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

Outline

Focus here on

1. Sample sizes

2. Volatility

3. Tracking performance

Page 6: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

1. Gross sample sizes

• Top (green) and lowest (red) 10

• Samples vary broadly as function of country size

total indu serv cons buil retaEU 172389 37846 42885 39510 21711 30437EA 99178 23115 23208 27290 9396 16169BE 6070 1270 1350 1850 700 900BG 4654 1194 835 1000 592 1033CZ 3850 1000 800 1000 600 450DK 7800 500 3600 1500 800 1400DE 11700 3800 3900 2000 1000 1000EE 1675 275 360 800 90 150IE 2000EL 4295 1065 870 1500 410 450ES 5903 2268 800 2000 285 550FR 18050 4000 4500 3300 2500 3750IT 9900 4100 2100 2000 700 1000CY 1100 100 200 500 100 200LV 4139 810 1283 1000 426 620LT 4424 795 976 1200 536 917LU 730 120 500 110HU 6100 1500 1500 1000 1500 600MT 2519 350 600 1000 219 350NL 7478 1400 3000 1661 917 500AT 9403 925 1798 1500 380 4800PL 19203 3500 4352 1020 5000 5331PT 5836 1242 1533 1629 873 559RO 9641 2338 2565 1000 1630 2108SI 4212 744 791 1500 395 782SK 3797 756 606 1200 557 678FI 4510 700 800 2350 160 500SE 7650 1594 2766 1500 481 1309UK 5750 1500 1000 2000 750 500average 6553 1456 1715 1463 835 1217

Page 7: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

Gross sample sizes (total)

• Outliers particularly visible for large countries• Response rates have to be taken into account

Page 8: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

Effective sample sizes

• Top (green) and lowest (red) 10

• Effective samples can be significantly reduced, reflecting low response rates ( bias and higher sampling error)

• Largest effective samples in indu & serv, smallest in reta & buil

total indu serv cons buil retaEU 122197 29133 29863 27081 15810 20310EA 68528 17924 16151 18663 6443 9348BE 5821 1205 1251 1850 651 863BG 4127 1159 787 610 572 999CZ 3193 850 600 1000 450 293DK 5244 425 2340 945 624 910DE 8673 2660 2613 2000 630 770EE 1483 206 288 800 77 113IE 118EL 2472 746 357 1050 131 189ES 3067 1361 352 800 114 440FR 12343 3000 2970 2310 1625 2438IT 9567 3977 1890 2000 700 1000CY 1100 100 200 500 100 200LV 2994 559 847 870 302 415LT 3615 684 810 960 418 743LU 565 118 340 108HU 2365 375 315 1000 375 300MT 1615 158 270 1000 72 116NL 5317 1120 2400 1030 367 400AT 2250 435 557 300 190 768PL 17095 3395 4004 765 4400 4531PT 4506 994 1226 1140 698 447RO 8427 2104 2309 650 1467 1897SI 3214 684 712 750 363 704SK 3292 601 504 1147 488 551FI 3126 560 560 1528 128 350SE 5305 1164 1521 1500 361 759UK 1306 495 180 118 398 115average 4695 1121 1195 1003 608 812

Page 9: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

Effective sample sizes (total)

Outliers persist for at least two countries

Page 10: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

2. Measure of the volatility/noise in the series: Months for cyclical dominance (MCD)

MCD = Time span (in months) that one has to wait before one can attribute a change of direction to cyclical rather than irregular (noise) factors• Based on time series decomposition into trend/cycle and

irregular component• Computation of m-on-m, 2-month, 3-month, etc. changes• Computation of absolute averages of these n-period changes• Comparison for the two components (ratio irreg/trend-cycle)

Page 11: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

mean|IR-IR(-1)|= 2.19 mean|IR-IR(-2)|= 2.18mean|IR-IR(-3)|= 2.16

-30

-20

-10

0

10

20

86 88 90 92 94 96 98 00 02 04 06 08 10 12

CONS

-16

-12

-8

-4

0

4

8

86 88 90 92 94 96 98 00 02 04 06 08 10 12

CONS_IR

-30

-20

-10

0

10

20

86 88 90 92 94 96 98 00 02 04 06 08 10 12

CONS_TC

mean|TC-TC(-1)|= 0.88mean|TC-TC(-1)|= 1.74mean|TC-TC(-1)|= 2.57 MCD=3

>><

Page 12: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

MCDs for confidence indicators

esi indu serv cons buil retaEU 1 1 2 2 3 3EA 1 1 2 2 3 4BE 2 2 2 3 2 3BG 3 3 4 4 3 3CZ 3 3 4 3 4 4DK 2 2 4 3DE 1 1 2 2 2 3EE 2 3 2 3 2 3IE 4EL 3 3 3 3 4 3ES 3 3 4 3 4 5FR 2 2 1 3 1 5IT 2 2 3 3 5 4CY 4 4 3 5 7 4LV 4 4 2 3 2 2LT 3 3 3 2 3 2LU 2 2 3 3HU 3 3 3 2 4 3MT 3 4 2 3 3NL 2 2 1 3 3 3AT 2 2 3 3 5 5PL 3 2 2 4 1 3PT 3 3 4 3 4 4RO 3 3 4 3 3 3SI 3 2 3 4 3 4SK 4 5 3 3 3 4FI 3 2 4 3 3 4SE 2 2 2 2 2 3UK 2 2 4 3 2 3average 2.7 2.7 2.8 3.1 3.1 3.5

• 1 or 2: green • 4 or more: red• EU/EA aggregates are smoother• ESIs are smoother than CIs• Some CIs indicate change in

cyclicaal conditions immediately (MCD=1)

• Oder of MCDs across surveys in line with sample sizes (indu & serv<cons<reta&buil)

• Strong variation across countries• Irregular component can dominate

cycle even after 4,5,6 months• Caveat: Not always same sample

Page 13: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

Some examples (1)

-80

-60

-40

-20

0

20

84 86 88 90 92 94 96 98 00 02 04 06 08 10 12

IT_BUIL

n=2660MCD=1

n=700MCD=5n=1625

MCD=1

n=601MCD=5

Page 14: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

Some examples (2)

n=651MCD=2

n=180MCD=4n=2400

MCD=1

n=190MCD=5

-60

-50

-40

-30

-20

-10

0

10

84 86 88 90 92 94 96 98 00 02 04 06 08 10 12

AT_BUIL

Page 15: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

But….

n=743MCD=2

n=2438MCD=5

Page 16: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

Plotting sample sizes (effective) vs. MCDs

Page 17: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

Continuous measure:

Ratio of average absolute 2-month changes in irreg to trend/cycle

Overall: no strong evidence, but very large samples correspond with low volatility

(exceptions: Reta FR, PL; Serv RO)

Page 18: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

Summary on volatility

• MCDs as materialisation of sampling error • Wide differences in the usefulness of results for

detecting trends and TPs• In some cases volatility has to be reduced,

otherwise short-term noise buries cyclical info we are interested in

• Ways to reduce volatility: larger samples, higher response rates, better stratification/weighting, stabilisation of (panel) responses, …

Page 19: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

3. Tracking performance

• Behaviour with respect to hard reference series the indicators are supposed to track

• Reference series: growth in GDP, IP, value added in services, private consumption, building production index

• Biased estimates (due to e.g. frame errors or systematic non-response) should have worse tracking performance than unbiased estimates

Page 20: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

Correlation with reference series

• >75% green, <50% red• Correlation of EU/EA aggregates

higher except for retail• ESI more strongly correlated with

GDP than CIs with sector reference series

• CIs in indu, serv and buil on average better than cons, reta (can also point to worse CI composition)

• Some countries fare much better than pothers

• There should be some positive correlation between broad sector CIs and respective output data

• Link to sample sizes?

esi indu serv cons buil retaEU 92 87 92 86 71 45EA 91 86 91 80 74 34BE 44 65 57 14 28 5BGCZ 83 78 71 59 75 63DK 54 48 40 DE 81 83 35 31 85 37EE 79 72 89 34 75IE 89 EL 85 73 85 74ES 93 75 75 85 75 84FR 79 71 81 59 79 40IT 78 75 75 61 62 3CY 83 80 74 15 79LV 85 68 92 90 88LT 75 56 93 91 72LU 70 44 34 7 HU 86 79 87 78 72 40MT 51 48 43 NL 85 57 67 67 93 75AT 77 81 74 15 -16 11PL 73 62 75 62 88 63PT 90 55 80 84 89 87RO 92 34 79 75 89SI 67 84 83 37 84 50SK 67 63 51 57 69 49FI 81 78 51 51 68 25SE 75 76 64 70 68UK 80 70 82 53 56 71average 77 67 73 57 64 57

Page 21: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

Correlations vs. sample sizes

• Some visual correspondence, but not significant• Non-sampling errors likely more important (frames,

systematic non-response,…)

Page 22: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

Conclusions (1)• Few institutes provide info about the sampling error

of their estimates • Need some measure of volatility / the margin of error

around the balance results • Here we looked at MCDs instead• Strong differences across countries w.r.t. sample

sizes, smoothness/volatility and tracking performance (these are already aggregated indicators for total sectors and combining several questions!)

Page 23: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

Conclusions (2)• Which factors are driving these differences?• Volatility: sample size and other characteristics of

sampling method, …• Tracking performance/bias: inappropriate frame,

treatment of non-response, …• Need to develop a common framework for the

assessment…• … with a view to a necessary improvement of the

results in some cases• We propose to set up a Taskforce "BCS quality

assessment framework"

Page 24: A word on metadata sheets and observed heterogeneity in ad hoc quality indicators of BCS data Presentation by Christian Gayer DG ECFIN A.4.2, Business.

Thanks for your attention