IMF Approach to Storing Metadata with Macroeconomic Statistics

14
IMF Approach to Storing Metadata with Macroeconomic Statistics UNECE Workshop on the Common Metadata Framework (Vienna, Austria, 4-6 July 2007)

description

IMF Approach to Storing Metadata with Macroeconomic Statistics. UNECE Workshop on the Common Metadata Framework (Vienna, Austria, 4-6 July 2007). Dissemination Standards Bulletin Board (DSBB). data standards initiative (SDDS/GDDS countries’ dissemination practices - PowerPoint PPT Presentation

Transcript of IMF Approach to Storing Metadata with Macroeconomic Statistics

Page 1: IMF Approach to Storing Metadata with Macroeconomic Statistics

IMF Approach to Storing Metadata with Macroeconomic Statistics

UNECE Workshop on the Common Metadata Framework

(Vienna, Austria, 4-6 July 2007)

Page 2: IMF Approach to Storing Metadata with Macroeconomic Statistics

Dissemination Standards Bulletin Board (DSBB)

• data standards initiative (SDDS/GDDS• countries’ dissemination practices• information that SDDS countries provide the

IMF on their dissemination practices• direct links to the economic and financial data

that countries disseminate under the SDDS• information that GDDS countries make

available to the IMF on their statistical practices• http://dsbb.imf.org

Page 3: IMF Approach to Storing Metadata with Macroeconomic Statistics

Collaboration with OECD

Dec 2006 - Agreement to use Dotstat and MetaStore to form the basis of the IMF data warehouse

Jan 07 – software available on joint Team Foundation Server (TFS)

Feb 07 IMF.Stat installed with the assistance of OECD

May 07 have loaded: International Financial Statistics (IFS), World Economic Outlook (WEO), and Sub Saharan Africa Regional Economic Outlook (REO)

June 07 – signed an MOU which supports a collaboration approach to future enhancements for the mutual benefit of both organizations

Page 4: IMF Approach to Storing Metadata with Macroeconomic Statistics

Data Fact table

CouGrpID

ConceptID

DataSourceID

UnitOfMeasID

TimeFreqID

StatusID

Observation

Flag

100

250

3000

10

25

2

158.1

E

Country Group CouGrpID

ParentID

Code

100

Null

156

Label Canada

Concept ConceptID

ParentID

Code

250

200

NGDP

Label Gross ...

DataSource

DatSrceID

ParentID

Code

3000

Null

WEO

Label World ...

Unit Of Measure

UofMID

ParentID

Code

10

Null

N

Label Nat Curr

Time & Frequency TimeFreqID

ParentID

Code

25

Null

200401

Label 2004 Q1

Status StatusID

ParentID

Code

2

Null

SHARE

Label Shareable

Metadata Fact table

CouGrpID

ConceptID

DataSourceID

UnitOfMeasID

TimeFreqID

StatusID

MetadataID

100

250

3000

10

25

2

5487

Country Group CouGrpID

ParentID

Code

100

Null

156

Label Canada

Concept ConceptID

ParentID

Code

250

200

NGDP

Label Gross ...

DataSource

DatSrceID

ParentID

Code

3000

Null

WEO

Label World ...

Unit Of Measure

UofMID

ParentID

Code

10

Null

N

Label Nat Curr

Time & Frequency TimeFreqID

ParentID

Code

25

Null

200401

Label 2004 Q1

Status StatusID

ParentID

Code

2

Null

SHARE

Label Shareable

Data Referential Metadata

Metadata

TextChain-linked GDP volume measures are expressed in ...

MetadataID 5487

IMF.Stat Data ModelIMF.Stat Data Model

Page 5: IMF Approach to Storing Metadata with Macroeconomic Statistics

Structural metadata

• Economic Concepts -mapped as many time series as possible to the Catalogue of Time Series and loaded them to IMF.stat

• Countries and groups – used IFS version of Country names and codes as the authoritative source for codes and labels

• Unit – chose to combine unit and scale e.g. Millions of US dollars

• Storing data in native units i.e. not trying to convert observations to a common unit.

• Status, Source and Time and Frequency reasonably straight forward so far. Will become more problematic when we introduce versioning.

Page 6: IMF Approach to Storing Metadata with Macroeconomic Statistics

• Working through existing metadata from IFS publications and production system

• Where necessary/possible cleaning it up, standardizing it and loading it to MetaStore

• WEO – metadata sourced from the external web site, reformatted and stored in MetaStore then exported to IMF.stat

• All referential metadata loaded to MetaStore and then exported to IMF.Stat

Referential Metadata

Page 7: IMF Approach to Storing Metadata with Macroeconomic Statistics

Data- IFS

All time series which were able to be mapped to the Catalogue of Time Series (CTS)

• Includes – Exchange rates– Balance of Payments– International Investment Position– Real Sector Statistics– International Liquidity– Money and Banking non-SRF data

• Excludes– Government Finance– Money and Banking SRF data– Fund Accounts

» 191 concepts» 233 countries» 39 groups» 7.6 million observations

Page 8: IMF Approach to Storing Metadata with Macroeconomic Statistics

WEO• Two most recent editions • Includes series published externally as well as other

series available internally• Concept - generally consistent with the CTS• Country and group – some differences in codes used

so mapped where possible. Some groups added.• Unit - limited number of units used and mainly

consistent across countries

Data-WEO

Page 9: IMF Approach to Storing Metadata with Macroeconomic Statistics

Sub Saharan Africa REO–Structural Metadata

• Concepts – virtually no codes or labels in common with the CTS

• Able to map those series published in the REO but the supporting series too difficult. Are now working through them on a case by case basis to determine which if any map to the CTS

• Country and group – country codes and labels mainly consistent with WEO. Groups all different even though sometimes have the same label.

• Units - mainly ratios which were added to the authoritative list.

Sub Saharan Africa REO Referential Metadata

• Have sourced top level referential metadata only. Will work with the Africa Department after the data are loaded to identify any usable referential metadata.

Data-REO

Page 10: IMF Approach to Storing Metadata with Macroeconomic Statistics

MetaStore• Some modifications with assistance from OECD

– Now includes • structural metadata • mappings to authoritative lists• referential metadata

SchemaLogic– In future may integrate structural metadata in

MetaStore or replaceAlignment with SDMX

– Have used 42 ‘types’ to categorize our referential metadata

– Added one to the OECD set, which are consistent with SDMX

Page 11: IMF Approach to Storing Metadata with Macroeconomic Statistics

Managing metadata within the IMF

• Locate relevant sources of metadata• Locate potential warehouse content• Central repositories for data and metadata• Harmonizing and mapping to a preferred term• Authoritative lists• Working with Information Services Division (ISD) to

ensure information management best practice• Assigning data stewards to manage metadata

Page 12: IMF Approach to Storing Metadata with Macroeconomic Statistics

Governance

• Establishing groups and individuals with certain roles and responsibilities for management of metadata– Economic Data Advisory Group

• Representation from departments across the Fund• Includes several working groups with specific focus

– Information Services Division • Responsible for provision of metadata

– Metadata and Standards team• New group in the Statistics Department currently focusing on

metadata used in the data warehouse

Page 13: IMF Approach to Storing Metadata with Macroeconomic Statistics

Next Steps

• Changes to work practices across the Fund• Identify a data steward for each dimension in IMF.Stat• Standardization, authoritative sources• Reuse of metadata across systems• Raise awareness of the value of quality metadata• Tie together basic schemas

Page 14: IMF Approach to Storing Metadata with Macroeconomic Statistics

IFS

WEO

Internal

External

Data sources

DataStream

ETL

IMF.stat

MetaStore

User interface

End-users

User interface

Referential and structural

metadata

Haver

111 USA112 UK273 MEX...2005;USA,GDP,548.25

...2004;USA,GDP,526.25

Data flow

Referential metadata

Structural metadata

EDW Top Level Diagram

ConceptCountryGroup

Data Source

Time& Freq