The FRB and XML: National data and International standards San Cannon Federal Reserve Board IASSIST...

37
The FRB and XML: The FRB and XML: National data and National data and International International standards standards San Cannon San Cannon Federal Reserve Board Federal Reserve Board IASSIST 2005 IASSIST 2005
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    0

Transcript of The FRB and XML: National data and International standards San Cannon Federal Reserve Board IASSIST...

The FRB and The FRB and XML:XML:

National data and National data and International standardsInternational standards

San CannonSan CannonFederal Reserve BoardFederal Reserve Board

IASSIST 2005IASSIST 2005

2

Background:Background:

The Fed is a statistical agency as well The Fed is a statistical agency as well as a central bank and regulatory as a central bank and regulatory agency.agency.

Lots of data and information are Lots of data and information are available on the public website.available on the public website.

Statistical data is varied: Statistical data is varied: Monthly industrial Monthly industrial production indexes (non-financial), daily interest and production indexes (non-financial), daily interest and exchange rates (financial) and quarterly financial flows for exchange rates (financial) and quarterly financial flows for various sectors of the economy, surveys of small various sectors of the economy, surveys of small businesses and consumers, etc.businesses and consumers, etc.

3

The different roles are often The different roles are often competing interests...competing interests...

Sometimes it seems that the statistical Sometimes it seems that the statistical agency role is secondary.agency role is secondary.

Data are not always easy to find.Data are not always easy to find. Downloads are not customizable.Downloads are not customizable. Example: Trying to extract one industrial Example: Trying to extract one industrial

production series: Requires two text files, production series: Requires two text files, cutting and pasting, reformatting….cutting and pasting, reformatting….

All or nothing approach.All or nothing approach. Complete – yes. User Friendly – no.Complete – yes. User Friendly – no.

4

Other agencies making great Other agencies making great stridesstrides::

Bureau of Economic Analysis has Bureau of Economic Analysis has wonderful tabling capabilities: wonderful tabling capabilities: www.bea.govwww.bea.gov

Bureau of Labor Statistics has query Bureau of Labor Statistics has query screens, series select screens and screens, series select screens and frequently requested statistics: frequently requested statistics: www.bls.govwww.bls.gov

5

Taking an extra step:Taking an extra step:

We wanted to build something forward We wanted to build something forward looking; XML was identified early on.looking; XML was identified early on.

Most flexible and seems to be the trend for Most flexible and seems to be the trend for futurefuture..

Financial data already heading that way: Financial data already heading that way: FinXML, FpML (financial product ML), FinXML, FpML (financial product ML), MDDL (Market data definition language), MDDL (Market data definition language), XBRL (eXtensible Business reporting XBRL (eXtensible Business reporting language)language)

6

How do we do it?How do we do it?

Build our own XML definitions:Build our own XML definitions:- Pro: would fit our data perfectlyPro: would fit our data perfectly- Con: we’d be the only onesCon: we’d be the only ones

Use financial definitions:Use financial definitions:- Pro: lots of others use themPro: lots of others use them- Con: we have nonfinancial data Con: we have nonfinancial data

Try SDMX Try SDMX (Statistical Data and Metadata eXchange)(Statistical Data and Metadata eXchange)::- Pro: designed for time series dataPro: designed for time series data- Con: new kid on the blockCon: new kid on the block

7

But nothing goes smoothly at But nothing goes smoothly at first:first:

SDMX is based on ‘key families’ and codelists SDMX is based on ‘key families’ and codelists where every concept can be represented by a where every concept can be represented by a code with a corresponding definition in a list:code with a corresponding definition in a list:

HBBAHBBA Int. Rate, Official, Discount rate/Base Int. Rate, Official, Discount rate/Base raterate

HBCAHBCA Int. Rate, Official, Intra-day loansInt. Rate, Official, Intra-day loans

SCBASCBA Indust. Production, Motor vehicles, Indust. Production, Motor vehicles, NSANSA

SCBBSCBB Indust. Production, Motor vehicles, Indust. Production, Motor vehicles, SASA

8

We think about data We think about data differentlydifferently

The Fed uses mnemonic series names where The Fed uses mnemonic series names where each character in our series name has each character in our series name has meaning and names are hierarchical. meaning and names are hierarchical.

RIFSPFF_N.BRIFSPFF_N.B

R.*:RateR.*:Rate

R.I.*:Rate of interest in money R.I.*:Rate of interest in money and capital marketsand capital markets

R.I.F.*:Federal Reserve SystemR.I.F.*:Federal Reserve System

R.I.F.S.*:Short-term or money R.I.F.S.*:Short-term or money marketmarket

R.I.F.S.P.*:Private securitiesR.I.F.S.P.*:Private securities

R.I.F.S.P.FF.:Federal fundsR.I.F.S.P.FF.:Federal funds

_N.:Not seasonally adjusted _N.:Not seasonally adjusted

.B:Business (Five days, Monday-.B:Business (Five days, Monday-Friday)Friday)

JQI_I02Y3361T3_N.M:JQI_I02Y3361T3_N.M:

J.*:Indices except of pricesJ.*:Indices except of prices

J.Q.*:ProductionJ.Q.*:Production

J.Q.I.:IndustrialJ.Q.I.:Industrial

_I.*:NAICS-based industry _I.*:NAICS-based industry classificationclassification

02Y:codes from year 200202Y:codes from year 2002

3361.:Motor Vehicle 3361.:Motor Vehicle ManufacturingManufacturing

T:thruT:thru

3363:Motor Vehicle Parts 3363:Motor Vehicle Parts ManufacturingManufacturing

_N.:Not seasonally adjusted _N.:Not seasonally adjusted

.M:Monthly.M:Monthly

9

Fitting a square peg in a round hole….Fitting a square peg in a round hole….

Data represented by a concrete number of Data represented by a concrete number of concepts are much easier to represent with key concepts are much easier to represent with key family dimensions and attributes:family dimensions and attributes:

Q.SCBA.GB.92Q.SCBA.GB.92 → → Freq.Topic.Country.BIS codeFreq.Topic.Country.BIS code

M.HBBA.US.01M.HBBA.US.01 → → Freq.Topic.Country.BIS codeFreq.Topic.Country.BIS code

Hierarchical relationships and varying number of Hierarchical relationships and varying number of concepts makes life more difficult – a single key concepts makes life more difficult – a single key family isn’t possible:family isn’t possible:

JQI_I02YMF_N.MJQI_I02YMF_N.M → → Topic_Industry_SA.FreqTopic_Industry_SA.Freq

RIFSPPNA2P2D30_N.BRIFSPPNA2P2D30_N.B → → Topic?_SA.FreqTopic?_SA.Freq

10

SDMX only provides a SDMX only provides a framework:framework: We still needed to build the actual We still needed to build the actual

schemas to describe our data within the schemas to describe our data within the SDMX metaschema framework.SDMX metaschema framework.

Each data release uses its own schema or Each data release uses its own schema or set of schemas. Each schema is based on set of schemas. Each schema is based on a key family used to describe the data.a key family used to describe the data.

Currently, our schemas are tailored to Currently, our schemas are tailored to meet our data needs.meet our data needs.

11

Storage adds further Storage adds further complications:complications:

We need to store data and metadata in a We need to store data and metadata in a database to be retrieved with queries.database to be retrieved with queries.

Native XML databases in their infancy.Native XML databases in their infancy.

We couldn’t find many people storing XML We couldn’t find many people storing XML tagged data in relational databasestagged data in relational databases

12

So what did we end up with?So what did we end up with?

Data model is hybrid: tree structure Data model is hybrid: tree structure flattened to fit codelist setup.flattened to fit codelist setup.

We store the XML as carefully sliced text in a relational database and we can build an index structure that allows us to respond to ad-hoc queries very efficiently, even for large volumes of data.

13

This kind of structure:This kind of structure:

14

Looks like this in SDMX-ML:Looks like this in SDMX-ML:<structure:KeyFamily id="CP_OUTST" agency="FRB"><structure:KeyFamily id="CP_OUTST" agency="FRB"> <structure:Name xml:lang="en">Commercial Paper <structure:Name xml:lang="en">Commercial Paper

Outstandings</structure:Name>Outstandings</structure:Name> <structure:Components><structure:Components> <structure:TimeDimension concept="TIME" codelist="CL_TIME"><structure:TimeDimension concept="TIME" codelist="CL_TIME"> <structure:TextFormat/><structure:TextFormat/> </structure:TimeDimension></structure:TimeDimension> <structure:FrequencyDimension concept="FREQ" codelist="CL_FREQ"/><structure:FrequencyDimension concept="FREQ" codelist="CL_FREQ"/> <structure:Dimension concept="CP_SA" codelist="CL_CP_SA"/><structure:Dimension concept="CP_SA" codelist="CL_CP_SA"/> <structure:Dimension concept="CP_IND_TYPE" <structure:Dimension concept="CP_IND_TYPE"

codelist="CL_CP_IND_TYPE"/>codelist="CL_CP_IND_TYPE"/> <structure:Dimension concept="CP_ORIG" codelist="CL_CP_ORIG"/><structure:Dimension concept="CP_ORIG" codelist="CL_CP_ORIG"/> <structure:Dimension concept="CP_OWN" codelist="CL_CP_OWN"/><structure:Dimension concept="CP_OWN" codelist="CL_CP_OWN"/> <structure:Dimension concept="CP_NSASC" codelist="CL_CP_NSASC"/><structure:Dimension concept="CP_NSASC" codelist="CL_CP_NSASC"/> <structure:Attribute concept="UNIT" codelist="CL_UNIT" <structure:Attribute concept="UNIT" codelist="CL_UNIT"

attachmentLevel="Group" assignmentStatus="Mandatory"/>attachmentLevel="Group" assignmentStatus="Mandatory"/> <structure:Attribute concept="UNIT_MULT" codelist="CL_UNIT_MULT" <structure:Attribute concept="UNIT_MULT" codelist="CL_UNIT_MULT"

attachmentLevel="Group" assignmentStatus="Mandatory"/>attachmentLevel="Group" assignmentStatus="Mandatory"/> <structure:Attribute concept="OBS_STATUS" codelist="CL_OBS_STATUS" <structure:Attribute concept="OBS_STATUS" codelist="CL_OBS_STATUS"

attachmentLevel="Observation" assignmentStatus="Mandatory"/>attachmentLevel="Observation" assignmentStatus="Mandatory"/> <structure:Attribute concept="SERIES_NAME" attachmentLevel="Series" <structure:Attribute concept="SERIES_NAME" attachmentLevel="Series"

assignmentStatus="Mandatory" />assignmentStatus="Mandatory" /> <structure:Attribute concept="DESCRIPTION" attachmentLevel="Series" <structure:Attribute concept="DESCRIPTION" attachmentLevel="Series"

assignmentStatus="Conditional" />assignmentStatus="Conditional" /> </structure:Components></structure:Components> </structure:KeyFamily></structure:KeyFamily>

15

Which gets stored like Which gets stored like this:this:

16

And the end result?And the end result?

The Data Download Project (DDP) is the The Data Download Project (DDP) is the largest, most complex application on the largest, most complex application on the Board’s public website.Board’s public website.

It’s also the first production application It’s also the first production application to deliver customized data extracts in to deliver customized data extracts in SDMX format.SDMX format.

And now…….And now…….

Version 1.0!Version 1.0!

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

Next steps…Next steps…

Performance testing and verify server load Performance testing and verify server load capabilities.capabilities.

Polish interface, do usability testing and Polish interface, do usability testing and verify compliance with Section 508 verify compliance with Section 508 regulations.regulations.

Long run: work with other central banks Long run: work with other central banks on common schema framework.on common schema framework.

Release on the unsuspecting public! Release on the unsuspecting public! Target: Third quarter 2005Target: Third quarter 2005

37

The last slide…The last slide…

Questions? Comments?Questions? Comments?

Thank you for your attention!Thank you for your attention!

San CannonSan Cannon

[email protected]@frb.gov

(202) 452-3710(202) 452-3710