New Resources in the Research Data Archive
description
Transcript of New Resources in the Research Data Archive
New Resources in the Research Data Archive
Doug Schuster
Topic Outline New Resources
Search/Discovery and Data Delivery TIGGE JRA-25 Routine Updates
Data Search, Discovery and Delivery
Popular Datasets Google Style Search Drill Down Style Search File Level Metadata
Example: Search for model generated tropical cyclone track
data using “Drill Down” method.
Data Search, Discovery, and Delivery
Data Search, Discovery, and Delivery (Drill Down)
Data Search, Discovery, and Delivery (Drill Down)
Data Search, Discovery, and Delivery (File Level Metadata)
Background on TIGGE
WMO World Weather Research Programme THORPEX
– THe Observing system Research and Predictability EXperiment
– THORPEX Interactive Global Grand Ensemble (TIGGE) Archive supports research• Grand Ensemble = multiple NWP centers ensembles
are combined (an ensemble of ensembles)• 10 international NWP Centers contributing to TIGGE
Background on TIGGE
Three mirrored archive centers• NCAR• ECMWF• CMA
{Shared System Development!}
• Daily Data Flow Metrics– 245 GB– 1.6 Million gridded fields as separate data packets– 3000+ Files/day
Data Receipt
Archive Centre
Current Data Provider
NCAR NCEP
CMC
UKMO
ECMWFMeteoFrance
JMAKMA
CMA
BoMCPTEC
IDD/LDM
HTTP
FTP
Unidata IDD/LDM
Internet Data Distribution / Local Data Manager
Commodity internet application to send and receive data
NCDC
Archive Summary
• Online Data– Period, most recent two weeks– ~ 4 TB , public products– ~ 2 TB, data preparation, subsetting, DB
• Offline Data– Full period of record– ~ 200 TB, NCAR MSS system
Major ChallengesInsure data receipt, build complete archive
Exchange manifest files as part of IDD/LDM data
transmission between Archive centersVerify send, receiveAutomated resend requests for missing fields
Collate data fields into different files typesHarvest and hold metadata in MySQL DB’s
Identify location of every field in file setUpdated often Critical for users interface and background data
processing
Major ChallengesAccess system must accurately display
what common parameters are available as users make selectionsDriven by multi-center research (Grand
Ensemble)Parameters vary between centers.
Variance between centers
N200N128
0.56x0.561.00x1.001.25x0.83
1.25x1.251.50x1.50
0 1 2 3 4
Spatial Resolution
ECMWF UKMO JMA NCEP CMA CMC BOM MF KMA CPTEC
Number of Data Providers
Mo
de
l Re
so
luti
on
ECMW
F
UKMO
JMA
NCEPCM
ACM
CBOM M
FKM
A
CPTEC
0
10
20
30
40
50
60
70
80 # fields, # ensemble members
Conforming parame-ters
Ensemble Members
ECMW
F
UKMO
JMA
NCEPCM
ACM
CBOM M
FKM
A
CPTEC
02468
1012141618
Forecast Length, Initialization
Forecast Length (Days)
Forecasts/day
Get Forecast Data
NCAR online file archive
• Selection options (Portal or RDA)
•Center(s)•Date•File type (sl, pl, etc)•Initialization time•Forecast length
Download Options• Point and click using browser, one file at a time• Script to run on local machine
•User and password encrypted ‘wget’ commands• background process to access all files
User customized files
• Selection options (Portal)•Same as for files, plus•Parameter Subsets•Grid Interpolation•Spatial subsets•Formats, GRIB2, NetCDF
Delayed ModeReal Time
Two User Interfaces
User access selection demonstration
Animation, what you will see– Multiple centers
• (ECMWF, UKMO, NCEP, CMA, CMC, KMA)– Fields/Parameters
• (Geopotential Height, 2m Temperature)– Levels
• (500 hPa, Single Level)– Spatial and temporal ranges
• (Global, 3-days, 12Z initializations, 48 hour forecasts)– Regridding to common spatial resolution
• (1.5°)– Output format
• (netCDF)
Sample Data Request for an Event
Retrieve Completed Subset
Subset Request Animation
Gustav/Hannah Animation
Features of JRA-25/JCDAS at NCAR
All data available through web/RDA portal and NCAR MSS, 11 TB• Available dates, 1979 though 2007• 23 different data products
– 4 x daily, GRIB1 format– Monthly mean, netCDF (NCAR derived from binary) format
• All data users are registered and must agree to JMA’s ‘Condition of Use’
Typhoon Sepat, 16 August 2007
Images courtesy Dave Stepaniak
Routine Updates• NCEP
FNL Global Tropospheric Analysis (Daily)BUFR/PREPBUFR obs. data (Weekly)
• Unidata IDD data (Daily)NetCDF format obs collected from GTSIDD model data (GRIB-2)
GFSNAMRUC
Routine Updates• SST
NCEP OI Global SST 1x1 Deg (weekly)NOAA OI Global 0.25 x 0.25 SST (monthly)Hadley Centre Global Sea Ice and SST (monthly)
• ReanalysisNNR Yearly updatesNARR Yearly updatesJRA-25
Questions?
Lessons Learned
Manifest files and automated resend are critical for a complete archive
The impact of different contributions from the NWP centers across archive cannot be under estimated
There are important design considerations to insure prompt browser interactions Caching data from the DB
Lessons Learned
Computational resource requirements ramp up quickly with multi-dimensional problemsD’s, center, ensemble member, parameter,
forecast length, etc. Archive file structure choices greatly impact
subsetting abilityTIGGE currently based on synoptic orderTime-series by parameter could be better?
Major Challenges Limited online storage – 4 TB, ≅ 2 weeks
temporal coverageFull archive on NCAR Mass Storage
System User registration and metrics required
Accept data policy; for research and education only
48 hour delay from forecast initialization time