Growing and Future Datasets in the SCD Research Data Archives for NSF SCD Review Panel 16 October...

21
Growing and Future Datasets in the SCD Research Data Archives for NSF SCD Review Panel 16 October 2001 Steven Worley Scientific Computing Division Data Support Section

Transcript of Growing and Future Datasets in the SCD Research Data Archives for NSF SCD Review Panel 16 October...

Growing and Future Datasets in the SCD Research Data

Archivesfor

NSF SCD Review Panel

16 October 2001Steven Worley

Scientific Computing DivisionData Support Section

Outline

• Introduction• Extant Growing Archives• Data Archive Assistance for

the Research Community• Future Archives

Introduction

• Components of a good archive– Maintained in a reliable system (MSS)– Clear and concise information interface

• Complete discovery metadata

– Convenient data access for many users• Local computing platforms• Transfer to remote computing platforms

– Consultants for assistance• Guidance to the best products, sometimes

within multi-product complex collections

• Components, continued

– Underlying archive with rich content• Many historical reference datasets

- have 100’s of these, but not discussed here

• Relevant new and frequently updated datasets

Focus today: Growing and Future datasets

Global Observations

P.O.R # Yrs Incep.

Date

Comments

Rawinsondes 1946-

on

55 1967 Upper Air

Pibals 1942-

on

59 1973 Upper Air, wind

Aircraft 1947-

on

52 1973 USAF and

Commer.

Sat. cloud wind

drift

1967-

on

34 1973 GOES and GTS

Satellite

Soundings

1969-

92

25 1973 TOVS +

irradiance

Surface Synoptic 1948-

on

53 1975 some much older

Ocean Surface 1794-

on

203 1981 COADS

Usages:

• Input for global atmospheric reanalysis

• Basic long term climate assessment and case studies

Operational and Composite Analyses

U.S. Analyses for the N. H. (Early Operational outputs and composites) P.O.R. Comments

Daily SLP Analysis 1889-on Composite of data sources, 2 x daily later period

Selected Early Analyses 1946,1950 - on 700mb, 500mb, 300mb NMC Oper. Analysis 1962-on Z &T @ 10mb – sfc. (11 lev)

Global Operational Analyses NCEP/NMC 1976-on Many levels and variables ECMWF 1980-on Many levels and variables

Special Analyses Australian 1972-1992 Discontinued FNOC (U.S. Navy) 1973-1993 Discontinued

• Special analyses were discontinued when global operational analyses became very good

• Daily SLP is a small but very popular dataset, e.g. NAO evaluations

ECMWF Global Operational Analyses Data Product Period of

Record Temporal Res.

Spatial Res. (dg)

Update Cycle

# Levs.

# Vars.

Major Variables

Upper Air 1985- 06/ 2001

6 hr ~1.125 6 mn 21 8 z,t,wind,rh

Surface 1985- 06/ 2001

6 hr ~1.125 6 mn 1 47 p,t,wind,soil.t, soil.moist.

Supplemental 1985- 06/ 2001

6 hr ~1.125 6 mn 16 rad.,stress,heat.flux, clouds

Extension 1991- 06/ 2001

6 hr ~1.125 6 mn 18 precip,heat.flux

Sf c/ Up.Air Low Resolution

1985- 06/ 2001

12 hr 2.5 1 mn 21+ 14 sf c.t,sf c. p,z,t,wind,rh

Sf c/ Up.Air †

Low Resolution

1985- 06/ 2001

1 mn 2.5 ~1 mn 21+ 14 sf c.t,sf c. p,z,t,wind,rh

† Computed by the SCD/ DSS

Highlights• up to date, 1985 – June 2001• different temporal resolutions, 6 hr to 1 mn• different spatial resolutions, ~ 1 degree to 2.5 degree• many atmospheric levels and variables

Details and Drawbacks

• Distribution Restriction; U.S. non-profits and UCAR members only.

• Cost, increasing and unpredictable

• $11K in 1999, $16K in 2000, $19K in 2001

• We get only modest resolution (T106, N80), T319 and N256 are available – again cost is an issue.

NCEP Operational Analyses Data Product Period of

Record Temporal Res.

Spatial Res. (dg)

Update Cycle

# Levs.

# Vars.

Major Variables

Final Analysis Global 2.5

1976- 08/ 2001

6 hr 2.5 1 mn 11+ 15 z,t,wind,rh, sf c.t, sf c.p

Final Analysis Global 1.0

09/ 1999 - today

6 hr 1.0 Daily (FTP)

26+ 71 z,t,wind,rh,vorticity sf c.t,sf c.p

ETA-3D N. America

05/ 1995- 07/ 2001

6 hr 40 (km) 1 mn 26+ 5 z,t,wnd,sh, precip(f orecast)

ETA-Surface N. America

05/ 1995- 07/ 2001

6 hr 40 (km) 1 mn 12 wind,sf c.p,sf c.t, soil.t,soil.p

LFM (1971-1995) and NGM (1984-cont), N. America, 190km and 6 hr resolution, are available but ETA is considered a superior replacement.

Highlights

• Very current, FNL 1.0 is done daily

• High resolution N. America, ETA at 40km

• No cost or distrib. restrictions from NCEP - GREAT

Reanalyses

P.O.R # Yrs Incep. Date

NCEP/NCAR Reanalysis

I

1948-06/2001 53 1994

ECMWF ERA-15 1979-1993 15 1994

NCEP Reanalysis II 1979-06/2001 22 1998

Notes:

• ERA-15 is terminal, ERA-40 is under development now

• NCEP II, experimental run (testing new data and schemes)

NCEP/NCAR Global Atmospheric Reanalysis Data Product Period of

Record Temporal Res.

Spatial Res. (dg)

Update Cycle

# Levs.

# Vars.

Major Variables

Analysis on Pressure Sf c.

1948- 6/ 2001

6 hr 2.5 1-2 mn 17 7 u,v,z,t,rh

Analysis on Sigma Sf c.

1948- 6/ 2001

6 hr 192x94 Gaussian

1-2 mn 28 6 u,v,t,sph,rel.vort,

Analysis on Theta Sf c.

1948- 6/ 2001

6 hr 2.5 1-2 mn 11 10 N**2, ab.vort,u,v, t,rh,pot.vort

Surf ace Flux Fields

1948- 6/ 2001

6 hr 2.5 1-2 mn 12 Clouds, rad.flx, soil.moist,heat.flx precip

Monthly Mean Anal. P. Sf c.

1948- 2000

1 mn 2.5 1-2 mn 17+ 36 u,v,z,t,rh

CD-ROMS 1953- 1999

12 hr, 1 day, 1mn

2.5 3-6 12 u,v,z,t,rh,heat.flx, rad,flx,precip

model qc’ed observations are returned f orecasts, once every 5 days a f orecast fi elds, 6 hr, available out to 8 days

Outstanding Features• Three different coordinate surfaces• Very long analysis• Unrestricted distribution• CD-ROMS are very popular

Countries Receiving Reanalysis CD-ROMs

Highlights• Over 8900 CD-ROMs 1997-09/2001

• Top 13; U.S. 46%, Japan 11%, (Canada, UK) @ 4%, (Germany, India) @ 3%, (Australia, S.Korea, Spain, Mexico, Norway, Russia, France) @ 2%

0

20

40

60

80

100

120

140

160

180

200

Un

iqu

e U

se

rs

1995 1996 1997 1998 1999 2000 2001

Years

NCEP/NCAR Reanalysis I, MSS

Other Users

Univ. Users

NCAR Users

2001 is Jan.-Sep. ONLY

Other Users, Jan.-Sep. 2001• 35 Received CD-ROMs• 80 Custom data orders (FTP, tape)• 906 Data downloads from the online server

406 taking more than one file (66 GB)678 Served

Data Archive Assistance for the Research Community

• Who is the Research Community– International – University – Other UCAR Research Programs

• Why?– Helps others achieve goals– Provides additional resources at NCAR– Can lead to future opportunities

International collaboration

GCIP Model Data Center

High res. atmos. models focused on energy and hydrology cycles – many surface boundary layer data.

GCIP: GEWEX Continental-Scale International Project / GEWEX : Global Energy and Water Cycle Exper.

• Critical data for N. American mesoscale studies• Complete archive is approx. 1 Terabyte• JOSS/UCAR has many of the GCIP observations

Eta –NCEP 3 hr 40 km25 lvs

5/1995 – 7/2001

MAPS – FSL NOAA

3 hr 40 km5 lvs

8/1996 - 7/2001

GEM – Canadian

6 hr 41 km28 lvs

4/1997 – 6/2001

University collaboration

MICOM; Miami Isopynic Coordinate Ocean Model, 1/12th degree 70N to 28 S, 16-20 layers

COADSClim. Forcing

6 yrs 305 Gigabytes

ECMWFClim. Forcing

2 yrs 164 Gigabytes

ECMWF Daily Forcing

5 yrs 415 Gigabytes(1979-1983)

U. Miami, Ocean Model Data

Why? Needed help getting the data to users

How?– Web order interface– Automatic processes to stage

the data from the MSS and create subsets

– Data then staged for FTP pickup

6-yr Mean T at 5 meters

UCAR research collaboration

GTS: GPS Science and Technology Program, leveraging GPS satellites for science

SuomiNet Data from GST

GPS satellite signals at receivers

• Estimate integrated water vapor

• Total electron count (strato.)

Why? • The GST project is focused

on real-time data capture and provision

• We want to preserve the archive for long term studies in the future

Receiver Sites

How?• GST staff stage data to the MSS• SCD staff perform archive maintenance and access

UCAR research collaboration

Support real-time data services

Unidata’s (UCAR) role– Runs full capacity IDD/LDM application– Serves ± 20 universities downstream

•SCD’s role–24 hr x 7 day operation monitoring–Routine system back up–Data archive backup and maintenance

•Covers one year – done daily•2.3 GB per day to the MSS•Observations, NCEP model data, wind profilers

Future Data Collection

• Rescue of African Data (NOAA/FSU)– Upper air and surface data– Digitize and save magnetic tapes– Why? Improve coverage

• Rescue of Russian Ocean Observations (NSF)– Digitize six million marine surface data, 1937-1993– “Cold War” archive – global coverage, 60’s and

70’s

• Map reanalysis model QC back onto original data– Merge many sources of observations together

• Many others– E.g. more global river flow data

Observations;

Future Data Collection

• ECMWF’s ERA-40 – 1957-2001– SCD has provided observational datasets– Three time periods computed at once, completion

approx. late 2002.– T159 (~82km) , 60 levels, and 6 hr resolutions– 15 TB of data– Cost = $100K, and restricted distribution

Reanalyses;

Future Data Collection

• NCEP N. America Mesoscale – Probably, 30 km, 25-30 levels– Starting late 2001, maybe later– Based on the ETA model

• Next U.S. Global Reanalysis– Based on NCEP experiments and ERA40 results– Might be under a combined NASA and NOAA

project

Reanalyses;

Generally: Stay informed about science activities, sometimes through participation, and then collect new data sources as they emerge.

Key Summary Points

• Update many archives to support research– Observational collections– Operational Analyses– Reanalyses

• Collaborations to promote great science– International– University– NCAR/UCAR

• Have plans to collect emerging new data resources