HDF - 1 - Mike Folk National Center for Supercomputing Applications HDF and HDF-EOS Workshop VI...

48
- 1 - HDF HDF Mike Folk National Center for Supercomputing Applications HDF and HDF-EOS Workshop VI December 4-5, 2002 HDF Update HDF Update HDF HDF
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    221
  • download

    0

Transcript of HDF - 1 - Mike Folk National Center for Supercomputing Applications HDF and HDF-EOS Workshop VI...

- 1 - HDFHDF

Mike Folk

National Center for Supercomputing Applications

HDF and HDF-EOS Workshop VI

December 4-5, 2002

HDF UpdateHDF Update

HDFHDF

- 2 - HDFHDF

TopicsTopics

• Who is supporting HDF

• HDF software in 2002

• Other activities of interest

- 3 - HDFHDF

Who is supporting HDF?Who is supporting HDF?

• NASA/ESDIS– Earth science applications, instrument data

• DOE/ASCI (Accelerated Strategic Computing Init.)– Simulations on massively parallel machines

• NCSA/NSF/State of Illinois– HPC and Grid data intensive apps, Visualization, user support– Atmospheric and ocean modeling environments

• DOE Scientific Data Analysis & Computation Program– High performance I/O R & D

• National Archives and Records Administration– Small grant to consider HDF5 as an archive format

- 4 - HDFHDF

HDF software in 2002HDF software in 2002

• Library releases

• Java Products

• Tools

• Compression

• Investigations of Web technologies

- 5 - HDFHDF

HDF4 libraryHDF4 library

• No releases in 2002.• Release 1.6 planned for May, 2003

– Bug fixes– New compilers

• Intel• Portland Group

– New OS• Mac OS X• AIX 5.1 64-bit

- 6 - HDFHDF

1.4.3

1.4.4

1.4.5

HDF5 tables

High level

APIsHigh level library

HDF5 software milestones in 2002HDF5 software milestones in 2002Q1 ‘02 Q2 ‘02 Q3 ‘02 Q4 ‘02

H5im

port

H4-H5

conversion

library

Base library

Java products

Other tools

Java

products

1.0

Java

products

1.1

Java

prods 1.2

- 7 - HDFHDF

HDF5 library in 2002HDF5 library in 2002

• Compilers, configuration, etc.– “h5cc” script to simplify compilation of HDF5

programs– F90 shared library and C++ supported on Windows– Intel C, F90 and C++ on Linux, IA32/64 and Windows – Support for zlib 1.1.4

• Performance– Added library performance tests– Performance improvements

• hyperslabs, data conversions. chunking – Fewer and larger I/O requests when accessing a file– Parallel I/O performance improvements

- 8 - HDFHDF

Parallel HDF5Parallel HDF5

• Parallel I/O performance benchmark suite – Compares raw I/O, MPI-I/O, and HDF5 I/O– Distributed with HDF5– http://hdf/RFC/PIO_Perf/PHDF5_performance.html

• Parallel HDF5 tutorial– http://hdf.ncsa.uiuc.edu/HDF5/doc/Tutor/

• “Flexible parallel HDF5” programming model– More flexible model for parallel HDF5

• Performance studies and tuning activities

- 9 - HDFHDF

Next major release -- HDF5 1.6Next major release -- HDF5 1.6

• Release date: Spring 2003• New format and library features include

– Compression enhancements, including szip – Generic Properties– Checksum– Dimension scale support (tentative)

• Performance improvements include– Chunking & compression– Parallel I/O performance benchmark suite

- 10 - HDFHDF

Next major release -- HDF5 1.6Next major release -- HDF5 1.6

• Flexible parallel HDF5• Special platforms

– Large Compaq cluster (Pittsburgh SC)– Crays– Windows XP– Mac– Several new compilers (e.g. Intel, Portland Group)

• Documentation– New User’s Guide-good draft, first version

- 11 - HDFHDF

High level APIsHigh level APIs

• Make HDF5 easier to use – More operations per call than the normal HDF5 API

• Encourage standard ways to store objects– Enforce standard representation of objects in HDF5

- 12 - HDFHDF

High level APIsHigh level APIs

• Lite – done – Same as HDF5, but simpler

• Image – done – Interprets dataset as image/palette– 2-D raster data like HDF4 raster images

• Table – partly done– Interprets dataset as “tables” – collections of records– Insert, delete records or fields– Future: sort and search

• Dimension scale – in the works• Unstructured grids – in the works• http://hdf.ncsa.uiuc.edu/HDF5/hdf5_hl/doc/

- 13 - HDFHDF

HDF5 tools activitiesHDF5 tools activities

- 14 - HDFHDF

HDF Java Products – 2002HDF Java Products – 2002

• Goal: replace older tools with single viewer/editor• HDF Java Products

– Java HDF Interface (JHI) – to access the HDF4 library. – Java HDF5 Interface (JHI5) – to access the HDF5 library.– New hdf-object package – understands HDF4 and HDF5.– HDFView – tool for browsing/editing HDF4 and HDF5

• See demo, brochure, CD, web page– http://hdf.ncsa.uiuc.edu/hdf-java-html/

- 15 - HDFHDF

HDFView releases in 2002HDFView releases in 2002

Version 1.0Browser for both HDF4 and HDF5

Version 1.1

Editor for both HDF4 and HDF5

Version 1.2

All features of old Java tools.Some new features.

Q2 Q3 Q4

HDFView can do as much as JHV and H5View and also includes many new editing features

http://hdf.ncsa.uiuc.edu/hdf-java-html/hdfview/

- 16 - HDFHDF

H4toH5 Conversion ToolkitH4toH5 Conversion Toolkit

• Goal: support transition from HDF4 to HDF5• Version 1.0 released in July 2002• Includes

– h4toh5 converter– h5toh4 converter– library of functions for converting HDF4 objects into

HDF5 objects • Download from:

– http://hdf.ncsa.uiuc.edu/h4toh5/libh4toh5.html• Mapping specification and FAQ

– http://hdf.ncsa.uiuc.edu/HDF5/doc/ADGuide/H4toH5Mapping.pdf

- 17 - HDFHDF

Other tools workOther tools work

• H5import - convert flat files to HDF5 datasets– ASCII text file with numeric data (float or integer)– Binary file with native floating point data – Binary file with native integer data

• hdf4import – souped up version of the old fptohdf– Available in hdf4r1.6

• HDF5-to-GIF and GIF-to-HDF5 converters• H5dump improvements

– Subsetting– Support variable length datatypes including strings

- 18 - HDFHDF

Other tools workOther tools work

• H5diff– compare the structure and contents of two HDF5 files,

and report differences– Command line utility like Unix ‘diff’ and older ‘hdiff’– Report missing objects, inconsistent size, datatype, etc.– Compare values of numeric datasets– First beta available January 2003– RFC: http://hdf.ncsa.uiuc.edu/RFC/H5diff/h5diff.html

- 19 - HDFHDF

CompressionCompression

• Szip - fast compression method for EOS data– Expect to include in next releases of HDF4 and HDF5

• Shuffling – reorder bytes before compressing– Can improve compression ratio

• Performance study – BZIP2 vs gzip compression– Study: whether or not to support bzip2 compression– Result: BZIP2 not significantly better than gzip– So not currently supported in the release– But BZIP2 can be used with HDF5

- 20 - HDFHDF

Investigations of Web technologiesInvestigations of Web technologies

- 21 - HDFHDF

HDF5 XMLHDF5 XML• Great interest in XML, interoperation of XML and

binary formats• Results

– HDF5 DTD– h5dump –XML– H5View reads XML and writes HDF5

• Studies, design notes, other info– http://hdf.ncsa.uiuc.edu/HDF5/XML/

• Possible future activity:– XML schema– Update tools– HDF4 schema, tools– Format translation via XSLT

- 22 - HDFHDF

XML, Java Server Pages, etc.XML, Java Server Pages, etc.

• How to use HDF5 data in Web environment• Experiments with XML, Java Server Pages

(JSP), etc.– JSP server

• Access HDF5 files on Web server using Web browser, or Java applet, or Java application

– Several variations demonstrated– Is not a product!

• http://hdf.ncsa.uiuc.edu/HDF5/XML/

- 23 - HDFHDF

CORBA ExperimentsCORBA Experiments

• HDF5 with CORBA on distributed systems– Prototype CORBA server to wrap HDF5 library

and datasets (C++)– Remote access via C++, Java, Web– Might be valuable as replacement for Java Native

Interface– Successful demonstration, but many open issues– Is not a product!

http://hdf.ncsa.uiuc.edu/HDF5/XML/JSPExperiments/index.html

- 24 - HDFHDF

Other Activities of InterestOther Activities of Interest

- 25 - HDFHDF

NPOESSNPOESS

• National Polar-orbiting Operational Environmental Satellite System – Combine satellite systems of civil and defense programs

• HDF5 to be used to distribute data to users • First implementation in 2006

– Support the NPOESS Preparatory Program

• Later full implementation by 2013– Converged system provides global coverage

• http://www.ipo.noaa.gov

- 26 - HDFHDF

Neutron Research CommunityNeutron Research Community

• Worldwide research community– England, France, Germany, Japan, Italy, Switzerland, Russia – US centers at Argonne, NIST, Los Alamos

• Neutron and X-ray scattering experiments and simulations– Common software and formats to gather, share, archive, post-

process data

• NeXus data format– Enforces standardization of metadata and data structures– Based on HDF4 for many years– Now switching to HDF5– http://www.neutron.anl.gov/nexus/

- 27 - HDFHDF

National Archives and National Archives and Records AdministrationRecords Administration

• Pilot project for HDF5

• Explore scientific data format requirements for long term archiving of electronic records

• Identify record types for which HDF5 is suited

- 28 - HDFHDF

Atmospheric and Ocean ModelsAtmospheric and Ocean Models

• Modeling Environment for Atmospheric Discovery (MEAD)

• HDF5 for high performance I/O for atmospheric and ocean modeling

– Weather Research and Forecasting (WRF) model – Regional Ocean Modeling System (ROMS)– Coupling of WRF and ROMS

• UAH ESML & data mining also involved

- 29 - HDFHDF

HDF5 Mesh API prototypeHDF5 Mesh API prototype

• Support for structured and unstructured “mesh” data • For applications such as computational fluid

dynamics, finite element analysis, and visualization.• A higher-level API • Format

– HDF5 groups and datasets to organize the data• Collaboration involving NCSA, CEI and others• Documentation still pretty sketchy, but see

• ftp://ftp.ensight.com/pub/HDF_RW/hdf_rw.tgz

• Discussion list in the works

- 30 - HDFHDF

HDF5 Wins 2002 R&D Magazine AwardHDF5 Wins 2002 R&D Magazine Award

“The 100 products and processes that are the most ‘technologically significant’ and can

change people's lives for the better”

http://www.ncsa.uiuc.edu/News/Access/Releases/020722.HDF5.html

- 31 - HDFHDF

Thank you!Thank you!

• HDF website– http://hdf.ncsa.uiuc.edu/

• HDF5 Information Center– http://hdf.ncsa.uiuc.edu/HDF5/

• HDF Helpdesk– [email protected]

• HDF users mailing list– [email protected]

HDFHDF

55

Information SourcesInformation Sources

- 32 - HDFHDF

Backup slidesBackup slides

- 33 - HDFHDF

HDF5 funding sourcesHDF5 funding sourcesDOE

SciDAC 4%

Other4%

State of IL10%

NSF14%

ASCI31%

NASA37%

NASA ASCI NSF State of IL DOE SciD Other$588,000 $495,000 $225,553 $162,750 $70,000 $60,000

- 34 - HDFHDF

HDF5 User CommunityHDF5 User Community

• Worldwide use in government, academia, industry• How many users?

– 450 organizations or individuals have filled in “user” form in the past year

– There are many times this many anonymous users– And some organizations have thousands of users (e.g. the Earth

Observing System)

• Public applications– More than 25 publicly available applications– Four vendors so far

• LabVIEW• IDL• EarthScan Network• HDF Explorer• Others in the works (e.g. Matlab)

- 35 - HDFHDF

Technical fields that use HDF5Technical fields that use HDF5

• Aerospace• Agricultural research• Air traffic control• Aircraft emissions database• Applied mathematics• Astrophysics• Astrophysics / supernovae• Atmospheric chemistry• Atmospheric physics• Bioengineering• CEM Simulation• Climatology / hydrology• Computational fluid dynamics• Computational physics • Computational physics / education• Computational physics and

computational astrophysics• Computer modeling• Computer science• Data processing • Earth observation / atmospheric

science• Earth science

• Photonic band gap studies• Photonic crystals• Photonics• Post-fire erosion analysis• Protein crystallography, molecular

modeling• Protostellar accretion discs• Remote sensing • SAR processing• Satellite / weather radar remote

sensing• Satellite oceanography• Semiconductor process simulation• Software engineering, distributed

systems• Space geodesy• Space physics• Surface water flow and sediment

transport• Theoretical chemistry• Visualization• Volcanology• Water resources management • X-ray physics

• Environmental science• Fast searching, sorting and retrieval• Film making special effects• Fluid mechanics• GIS• Geodetic Science• Geology• Gravitational physics• Hydrology• Information technology• Magnetic mass spectrometer

development• Marine biology / ecology• Materials science• Meteorological data products• Meteorology• Microscopy• Molecular biology• Nano device simulation• Neutron scattering• Ocean color• Ocean remote sensing• Optics / optoelectronics• Petroleum engineering

- 36 - HDFHDF

Users of HDF5 – 66 countriesUsers of HDF5 – 66 countries

- 37 - HDFHDF

Next major release -- HDF5 1.6Next major release -- HDF5 1.6

- 38 - HDFHDF

Next major release -- HDF5 1.6Next major release -- HDF5 1.6

• Performance improvements– Chunking– Compression (several)– Parallel I/O– Metadata I/O– Compact dataset storage

• Other parallel– Parallel I/O performance benchmark suite– Flexible parallel HDF5

– Portland group C, Fortran 90 and C++ compilers– Quite a bit of Fortran work

- 39 - HDFHDF

Next major release -- HDF5 1.6Next major release -- HDF5 1.6

• Testing (several)• Special platforms

– PSC cluster– Cray– Windows XP– Mac– Several new compilers (e.g. Intel, Portland Group)

• Documentation– New User’s Guide-good draft, first version

- 40 - HDFHDF

HDF5 High Level APIs – HDF5 High Level APIs – HDF5 ImageHDF5 Image

• For datasets to be interpreted as images/palettes– 2-D raster data like HDF4 raster images

• Image operations– Create, write, read, query

• Based on “HDF5 Image & Palette Specification”

- 41 - HDFHDF

HDF5 High Level APIs – HDF5 TableHDF5 High Level APIs – HDF5 Table

• For datasets to be interpreted as “tables”– A collection of records– All records have the same structure – Like Vdatas in HDF4, but more operations

• Table operations– Create, write, read, query– Insert, delete records or fields– Future: sort and search– Includes the following new Table functions:

- 42 - HDFHDF

HDF5 High Level APIs – HDF5 TableHDF5 High Level APIs – HDF5 Table

• For datasets to be interpreted as “tables”– A collection of records– All records have the same structure – Like Vdatas in HDF4, but more operations

• Table operations– Create, write, read, query– Insert, delete records or fields– Later: sort and search

- 43 - HDFHDF

HDF5 High Level API – FutureHDF5 High Level API – Future

• Dimension scales– Similar to HDF4– In progress

• More table operations– sort and search

• Unstructured grids– E.g. triangle mesh

- 44 - HDFHDF

Szip Compression SoftwareSzip Compression Software

• Implements CCSDS lossless compression algorithm• Fast compression method for EOS data• Expect to include in next releases of HDF4 and HDF5

– HDF4: compress SDS and image– HDF5: compress datasets

• Intellectual property issues– Owned by U of Idaho (formerly U of New Mexico)– Open source– No commercial of encoder use without license– Decoder free for everyone

- 45 - HDFHDF

Performance study – BZIP2 compressionPerformance study – BZIP2 compression

• Goal: decide whether or not to support bzip2 compression• Compared bzip2 and gzip • Observations

– Bzip2 always better than gzip in compression ratio – But the difference was just a few percentage points– And bzip2 always takes more processing time, especially for

decoding

• Result– Not currently supported in the release– But BZIP2 can be used with HDF5 (checked with HDF5-1.4.4)

• http://hdf.ncsa.uiuc.edu/HDF5/papers/bzip2/

- 46 - HDFHDF

New HDFView featuresNew HDFView features

• Display palette in graph as separate RGB lines.

• Open file as read-only option• Create new array from old array• Import data from text file• Save to HDF4, HDF5 or binary• Create new image from subset of

existing image • Modify string-type dataset content• Convert jpeg to HDF image• Convert HDF to jpeg image• More user options and well

organized GUI

• Select vdata or compound datatype by field

• Select subset from preview image and using mouse

• Support unlimited dimension when creating new HDF4 dataset.

• Enable application of simple math calculations to data

• Support multiple palettes/image• Create new image with default

attributes• Modify image palette or select

predefined palette

- 47 - HDFHDF

Java Native

Interface

HDF Libraryand File

H5view,etc

C

Java

Java

C

Applet

Java

CORBAServer

OtherApp. Other

App.

C++

Any

Any

Client/Remote Server/Local

Java Server

Platform

Webbrowser

HTML

XML

Java

Any

Java Native

Interface

Java C

Distributed Product

Demonstrated in Research

Should work, but not demonstrated

CORBA, XML etc. permutationsCORBA, XML etc. permutations

- 48 - HDFHDF

Local Equatorial

Crossing Time

05301330

0930

METOP

NPOESS

NPOESSLite

NPOESS

Local Equatorial

Crossing Time

METOP

0530

08301330

0930

DMSP

DMSP

POES

National Polar-orbiting Operational Environmental National Polar-orbiting Operational Environmental Satellite System (NPOESS)Satellite System (NPOESS)

Today• 4-Orbit System

– 2 US Military– 2 US Civilian

U.S. civil and defense programs to combine weather data collection, U.S. civil and defense programs to combine weather data collection, expanding to global coverage and long-term continuity of observations at expanding to global coverage and long-term continuity of observations at

less cost!less cost!

Local Equatorial

Crossing Time

0530

07301330

0830

DMSP

DMSP

POES

POES

Distribute in HDF5Distribute in HDF5

Tomorrow (2005)2 US Military1 US Civilian1 EUMETSAT/METOP

Future (2013)2 US Converged1 US “Lite”1 EUMETSAT/METOPSpecialized Satellites