NASA HDF/HDF-EOS Data Access Challenges
-
Upload
the-hdf-eos-tools-and-information-center -
Category
Technology
-
view
816 -
download
5
Transcript of NASA HDF/HDF-EOS Data Access Challenges
www.hdfgroup.org
The HDF Group
ESIP 2013 Summer Meeting 1
NASA HDF/HDF-EOS Data Access Challenges
H. Joe Lee ([email protected])Kent Yang ([email protected])
The HDF Group
July 9, 2013
www.hdfgroup.org
Hal Varian, Google’s chief economist
ESIP 2013 Summer Meeting 2July 9, 2013
“The ability to take data – to be able to understand it, to process it, to extract value from it, to visualize
it, to communicate it – that’s going to be a hugely important skill in the next decades.”
www.hdfgroup.org
For Earth Science Data Users
ESIP 2013 Summer Meeting 3July 9, 2013
The ability to take NASA HDF/HDF-EOS data – to be able to understand it, to process it, to extract value from it, to visualize it, to
communicate it – that’s a hugely important skill right now.
www.hdfgroup.org
Is it easy to take NASA HDF data?
ESIP 2013 Summer Meeting 4July 9, 2013
No, for Average Joe data user.
www.hdfgroup.org
Understand
ESIP 2013 Summer Meeting 5July 9, 2013
“I'm new to IDL and HDF; and I'm currently working with MODIS L1B data. I found your examples very helpful. Is
it possible to show how radiance is calculated?”
www.hdfgroup.org
Process
ESIP 2013 Summer Meeting 6July 9, 2013
“I work in NASA/GSFC GES-DISC on AIRS
project. We have new idl version 8.1.
But got a core dump error when we run EOS function EOS_SW_INQSWATH to
inqure swath name from a AIRS level 2 product file. Need your help. Thanks.”
www.hdfgroup.org
“Hi,I want to use the following TRMM data , http://mirador.gsfc.nasa.gov/...2A25....Can you provide me some programs that deal with these datasets so that I can obtain the
daily convective
precipitation in the region 110-180E,0-40N during 2006?”
Extract Values
ESIP 2013 Summer Meeting 7July 9, 2013
www.hdfgroup.org
Visualize
ESIP 2013 Summer Meeting 8July 9, 2013
“Can you please make the matlab file for
reading ozone hdf5 files obtained from
mls available to the public. I wanted to obtain ozone distributionover the world and ozone distributions with height etc. thank you :)….oh can you tell me which function can i use to plot latitude in the x-axis, pressure in the y-axis and a contour plot of ozone over it?”
www.hdfgroup.org
Communicate
ESIP 2013 Summer Meeting 9July 9, 2013
“Your prog is very helpful to verify my process. I have one more doubt. I am trying to
convert this hdf to Geotiff using Matlab. Do have any written code to do the same. Doing it with HEG tool given an error specifying that 5D are only supported for SOM projections. Also I am doing all processing with Matlab. So could you pl. help me.”
www.hdfgroup.org
NASA HDF Users See Challenges
ESIP 2013 Summer Meeting 10July 9, 2013
in accessingsatellite-product-specific
(MODIS, AIRS, MLS)
geo-location/time-specific (lat/lon/height/year)
data with their favorite software packages (MATLAB/IDL/ArcGIS).
www.hdfgroup.org
What Makes Access Challenging?
ESIP 2013 Summer Meeting 11July 9, 2013
1. Some files use the techniques that end users may
not be familiar with, although the techniques may
help storing data efficiently.
2. Information from a source outside the files is
required to retrieve the data in a physically
meaningful manner.
3. Attributes do not comply with the widely used
conventions.
4. Metadata in HDF file has incorrect information.
www.hdfgroup.org
Converted File Size Comparison
July 9, 2013 ESIP 2013 Summer Meeting 12
72M
128M
656M
HDF-EOS2
Netcdf-4
Netcdf-3
9X
www.hdfgroup.org
Challenge 1: Unfamiliar Techniques
ESIP 2013 Summer Meeting 13July 9, 2013
Users look for Latitude/Longitude datasets that match
variable (e.g., Ozone) datasets.
Some HDF products have
• mismatched lat/lon.
• lat/lon information in metadata attribute.
• duplicate lat/lon information.
www.hdfgroup.org
Swath Dimension Map Example
ESIP 2013 Summer Meeting 14July 9, 2013
HDF-EOS Swath Dimension Map allows to have
mismatched size in dimensions.
• Latitude[512][512]
• Longitude[512][512]
• Data[1024][1024]
www.hdfgroup.org
NSIDC AMSR_E NCL Example
ESIP 2013 Summer Meeting 15July 9, 2013
; Read the file as HDF4 file to obtain dataset attributes. hdf4_file = addfile("AMSR_E_L3_WeeklyOcean_V03_20020616.hdf", "r")
; Read the file as HDF-EO2 file to obtain lat and lon. hdf-eos2_file = addfile("AMSR_E_L3_WeeklyOcean_V03_20020616.hdf.he2", "r")
User should call both HDF4 and HDF-EOS2 API:
• HDF4 API alone cannot resolve lat/lon.
• HDF-EOS2 API alone cannot retrieve some attributes
that are added later by HDF4 APIs.
www.hdfgroup.org
Challenge 2: Information Outside HDF
ESIP 2013 Summer Meeting 16July 9, 2013
Users must read data product manual to find
• fill value / valid ranges
• units or discrete key values
• scale / offset equation
• physical description of data
Some products are not self-describing!
www.hdfgroup.org
Without Information Outside HDF
ESIP 2013 Summer Meeting 17July 9, 2013
www.hdfgroup.org
With Information Outside HDF
ESIP 2013 Summer Meeting 18July 9, 2013
www.hdfgroup.org
Challenge 3: The CF Conventions
ESIP 2013 Summer Meeting 19July 9, 2013
Following the widely accepted CF conventions is
important for interoperability but some HDF products
• use non-alphanumeric characters.
• use non-CF attribute names and values.
• use non-CF scale / offset rules.
• use different data type for attribute (e.g.,
_FillValue) from the variable.
www.hdfgroup.org
Attribute Type Mismatch Example
ESIP 2013 Summer Meeting 20July 9, 2013
Int16 data[180][360] // Variable
String valid_range “0,100” // Attribute (Wrong)
Byte _FillValue 255 // Attribute (Wrong)
Int16 data[180][360] // Variable
Int16 valid_range 0,100 // Attribute (Correct)
Int16 _FillValue 255 // Attribute (Correct)
www.hdfgroup.org
Challenge 4: Incorrect Information
ESIP 2013 Summer Meeting 21July 9, 2013
Sometimes, metadata contains incorrect information.
This is rare and such information is usually corrected
immediately by data producers.
www.hdfgroup.org
Incorrect Information Example
ESIP 2013 Summer Meeting 22July 9, 2013
An NCL user reported that the same code doesn’t work
for an older MOP02 HDF-EOS5 file.
In 2008/01/01 file, StructMetadata has the wrong value:
nTime = 250841130416
In 2008/12/31 file, StructMetadata has the correct value:
nTime= 2
LaRC ASDC fixed this already!
www.hdfgroup.org
Good News
ESIP 2013 Summer Meeting 23July 9, 2013
The recent effort from The HDF Group overcomes many
challenges:
• HDF4/HDF5 OPeNDAP Handler with EnableCF option
• H4CF Conversion Toolkit with NcML / NCO examples
• HDF-EOS5 Augmentation Tool
• HDF-EOS2 Dumper tool with Comprehensive
Examples for MATLAB/IDL/NCL
The above tools and their examples are available at
HDFEOS.org.
www.hdfgroup.org
Challenge 1: Unfamiliar Techniques
ESIP 2013 Summer Meeting 24July 9, 2013
HDF OPeNDAP handlers & H4CF Conversion Toolkit
• provide full geo-location information as explicit datasets.
HDF-EOS5 Augmentation Tool
• provides ways to associate geo-location information with
existing datasets or to supply new ones.
HDF-EOS2 Dumper Tool
• prints out geo-location information in ASCII because
MATLAB/IDL/NCL can read ASCII text data.
www.hdfgroup.org
Challenge 2: Information Outside HDF
ESIP 2013 Summer Meeting 25July 9, 2013
HDF OPeNDAP handlers
• provide fill value / valid range information.
• apply CF scale / offset rule.
• calculate latitude and longitude values for some NASA
non-EOS products.
• are tested against ncml_handler so that data centers
can add additional information using NcML.
H4CF Conversion Toolkit (h4tonccf)
• provides NcML and NCO examples to add or edit
attributes for converted NetCDF files.
www.hdfgroup.org
Challenge 3: The CF Conventions
ESIP 2013 Summer Meeting 26July 9, 2013
HDF OPeNDAP handlers & H4CF Conversion Toolkit
• flatten group hierarchies.
• change variable & attribute types, names, and values.
• add named dimensions.
• add coordinate information.
www.hdfgroup.org
Challenge 4: Incorrect Information
ESIP 2013 Summer Meeting 27July 9, 2013
HDF OPeNDAP handlers & H4CF Conversion Toolkit
• correct errors for old products temporarily.
• catch errors for new products.
www.hdfgroup.org
Better News
ESIP 2013 Summer Meeting 28July 9, 2013
We see less and less challenges in newer HDF products
thanks to open communication and standardization effort
among Earth Science communities through meetings,
telecons, and mailing lists.
• HDF – DAACs Telecons
• ESDSWG – H5CF Conventions
• ESIP
• CF (satellite) conventions mailing lists
www.hdfgroup.org
Future Challenges
ESIP 2013 Summer Meeting 29July 9, 2013
• Data Discovery
• Subsetting and Aggregation
• Sharing Research Data
www.hdfgroup.org
Data Discovery
ESIP 2013 Summer Meeting 30July 9, 2013
Some users still don’t know how to search and where
to download data.
Spatial search in Reverb doesn’t guarantee that the
matched HDF data files contain the valid values at
the specific location that user is looking for.
Browse image is helpful but users don’t want to
examine one by one.
www.hdfgroup.org
Reverb Browse Image for O3 at Seoul
ESIP 2013 Summer Meeting 31July 9, 2013
The returned HDF file has no value at Seoul
www.hdfgroup.org
Subsetting and Aggregation
ESIP 2013 Summer Meeting 32July 9, 2013
Customized on-demand HDF product generation is
desired based on the user’s query. For example,
“Give me all L2 Ozone data at Seoul from 2002 to 2013
and allow me to download it as a single HDF file.”
Most HDF data products are packaged in daily granule
for large region. Search result returns thousands of HDF
files and users cannot download them one by one.
www.hdfgroup.org
Reverb Query Result for AIRS at Seoul
ESIP 2013 Summer Meeting 33July 9, 2013
Showing 1 to 9 of 5,047 granules
www.hdfgroup.org
Sharing Research Data
ESIP 2013 Summer Meeting 34July 9, 2013
How can users easily compose and publish new
research data from the different NASA data product
sources?
“I’d like to combine AIRS Ozone and OMI Ozone data
at Seoul from 2002-2013 and share it with journal
editors.”
Can this be shared as a single URL query to NASA
data cloud?
www.hdfgroup.org
Acknowledgements
ESIP 2013 Summer Meeting 36July 9, 2013
This work was supported by Subcontract number 114820 under Raytheon Contract number NNG10HP02C, funded by the National Aeronautics and Space Administration (NASA) and by cooperative agreement number NNX08AO77A from the NASA. Any opinions, findings, conclusions, orrecommendations expressed in this material are those of the authors and do not necessarily reflect the views of Raytheon or the National Aeronautics and Space Administration.