Marcel Ritter , Werner Benger , Joseph Stoeckl , Donna Delparte , Mike Folk, Quincey Koziol,

40
CROSS DISCIPLINARY APPLICATIONS OF MULTIPLE OBSERVATIONAL AND COMPUTATIONAL DATASETS USI FOR ARCHIVING AND HIGH PERFORMANCE PROCESSING. Marcel Ritter, Werner Benger, Joseph Stoeckl, Donna Delparte, Mike Folk, Quincey Koziol, Frank Steinbacher and Markus Aufleger HDF5 Center for Computation & ASTRO@UI BK

description

HDF5. Cross Disciplinary Applications of Multiplex Observational and Computational Datasets using for Archiving and High Performance Processing. Marcel Ritter , Werner Benger , Joseph Stoeckl , Donna Delparte , Mike Folk, Quincey Koziol, Frank Steinbacher and Markus Aufleger . - PowerPoint PPT Presentation

Transcript of Marcel Ritter , Werner Benger , Joseph Stoeckl , Donna Delparte , Mike Folk, Quincey Koziol,

Page 1: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

CROSS DISCIPLINARY APPLICATIONS OF MULTIPLEX OBSERVATIONAL AND COMPUTATIONAL DATASETS USING

FOR ARCHIVING AND HIGH PERFORMANCE PROCESSING.

Marcel Ritter, Werner Benger, Joseph Stoeckl, Donna Delparte, Mike Folk, Quincey Koziol,

Frank Steinbacher and Markus Aufleger

HDF5

Center for Computation & Technology

ASTRO@UIBK

Page 2: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

Outlook

• Motivation• Requirements on a Data Format• Introduction HDF5 • F5– Introduction– Examples of Data Sets

• Application Example: – The Hawaiian Geospatial Data Repository

• Conclusion

Page 3: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

Motivation

Workgroup A Workgroup B

Scientific Collaboration

Page 4: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

Motivation

Workgroup A Workgroup B

Software Tool 1

Software Tool 2

File Format 2

Scientific Collaboration

File Format 1

Page 5: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

Motivation

Workgroup A Workgroup B

Software Tool 1

Software Tool 2

File Format 2File Format

1

Data Exchange

Page 6: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

Motivation

File Format 2

File Format 1

File Format 3

File Format 4

File Format 5

…File Format N

Page 7: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

Motivation

File Format 2

File Format 1

File Format 3

File Format 4

File Format 5

…File Format N

Huge Implementation Effort

o(N2)

Page 8: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

Motivation

File Format 2

File Format 1

File Format 3

File Format 4

File Format 5

…File Format N

Common Data

Format

Less Implementation Effort o(N)

Page 9: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

Motivation

Workgroup A Workgroup B

Workgroup C Workgroup DSoftware 3

Software Tool 1

Software 4

Software Tool 2

Common Data

Format

Easier collaborationMore time for science

Page 10: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

Requirements on a Data Format

Easy access

Fast and efficient

Huge data(Terabytes)

Huge variety of

data

Self-descriptive

Well documented and user

community

Sustainable (>10 years)

!

Page 11: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

HDF5Hierarchical Data Format 5

http://www.hdfgroup.org/HDF5

Page 12: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

- A Few Analogies

• File system (in a file)• Binary XML file• PDF for numerical data• Database (container for

array variables)

HDF5

Page 13: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

- Relationships

lat | lon | temp----|-----|----- 12 | 23 | 3.1 15 | 24 | 4.2 17 | 21 | 3.6

/

SimOutCity A

Parameters10;100;1000

Timestep36,000

HDF5

Group

Dataset

Attribute

Relation

Page 14: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

-What Users Get…

• A multi-platform library and tools built on over 10 years experience in large data handling from the high performance computing community (HPC).

• A capability that:– Lets them organize large and/or complex collections of data– Gives them efficient and scalable data storage and access– Lets them integrate a wide variety of types of data and data

sources

– Guarantees long-term data integrity and preservation

HDF5

Page 15: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

• Shapefiles: HDF5 as container format

HDF5

Browser application

Page 16: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

• Shapefiles: HDF5 as container format

HDF5

Browser application

Vector dataPixel data

Attribute data

Page 17: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

- More ApplicationsHDF5

Billions of elements/dozens associated values

Earth Science (Earth Observing System) Big simulations

Movie Making

Flight Testing

Aqua (6/01)

AuraTES

HRDLSMLS OMI

Terra CERES MISR

MODISMOPITT

AquaCERES

MODIS

AMSR

Page 18: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

HDF5• More than a ZIP or TAR• also allows to describe the

structure of the contents of a file

• How to store different kinds of data sets consistently in HDF5?

Page 19: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

F5Fiber Bundle Data Model

http://www.fiberbundle.net

Page 20: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

• Based on HDF5• Inspired by concepts of:

– Topology– Differential Geometry – Geometric Algebra

• Separation of Geometry (Grids) and Datafield (Fields)

F5

Grid

Field

Page 21: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

• Hierarchical Structure:

F5

Fiber Bundle

Time Slice

Grid

Topology

Coordinates

Field

Page 22: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

F5

Visible to the end user

• Hierarchical Structure:

Fiber Bundle

Time Slice

Grid

Topology

Coordinates

Field

Page 23: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

Fiber: 0D 1D 3D 6D BA

SE:

3D

2D

1

D

0

D

Page 24: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

• Multi Channel – Multi Resolution Images:

F5

Page 25: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

• Multi Channel – Multi Resolution Images:

F5

Time Grid Topology Representation Field [Datatype]/1.4/Satellite/VertexRefinement1x1/Cartesian/Positions [uniform-grid]

/RGB [byte,byte,byte] /N-IR [float64] /T-IR [float64]

/VertexRefinement2x2/Cartesian/Positions /RGB “ /N-IR

/T-IR/1.6/ …

Page 26: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

• Full Waveform LIDAR:

F5

t1 t2 t3

t_emission

Page 27: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

F5

Time Grid Topology Representation Field [Datatype]/CorseTime/LASER/POINTS/CartesianCoords/Positions [point3D]

/TimeStamp [float64] /Waveform [uint16,uint16] /Reflectance [float32]

/SHOTS /SHOTSAsPOINTS/Positions vlen[uint32] /Origin [point3D] /Direction [vector3D]

/EmissionTime [float64]

• Full Waveform LIDAR: - Laser Data

t1 t2 t3

t_emission

Page 28: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

• Full Waveform LIDAR: - Airplane Data

F5

/CorseTime/PLANE/POINTS/CartesianCoords/Positions [point3D] /Rotation [rotor3D] /TimeStamps [float64]

Page 29: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

• Bringing together in F5:– Satellite data– LIDAR– Shapefiles

F5

• Features of HDF5• Sustainable storage• Meta data• Compression• Parallel IO• Hyperslab access

• Consistent data organization of simple and complex spatial-temporal data

• Handle time series of data easily

• Make tools of other disciplines applicable to the Geo-science Community, such as astrophysics imaging mosaic tools for satellite data: Montage, http://montage.ipac.caltech.edu

Benefits

Page 30: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

HAWAIIAN DATA REPOSITORY

http://www.epscor.hawaii.edu

Application Example

Page 31: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

HAWAIIAN DATA REPOSITORY

Centralized integrative capability to store and manage access to massive (terabytes) research datasets

Goal:

Users:University of Hawaii

research teamsBroad statewide

research community

Objectives:

Collect, store and manage access to data

Utilize user portals

Utilize and link to the Maui High Performance Computing Center

(MHPCC)

Discovery, manipulation, fusion and visualization

Mission:

Page 32: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

Geospatial Information and Mass Storage

Page 33: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

How to manage and store large complex datasets?!!

Geospatial Information and Mass Storage

Page 34: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

Geospatial Information and Mass Storage

HDF5

Page 35: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

Geospatial Information and Mass Storage

F5

HDF5

Page 36: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

CONCLUSION• A common data format eases and

reduces wasted time spent on data conversions

• Data formats for sustainable transparent storage of huge and complex data exist, one just has to use them –

• captures observational and simulation data consistently.

• Geoscience repositories, such as the

can be built upon this format.

F5

HDF5

COLLABORATIONS

HAWAIIAN DATA REPOSITORY

Page 38: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

- HDFViewHDF5

screenshot of shapefiles

Page 39: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

Geospatial Information and Mass Storage

• Weather station data• Marine buoy sensor data• GPS data collection• Database datasets, excel files• Spatial data - imagery, LiDAR, GIS

• Geoweb application services – WMS, WFS, WPC

• Database management • Data streaming• Data storage of statewide datasets

• Upload and download capability• Metadata search capacity• Visualization of spatial and non-

spatial datasets

• Access to HPC services • real-time modeling and analysis

Page 40: Marcel Ritter , Werner  Benger , Joseph  Stoeckl ,  Donna  Delparte , Mike Folk,  Quincey  Koziol,

• Grid– Manifold describing the base space

• Topology• Refinement level• Coordinate representation• Vertex positions in representation

F5