Lossless Compression of Meteorological Data in GRIB Format R. Lorentz Fraunhofer Institute for...

13
Lossless Compression of Meteorological Data in GRIB Format R. Lorentz Fraunhofer Institute for Scientific Computation and Algorithms (SCAI) Germany

Transcript of Lossless Compression of Meteorological Data in GRIB Format R. Lorentz Fraunhofer Institute for...

Page 1: Lossless Compression of Meteorological Data in GRIB Format R. Lorentz Fraunhofer Institute for Scientific Computation and Algorithms (SCAI) Germany.

Lossless Compression of Meteorological Data in GRIB Format

R. LorentzFraunhofer Institute for

Scientific Computation and Algorithms (SCAI)Germany

Page 2: Lossless Compression of Meteorological Data in GRIB Format R. Lorentz Fraunhofer Institute for Scientific Computation and Algorithms (SCAI) Germany.

Seite 2Prof. Dr. Rudolph Lorentz, FhG-SCAI.NuSo

Data Compression

• What is it?

• What is it good for in the context of WIS? a) Reducing archive size b) Speeding up data transfer

• Who needs it? a) Archive of size above ~ 100 Terabyte b) Frequent transfer of large blocks of data ~ 1 GB

Page 3: Lossless Compression of Meteorological Data in GRIB Format R. Lorentz Fraunhofer Institute for Scientific Computation and Algorithms (SCAI) Germany.

Seite 3Prof. Dr. Rudolph Lorentz, FhG-SCAI.NuSo

Data Compression

Some numbers

Lossless Data compression:

e.g., Zip programs: compression factors

for text: 2 – 3

for simulation data: 1 – 1.2

Lossy data compression

e.g., Jpeg, Mpeg: compression factors

for pictures: ~ 10 – 100

(not suitable for floating point numbers)

Page 4: Lossless Compression of Meteorological Data in GRIB Format R. Lorentz Fraunhofer Institute for Scientific Computation and Algorithms (SCAI) Germany.

Seite 4Prof. Dr. Rudolph Lorentz, FhG-SCAI.NuSo

Data Compression

Disadvantages of data compression

1. Costs resources

compression and decompression take time, say 20 MB/sec for a 3 Ghz Linux PC

2. Software must be integrated into the production run

Page 5: Lossless Compression of Meteorological Data in GRIB Format R. Lorentz Fraunhofer Institute for Scientific Computation and Algorithms (SCAI) Germany.

Seite 5Prof. Dr. Rudolph Lorentz, FhG-SCAI.NuSo

Data Compression

Example

Compression of meteorological data for the German Weather Service

1. This is lossless compression of LME data in GRIB1 format

compression factor ~ 2,5

archive size: 3.5 Petabyte

2 Data on rectangular grids

3. Compression factor is most important

Page 6: Lossless Compression of Meteorological Data in GRIB Format R. Lorentz Fraunhofer Institute for Scientific Computation and Algorithms (SCAI) Germany.

Seite 6Prof. Dr. Rudolph Lorentz, FhG-SCAI.NuSo

Data Compression

Data Formats

Meteorological

• GRIB 1, 2: has built-in compression

• BUFR: has compression option

General purpose

• HDF5

• Netcdf

Page 7: Lossless Compression of Meteorological Data in GRIB Format R. Lorentz Fraunhofer Institute for Scientific Computation and Algorithms (SCAI) Germany.

Seite 7Prof. Dr. Rudolph Lorentz, FhG-SCAI.NuSo

Data Compression

GRIB1 Grid Types

1. Function values, rectangular grid

2. Function values, global triangular grid (GRIB2)

3. Function values, global Gaussian grid (topologically equivalent to a rectangular grid)

4. Function values, thinned Gaussian grid (global)

5. Spectral coefficients, both simple and complex packing

Page 8: Lossless Compression of Meteorological Data in GRIB Format R. Lorentz Fraunhofer Institute for Scientific Computation and Algorithms (SCAI) Germany.

Seite 8Prof. Dr. Rudolph Lorentz, FhG-SCAI.NuSo

Data Compression

How does lossless compression work?

For grid data:

Neighboring grid points have similar values => store only the differences

Heuristic conclusions:

• the higher the grid resolution, the better the compression

• the smoother the functions (observables) the better the compression

Page 9: Lossless Compression of Meteorological Data in GRIB Format R. Lorentz Fraunhofer Institute for Scientific Computation and Algorithms (SCAI) Germany.

Seite 9Prof. Dr. Rudolph Lorentz, FhG-SCAI.NuSo

Data Compression

Some Numbers

Computed with GRIBZip, a commercial program developed at SCAI

Average compression factors over all GRIB files of a forecast.

Rectangular grids, function values (LME model), resolution 7 km:

2D: K = 2.65

Rectangular grids, LMK model: resolution 2.8 km

2D: K = 2.75

Global triangular grids (GME model), resolution 40 km

2D: K = 2.38

Page 10: Lossless Compression of Meteorological Data in GRIB Format R. Lorentz Fraunhofer Institute for Scientific Computation and Algorithms (SCAI) Germany.

Seite 10Prof. Dr. Rudolph Lorentz, FhG-SCAI.NuSo

Data Compression

More Numbers

Calculated with experimental programs (work in progress):

• Gaussian grids, resolution 63 km (DKRZ, Max Planck Institute), K = 3.1

• Thinned Gaussian grids, resolution 39 km (ECMWF), K = 2.34

• Spectral data, simple packing, highest frequency 213 (DKRZ), K = 1.99

• Spectral data, complex packing (ECMWF), not possible??

Page 11: Lossless Compression of Meteorological Data in GRIB Format R. Lorentz Fraunhofer Institute for Scientific Computation and Algorithms (SCAI) Germany.

Seite 11Prof. Dr. Rudolph Lorentz, FhG-SCAI.NuSo

Data Compression

3D Compression

Compressing several layers of data together

Typical examples

Local grid: LME data

For 2D: K = 2.7

For 3d: K = 3.17

Global grid: GME data

For 2D: K = 1.97

For 3d: K = 2.59

XOne GRIB record

Several GRIB records

Page 12: Lossless Compression of Meteorological Data in GRIB Format R. Lorentz Fraunhofer Institute for Scientific Computation and Algorithms (SCAI) Germany.

Seite 12Prof. Dr. Rudolph Lorentz, FhG-SCAI.NuSo

Data Compression

3D Compression

Comments:

• 3D data (in GRIB format) is relatively hard to compress => 3D compression is particularly effective

• Improvement of the compression factor by 0.5 to 1.0

• Harder to implement

• Is it worth it? => depends on the proportion of 3D data.

Page 13: Lossless Compression of Meteorological Data in GRIB Format R. Lorentz Fraunhofer Institute for Scientific Computation and Algorithms (SCAI) Germany.

Seite 13Prof. Dr. Rudolph Lorentz, FhG-SCAI.NuSo

Data Compression

My Message

Compression is possible and saves resources

1. For archiving

2. When transferring data

Work initiated as a research cooperation between the DWD (German Weather Service) and SCAI.

Work done together with R. Iza-Teran, M. Rettenmeier.