Visualisation of Big Imaging Data

46
Visualisation of Big Imaging Data: Radio Astronomy Case Slava Kitaeff

Transcript of Visualisation of Big Imaging Data

Visualisation of Big Imaging Data:

Radio Astronomy Case

Slava Kitaeff

Contents

•  Pawsey, ICRAR, computers and telescopes •  Astronomy image formats and visualisation

software •  The era of Big Data in astronomy •  JPEG2000 and JPIP •  SkuareView – new astronomy remote

visualisation framework and tool •  Demo

2

3 3

Pawsey, ICRAR, computers and telescopes.

Pawsey Supercomputing Centre

4

Pawseyisthegovernment-supportedhigh-performancecompu7ngna7onalfacility(Perth,WesternAustralia)thatsupportsresearchersinWesternAustraliaandacrossAustraliathroughprovidingtheinfrastructureforthecomputa7onalresearchworkflows.Thisincludes•  supercomputers•  cloudcompu7ng•  datastorage•  visualisa7on

HPC@Pawsey

5

•  48bladesx4nodesx2CPUs(IntelXeonE5-2690V3“Haswell”2.6GHz)x12-cores=35,712cores

•  1.1PetaFLOP•  Interconnect-CrayAries•  Localstorage–3PBCraySonexion1600

Lustreappliance

Magnus

#41inTOP500(November2014)

HPC@Pawsey

6

•  118computeblades,eachofwhichhasfournodes

•  Eachnodesupportstwo,10-coreIntelXeonE5-2960V2“IvyBridge”processorsopera7ngat3.00GHz

•  Totalof9,440cores•  ~200TeraFLOPSofcomputepower.•  Interconnect-CrayAries

Galaxy

Zeus&Zythos

•  39nodesinvariousconfigura7ons•  Zythosisthelargestnode:SGIUV2000

systemwith6TBsharedmemory,264IntelXeonprocessorcoresand4NVIDIAK20GPUs.

NeCTAR l  NeCTAR (National eResearch

Collaboration Tools and Research)

l  NeCTAR is an Australian Government project to build infrastructure specifically for the needs of Australian researchers

l  NecTAR is a $47 million dollar, Australian Government project conducted as part the Super Science initiative and financed by the Education Investment Fund

l  NeCTAR has built: l  New virtual laboratories l  A research cloud l  eResearch tools l  Hosting services

RDS/RDSI

Australian National Data Storage

InPerth~4Petabytesdiskstorage(GPFS),plus>35PerabyteDMFTapestorage

ICRAR

•  The International Centre for Radio Astronomy Research is a collaborative centre that is international in scope and that achieves research excellence in astronomical science and engineering.

•  ICRAR is an equal joint venture between Curtin University and The University of Western Australia with funding support from the State Government of Western Australia

9

ASKAP/MWA/SKA

10 2013 Harley Wood Winter School,

11 11

Astronomy Image Formats and

Visualisation Software

Astronomy Visualisation

Astronomy datasets are n-dimensional •  An electro-magnetic wave is described by Amp(RA, DEC, spectral/velocity/energy, polarization, time [phase]) •  Project a n-dimensional object on a 2-dimensional plane •  Add other dimensions through other means

•  No other dimensions: projection of data, slices •  Time/movies •  Projection can also show combinations of dimensions, rotation

of cubes, volume rendering/opaqueness •  Collapse can be in different ways, e.g. moment maps, peak

flux maps, medians, etc. •  Can be combined, e.g. brightness/hue •  Contours, markers, vectors •  Polarization is used e.g. in 3d-movies

12

SAOImage

13

Alladin Sky Atlas - Lite

14

NRAO casaviewer

15

Astronomy Image Formats

•  Flexible Image Transport System (FITS) •  CASA Measurement Set (and Image Tables) •  HDF5 (LOFAR) •  others

•  No PNG, JPG, TIFF etc, as they are poor in metadata handling

16

Spectral-imaging data-cube

17

•  RightAscension•  Declina7on•  Velocity(frequency/wavelength)•  Polarisa7on•  SkyModel•  Beammap

18

Neutral Hydrogen (HI) in Universe

18

19

Cosmos HI Large Extragalactic Survey (CHILES)

19

VLAinBarrayandcoveringaredshilrangefromz=0toz=0.45

Fulldata-cubeis500GB

20 20

The Era of Big Data in Astronomy

SKA1 data sizes/volumes

21

•  CHILEScube 0.5TB•  ASKAPDINGOcube 1TB

HDD capacity

Moore’s law for HDD

•  ~10 times every 5 years •  10TB HDD in a today’s desktop •  100TB HDD/SSD/(?) by 2020

Another problem: Network speed

•  Moore’s law for network I/O •  ~10 times every 6-10 years •  1-10Gb in desktop/server today •  100Gb by ~2025

Capacity(G

b/s)

0.001

0.010

0.100

1.000

10.000

100.000

1990 1995 2000 2005 2010 2015 2020 2025

Downloadof22TBSKAdata-cubeToday•  at1Gb/s:~60hoursby2020•  at10Gb/s:~6hours

Thedataislikelyhastostayinthearchive,andweneedtobeabletoworkwithitremotely.

24

“Must have’s” to enable SKA scale visualisation

24

•  Remote visualisation from archive or cluster

•  Multiple representations of data

•  Entirely different data organisations

“Must have’s” to enable visualisation

•  Multiple resolutions without penalties

•  Lossles & lossy compression to save the bandwidth

•  Steaming progressively instead of cutting out

•  Comprehensive support for metadata

25

Current formats and frameworks can’t do it!

26

One of few alternatives

27

28 Presentation Title (Edit in File > 'Page Setup’ > ‘Header/footer’) 28

JPEG2000 & JPIP technology

JPEG2000 encoding

29

JPIP – interaction protocol

30

Part9ofJPEG2000standard

Distributed client-server architecture

31

SkuareViewClient

SkuareViewClient

SkuareViewServer

JPXcomponent

JPXcomponent

SkuareViewServer

JPX JPX

Proxy

JPXmerger

JPEG 2000 Key Benefits

Superior compression performance (CDF 5/3 for lossless and CDF 9/7 for lossy compression) at low computational requirements. Availability of multi-component transforms including arbitrary wavelet transforms, arbitrary linear transforms (e.g., KLT, block-wise KLT, etc.) with both reversible and irreversible versions. Superior compression efficiency and graceful degradation (no blocking artifacts, visually lossless compression)

http://www.aware.com/biometrics

100:1 JPEG 2000 100:1 JPEG

AstroHPC’12, June 19, 2012, Delft, The Netherlands

PERFORMANCE AND EFFICIENCY

JPEG 2000 Key Benefits

SCALABILITY: MULTIPLE VERSIONS OUT OF A SINGLE COMPRESSED IMAGE

•  Multiple fidelity/resolution representation. •  Progressive transmission/recovery by fidelity or

resolution. •  Several mechanisms to support spatial random

access image regions at varying degrees of granularity.

•  Easy proxy generation. •  Bandwidth optimization and adaptive

transmission (only what’s needed) •  Different parts of the same image can be stored

using different quality (e.g. ROI at highest quality).

33 AstroHPC’12, June 19, 2012,

Delft, The Netherlands

LOW QUALITY AREA

JPEG 2000 Key Benefits

FORMAT AND ACCESS

•  Support of volumetric image cubes through JP3D and 3D volumetric compression.

34

Bruylants et al, 2007

AstroHPC’12, June 19, 2012, Delft, The Netherlands

35

JPEG 2000 Key Benefits

35

•  Store existing metadata headers –  FITS –  WCS

•  Provenance •  Cataloguing

–  Supports complex geometries

–  Comments/labels/links to other files

Powerful metadata support

Almost any data can be compressed

Lossless

•  FITS – 16.97MB •  JPEG2000 – 1.68MB •  Ratio 1:10

•  FITS - 6.9 MB; •  JPEG2000 - 2.3 MB •  Ratio 1:3

Almost any data can be compressed

Most could be compressed lossely to least 1:20 ratio showing no visually noticeable degradation

Lossy(targetPSNR=44.5dB),ra7o1:20,Original

•  1:100s ratio can be achieved with adaptive quality

Lossy

Almost any data can be compressed

Cosmological simulations

What’s the damage if lossy?

39

•  ~1:10–nodifferenceforgivenprecision (<0.1%,atquanta7sa7onstep10-4)

•  ~1:20–visuallylossless•  greatbenefitsforthesourcefinding

40 40

SkuareView New Astronomy Remote Visualisation

Framework and Tool

Data Reduction Pipeline

Rawdatafromantennas

Channeliza7on(PFB) Beam-forming Correla7on Calibra7on Imaging Cleaning

Spectral-imagingdata-cube

Polariza7onMap

Con7nuumMap

Catalogues

Process#1

940–944MHz

MS

FITS

JPX

Processes#2…119

4MHzchunks

MS

FITS

JPX

Process#120

1416-1420MHz

MS

FITS

JPX

Cube.jpx

Cluster/Cloud

SkuareView implements

42

SkuareViewClient

SkuareViewClient

SkuareViewServer

JPXcomponent

JPXcomponent

SkuareViewServer

JPX JPX

Proxy

JPXmerger

43

Interactive

CHILES talk-fest 19/04/2016 | JT Malarecki 43

Demo

The data is in AWS (US, Oregon) 1) MWA GLEAM: rgb_map_hp_trim.jpx (167MB, raw data ~769M) 2) CHILES in AWS (500 GB as FITS, ~120GB as JPX) Data cubes: 120 chunks (4MHz, 256 channels) Full data cube: cube.jpx (virtually joined chunks, 25088 channels (5 chunks are being reprocessed)

44

SkuareView Framework

45 AstroHPC’12, June 19, 2012, Delft, The Netherlands

Astronomy Data Services at Pawsey

46