Visualisation of Big Imaging Data
-
Upload
slava-kitaeff-phd -
Category
Documents
-
view
61 -
download
1
Transcript of Visualisation of Big Imaging Data
Contents
• Pawsey, ICRAR, computers and telescopes • Astronomy image formats and visualisation
software • The era of Big Data in astronomy • JPEG2000 and JPIP • SkuareView – new astronomy remote
visualisation framework and tool • Demo
2
Pawsey Supercomputing Centre
4
Pawseyisthegovernment-supportedhigh-performancecompu7ngna7onalfacility(Perth,WesternAustralia)thatsupportsresearchersinWesternAustraliaandacrossAustraliathroughprovidingtheinfrastructureforthecomputa7onalresearchworkflows.Thisincludes• supercomputers• cloudcompu7ng• datastorage• visualisa7on
HPC@Pawsey
5
• 48bladesx4nodesx2CPUs(IntelXeonE5-2690V3“Haswell”2.6GHz)x12-cores=35,712cores
• 1.1PetaFLOP• Interconnect-CrayAries• Localstorage–3PBCraySonexion1600
Lustreappliance
Magnus
#41inTOP500(November2014)
HPC@Pawsey
6
• 118computeblades,eachofwhichhasfournodes
• Eachnodesupportstwo,10-coreIntelXeonE5-2960V2“IvyBridge”processorsopera7ngat3.00GHz
• Totalof9,440cores• ~200TeraFLOPSofcomputepower.• Interconnect-CrayAries
Galaxy
Zeus&Zythos
• 39nodesinvariousconfigura7ons• Zythosisthelargestnode:SGIUV2000
systemwith6TBsharedmemory,264IntelXeonprocessorcoresand4NVIDIAK20GPUs.
NeCTAR l NeCTAR (National eResearch
Collaboration Tools and Research)
l NeCTAR is an Australian Government project to build infrastructure specifically for the needs of Australian researchers
l NecTAR is a $47 million dollar, Australian Government project conducted as part the Super Science initiative and financed by the Education Investment Fund
l NeCTAR has built: l New virtual laboratories l A research cloud l eResearch tools l Hosting services
RDS/RDSI
Australian National Data Storage
InPerth~4Petabytesdiskstorage(GPFS),plus>35PerabyteDMFTapestorage
ICRAR
• The International Centre for Radio Astronomy Research is a collaborative centre that is international in scope and that achieves research excellence in astronomical science and engineering.
• ICRAR is an equal joint venture between Curtin University and The University of Western Australia with funding support from the State Government of Western Australia
9
Astronomy Visualisation
Astronomy datasets are n-dimensional • An electro-magnetic wave is described by Amp(RA, DEC, spectral/velocity/energy, polarization, time [phase]) • Project a n-dimensional object on a 2-dimensional plane • Add other dimensions through other means
• No other dimensions: projection of data, slices • Time/movies • Projection can also show combinations of dimensions, rotation
of cubes, volume rendering/opaqueness • Collapse can be in different ways, e.g. moment maps, peak
flux maps, medians, etc. • Can be combined, e.g. brightness/hue • Contours, markers, vectors • Polarization is used e.g. in 3d-movies
12
Astronomy Image Formats
• Flexible Image Transport System (FITS) • CASA Measurement Set (and Image Tables) • HDF5 (LOFAR) • others
• No PNG, JPG, TIFF etc, as they are poor in metadata handling
16
Spectral-imaging data-cube
17
• RightAscension• Declina7on• Velocity(frequency/wavelength)• Polarisa7on• SkyModel• Beammap
19
Cosmos HI Large Extragalactic Survey (CHILES)
19
VLAinBarrayandcoveringaredshilrangefromz=0toz=0.45
Fulldata-cubeis500GB
HDD capacity
Moore’s law for HDD
• ~10 times every 5 years • 10TB HDD in a today’s desktop • 100TB HDD/SSD/(?) by 2020
Another problem: Network speed
• Moore’s law for network I/O • ~10 times every 6-10 years • 1-10Gb in desktop/server today • 100Gb by ~2025
Capacity(G
b/s)
0.001
0.010
0.100
1.000
10.000
100.000
1990 1995 2000 2005 2010 2015 2020 2025
Downloadof22TBSKAdata-cubeToday• at1Gb/s:~60hoursby2020• at10Gb/s:~6hours
Thedataislikelyhastostayinthearchive,andweneedtobeabletoworkwithitremotely.
24
“Must have’s” to enable SKA scale visualisation
24
• Remote visualisation from archive or cluster
• Multiple representations of data
• Entirely different data organisations
“Must have’s” to enable visualisation
• Multiple resolutions without penalties
• Lossles & lossy compression to save the bandwidth
• Steaming progressively instead of cutting out
• Comprehensive support for metadata
25
Distributed client-server architecture
31
SkuareViewClient
SkuareViewClient
SkuareViewServer
JPXcomponent
JPXcomponent
SkuareViewServer
JPX JPX
Proxy
JPXmerger
JPEG 2000 Key Benefits
Superior compression performance (CDF 5/3 for lossless and CDF 9/7 for lossy compression) at low computational requirements. Availability of multi-component transforms including arbitrary wavelet transforms, arbitrary linear transforms (e.g., KLT, block-wise KLT, etc.) with both reversible and irreversible versions. Superior compression efficiency and graceful degradation (no blocking artifacts, visually lossless compression)
http://www.aware.com/biometrics
100:1 JPEG 2000 100:1 JPEG
AstroHPC’12, June 19, 2012, Delft, The Netherlands
PERFORMANCE AND EFFICIENCY
JPEG 2000 Key Benefits
SCALABILITY: MULTIPLE VERSIONS OUT OF A SINGLE COMPRESSED IMAGE
• Multiple fidelity/resolution representation. • Progressive transmission/recovery by fidelity or
resolution. • Several mechanisms to support spatial random
access image regions at varying degrees of granularity.
• Easy proxy generation. • Bandwidth optimization and adaptive
transmission (only what’s needed) • Different parts of the same image can be stored
using different quality (e.g. ROI at highest quality).
33 AstroHPC’12, June 19, 2012,
Delft, The Netherlands
LOW QUALITY AREA
JPEG 2000 Key Benefits
FORMAT AND ACCESS
• Support of volumetric image cubes through JP3D and 3D volumetric compression.
34
Bruylants et al, 2007
AstroHPC’12, June 19, 2012, Delft, The Netherlands
35
JPEG 2000 Key Benefits
35
• Store existing metadata headers – FITS – WCS
• Provenance • Cataloguing
– Supports complex geometries
– Comments/labels/links to other files
Powerful metadata support
Almost any data can be compressed
Lossless
• FITS – 16.97MB • JPEG2000 – 1.68MB • Ratio 1:10
• FITS - 6.9 MB; • JPEG2000 - 2.3 MB • Ratio 1:3
Almost any data can be compressed
Most could be compressed lossely to least 1:20 ratio showing no visually noticeable degradation
Lossy(targetPSNR=44.5dB),ra7o1:20,Original
• 1:100s ratio can be achieved with adaptive quality
Lossy
What’s the damage if lossy?
39
• ~1:10–nodifferenceforgivenprecision (<0.1%,atquanta7sa7onstep10-4)
• ~1:20–visuallylossless• greatbenefitsforthesourcefinding
Data Reduction Pipeline
Rawdatafromantennas
Channeliza7on(PFB) Beam-forming Correla7on Calibra7on Imaging Cleaning
Spectral-imagingdata-cube
Polariza7onMap
Con7nuumMap
Catalogues
Process#1
940–944MHz
MS
FITS
JPX
Processes#2…119
4MHzchunks
MS
FITS
JPX
Process#120
1416-1420MHz
MS
FITS
JPX
Cube.jpx
Cluster/Cloud
SkuareView implements
42
SkuareViewClient
SkuareViewClient
SkuareViewServer
JPXcomponent
JPXcomponent
SkuareViewServer
JPX JPX
Proxy
JPXmerger
Demo
The data is in AWS (US, Oregon) 1) MWA GLEAM: rgb_map_hp_trim.jpx (167MB, raw data ~769M) 2) CHILES in AWS (500 GB as FITS, ~120GB as JPX) Data cubes: 120 chunks (4MHz, 256 channels) Full data cube: cube.jpx (virtually joined chunks, 25088 channels (5 chunks are being reprocessed)
44