Changing our understanding of the Universe · Changing our understanding of the Universe with the...

25

Transcript of Changing our understanding of the Universe · Changing our understanding of the Universe with the...

Changing our understanding of the Universe

with the SKA

Cradle of life

Cosmology

Cosmic dawn

Galaxy evolution

Cosmic magnetism

Fundamental physics

Transient sky

Courtesy: Loeb

Courtesy: Calcada Courtesy: Miville-Deschenes

Courtesy: SKAO

Courtesy: SKAO

CERN OpenStack DayMay 27, 2019 3

At the origin of the SKA concept:

neutral hydrogen (HI)

Light from galaxies:

visible & radio

Courtesy: Duc & Renaud

Courtesy: SKAO

CERN OpenStack DayMay 27, 2019 4

Galaxy evolution

TodaySKA

Sta

r fo

rmation

rate

Look back time (Gyr)

Courtesy: Volonteri

5

Cosmic DawnCourtesy: ESA

Courtesy: NASA

Courtesy: AAO

6

Fundamental physics

CERN OpenStack DayMay 27, 2019Courtesy: SKAO

Our Galaxy: an « instrument » to

detect gravitational waves!

CERN OpenStack DayMay 27, 2019 7

Kramer & Stappers 2015

Miller & Yunes 2019

8

Transients

CERN OpenStack DayMay 27, 2019

Courtesy: 1M2H/UC Santa Cruz and Carnegie

Observatories/Ryan Foley

Courtesy: Hallinan et al. 2017Relatively slow variability

Find (mostly) in images

Courtesy: R. Fender

Relatively slow variability

Find (mostly) in pulsar mode

(see after)

CERN OpenStack DayMay 27, 2019 9

Why with the SKA?

Today with the JVLA Tomorrow with SKA1

> 25 GHz

50 MHz

Dishes

Mid-Frequency

Aperture Arrays

Low-Frequency

Aperture Arrays

Courtesy: SKAO

CERN OpenStack DayMay 27, 2019 10

SKA Phase 1 (SKA1)SKA1-LOW

(AUS)

131,000 log

periodic

antennas

SKA1-MID

(ZA)

197 dishes

(15m)

Courtesy: SKAO

50 MHz 350 MHz 15 GHz

• Non-imaging (Tide Array Beams)

• Imaging

CERN OpenStack DayMay 27, 2019 12

Aperture*Synthesis*

V(u,v)*can*be*measured*on*a*discrete*number*of*points.*A*good*image*quality*requires*a*good*coverage*of*the*uv*plane.*We*can*use*the*earth*rotaHon*to*increase*the*uv*coverage*

Andrea*Isella***::****CASA*Radio*Analysis*Workshop****::****Caltech,**January*19,*2012*

SKA Observatory data products – data rates

• Image cubes (2 spatial dimensions, plus

radio spectral frequency, polarization)

– Each can be huge, typically minutes-to-hours integrated together

– High speed image plane searches

• Deep-cube: per 6 hours integration, O(50k x 50k) pixels, 50k channels,

4 polarisations: 5 Petabytes. 1.85 Tbits/s (20 x 100gbit/s links)

• Image plane searching: per 1 second, O(5k x 5k) pixels, 10 channels, 1 polarisation: each cube 25 Gbytes, 200 gbit/s

Observing modes

Courtesy: Condon & Ransom,

van Haarlem et al. 2013; ATNF &

SUT

CERN OpenStack DayMay 27, 2019 13

The SKA data challenge: a schematic view

SKA1-MID

~2 Pb/s

8.8 Tb/s

7.2 Tb/s

50 PFLOPS

250 PFLOPS

2 x 5 Tb/s

350 PB/telescope/yr

(could be a lot, lot, lot more)

Beam-

forming

Imaging &

Science

Pulsar search & Correlation

SKA1-LOW

CERN OpenStack DayMay 27, 2019 14

SKA Regional Centers

(SRC)

• SKA observatory responsible for all data products up to and including level 6

– Assume in baseline

• Regional Science Centres responsible for level 7

– Assume not in baseline

• SKA Observatory responsible for data distribution system

Enhanced data products e.g. Source identification and association

Validated science data products (released by Science Teams)

Calibrated data, images and catalogues

Visibility data

Correlator output

Beam-former output

ADC outputs

7

6

5

4

3

2

1

ST

ST

SKA

SKA

SKA

SKA

SKA

DefinitionLevel Responsibility

AENEAS 8 INFRASUPP-03-2016

Figure 3: Illustration of the proposed federated network of SKA Regional Centres (SRCs) distributed

around the world. These SRCs will provide access to the accumulated SKA science data for

communities in different regions and globally. The proposed European Science Data Centre (ESDC)

would serve as the European hub in such a network and support the European SKA community.

A European Science Data Centre for the SKA

The SKA Organisation (SKAO) is expected to adopt a tiered model for data and science support similar

to that employed for other successful large infrastructures in particular CERN. Storage and computing

resources associated with the operational SKA Observatory itself are expected to be highly constrained

in order to keep up with SKA operations. Any further processing and subsequent science extraction by

users will require significant, outside computing and storage resources in the form of SKA Regional

Centres. In this model, SKA Regional Centres will play a role analogous to CERN’s Tier 1 sites and

provide sufficient resources to store subsets of the SKA archive, support significant processing and post-

processing capability, and further distribute data to users and smaller Tier 2 sites. The specific

capabilities required by the SKAO of affiliated SRCs are still being defined; however, based on the

science drivers of the SKA project, we can anticipate a well informed model for the functions that a

regional centre must support.

In this context, SKA Regional Centres will be a vital resource to enable the community to take maximal

advantage of the scientific potential of the SKA. Moreover, within the tiered SKA operational model

currently being considered, the SRCs will provide essential functionality which is not currently

provisioned within the directly operated SKAO facilities. Therefore, SRCs will form an intrinsic part of

SKA operations and be the working interface for most scientists using the SKA (see Figure 4). As such,

national investments in a distributed SRC across Europe could represent a significant contribution to

SKA operations.

As the primary interfaces for extracting science, the ultimate success of the SKA will be directly

coupled to the capabilities of these SRCs. Establishing a large-scale, distributed European Science Data

Centre (ESDC) for SKA research represents an important opportunity to provide the astronomy

community with the scale of computational infrastructure necessary to maximally exploit the scientific

potential of the SKA. Within Europe, a joint effort provides the opportunity to utilize existing

infrastructure in a uniform way, coordinate engagement with both European and national ICT

This proposal version was submitted by Rob VAN DER MEER on 30/03/2016 16:52:22 Brussels Local Time. Issued by the Participant Portal Submission Service.

Courtesy: H2020 AENEAS; T.

Cornwell

CERN OpenStack DayMay 27, 2019 15

The SDP Data Challenge

Low MidLow Mid

Data Ingest Rate

(GBytes/second) ~707 ~775

Hot Data Buffer

(Petabytes) ~38 ~32

Hot Buffer per Node

(Terabytes) ~21 ~20

Processing Power

(PetaFLOPS) ~125 ~125Morganoshell - Own work, CC BY-SA 4.0

CERN OpenStack DayMay 27, 2019 16

Meeting the Challenge

• Performance & Scalability

• Modifiability & Maintainability

• Availability

• Reliability

• Usability

• Affordability

• Testability

• Portability

CERN OpenStack DayMay 27, 2019 17

The SDP

CERN OpenStack DayMay 27, 2019 18

Scheduling ObservationsThe SDP is controlled by the Telescope Manager

• Observations to be conducted arrive as scheduling block instances

• Processing Blocks define a workflow and its parameters

• Execution Engines are the processing platforms for different workflows

Monitoring of hardware, software and workflow metrics is generated by the SDP

• Workflow may be steered, adjusted or abandoned based on real-time workflow telemetry data

Transient events could require rapid reconfiguration of the telescope and SDP processing blocks

CERN OpenStack DayMay 27, 2019 19

Data Parallelism and Decomposition

Divide and conquer:

• Frequency

• Time

• Space

CERN OpenStack DayMay 27, 2019 20

The Data Island

• Data locality for a workflow

• Isolation brings performance guarantees

• Provisioned dynamically, according to workflow needs

• Fully-provisioned non-blocking network

• POSIX namespace likely

Workflow

(Buffer) Data Islands

Execution Engines

Internal Distribution

CERN OpenStack DayMay 27, 2019 21

ALaSKA - à la SKA:

SKA Performance Prototyping

CERN OpenStack DayMay 27, 2019 22

ALaSKA - Performance Prototype Platform

CERN OpenStack DayMay 27, 2019 23

Kayobe: A Universe from Nothing

• Infrastructure management with Ironic

• BIOS configuration with Ansible

• Network configuration with Ansible

• Containerised with Kolla

• Configured with Kolla-Ansible

• Orchestrated with Python and Ansible

• Optimised for Research Computing

• Open source: OpenStack big tent

• Infrastructure as code

• https://kayobe.readthedocs.io/

CERN OpenStack DayMay 27, 2019 24

The Scientific OpenStack Book:

● Written with help from the OpenStack

Scientific SIG

● Current best practice for OpenStack and

HPC

● Six subject overviews with case studies

contributed by WG members

https://www.openstack.org/science/

OpenStack Scientific SIG