Functions & Objects IDIA 618 Spring 2012 Bridget M. Blodgett.
IDIA Pipelines - University of Cape Town
Transcript of IDIA Pipelines - University of Cape Town
IDIA PipelinesBradley Frank (SARAO / IDIA)
Srikrishna Sekhar, Jordan Collier, David AikemaRuss Taylor, Sourabh Paul
IDIA Pipelines• Initiated in 2017 — response to IDIA Call for Projects.
• Umbrella Programme for Science Data Processing at IDIA.
• Data transport and management (with SARAO).
• Provision of Software (via Singularity Containers).
• Design, use and testing of Virtual Machine infrastructures.
• Fat Nodes vs Clusters.
• Prototype Astronomer User Models.
Control
Data
Head Node
Worker Nodes
Fat Node
Control
Control
Distributed Storage BeeGFS / CEPH
High-Speed Network Mount
Software: Singularity Containers Workflow/Resource on VHPC: SLURM
General User Model
MeerKAT Data• Level 0: Raw (32k, 0.5s dump).
• 8-hr full-pol full-res dataset ~ 20TB.
• Difficult to move — can we do pre-processing?
• HPC CASA at the CHPC.
• Level 1: Calibrated/Imaged at KAPB.
• SDP Pipeline: Corrected Data, Flags + Calibration Tables.
• Potentially smaller volume/averaged?
• Initially meant for QA purposes.
• Can we use L1 for science?
• DQA.
The Data Flow
*Or any other 3rd-Party Data Centre (in theory).
X
MeerKAT
Correlator SDP Cal (L0) (Site/Karoo)
Archive (L0+L1) (CHPC/Cape Town)
Data Transfer Node(s)
(CHPC/Cape Town)
100Gb/s
10/40/100 Gb/s
IDIA*
Data TransferCurrent State
X
MeerKAT
Correlator SDP Cal (L0) (Site/Karoo)
Archive (L0+L1) (CHPC/Cape Town)
Data Transfer Node(s)
(CHPC/Cape Town)
100Gb/s
10/40/100 Gb/s
IDIA*
• AOD Informs PI that Data is Ready
• PI instructs AOD to trigger push to DTN
• AOD/Dev confirms arrival of data on DTN.
• AOD/Dev contacts IDIA to initiate pull.
• Data arrives.• PI is informed.
Data TransferCurrent State
• IDIA/SARAO Data Transfer Node.
• Raw data scraped off S3 database (Rados + NPY Array).
• Converted to MS -> DTN.
• DTN push request initiated.
• GridFTP Transfer queued (managed by FTS).
• Received at IDIA cluster.
Data TransferIn Development
X
MeerKAT
Correlator SDP Cal (L0) (Site/Karoo)
Archive (L0+L1) (CHPC/Cape Town)
Data Transfer Node(s)
(CHPC/Cape Town)
100Gb/s
10/40/100 Gb/s
IDIA*
• PI checks archive interface (via VPN) for data.• IDIA affiliated PI can Push-To-IDIA (its a button).• MS data is transferred directly to appropriate IDIA directory.• Transfer progress can be monitored on archive dashboard.
Data TransferIn Development with SARAO
Screenshot Courtesy of Chris Schollar
Data TransferIn Development with SARAO
Screenshot Courtesy of Chris Schollar
Data TransferIn Development
Screenshot Courtesy of Chris Schollar
Data Quality Assurance
• Jordan Collier, in close collaboration with MeerKAT SDP.
• Framework to measure quality of pipelines.
• Science context (LSP-based) to be included in standard SDP Cal Report.
• SDP Cal Pipeline adjusted for science output.
• Mapping science requirements from LSP to technical requirements for pipelines.
Data Quality Assurance
IDIA Pipelines• processMeerKAT Pipeline.
• Package for processing on HPC (SLURM + ILIFU Cluster).
• To be generalised for use on PBS/Torque controlled system.
• Robust, generic, fast implementation of a’priori calibration (including flagging).
• General purpose Selfcal.
• Aim: T(cal) ~ T(obs)
• Framework.
• Best practices on how to use SLURM and MPICASA.
• Developer’s Guide.
• How to write and include your own modules in the pipeline.
IDIA Pipelines• Algorithms written using CASA.
• MOU with NRAO.
• Most radio astronomers are familiar with CASA and MSs.
• Heterogeneous application: Single Node/Single Thread, Single-Node/Multi-Threaded (OMP), Multi Node/Multi-Threaded (OMP+MPI).
• Many pipelines use CASA for flagging and a’priori calibration (gain/bandpass) done.
• Management:
• Run MPICASA using SLURM (srun), which in turn runs the appropriate container image.
• Sidesteps SQL thread unsafe quirks.
• Keeps MPI and software quarantined (as recommended by LBL).
• Just Python and Bash (SBATCH).
mpicasa -hostfile hostnames /path/to/casa --someoptions commands
mpirun
localhost slots=2010.0.0.1 slots=3010.0.0.2 slots=40
Executable
--nologger --log2term --nogui
myscript.py
some_task(arg1=‘blah’,arg2=123,arg4=‘whatever)
orted * 20
orted * 30
Localhost
10.0.0.1
orted * 40 10.0.0.2
{
LBL
CASA
Some Results55-dish/856MHz
Almost Noise Limited
Status• COSMOS 55-dish, ~8-hr, 150MHz
• Tcrosscal = 0.5 Tobs, Tcrosscal + Timage ~ Tobs (Not Optimised!!)
• Currently matching SLURM and MPICASA parallelism.
• Not all tasks are parallelised the same.
• Partition (IO), TCLEAN (CPU), Flagdata (RAM).
• Bandpass and Gaincal (Not parallelised).
• Given an input MS and operations, decide on robust job parameters.
• Kicking off low-freq selfcal with AP from high frequency (works better than expected).
• Selfcal Recipes.
• 2 in dev (iterative masking).
• Continuum subtraction: efficacy and performance.
• UVLIN vs UVMODEL vs UVLIN+UVMODEL.
Moving Forward
• processMeerKAT currently under performance testing.
• Public IDIA release soon.
• A’priori by the end of 2018?
• SLURM/MPICASA User Guide: released soon thereafter.
• Developers Guide — early 2019.
• Selfcal: Feb 2019 (Planned:)