The CernVM Project A new approach to software distribution Carlos Aguado Jakob Predrag
CernVM Program of Work 2021 · 2021. 2. 21. · Micro rum.cern.ch • CernVM-FS appliance •...
Transcript of CernVM Program of Work 2021 · 2021. 2. 21. · Micro rum.cern.ch • CernVM-FS appliance •...
CernVM Program of Work 2021
Jakob Blomer for the CernVM Team
SFT Meeting22 February 2021
Infrastructure: WLCG
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
Stratum 0/1
WLCG squid
Software, containers, auxiliary data for HEP,LIGO, EUCLID, LSST, EESSI, and many others
2
Infrastructure: WLCG
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
Stratum 0/1
WLCG squid
Software, containers, auxiliary data for HEP,LIGO, EUCLID, LSST, EESSI, and many others
Available in the default configuration:∼ 1.4B files∼ 125 repositories
2
Infrastructure: WLCG
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
Stratum 0/1
WLCG squid
Software, containers, auxiliary data for HEP,LIGO, EUCLID, LSST, EESSI, and many others
Available in the default configuration:∼ 1.4B files∼ 125 repositories
CERN Stratum 0s fully on Ceph S3
2
Code Works
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
• Among the top 1.5% of active opensource projects
• Steady 50–100 commits per month
• ∼30 000 LOC changed in 2020
3
Review of 2020
Review of 2020
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
Highlights• Scaling up CernVM-FS container hub: 800+ images on /cvmfs/unpacked.cern.ch
• Container runtime support: containerd/k8s, podman [GSoC contribution]• Major improvements to container conversion service (DUCC)
• New and improved publishing workflows
• Template transactions: ultra-fast, meta-data only publishing• Ephemeral writable shell: technical foundation to publish from anywhere• Fine-grained publisher monitoring
• Performance improvements: parallelized garbage collection and storage gateway services→ now fast enough to publish all LHCb nightlies (1.5 k packages, >10M files) until start of working day
• Commissioning of the gateway services for LHCb nightlies
• Experimental support for Microsoft Azure blob storage [Microsoft contribution]
• CernVM 5 prototype, EL8 based
• Infrastructure modernization: web presence, CI pipeline, VM & storage replacement
• Dissemination: pre-GDB EGI webinar EGI Clinic IPDPS’20 (with U Notre Dame)CernVM virtual workshop 1-2 February 2021 with 99 registered participants 4
Review of 2020
Agent CloudCloud
Gateway
Book Keeper
WebAPI
MicroHighlights• Scaling up CernVM-FS container hub:
800+ images on /cvmfs/unpacked.cern.ch
• New and improved publishing workflows
• Performance improvements
• CernVM 5 prototype, EL8 based
• Infrastructure modernization
• . . .
Unfinished Tasks• Container conversion status REST API &
dashboard
• Shared, external cache manager formulti-container host
• Client pre-caching (due to reduced summerstudent programme)
• In progress:
• Transition of publishing code to newlibcvmfs_server
• Connecting ephemeral writable shell togateway services
5
Platform Support Commissioned in 2020
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
• A Platforms:
• EL 7–8
• Ubuntu 16.04, 18.04, 20.04new
• B Platforms
• macOS 10.15, 11 Big Sur (M1 + Intel)new
• SLES 11 – 12• Fedora, latest two versions• Debian 8–10• EL7 AArch64• IA32 architecture
• Linux on Windows via WSL-2new
• Client packaged as a containernew
(for container-only Linux distros such as Atomic Host)
6
Highlights: New Web Site
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
7
Highlights: New Web Site
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
7
Highlights: New Web Site
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
7
Highlights: New Web Site
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
https://cernvm.cern.ch
• Moved from Drupal to Jekyll
• Modern look, responsive design
7
Highlights: New Monitoring Site
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
8
Highlights: New Monitoring Site
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
8
Highlights: New Monitoring Site
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
8
Highlights: New Monitoring Site
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
Repository monitor
• Hosted on CERN OpenShift
• Based on JavaScript and libcvmfs
• Easy to add your repository: add
repository metadata and
submit pull request
• JSON API
8
Highlights: CernVM Forum
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
9
Highlights: CernVM Forum
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
https://cernvm-forum.cern.ch
• Discourse forum for CernVM-FS
and the CernVM appliance
• Many nice features: searchable,
questions can be marked as
resolved, . . .
• Supposed to eventually replace
mailing lists
9
Highlights: JSROOT Powered Fine-Grained Publish Monitoring
Agent CloudCloud
Gateway
Book Keeper
WebAPI
MicroCVMFS_UPLOAD_STATS_DB=trueDemo
• Statistics are generatedwith ROOT
• Uploaded as static filesto Stratum 0 storage
• Interactive plots(JavaScript / JSROOT)
10
Highlights: Template Transactions
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
template clone
/cvmfs/sw.cvmfs.io
amd64-gcc9
4.2
ChangeLog
4.2-patches
ChangeLog
...
...
cvmfs_server transaction \-T /amd64-gcc9/4.2=/amd64-gcc9/4.2-patches \sw.cvmfs.io
• As part of opening the transaction,“4.2” is cloned to “4.2-patches”
• Meta-data only copy, thus extremly fast:observed 50 kHz file publish rate
• Only changes on top need to be published
Used in fast container image ingestion
11
Highlights: Ephemeral Publish Shell
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
• A new command, cvmfs_server enter, creates a sub-shell with a writable /cvmfs
• Uses internally user namespaces and fuse-overlayfs
• Works unprivileged on any modern Linux (e. g. EL8) that can mount the client
• Could eventually be used to directly publish from any node to a gateway —however, the 2.8 release has only a ephemeral writable shell as a first step
$ cvmfs_server enter hsf.cvmfs.io...Opens a shell with write access to /cvmfs/hsf.cvmfs.io$ cvmfs_server diff --worktree...Close shell, back to read-only mode
Solves the main technology challenge to move away from dedicated publisher node,i. e. publish from anywhere!
12
Highlights: Infrastructure Modernization
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro• Migration of 26 OpenStack VMs (builders, web services, etc.);campaign triggered by hypervisor decommissioning – we’d prefer automatic migrations in the future
• Migration of ∼1TB project storage from NFS to Ceph-FS and Ceph S3
• Migration of ∼15 build & test jobs to new Jenkins server
• Commissioning of GitHub pull request builder: allows us to fully test changes before merging
There has been an exceptional amount of infrastructure work in 2020.We count on the fact that the work is amortized over the coming years.
13
CernVM / CernVM-FS Program of Work 2021
14
Developer Power
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro2020 2021
Jakob Blomer Staff 50 % 50%TBS Staff — 50%Simone Mosciatti Fellow 100% 25%Jan Priessnitz Tech 60 % —Andrea Valenzuela Tech 33% 66%TBS Tech — 33%
FTE ∼2.4 ∼2.25
Significant contributors: Mohit Tyagi (GSoC student), Enrico Bocchi (IT-ST), Dave Dykstra (FNAL)
15
CernVM Calendar
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
12/1
9v2
.702
/20
2.7.
104
/20
2.7.
2
07/2
02.
7.3
09/2
02.
7.4
10/2
02.
7.5
02/2
1v2
.8, C
ernV
M4.
5
Cern
VM
’21
03/2
12.
8.1
bugfix releases
Q4/
21v2
.9
Q2/
22Ce
rnVM
’22
(NIK
HEF
)
Consolidation & Improvements
Ongoing effort to consolidate CernVM-FS developments in a single repository,e. g. gateway services and containerd plugin scheduled for merging
16
CernVM Appliance Plan of Work for 2021
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
1. Ready to use platform for LHC experiment production and development
2. Reference platform for long-term data preservation
• 10 000+ booted VMs / day
• 45% of all ATLAS simulation jobs in 2020 ran at point 1 on CernVM!
• 𝜇CernVM bootloader + reference containers covering EL 4–7
• Interactive support: cernvm-launch and cernvm-online.cern.ch
2021 Plan of Work
• Maintenance updates for CernVM 4 [est 1 FTW]
• Migration of cernvm-online.cern.ch to new single sign-on system [est 1 FTW]
• Stretch goal: CernVM 5 pre-production release [est 1 FTM]
17
CernVM-FS Plan of Work for 2021
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro1. Maintenance and support
2. Consolidation tasks
3. Seamless container image ingestion
4. Kubernetes-native publisher (in collaboration with SPI)
5. Client performance improvments for very large applications (e. g. Tensorflow)
18
Maintenance and Support
Agent CloudCloud
Gateway
Book Keeper
WebAPI
MicroSignificant mainte-
nance and support load
Key figures from 2020:
• 450 mails on supportmailing lists
• 40 bug fixes merged
19
Consolidation Tasks
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro• Addressing open issues: bugfix sprint [est 1 FTM]
• Addressing known shortcomings of the gateway services
• Trigger garbage collection from remote publishers [est 1 FTW]• Use template transaction from remote publishers [est 1 FTW]• Transaction wait queue to prevent concurrent publishers from starvation [est 1 FTW]• Full repository tagging support [est 1 FTW]• Rebase gateway receiver on new libcvmfs_server [est 1 FTW]
• Source code repository consolidation [est 1 FTW]
• New platforms: SLES15 (for HPC), Debian 11 [est 1 FTW]
• macOS binary signatures [est 1 FTW]
20
Future-proofing: Next-Generation Server Code
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
Legacy Code New Architecture
libcvmfs_servercommit changeset, GC, tag management, . . .
PUT/GET storage abstraction
CLI GW receiver REST API · · ·
A set of tools targeted for a dedicated release managermachine, and the interactive workflow open transaction+ copy + commit
A common base library providing repositorytransformation primitives, on top of which higher-levelpublish abstractions can be built
Initial CLI commands ported to libcvmfs_server: info, diff, transaction, enter.Foundation for future maintainability and other consolidation tasks (e. g. gateway services)
Plan for 2021: port complete publish workflow to libcvmfs_server, includingtransaction abort & commit, tagging, garbage collection [est 2 FTM] 21
Seamless Container Image Ingestion
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
ApproachUsers develop containers with the standard tools and services (gitlab, Dockerhub, etc.).
For their large-scale deployment, we want to automatically ingest them in /cvmfs/unpacked.cern.ch
Container Publishing• Based on working prototype, commission
web-hook connection from standard registry toCernVM-FS [est 2 FTW]
• Based on working prototype, merge fast mergingof image layers [est 2 FTW]
• Dashboard and status API: display currentactivity, list of hosted images, etc. [est 1 FTM]
• Develop standard benchmark for publishthroughput to assess supported scale of usercontainer ingestion [est 1 FTM, summer student]
Container Engine Integration
Engine Type CernVM-FS Support
singularity flat nativedocker layers graph driver1
containerd layers remote snapshotterpodman layers extra image store
1 Expected to be replaced by containerd remote snapshotter
Review and improve documentation, examples,integration tests for different deploymentoptions [est 2 FTW]
22
Usability Milestones
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro• Implement publishing to gateway services from ephemeral writable shellrelies on libcvmfs_server consolidation tasks [est 1 FTM]
• Based on the ephemeral publish container (see before), demonstrate a kubernetes-native publishworkflow in collaboration with SPI [est 1 FTM]
• Implement a client-preching mechanism to improve cold-cache start-up performance of very largeapplications (e. g. Tensorflow) design ready, planned as GSoC project [est 2 FTM]
• Stretch goal: shared, external cache manager for multi-container host [est 1 FTM]
• Stretch goal: restart activity on CernVM-FS Conveyor (see backup slides)
23
Community Interaction
Community Interaction
Agent CloudCloud
Gateway
Book Keeper
WebAPI
Micro
• Developers and operators meet in a monthly coordination call (no changes for 2021)
• Weekly operations coffee with IT-SM (no changes for 2021)
• New CernVM forum supposed to take over from mailing lists
• Mattermost becoming an important information exchange between developers and power users
• Two publications in preparation for vCHEP 2021
• A CernVM-FS powered container hub (with IT-ST)• Performance engineering LHCb nightly builds publishing (with LHCb, IT-ST)
• Frontiers in Big Data publication in preparation on containerised analysis workflows withkubernetes (with CMS)
• Conferences and workshops on the radar: experiment computing weeks, GDB, HEPIX, ACAT
• Stretch goal: repository content manager training course for software librarians [est 2 FTW]
24
Summary
Outlook and Goals for 2021
Agent CloudCloud
Gateway
Book Keeper
WebAPI
MicroMain Priorities for 2021
1. Consolidation and exploitation of the CernVM-FS new services and features
2. Improve usability and scale of CernVM-FS based container deployments
3. Demonstration of a kubernetes-native publishing workflow (with SPI)
The team successfully addressed a number of technology challenges in the last 12-18 months, inparticular CernVM-FS integration with the container ecosystem, unprivileged client deployments
(crucial for HPC access) and containerized publishing. In 2021, the new developments will undergo aphase of consolidation and hardening.
25
Backup Slides
Stretch Goal: CernVM-FS Conveyor
A high-level abstraction of writing based on interdependent publication jobs.
$ ssh cvmfs-sft.cern.ch$ cvmfs_server transaction sft.cern.ch /lcg/ROOT$ tar -xf ROOT-6.18.tar.gz$ post-install.sh$ cvmfs_server publish
Current approach
{"repository": "sft.cern.ch","path": "/lcg/ROOT","payload": "https://root.cern.ch/ROOT-6.18.tar.gz","script": "https://spi.cern.ch/post-install.sh","uuid": "e7b67a2...","dependencies": ["f61d...", "a00e...", "..."]
}
• Send jobs to Conveyor API
• Conveyor distributes work to multiplepublisher nodes
Goal: liberate CI pipeline from handling cvmfs_server intrinsics.Prototype available, est 1–2 months to develop into a first usable version in collaboration with SPI