glideinWMS Architecture - glideinWMS Training Jan 2012
-
Upload
igor-sfiligoi -
Category
Technology
-
view
469 -
download
3
description
Transcript of glideinWMS Architecture - glideinWMS Training Jan 2012
![Page 1: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/1.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 1
glideinWMS training @ UCSD
glideinWMS architectureby Igor Sfiligoi (UCSD)
![Page 2: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/2.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 2
Outline
● A high level overview of the glideinWMS
● Description of the components
![Page 3: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/3.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 3
glideinWMS
glideinWMSfrom 10k feet
![Page 4: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/4.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 4
Refresher - Condor
● A Condor pool is composed of 3 pieces
Collector
Negotiator
Central manager
Submit node
Schedd
Execution node
Startd
Job
Execution node
Execution node
Execution node
Execution node
Submit node
Submit node
![Page 5: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/5.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 5
What is a glidein?
● A glidein is just a properly configured execution node submitted as a Grid job
Collector
Negotiator
Central manager
Submit node
Schedd
Execution node
Startd
Job
Submit node
Submit node
glidein
Execution nodeglidein
Execution nodeglidein
Execution nodeglidein
![Page 6: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/6.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 6
What is glideinWMS?
● glideinWMS is an automated tool for submitting glideins on demand
Collector
Negotiator
Central manager
Submit node
Schedd
Execution node
Startd
Job
Submit node
Submit node
glideinWMS
GlobusGlobus
CREAMExecution nodeglidein
Execution nodeglidein
Execution nodeglidein
glidein
![Page 7: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/7.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 7
glideinWMS architecture
● glideinWMS has 3 logical pieces
Factory node
Condor
Factory
Frontend node
Frontend
Globus
CREAM
Submit node
Submit node
Submit node
Central manager
Execution nodeglidein
Execution nodeglidein
Worker node
glidein_startup
Startd
MonitorCondor
Requestglideins
Submitglideins
ConfigureCondor G.N.
Match
Frontend domain
![Page 8: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/8.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 8
Site selection logic and job monitoring
Grid knowledge and troubleshooting
glideinWMS architecture
● glideinWMS has 3 logical pieces● glidein_startup – Configures and starts
Condor execution daemons
● Factory – Knows about the sites and does the submission
● Frontend – Knows about user jobs and requests glideins
Runtime environmentdiscovery and validation
![Page 9: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/9.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 9
Cardinality
● N-to-M relationship● Each Frontend can talk to many Factories● Each Factory may serve many Frontends
Startd
Glidein Factory
ScheddUser job
Collector
Negotiator
VO Frontend
StartdUser job
ScheddCollector
Negotiator
VO Frontend
StartdUser jobGlidein Factory
![Page 10: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/10.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 10
Many operators
● Factory and Frontend are usually operated by different people
● Frontends VO specific● Operated by VO admins● Each sets policies for its users
● Factories generic● Do not need to be affiliated with any group● Factory ops main task is Grid monitoring and
troubleshooting
![Page 11: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/11.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 11
glideinWMS
A (sort of) detailed view of
glidein_startup
![Page 12: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/12.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 12
Refresher – glideinWMS arch.
● glidein_startup configures and starts Condor
Factory node
Condor
Factory
Frontend node
Frontend
Globus
CREAM
Submit node
Submit node
Submit node
Central manager
Execution nodeglidein
Execution nodeglidein
Worker node
glidein_startup
Startd
MonitorCondor
Requestglideins
Submitglideins
ConfigureCondor G.N.
Match
![Page 13: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/13.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 13
glidein_startup tasks
● Validate node (environment)● Download Condor binaries● Configure Condor● Start Condor daemon(s)● Collect post-mortem monitoring info● Cleanup
Performed by plugins
![Page 14: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/14.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 14
glidein_startup plugins
● Config files and scripts loaded via HTTP● From both the factory and the frontend Web servers● Can use local Web proxy (e.g. Squid)● Mechanism tamper proof and cache coherent
glidein_startup
● Load files from factory Web
● Load files from frontend Web
● Run executables● Start Condor● Cleanup
StartdFrontend node
HTTPd
Factory node
HTTPdS
qu
id
![Page 15: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/15.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 15
glidein_startup scripts
● Standard plugins● Basic Grid node validation (certs, disk space, etc.)● Setup Condor (glexec, CCB, etc.)
● VO provided plugins● Optional, but can be anything● CMS@UCSD checks for CMS SW
● Factory admin can also provide them● Details about the plugins can be found at
http://tinyurl.com/glideinWMS/doc.prd/factory/custom_scripts.html
![Page 16: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/16.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 16
glideinWMS
A (sort of) detailed view of the
glidein factory
![Page 17: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/17.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 17
Refresher – glideinWMS arch.
● The factory knowns about the grid and submits glideins
Factory node
Condor
Factory
Frontend node
Frontend
Globus
CREAM
Submit node
Submit node
Central manager
Execution nodeglidein
Execution nodeglidein
Worker node
glidein_startup
Startd
MonitorCondor
Requestglideins
Submitglideins
ConfigureCondor G.N.
Match
![Page 18: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/18.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 18
Glidein factory
● Glidein factory knows how to contact sites● List in a local config● Only trusted and tested sites should be included
● For each site (called entry)● Contact info (Node, grid type, jobmanager)● Site config (startup dir, glexec, OS type, …)● VOs supported● Other attributes (Site name, closest SE, ...)
● Admin maintained, periodically compared to BDIIhttp://tinyurl.com/glideinWMS/doc.prd/factory/configuration.html
![Page 19: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/19.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 19
Glidein factory role
● The glidein factory is just a slave● The frontend(s) tell it how many glideins
to submit where● Once the glideins start to run, they report to
the VO collector and the factory is not involved
● The communication is based on ClassAds● The factory has a Collector for this purpose
Factory node
Collector
Factory
Frontend node
Frontend
![Page 20: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/20.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 20
Factory collector
● The factory collector handles all communication
Factory node
Collector
Factory
Frontend node
Frontend
EntryEntry Entry
Spawn
Frontend node
Frontend
...
.
.
.
Advertiseentry
Find sites
Requestglideins
Retrieveorders
http://tinyurl.com/glideinWMS/doc.prd/factory/design_data_exchange.html
![Page 21: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/21.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 21
Frontend node
Frontend
Frontends
● The factory admin decides which Frontends to serve● Valid proxy
with known DN needed to talk to the collector
● Factory config has furtherfine grained controls Factory node
Collector
Factory
Frontend node
Frontend
![Page 22: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/22.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 22
glidein
glideinFactory node
Glidein submission
● The glidein factory (entry) usesCondor-G to submit glideins● Condor-G does the heavy lifting● The factory just monitors the progress
Entry
ScheddEntry
Schedd
.
.
.
.
.
.
Submit
Monitor
Submit
Monitor
Globus
CREAM
glidein
glidein
![Page 23: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/23.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 23
Credentials/Proxy
● Proxy typically provided by the frontend● Although the factory can provide a default one (rarely used)
● Proxy delivered encrypted in the ClassAd● Factory (entry) provides the encryption key (PKI)
● Proxy stored on disk● Each VO mapped to a different UID
Factory node
Collector
Entry
Frontend node
Frontend
Get key
Deliver proxy(encrypted)
Schedd
![Page 24: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/24.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 24
glideinWMS
A (sort of) detailed view of the
VO frontend
![Page 25: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/25.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 25
Factory node
Condor
Factory
Frontend node
Frontend
Globus
CREAM
Submit node
Submit node
Central manager
Execution nodeglidein
Execution nodeglidein
Worker node
glidein_startup
Startd
MonitorCondor
Requestglideins
Submitglideins
ConfigureCondor G.N.
Match
Refresher – glideinWMS arch.
● The frontend monitors the user Condor pool,does the matchmaking and requests glideins
Frontend domain
![Page 26: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/26.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 26
VO frontend
● The VO frontend is the brain of a glideinWMS-based pool● Like a site-level “negotiator”
Factory node
Frontend node
Frontend
Submit node
Submit node
Central manager
MonitorCondor
Requestglideins
Match
VO domain Findidle jobs
Findentries
Match
Requestglideins
![Page 27: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/27.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 27
Collector
Negotiator
Central manager
Submit node
Schedd
Execution node
Startd
Job
Factory
GlobusGlobus
CREAMExecution nodeglidein
Execution nodeglidein
Execution nodeglidein
glidein
Frontend
Two level matchmaking
● The frontend triggers glidein submission● The “regular” negotiator matches jobs to glideins
![Page 28: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/28.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 28
Frontend logic
● The glideinWMS glidein request logicis based on the principle on “constant pressure”● Frontend requests a certain number of
“idle glideins” in the factory queue at all times● It does not request a specific number of glideins
● This is done due to the asynchronous nature of the system● Both the factory and the frontend are
in a polling loop and talk to each other indirectly
![Page 29: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/29.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 29
Frontend logic
● Frontend matches job attrs against entry attrs● It then counts the matched idle jobs● A fraction of this number becomes the
“pressure requests” (up to 1/3)
● The matchmaking expression is defined by the frontend admin● Not the user● Debatable if it is better or worse, but it does reduce
frontend code complexity
![Page 30: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/30.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 30
● The frontend owns the “glidein proxy”● And delegates it to the factory(s)
when requesting glideins● Must keep it valid at all times
(usually at OS level)
● The VO frontend can (and should) provide VO specific validation scripts‑
● The VO frontend can (and should) set the glidein start expression● Used by the VO negotiator for final matchmaking
Frontend config
![Page 31: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/31.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 31
glideinWMS
And the
summary
![Page 32: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/32.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 32
Summary
● Glideins are just properly configured Condor execute nodes submitted as Grid jobs
● The glideinWMS is a mechanism to automate glidein submission
● The glideinWMS is composed of three logical entities, two being actual services:● Glidein factories know about the Grid● VO frontend know about the users and
drive the factories
![Page 33: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/33.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 33
Pointers
● glideinWMS development team is reachable [email protected]
● The official project Web page ishttp://tinyurl.com/glideinWMS
● CMS frontend at UCSDhttp://glidein-collector.t2.ucsd.edu:8319/vofrontend/monitor/frontend_UCSD-v5_2/frontendStatus.html
● OSG glidein factory at UCSDhttp://hepuser.ucsd.edu/twiki2/bin/view/UCSDTier2/OSGgfactoryhttp://glidein-1.t2.ucsd.edu:8319/glidefactory/monitor/glidein_Production_v4_1/factoryStatus.html
![Page 34: glideinWMS Architecture - glideinWMS Training Jan 2012](https://reader034.fdocuments.in/reader034/viewer/2022051514/54b6bcdb4a79592f7a8b4604/html5/thumbnails/34.jpg)
UCSD Jan 17th 2012 glideinWMS architecture 34
Acknowledgments
● The glideinWMS is a CMS-led project developed mostly at FNAL, with contributions from UCSD and ISI
● The glideinWMS factory operations at UCSD is sponsored by OSG
● The funding comes from NSF, DOE and the UC system