Understanding priorities in HTCondor
-
Upload
igor-sfiligoi -
Category
Technology
-
view
570 -
download
2
description
Transcript of Understanding priorities in HTCondor
CERN, Dec 2012 HTCondor priorities 1
glideinWMS for users
Understanding priorities in HTCondor
by Igor Sfiligoi (UCSD)
CERN, Dec 2012 HTCondor priorities 2
Scope of this talk
This talk provides an overviewof how priorities work in HTCondor,
both between users and among jobs of the same user,
and how the user can affect policies.
Reader is expected to already have a basic understanding of HTCondor.
CERN, Dec 2012 HTCondor priorities 3
HTCondor Architecture
● As a reminder
Central manager
Condor
Submit node
Condor
Execute node
Condor
Submit node
Submit node
Execute node
Execute node
Execute node
Execute node
CERN, Dec 2012 HTCondor priorities 4
HTCondor Architecture
● And with relevant daemon names
Central manager
Negotiator
Submit node
Schedd
Execute node
Condor
Submit node
Submit node
Execute node
Execute node
Execute node
Execute node
CERN, Dec 2012 HTCondor priorities 5
User Priorities
CERN, Dec 2012 HTCondor priorities 6
What is a user?
● Before talking about priorities between userswe need to define what IS a user
● A “HTCondor user” is represented asOwner@Domain● In most setups, the Owner is the
“Login User Name” on the submit node● The Domain may either represent the submit node itself,
or a set of submit nodes that share the same Owner identification policies
Yes, priorities are based
on the User not the Owner
Both rules defined by the HTCondor adminand cannot be changed by the final user
CERN, Dec 2012 HTCondor priorities 7
User priorities
● By default, the Negotiator treats all users equally● You get fair-share out of the box
● Each user is assigned a priority number● The lower, the better● Two users with the same priority number
on average get half of Slots each
● User priority asymptotically steers toward the number of Slots used● Both up and down
http://research.cs.wisc.edu/htcondor/manual/v7.8/3_4User_Priorities.html#SECTION00444000000000000000
CERN, Dec 2012 HTCondor priorities 8
Special users
● If not all users are equally important, the Negotiator supports● Accounting groups – When you need to group users ● Priority factors – Works on user-by-user basis
● The two mechanisms can be combined
http://research.cs.wisc.edu/htcondor/manual/v7.8/3_4User_Priorities.html
CERN, Dec 2012 HTCondor priorities 9
Accounting groups
● Users can be joined in accounting groups● The Negotiator defines the groups,
but jobs specify which group they belong to● Each group can be given a quota
● Can be absolute or relative to the size of the pool● Sum of running jobs in the group cannot exceed it
● If quotas >100%, can be used for relative prio● Here higher is better● Each group will be given,
on average, quotaG/sum(quotas) of slots
Jobs without anygroup may never
get anything
CERN, Dec 2012 HTCondor priorities 10
Mapping jobs to A.G.
● Users must specify which group they belong to● No automatic mapping or validation in Condor● Based on trust
● Jobs must add to their submit file+AccountingGroup = "<group>.<owner>"
Universe = vanillaExecutable = cosmosArguments = -k 1543.3Output = cosmos.outInput = cosmos.inLog = cosmos.log+AccountingGroup = "group_higgs.frieda"Queue 1
Universe = vanillaExecutable = cosmosArguments = -k 1543.3Output = cosmos.outInput = cosmos.inLog = cosmos.log+AccountingGroup = "group_higgs.frieda"Queue 1
CERN, Dec 2012 HTCondor priorities 11
Mapping jobs to A.G.
● Users must specify which group they belong to● No automatic mapping or validation in Condor● Based on trust
● Jobs must add to their submit file+AccountingGroup = "<group>.<owner>"
Universe = vanillaExecutable = cosmosArguments = -k 1543.3Output = cosmos.outInput = cosmos.inLog = cosmos.log+AccountingGroup = "group_higgs.frieda"Queue 1
Universe = vanillaExecutable = cosmosArguments = -k 1543.3Output = cosmos.outInput = cosmos.inLog = cosmos.log+AccountingGroup = "group_higgs.frieda"Queue 1
“AccountingGroup@Domain”is effectively the identifier
used by the Negotiatorfor Priority purposes
With the default beingA.G.==Owner
CERN, Dec 2012 HTCondor priorities 12
Priority Factors
● Each user can be assigned a Priority Factor● PF>1 will reduce a user's priority
– If users X and Y have PFX=(N-1)*PFY, on averageuser X gets 1/N of slots (with user Y the rest)
● Can manage with cmdline tool condor_userprio
● Admin likely have set high default PF (e.g. 1000)– PF cannot go below 1
$ condor_userprio -all -allusers |grep [email protected]@node1 8016.22 8.02 10.00 0 15780.63 11/23/2011 05:59 11/30/2012 20:37$ condor_userprio -setfactor group1.user1@node1 1000The priority factor of group1.user1@node1 was set to 1000.000000$ condor_userprio -all -allusers |grep [email protected]@node1 8016.22 8.02 1000.00 0 15780.63 11/23/2011 05:59 11/30/2012 20:37
$ condor_userprio -all -allusers |grep [email protected]@node1 8016.22 8.02 10.00 0 15780.63 11/23/2011 05:59 11/30/2012 20:37$ condor_userprio -setfactor group1.user1@node1 1000The priority factor of group1.user1@node1 was set to 1000.000000$ condor_userprio -all -allusers |grep [email protected]@node1 8016.22 8.02 1000.00 0 15780.63 11/23/2011 05:59 11/30/2012 20:37
http://research.cs.wisc.edu/htcondor/manual/v7.8/2_7Priorities_Preemption.html#sec:user-priority-explained
Only superuser can set
CERN, Dec 2012 HTCondor priorities 13
Efficiency trade-off
● After getting a Slot, the schedd will keep it for an extended period of time● i.e. will schedule several jobs
of the same user on it● For efficiency reasons
– Negotiator can take a few mins to do the matching
● As a side effect● A low priority user may keep the execute node
even if jobs from a higher priority user show up
Configurable,but it is a trade-off.
In glideinWMS,lifetime of the glidein
by default
CERN, Dec 2012 HTCondor priorities 14
Preemption
● HTCondor has the notion of preemption● If a job from a higher priority user shows up,
the Negotiator may instruct an execute node to kill the running job and re-negotiate
● Yes, all work done to that point is lost(unless the job is able to checkpoint)
● Disabled by default on glideinWMS systems
CERN, Dec 2012 HTCondor priorities 15
Submit node limits
● HTCondor resource usage on the submit node scales with the number of running jobs● So an admin will likely set a limit
MAX_JOBS_RUNNING
● If the submit node gets close to the limit, you are likely to see “weird behavior”● The negotiator will try to be fair,
and distribute the remaining wiggle room to several users with a similar priority number
● Remember: User priority is a dynamic property
CERN, Dec 2012 HTCondor priorities 16
Monitoring per-user usage
● Submitter ClassAds provide per-user info● But one ClassAd per submitter node
● The long format contains info about limits
$ condor_status -submitters
Name Machine Running IdleJobs HeldJobs
uscms3024@cmsanalysi glidein-2. 802 299 1uscms3024@cmsanalysi submit-2.t 2063 1131 0uscms3044@cmsanalysi submit-2.t 663 344 0uscms3045@cmsanalysi submit-2.t 0 1 0 RunningJobs IdleJobs HeldJobs
uscms3024@cmsanalysi 2865 1430 1uscms3044@cmsanalysi 663 344 0uscms3045@cmsanalysi 0 1 0
Total 3528 1775 1
$ condor_status -submitters
Name Machine Running IdleJobs HeldJobs
uscms3024@cmsanalysi glidein-2. 802 299 1uscms3024@cmsanalysi submit-2.t 2063 1131 0uscms3044@cmsanalysi submit-2.t 663 344 0uscms3045@cmsanalysi submit-2.t 0 1 0 RunningJobs IdleJobs HeldJobs
uscms3024@cmsanalysi 2865 1430 1uscms3044@cmsanalysi 663 344 0uscms3045@cmsanalysi 0 1 0
Total 3528 1775 1
Actual ClassAds
Summary
CERN, Dec 2012 HTCondor priorities 17
Job Priorities
CERN, Dec 2012 HTCondor priorities 18
Priority-FIFO
● So, a user will have many jobs● In which order will they be executed?
● HTCondor guarantees the Priority-FIFO policy● Each jobs has a priority associated with it
● Jobs in the same priority class will start in FIFO order
● Jobs with higher priority always start before jobs with lower priority– i.e. higher priority is better
User-specific – will not affect priority between users
CERN, Dec 2012 HTCondor priorities 19
Non-uniform environments
● Of course, everything is contingent to matching● P-FIFO only applies to jobs that match
at least one Slot
● If not all Slots are uniform● Lower priority (or submitted late) Jobs
may start before high priority (or submitted early) Jobsif the latter do not match any Unclaimed Slots
CERN, Dec 2012 HTCondor priorities 20
Job restarts
● If an execute node dies for whatever reason, HTCondor will try to re-start the job that was running there somewhere else
● In a typical (glidein) setup, it will get the next available matching slot for that user● i.e. it will not preempt a lower priority job
CERN, Dec 2012 HTCondor priorities 21
Multiple submit nodes
● The same user may have submitted jobs on many submit nodes● Here assuming they share the same Domain name
● Each submit node will handle its jobs on its own● No guarantee on the execution order
between jobs on different node● HTCondor will try to Round-Robin between them
● In 7.9.x, HTCondor can be configured to treat the Job priority as a global property● i.e. first high priority jobs, no matter which submitter● But still no guarantee within the prio. class
NEW
CERN, Dec 2012 HTCondor priorities 22
Prioritiesin
glideinWMS
CERN, Dec 2012 HTCondor priorities 23
None
● The glideinWMS layer does not handle priorities in any shape or form
● All jobs from all users treated the same● Although it may create different execute node
requirements for some of them– But it is effectively a binary decision
CERN, Dec 2012 HTCondor priorities 24
The End
CERN, Dec 2012 HTCondor priorities 25
Pointers
● HTCondor Home Pagehttp://research.cs.wisc.edu/htcondor/
● HTCondor [email protected]@cs.wisc.edu
CERN, Dec 2012 HTCondor priorities 26
Acknowledgments
● The creation of this document was sponsored by grants from the US NSF and US DOE,and by the University of California system