The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job...

21
The OxGrid Resource Broker David Wallom

description

OxGrid, a University Campus Grid Single entry point for users to shared and dedicated resources Seamless access to NGS and OSC for registered users

Transcript of The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job...

Page 1: The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission…

The OxGrid Resource Broker

David Wallom

Page 2: The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission…

Overview

• OxGrid• Resource Broking• Why build our own• Job Submission and other tools• Future developments

Page 3: The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission…

OxGrid, a University Campus Grid

• Single entry point for users to shared and dedicated resources

• Seamless access to NGS and OSC for registered users

Page 4: The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission…

Resource Broking

• The original idea of the grid relied on efficient resource broking to abstract the user away from the resources

• This has been significantly neglected by grid software developers– Push or pull type of mechanism, each have

significant advantages or disadvantages– Resources that have multiple job sources

increase complexity many fold

Page 5: The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission…

Why build our own?• OxGrid is intended to be a lightweight

development• Replacement of individual components should be

simple– Use of service based interfaces are the goal

• Current solutions do not allow this with massive dependencies and non trivial maintenance requirements

• Condor-G is a simple off the shelf Grid system meta scheduler, why make it so much more complicated?

Page 6: The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission…

Condor Matchmaking

• Matchmaking is a methodology for Distributed Resource Management

• Conceptually simple:– Service providers and requesters advertise– Compatible advertisements are matched– Matched entities cooperate to perform service

• Developed for opportunistic environments– Use resources as and when available

Thanks to the Miron and the Condor Team

Page 7: The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission…

Condor Matchmaking (Cont.)

• Customers and Servers advertise to a Matchmaking Service

• Advertisements describe advertising entities– Characteristics– Requirements and Constraints– Preferences

• These descriptions are called classified advertisements (classads)

Thanks to the Miron and the Condor Team

Page 8: The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission…

Static and Dynamic Information

• Static information– e.g. processor architecture, physical memory,

operating system, scheduling system, no. of nodes• Dynamic information

– e.g. system availability, scheduler load, queue length, used disk or memory

Page 9: The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission…

OxGrid Virtual Organisation Manager Database

• Final repository for authorisation information

• Stores additional static information for each resource such as capability and maximum number of submitted jobs for that node

Page 10: The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission…

Data Harvesting cycle

• Information sources can be added or removed at will

• Either a single repository for information aggregation (e.g. ngsinfo) or individual machines

• Simple internal representation of information gives ease of adding new types of info source

Page 11: The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission…

Generated classadMyType = "Machine"TargetType = "Job"Name = ”bedrock.oucs.ox.ac.uk-condor“gatekeeper_url=”bedrock.oucs.ox.ac.uk/jobmanager-condor"Requirements=(CurMatches<20)& (TARGET.JobUniverse == 9)WantAdRevaluate = TrueUpdateSequenceNumber = 1097580300CurMatches = 0OpSys = "LINUX“Arch = "INTEL"Memory = 501MPI = FalseINTEL_COMPILER=TrueGCC3=True

Page 12: The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission…

Tuning Condor to act as a metascheduler

• The default configuration of Condor is as a cycle scavenger

• Alter this through ensuring that all available tasks are attempted to be matched with each pass of the Negotiator

• Since we are a Condor-G system only we change the default universe of the system to grid

Page 13: The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission…

Changes to Condor configuration

DEFAULT_UNIVERSE = GLOBUSCLASSAD_LIFETIME = 900 NEGOTIATE_ALL_JOBS_IN_CLUSTER = TrueNEGOTIATOR_INTERVAL = 30 JOB_START_DELAY = 10GRIDMANAGER_JOB_PROBE_INTERVAL=60

Page 14: The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission…

Job Submission

• Most users are comfortable with command-line applications– Condor submission scripts would be another

language for our users to learn…– submission step as a scriptable application

with argument• Created job-submission

Page 15: The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission…

job-submission-h <HOSTNAME>/<JOBMANAGER>-e <EXECUTABLE>-t Boolean transfer exe?-a EXE arguments-i Input files to be transferred-o Output files to be transferred

Page 16: The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission…

Job classadexecutable = update_fileTransfer_Executable = Trueglobusscheduler = $$(gatekeeper_url)Requirements = (TARGET.gatekeeper_url == "t2ce02.physics.ox.ac.uk/jobmanager-lcgpbs" || TARGET.gatekeeper_url ==

"condor.oucs.ox.ac.uk/jobmanager-condor" || TARGET.gatekeeper_url == "grid-compute.oesc.ox.ac.uk/jobmanager-pbsox" || TARGET.gatekeeper_url == "bedrock.oucs.ox.ac.uk/jobmanager-sge") && TARGET.gatekeeper_url =!= UNDEFINED && TARGET.OpSys == "LINUX"

match_list_length = 1arguments = TEST_3_2.in TEST_3_2.outtransfer_input_files = TEST_3_2.intransfer_output_files = TEST_3_2.outWhenToTransferOutput = ON_EXITuniverse = gridgrid_type = gt2notification = ERRORoutput = temp-1168783341-2.outerror = temp-1168783341-2.errlog = temp-1168783341-2.logqueue

Page 17: The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission…

Additional User Tools

• oxgrid_certificate_import– Simplifies the installation of a user digital certificate to a single

command• oxgrid_q

– Display the users current queue at the resource broker. Has the options to allow the user to see the full task queue.

• oxgrid_status– Displays the resources that are available to the user with options

for all resource currently registering with the resource broker• oxgrid_cleanup

– Removes either a single submitted process or a range of child processes with their master

Page 18: The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission…

oxgrid_status

Page 19: The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission…

Users• Statistics• Materials science• Inorganic chemistry• Theoretical chemistry• Biochemistry• Computational biology• Astrophysics• Condensed matter physics• Zoology

• Researchers and students

Page 20: The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission…

Future Developments

• As part of GridBS project development:– Additional direct submission into MS CCS

using GridSAM BLAH– Addition of new types of data sources

• EGEE• Grimoires

• Continue to improve packaging to ensure ease of installation and re-distribution

Page 21: The OxGrid Resource Broker David Wallom. Overview OxGrid Resource Broking Why build our own Job Submission…

Conclusion

• We have designed a resource broker that is orders of magnitude small with minimal external dependencies

• Simple tools have allowed users of OxGrid easy access to resources in many different institutions

• Over 65k individual tasks have been submitted to connected resources since January