Preference–Based Matchmaking of Grid Resources with CP–Nets

J Grid Computing (2013) 11:211–237DOI 10.1007/s10723-012-9235-2

Preference–Based Matchmaking of Grid Resourceswith CP–Nets

Massimo Cafaro · Maria Mirto · Giovanni Aloisio

Received: 21 December 2011 / Accepted: 10 September 2012 / Published online: 14 October 2012© Springer Science+Business Media Dordrecht 2012

Abstract We deal with the problem ofpreference-based matchmaking of computationalresources belonging to a Grid. We introduce CP–Nets, a recent development in the field of Arti-ficial Intelligence, as a means to deal with user’spreferences in the context of Grid scheduling. Wediscuss CP–Nets from a theoretical perspectiveand then analyze, qualitatively and quantitatively,their impact on the matchmaking process, withthe help of a Grid simulator we developed for thispurpose. Many different experiments have beensetup and carried out, and we report here ourmain findings and the lessons learnt.

Keywords Grids · Matchmaking · CP–Nets

M. Cafaro (B) · G. AloisioUniversity of Salento, Lecce, Italye-mail: [email protected]

G. Aloisioe-mail: [email protected]

M. Cafaro · M. Mirto · G. AloisioCMCC—Euro-Mediterranean Centre on ClimateChange, Lecce, Italy

M. Mirtoe-mail: [email protected]

1 Introduction

Grid computing [16] emerged as a new paradigmdistinguished from traditional distributed comput-ing because of its focus on large-scale resourcesharing and innovative high-performance appli-cations. The Grid infrastructure ties together anumber of Virtual Organizations (VOs) [17], thatreflect dynamic collections of individuals, institu-tions and computational resources.

A Grid Information Service (GIS) [12] aimsat providing an information rich environment tosupport service/resource discovery and decisionmaking processes. The main goal of Grid environ-ments is indeed the provision of flexible, secureand coordinated resource sharing among VOs totackle large-scale scientific problems, which inturn require addressing, besides other challengingissues like authentication/authorization, access toremote data etc., service/resource discovery andmanagement for scheduling and/or co-schedulingof resources.

Information thus plays a key role allow-ing, if exploited, high performance execution inGrid environments: the use of manual or de-fault/static configurations hinders application per-formance, whereas the availability of informationregarding the execution environment fosters de-sign and implementation of so-called Grid-awareapplications.

212 M. Cafaro et al.

Obviously, applications can react to changesin their execution environment only if thesechanges are somehow advertised. Therefore, self-adjusting, adaptive applications are natural con-sumers of information produced in Grid environ-ments where distributed computational resourcesand services are sources and/or potential sinks ofinformation, and the data produced can be sta-tic, semi-dynamic or fully dynamic [33]. Resourcebrokering services and Grid schedulers also needto access this information for matchmaking avail-able Grid resources against a user’s request[15, 47].

The problem of matchmaking available re-sources in a Grid environment against a user’srequest entails finding one or more (a pooled setof) resources that best match the user’s request.A matchmaking service is in charge of finding thebest match given the current status of the Grid en-vironments: indeed, the same request may resultin different matchings under different resourceload, etc.

The input to the matchmaking service is a jobdescription expressed in a specified formalism(e.g., the Job Submission Description Language[1], the Job Description Language [34], the Con-dor’s classified advertisements [39] etc.) contain-ing constraints to be satisfied by resources to exe-cute a batch, parameter sweep or workflow job.

Our contribution is two-fold. First, we proposeto extend the matchmaking process to take intoaccount the user’s preferences (besides the usualconstraints) and, in order to deal with prefer-ences, we suggest the use of Conditional Prefer-ence Networks (CP–Nets) [7], a powerful conceptborrowed from the field of Artificial Intelligencethat can be used to describe, structure and reasonabout user’s preferences. Second, we thoroughlyanalyze, both qualitatively and quantitatively, theimpact of CP–Nets on the matchmaking process.Our analysis takes into account both the resourcebroker (or Grid scheduler) and the users’ per-spectives, in order to assess the validity of ourapproach.

It is worth remarking here that our focus isnot on the scheduling process: we limit ourselfin this analysis to matchmaking only, i.e., to theproblem of finding a matching set of resourcestaking into account user’s constraints and prefer-

ences. Therefore, in what follows, we will not dealwith the problem of scheduling Grid resourcesusing algorithms such as FCFS, SJF, backfillingetc to achieve, for instance, minimization of themakespan metric and we do not discuss Gridscheduling systems such as GlideinWMS [37] etc.Instead, we will utilize the simplest possible strat-egy: since our matchmaking algorithm returns aset of resources ranked according to their match-ing degrees, we will simply schedule the corre-sponding job on the first available resource. If thisresource is not available, we will try to schedulethe job on the second resource if available and soon, round-robin.

The rest of the paper is organized as follows.Section 2 introduces CP–Nets. Our matchmak-ing approach based on CP–Nets is presented inSection 3. This approach is implemented in theGrid simulator we used for our tests, which isdescribed in Section 4. We discuss the impact ofCP–Nets on the scheduling process in Section 5,analyzing the results of several computer simula-tions. We discuss related work in Section 6, anddraw our conclusions in Section 7.

2 CP–Nets

Conditional Preference Networks [6] address theproblem of representing and reasoning with pref-erences over a multivariate domain; their broadapplicability to many fields such as, for instance,design, planning and decision making, is related tothe ability to succinctly specify and represent pref-erence orderings graphically. This is an extremelyimportant feature, owing to the fact that corre-spondingly explicit representations of preferenceorderings of multivariate domains are exponen-tial in the number of variables and thus unfea-sible. We begin by formally defining preferencerelations.

Definition 1 (Preference Relation) Given a set ofvariables V = {v1, ..., vn} and the outcome spaceO = Dom(v1) × ... × Dom(vn), a preference rela-tion or ranking is a total preorder over O; ifo1 � o2 then the outcome o1 is equally or morepreferred than o2.

Preference–Based Matchmaking of Grid Resources with CP–Nets 213

Each variable vi (also known as attribute or fea-ture) may assume a value belonging to Dom(vi) ={vi

1, ..., vini}. The size of the set of outcomes O is

thus exponential.CP–Nets capture ceteris paribus (all else being

equal) conditional preference statements, whosesemantics is based on the notion of preferentialindependence.

Definition 2 (Preferential Independence) Let xdenote an assignment of values to a set X ⊆ Vand xy the concatenation of two assignments toX and Y with X ∩ Y = ∅. A set of variables Xis preferentially independent of its complementY = V − X iff

x1y1 � x2y1 iff x1y2 � x2y2∀x1x2y1y2

When the preferential independence relationholds, x1 is preferred to x2 ceteris paribus: fixingthe values of all of the other variables, the pref-erence relation (over assignments to the set X)holds independently of the values taken by theother variables.

We are now ready to define Conditional pref-erential independence.

Definition 3 (Conditional Preferential Indepen-dence) Given a partition of V such that V = X ∪Y ∪ Z , X and Y are conditionally preferentiallyindependent given z iff

x1y1z � x2y1z iff x1y2z � x2y2z ∀ x1x2y1y2

In practice, X and Y are preferentially inde-pendent iff Z is assigned z. If the relation holdsfor all possible assignments z then X and Y areconditionally preferentially independent given Z .

The preference elicitation process requires thatusers specify for each variable x ∈ V the parentvariable Parent(x) that can affect their prefer-ences over the values of x. The CP–Net graphis then constructed so that for each node x,Parent(x) is the immediate predecessor. A moregeneral approach is to allow for Parent(x) to rep-resent a set of vertices instead of a single vertex.On the basis of the particular assignment to thevertex Parent(x) the user is able to determine aspecific preference order over Dom(x), the do-main of the variable x, all other things being equal.

Thus, a CP–Net associates to each assignment toParent(x) a Conditional Preference Table (CPT).

Definition 4 (CP–Net) A CP–Net is a directedgraph G = (V, E). The set of vertices V ={v1, ..., vn} represents the CP–Net variables andE = {(vi, v j) : vi, v j ∈ V} is the set of edges be-tween variables. For each v ∈ V, the functionParent(v) returns the vertex v ∈ V such that(v, v) ∈ E. The CPT specifies a strict partial or-der i

u over Dom(xi) representing the conditionalpreference of the instantiations of xi for a giveninstantiation u of Parent(xi).

Given the ceteris paribus preference statement“I prefer wine to beer with my meal”, its interpre-tation is: given two identical meals, one with wineand one with beer, I prefer the former. The state-ment “I prefer red wine to white wine with mymeal, ceteris paribus, given that meat is served” isinterpreted as: given two identical meals in whichmeat is served, I prefer red wine to white wine.This tells us nothing about two identical meals inwhich meat is not served. Ceteris Paribus pref-erence statements induce independence relations,for instance, if my preference for wine depends on(and only on) the main course, then wine choiceis conditionally preferentially independent of allother variables given the main course value.

We now give an example describing how aCP–Net can be used to represent the followingpreference statements:

– I strictly prefer aix to linux as operatingsystem;

– I prefer power processors if the operating sys-tem is aix, and xeon processors if the operatingsystem is linux;

– I prefer the EESL math library when usingpower processors and the NAG library whenusing xeon processors.

Let the variables a, l, p, x, e and n representrespectively a preference for aix, linux, powerprocessor, xeon processor, EESL and NAG mathlibrary.

The first preference statement is unconditional;the other ones are conditional. Figure 1a showsthe CP–Net related to the previous example. Eachnode represents a domain variable, and the im-mediate parents Parent(v) of a variable v in the


a l

a : p x

l : x p

p :e n

x :n e

Operating System

CPU

Math Library

l ∧ p ∧ n

l ∧ p ∧ e

l ∧ x ∧ e

l ∧ x ∧ n

a ∧ x ∧ n

a ∧ p ∧ n

a ∧ x ∧ e

a ∧ p ∧ e

(a) CP–Net (b) Corresponding ConditionalPreference Graph

Fig. 1 CP–Net and corresponding Conditional PreferenceGraph

network are those variables that affect the user’spreference over the values of v. Associated toeach node, there is a Conditional PreferenceTable (CPT) which provides an ordering overthe values of the node for each possible parent’scontext. Figure 1b shows the corresponding Con-ditional Preference Graph, in which l ∧ p ∧ n rep-resents the worst outcome and a ∧ p ∧ e the bestone.

As discussed in [7], any acyclic CP–Net definesa consistent partial order over the outcome space;given a CP–Net N and two possible outcomesx and y, a dominance query asks whether ornot x � y is a consequence of the preferencesof N. When the Conditional Preference Graphis a DAG (Directed Acyclic Graph), it can beshown that, in order to answer the query, a simplepolynomial time sweep algorithm only needs tosearch for a flipping sequence (path) from the lesspreferred outcome y through a series of more pre-ferred outcomes, to the more preferred outcomex, where each value flip in the sequence is sanc-tioned by the network N. The time complexity ofthe flipping-sequence search over binary–valued,

DAG–structured CP–Nets is O(n2), where n isthe number of variables in the CP–Net [6]. Forinstance, the dominance query a ∧ x ∧ n � l ∧ p ∧e can be shown to be true in the example of Fig. 1bowing to the fact that there exists a path (sequenceof improving flips) from one assignment to an-other (flipping sequence). This is a proof that thelatter assignment is preferred to the former.

Given a CP–Net N, generating an optimaloutcome is even easier: this requires sweepingthrough the network from top (ancestor vertices)to bottom (descendent vertices) setting each vari-able to its most preferred value given the instanti-ation of its parents. Even if the network does not,in general, determine a unique ranking, howeverit determines a unique best outcome (assumingno indifference). Therefore, outcome optimizationqueries can be answered using the outlined for-ward sweep algorithm, whose complexity is O(n),so that it is linear in the number n of variables [6].

Finally, CP–Nets also allow expressing relativeimportance relations. These express the fact thatone variable’s value is more important than an-other’s; moreover, CP–Nets induce implicit im-portance relations between nodes and their de-scendants. As an example, one could say thatProcessor type is more important to me than op-erating system (all else being equal). If it is moreimportant to me that the value of x be high thanthe value of y be high, then x is more importantthan y. The notation to express this relative im-portance relation is x � y.

A variable may be conditionally more impor-tant than another. For instance, one could say theoperating system is more important than proces-sor type (all else being equal), if the workstationis used primarily for graphical applications. Givenz ∈ Dom(Z ), if it is more important to me thatthe value of x be high than the value of y be high,then x is conditionally more important than y. Thecorresponding notation is x �z y.

3 Conditional Preference Matchmaking

In this Section we briefly review our approach tomatchmaking available resources in a Grid envi-ronment against a user’s request. Matchmaking


available resources in a Grid environment againsta user’s request entails finding one or more (apooled set of) resources that best match the user’srequest. Our matchmaking service is in charge offinding the best match given the current status ofthe Grid environments, the user’s constraints andpreferences: indeed, the same request may resultin different matchings under different resourceload, etc. The first input to the matchmaking ser-vice is a job description expressed in a specifiedformalism (e.g., the Job Submission DescriptionLanguage [1], the Job Description Language [34],the Condor’s classified advertisements [39] etc.)containing constraints to be satisfied by resourcesto execute a batch, parameter sweep or workflowjob (e.g. the machine’s memory must be at least16 GB, the processor must be AMD etc). Thesecond input are the user’s preferences w.r.t. theresources to be used for the execution of his/herjob. In our simulator, both the constraint and thepreferences are expressed using a simple XMLdialect, as shown in Section 4.4.

Algorithm 1 describes the steps to achieveConditional Preference Matchmaking. We start

querying a Grid Information Service (GIS) usingthe constraints in C to determine RMC, a set ofresources matching the constraints on which theuser’s job may run (step 1). If RMC is empty, nomachine on the Grid actually satisfies the user’sconstraint, so that the scheduler discards the joband informs the user (steps 2–3). Otherwise, insteps 4–11 we deal with the resources in RMC.

We begin by building the CP–Net graph relatedto the preferences P in step 5. Then, we run thelinear time outcome optimization query on theCP–Net in step 6, determining RMPC ⊂ RMC,a subset of resources matching both the user’spreferences and costraints. If RMPC is empty,the CP–Net algorithm did not find any resource inRMC satisfying the user’s preferences. Hence, weschedule the job on one of the machines belongingto RMC (steps 7–8). This can be done using, forinstance, an heuristic.

When RMPC is not empty, steps 9–11 return aset of resources suitable for job execution. SinceRMPC may contain many resources, we selectF ⊂ RMPC, a fraction of these resources in step10. For instance, we select and return the initial


30 % of the machines returned in RMPC, orderedfrom best to worst w.r.t. preferences. The job isscheduled on the first machine available in F. Ifa machine is not available, the scheduler will tryscheduling the job on the next one round robinuntil it succeeds or a timeout elapses.

We now analyze the computational complexityof Algorithm 1. We denote by T(R, C, P) thetime required to determine a feasible set of re-sources on a Grid consisting of R resources tak-ing into account the constraints C and the pref-erences P, by H(RMC) the time to execute anheuristic on the set of resources RMC and byCP–Net(RMC, P) the time to execute the CP–Net outcome optimization query on RMC andP. We have T(R, C, P) = Max(H(RMC), CP–Net(RMC, P)).

Indeed, the time to query a GIS (we have anexplicit query in step 1, and implicit queries insteps 8 and 11) is usually negligible w.r.t. the timeused for scheduling the job. The CP–Net outcomeoptimization query requires in the worst case timelinear in the input size n (number of preferences),i.e. O(n). Therefore, T(R, C, P) = �(|RMC|).

Since analyzing all of the resources in step 8using an heuristic requires at least time linear inthe number of resources in RMC, we concludethat the overall time complexity depends on theactual number of steps executed for each resource.As an example, if no more than O(1) steps areexecuted on each resource in RMC, then theoverall complexity of the algorithm is linear in thenumber of resources in RMC.

4 A Simulator for Grid Matchmakingand Scheduling

Efficient and effective scheduling is very impor-tant in Grid computing environments as shown in[13, 25, 26]. The problem can be addressed con-sidering experimental or simulation approaches.Validating the performance of Grid schedulingstrategies in a real production environment shouldbe the ideal scenario but cannot be feasibly car-ried out. The complexity of production systems,dynamism of Grid execution environments andthe difficulty to reproduce experiments, makescheduling in production systems a complex re-

search environment. So, given the difficultiestied to the experimental approach, simulation isthe most flexible and viable way of evaluatingdifferent Grid scheduling algorithms as well asother design issues, although some simplificationsand assumptions are made.

In order to develop and evaluate new Gridscheduling algorithms it is fundamental the use ofsimulators in order to address performance evalu-ation studies, considering possible constraints andpreferences given by the users. On the other hand,it is worth nothing here that the evaluation ofthe performances obtained with a simulator isjust a first step, that must be followed by thesetup of a good testing environment with repre-sentative workload traces to produce dependableresults. Computer simulation is our approach forevaluation of CP–Nets in the matchmaking of in-volved resources; therefore, a simulator has beendesigned and implemented. Even though sev-eral simulators have already been developed e.g.Alea [20], Briks [44], ChicSim [36], GrenchMark[19], GridNet [24], GridSim [8], MicroGrid [40],NSGrid, [46], G3S [32], OptorSim [9], SimGrid[10] and GSSIM [23], we decided to implementour own simulator for the following reasons:

– we did not need the ability to plug in severalalgorithms;

– the majority of simulators do not provide Cbindings (with the exception of SimGrid);

– in order to reduce software engineering costsand to maximize reuse we had to exploit analready implemented code base;

– the time required to install a simulator, readand understand its documentation and imple-ment our algorithm according to the simula-tor’s architecture was much higher, with regardto the integration of our software modules forour purposes.

Our simulations takes as parameters con-straints and preferences, related to a set of infor-mation on resources (CPU, memory and storageusage), network links and applications, providedby the users (our scheduler’s clients) in an XMLformat. We begin describing first the workloaddata, then we present the architecture of the sim-ulator and hence several implementation details.


4.1 Workload Data

The workload plays an important role in ex-perimental performance evaluation of computersystems. Many studies have been conducted todesign better and more effective resource alloca-tion schemes [13].

Using the simulation approach, a data set rep-resentative of the job inter–arrival times can beretrieved by using various statistical distributionssuch as Uniform and Exponential distribution [14,29, 31].

For describing job arrivals, we used statisticaldistributions such as Exponential, Gaussian andWeibull. The first one, Exponential distribution, ismotivated by the need to simulate an incrementalrate of task arrivals, the second one, Gaussiandistribution, performs well even under the stressof millions of tightly packed data points and finallythe Weibull distribution’s failure rate is a powerfunction of time, so that instantaneous failure rateat time t is defined as the probability of failuresbetween time t and t + dt given that no failure hasoccurred in the system until time t.

4.2 Simulator Architecture

The simulator has a client-server architecture.The server starts initializing a number of childprocesses specified as a command line parame-ter. When a connection request is issued by aclient (service consumer), the server spawns a newthread to serve the incoming request in a thread.

A submitted job contains constraints and pref-erences described in an XML file, described inSection 4.4. As shown in the Fig. 2, when therequest is submitted by a client, the server queuesthe request in the Arrivals Queue (managed bythe process_request thread). Hence the job isvalidated against a Document Type Definition(DTD) document (validator thread).

If the validation is successful, then the job isqueued in the Job Queue, handled with a FIFO(First In First Out) policy. The job status becomesJOB_SUBMITTED and it is stored in an internaldatabase with the current timestamp. Otherwise,if the XML file is not well formed or the clienthas submitted again a job that was submitted pre-viously, then the validator thread kills the job and

Start

XML

Data Generation

Workload dataArrivals Queue (An)

Validation

Duplicatedjob_id

Yes

Job not scheduled

No

Yes

Job Queue

(Jn)

No

JobEvaluation

Use values of previous queries

Constraint queryPreferences query

QueryEvaluation

Jn=An

Jn=Pn

Job is suppressed

con == 0

con > 0 Check grid status

free storage >requested storage

free CPUs >requested

CPUs

Yes

No

Job Execution

Yes

No

Resources availability

Yes

Job Pending

(Pn)

No

End

Fig. 2 The simulator flow chart


the server deals with the following requests. Jobsqueued in the Job Queue are not scheduled im-mediately, owing to the fact that there is anotherqueue, the Pending Queue, with higher priority.The latter queue handles jobs that were submittedpreviously; the scheduler was not able to schedulethem before, taking into account the constraintsand preferences characterizing them on the onehand, and the Grid status on the other.

When a job is dequeued from the Job Queue,the scheduler queries the internal database (whichfullfills the role of a Grid Information Ser-vice) to retrieve RMC, the set of computa-tional resources matching the constraints andthen processes through the CP–Net outcome op-timization query the preferences as specified inthe corresponding XML file, returning RMPC,the set of resources matching both the user’spreferences and constraints. If none of the re-sources satisfies the constraint query (|RMC| ==0), the client request cannot be scheduled andhence the job is suppressed and its status is up-dated to JOB_SUPPRESSED. If a set of re-sources matches the constraints but no resourcesatisfies the preferences, then one of the resourcesmatching the constraints is chosen (|RMC| > 0 ∧|RMPC| == 0). Finally, when a set of resourcesmatches the constraints and the CP–Net algorithmreturns a set RMPC including at least a resource,which is the best outcome with regard to the pref-erences (|RMC| > 0 ∧ |RMPC| > 0), the sched-uler selects a subset of resources in RMPC totallyordered from best to worst w.r.t. preferences, andtries to schedule the job on the first availableresource (round robin).

Therefore, before executing the job, the sched-uler queries again the database because, owingto the dynamic nature of the Grid environment,

it needs to verify if the selected resource stillprovides the required number of CPU cores etc;indeed, another scheduler’s thread may have sub-mitted concurrently another job or rescheduled ajob dequeued from the Pending Queue.

If this is the case, the job is queued in the Pend-ing Queue (also handled with FIFO policy). Sincethis queue has a higher priority with regard to theJob Queue, the scheduler will attempt to servejob requests belonging to this queue before theones related to Job Queue. In turn, this providesa certain degree of fault–tolerance. However, ajob cannot be queued into the Pending Queueover and over again: a field of the database (max-PendingTimes (MPT)) specifies a Time To Live(TTL), so that when this time limit is exceededthe job is suppressed and its status updated toJOB_TIMEOUT.

Figure 3 depicts job states and transitions be-tween states; available states include:

– JOB_PENDING: the job is waiting in the Ar-rivals or Pending Queue;

– JOB_SUBMITTED: the job is processed bythe server;

– JOB_TIMEOUT: the time specified in themaxPendingTimes field has elapsed withoutthe scheduler being able to submit the job;

– JOB_SUPPRESSED: no resource was avail-able for job scheduling;

– JOB_DONE: the job completed successfully;– JOB_FAILED: job execution failed.

4.3 The Information Database

One of the aspects considered in the design of thesimulator has been the definition of the database

Fig. 3 Job states

JOB_SUBMITTED

JOB_TIMEOUT

JOB_FAILED

JOB_PENDING

JOB_DONE

end

start

MPT < Max

JOB_SUPPRESSED

MPT >Max


schema in order to provide a system for storingand accessing the data with the usual CRUD op-erations. Therefore, we modeled the informationto be managed taking into account informationabout the available CPUs and cores, resources,nodes, applications, jobs and queues. A snap-shot of the relational schema is shown in Fig. 4).The “Grid” database contains the followinginformation:

– CPU entity:

– serial_number: CPU serial number (pri-mary key of this entity);

– node_id: node identifier, represents a re-source node containing the CPU;

– hourly_cost: hourly cost of the CPU;– frequency: clock frequency;

– brand: CPU manufacturer;– cores: numbers of cores;– c_int: integer performance value acquired

through the SPEC benchmark;– c_float: floating point performance value

acquired through the SPEC benchmark;– cache_L1: size of level 1 cache;– cache_L2: size of level 2 cache.

– Node entity:

– id_node: node identifier (primary key ofthis entity);

– ram: RAM size;– os_name: installed operating system;– os_version: operating system version;– os_load_average: load average;– network_interface: available network

interface;– hostname_resource: node’s hostname.

CPU

serial_number

hourly_costfrequencybrandcoresspec_intspec_floatcache_L1cache_L2node_id

node

id_node

ramos_nameos_versionos_load_averagenetwork_interfacehostname_resource

PK

FK1

application

name

typeexpected_workloadadditional_prefs

application_installed_resource

date_time

resource_hostnamename_application

resource

PK hostname

placestorage_typestorage_namestorage_brandstorage_spacestorage_free_spacebandwith_inbandwith_out

PK

FK1

queue

namehostname

typeprioritycpusfree_cpuspolicy

job

PK

FK1FK2FK3

id

typeSLAnameparent_idname_queuename_application

job status

PK

FK1

date_time

valueid_job

PK

PK

FK2FK1

PK1PK2

Fig. 4 Relational schema of the “Grid” database


– Resource entity:

– hostname: hostname identifying the re-source (primary key of this entity);

– place: resource’s location;– storage_type: storage type of the

resource;– storage_name: storage name of the

resource;– storage_brand: storage brand of the

resource;– storage_space: size of space currently used;– storage_free_space: size of free space

available;– bandwidth_in: input network bandwidth;– bandwidth_out: output network

bandwidth.

– Application entity:

– name_application: application name (pri-mary key of this entity);

– type: application type (sequential,parallel);

– expected_workload: expected workload ofthe application;

– additional_prefs: possible preferences.

– Job entity:

– id: job identifier (primary key of thisentity);

– type: job type (sequential, parallel);– SLA: Service Level Agreement associated

to the job;– name: job name;– parent_id: identifier of parent job (re-

quired to support workflow applications;NULL for a job without parent);

– name_queue: name of resource queuein which the job has been queued forexecution;

– name_application: name of application towhich the job refers.

– Job Status entity:

– id_job: job identifier (primary key of thisentity);

– date_time: timestamp (date and hour) as-sociated to the job status;

– value: job status;

– Queue entity:

– name: queue name (primary key of thisentity, along with hostname_resource);

– hostname_resource: hostname of the re-source to which the queue belongs;

– type: queue type (sequential, parallel)– priority: priority level (high, medium, low);– cpus: total number of CPUs handled;– free_cpus: number of current CPUs avail-

able for job execution;– policy: management policy of the queue.

4.4 Job Description

The jobs executed during the simulation are rep-resented by XML files containing both the con-straints and preferences. These files are validatedusing a suitable schema. In particular, the Jobtag, that represents the root element, must havethree child tags: Parameters, Requirements andPreferences. The Parameters tag specifies the jobexecutable, the command line arguments, the typeof job (serial or parallel) and the number of re-quired CPUs. Requirements contains constraintson the resources such as the type of CPUs, theamount of RAM, the operating system etc; Pref-erences contains the user’s desiderata. A Prefer-ences node may have a maxT node (maxterm)containing a minT (minterm) node which musthave a child node named depends. Indeed, prefer-ences’ modeling is based on CP–Nets: a CP–Tableassociate to a CP–Net node may be expressedusing standard forms of expressions such as sumof products (minterms) or product of sums (max-terms) commonly used in boolean algebra andKarnaugh maps. The following is the DocumentType Definition for job description.<!ELEMENT Job (Parameters,Requirements,Preferences)><!ELEMENT Parameters (Executable, Arguments, Type)><!ELEMENT Executable (#PCDATA)><!ELEMENT Arguments (#PCDATA)><!ELEMENT Type (#PCDATA)><!ELEMENT Requirements (CPU?, Node?, Resource?)><!ELEMENT CPU (brand?, cores?, frequency?, cache_L1?, cache_L2?, CINT?,

CFP?, hourly_cost?)><!ELEMENT Node (ram?, os_name?, os_version?)><!ELEMENT Resource (hostname?, place?, bandwidth_in?, bandwidth_out?,

storage_free_space?)><!ELEMENT Preferences (CPU?, Node?, Resource?)><!ELEMENT brand (maxT?)><!ELEMENT cores (maxT?)><!ELEMENT frequency (maxT?)><!ELEMENT cache_L1 (maxT?)><!ELEMENT cache_L2 (maxT?)><!ELEMENT CINT (maxT?)><!ELEMENT CFP (maxT?)><!ELEMENT hourly_cost (maxT?)><!ELEMENT ram (maxT?)>


<!ELEMENT os_name (maxT?)><!ELEMENT os_version (maxT?)><!ELEMENT hostname (maxT?)><!ELEMENT place (maxT?)><!ELEMENT bandwidth_in (maxT?)><!ELEMENT bandwidth_out (maxT?)><!ELEMENT storage_free_space (maxT?)><!ELEMENT maxT (minT+)><!ELEMENT minT (depends+)><!ELEMENT depends EMPTY><!ATTLIST Executable applicationName CDATA #REQUIRED expectedWorkload

CDATA #REQUIRED><!ATTLIST Type nCpu CDATA #REQUIRED><!ATTLIST brand value CDATA #REQUIRED operator (equal) "equal"><!ATTLIST cores value CDATA #REQUIRED operator (max|min) "min"><!ATTLIST frequency value CDATA #REQUIRED operator (max|min) "min"><!ATTLIST cache_L1 value CDATA #REQUIRED operator (max|min) "min"><!ATTLIST cache_L2 value CDATA #REQUIRED operator (max|min) "min"><!ATTLIST CINT value CDATA #REQUIRED operator (max|min) "min"><!ATTLIST CFP value CDATA #REQUIRED operator (max|min) "min"><!ATTLIST hourly_cost value CDATA #REQUIRED operator (max|min) "min"><!ATTLIST ram value CDATA #REQUIRED operator (max|min) "min"><!ATTLIST os_name value CDATA #REQUIRED operator (equal) "equal"><!ATTLIST os_version value CDATA #REQUIRED operator (equal) "equal"><!ATTLIST hostname value CDATA #REQUIRED operator (equal) "equal"><!ATTLIST place value CDATA #REQUIRED operator (equal) "equal"><!ATTLIST bandwidth_in value CDATA #REQUIRED operator (min) "min"><!ATTLIST bandwidth_out value CDATA #REQUIRED operator (min) "min"><!ATTLIST storage_free_space value CDATA #REQUIRED operator (min)

"min"><!ATTLIST maxT operation (and|or) "and"><!ATTLIST minT operation (and|or) "and"><!ATTLIST depends node (brand|cores|frequency|cache_L1|cache_L2|CINT

|CFP|hourly_cost|ram|os_name|os_version|hostname|place|bandwidth_in|bandwidth_out|storage_free_space) "brand"denied (y|n) "n">

Here is an example of a job description file:<Job>

<Parameters><Executable applicationName="app_8" expectedWorkload="53.675">48</Executable><Arguments>arg4</Arguments><Type nCpu="8">PARALLEL</Type>

</Parameters><Requirements>

<CPU><frequency value="1.8" operator="min"/><hourly_cost value="2" operator="max"/>

</CPU><Node>

<ram value="4" operator="max"/><os_name value="Aix" operator="equal"/>

</Node></Requirements><Preferences>

<CPU><cores value="16" operator="min">

<maxT operation="or"><minT operation="and">

<depends node="bandwidth_in" denied="n"/></minT>

</maxT></cores><hourly_cost value="1.5" operator="max"/>

</CPU><Resource>

<bandwidth_in value="12" operator="min"><maxT operation="or">

<minT operation="and"><depends node="hourly_cost" denied="y"/>

</minT></maxT>

</bandwidth_in></Resource>

</Preferences></Job>

This file describes the following requests of auser:

– Execution of a parallel job named ‘48’, relatedto the ‘app_8’ application with at least 8 CPUs(constraint);

– RAM size must be at least 4 GB on each node(constraint);

– The operating system must be Aix(constraint);

– The CPU frequency must be greater than orequal to 1.8 Ghz (constraint);

– The hourly cost of the CPUs must be less thanor equal to 2 dollars (contraint);

– If possible, the hourly cost of the CPUsshould be less than or equal to 1.5 dollars(preference);

– If it is possible submit a job on a CPU withhourly cost under 1.5 dollars, preferably theresource should have a minimum input band-width of 12 Mb/s (preference);

– If the previous requests can be satisfied, thejob should be run on a CPU with 16 cores(preference).

The following example shows how to expressthree preferences, each one depending on the pre-vious one; we omit details related to parametersand requirements. In particular, the user prefersa CINT value (related to the CPU performance)that must be at least 42, and, if this preferenceholds, the user prefers an AMD CPU; finally, ifthe CPU is an AMD one, the user prefers a level2 cache size of at least 1 MBytes.<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE Job SYSTEM "gridsim.dtd"><Job>

<Parameters> ...</Parameters><Requirements>...</Requirements><Preferences>

<CPU><brand value="AMD" operator="equal">

<maxT operation="and"><minT operation="or">

<depends node="CINT" denied="n"/></minT>

</maxT></brand><cache_L2 value="1" operator="min">

<maxT operation="and"><minT operation="or">

<depends node="brand" denied="n"/></minT>

</maxT></cache_L2><CINT value="42" operator="min"/>

</CPU></Preferences>

</Job>

5 Impact of CP–Nets on Scheduling

In this section we present the experimental re-sults we obtained. We begin by describing theexperiments that have been carried out, which arecharacterized by the following parameters:

– r, number of computational resources man-aged by the Grid scheduler;

– j, number of jobs submitted to the Gridscheduler;


– w, workload expressed as number of jobs al-ready running on the Grid;

– e, boolean value indicating if the CP–Net algo-rithm is enabled or disabled;

– a, number of applications to be simulated;– n, total number of nodes;– c, total number of CPUs.

We designed and carried out 38 differentexperiments which are characterized by r ∈{500, 1000, 2000}, j ∈ {500, 1000, 2000, 100000},w ∈ {0, 500, 1000, 2000}, e ∈ {T RU E, F ALSE},a = 24. The first 36 experiments have been runsubmitting up to 2,000 jobs on small, mediumand large Grids: for r = 500, n = 13,567 and c =365,204; for r = 1,000, n = 27,766 and c = 729,084;for r = 2,000, n = 52,642 and c = 1,330,148. Thelast two experiments have been run submitting100,000 jobs on a large Grid: r = 2,000, n = 52,737and c = 1,404,214.

The hardware used in the first 36 experi-ments consists of three SMP (Symmetric Multi-Processor) nodes configured with two Intel Ita-nium 2 single core processors 1.4 Ghz with 1.5 MBlevel 3 cache and 4 GB of main memory. Oneof the nodes was dedicated to the execution ofour Grid simulator, another one to the back-endPostgreSQL database and the last one was usedto issue the user’s requests to the Grid simulator.In order to stress the simulator, all of the requests

Table 1 Unloaded Grid consisting of 500 resources

Jobs 500 1,000 2,000Time (CP–Net disabled) 1,329.29 1,618.02 1,592.54Time (CP–Net enabled) 2,525.93 2,080.67 2,798.98Difference 1,196.64 462.64 1,206.44% Difference 90.02 28.59 75.76Average (CP–Net disabled) 2.66 3.24 3.19Std deviation 5.29 5.76 3.28

(CP–Net disabled)Average (CP–Net enabled) 5.05 4.16 5.60Std deviation 8.47 5.61 6.57

(CP–Net enabled)Resources 383 484 486

(CP–Net disabled)Resources 293 420 482

(CP–Net enabled)Difference −90 −64 −4% Difference −23.49 −13.22 −0.82

Table 2 Grid consisting of 500 resources, initial workloadof 500 jobs

Jobs 500 1,000 2,000Time (CP–Net disabled) 945.63 1,293.75 1,399.59Time (CP–Net enabled) 1,946.44 2,355.04 2,687.59Difference 1,000.81 1,061.29 1,287.99% Difference 105.83 82.03 92.03Average (CP–Net disabled) 1.89 2.59 2.80Std deviation 4.13 6.19 6.68





were issued concurrently. For the last two exper-iments, the hardware used consists of three SMPnodes configured with two Intel Xeon E5520 dualcore processors 2.27 Ghz with 8 MB level 3 cacheand 8 GB of main memory.

Tables 1, 2, 3, 4, 5, 6 and 7 and Figs. 5, 6,7, 8, 9, 10, 11, 12, 13 and 14 summarize the re-sults obtained. In these tables, the total time forscheduling the jobs, the average time to scheduleone job and standard deviation are expressed in

Table 3 Unloaded Grid consisting of 1,000 resources

Jobs 500 1,000 2,000Time (CP–Net disabled) 4,478.43 5,832.93 5,948.85Time (CP–Net enabled) 6,874.56 8,533.51 8,644.94Difference 2,396.12 2,700.58 2,696.09% Difference 53.50 46.30 45.32Average (CP–Net disabled) 8.96 11.67 11.90Std deviation 27.96 36.77 37.33






Table 4 Grid consisting of 1,000 resources, initial work-load of 1,000 jobs

Jobs 500 1,000 2,000Time (CP–Net disabled) 3,639.75 5,067.47 5,525.62Time (CP–Net enabled) 5,719.73 7,587.18 7,612.77Difference 2,079.98 2,519.71 2,087.15% Difference 57.15 49.72 37.77Average (CP–Net disabled) 7.28 10.13 11.05Std deviation 23.01 32.11 35.12





seconds. It’s worth clarifying here that, since ourfocus is on matchmaking and not on scheduling,the impact of CP–Nets on scheduling is measuredby carrying out several couples of experiments,respectively enabling or disabling the CP–Netalgorithm (parameter e) in order to verify quanti-tatively the net effect of enabling it and to demon-strate that it is negligible; we schedule the jobsusing the simplest round-robin strategy applied

Table 5 Unloaded Grid consisting of 2,000 resources

Jobs 500 1,000 2,000Time (CP–Net disabled) 34,256.4 63,764.1 132,312Time (CP–Net enabled) 38,972.8 76,280.9 153,356Difference 4,716.47 12,516.8 21,044.2% Difference 13.76 19.62 15.9Average 68.51 63.76 66.15

(CP–Net disabled)Std deviation 179.16 177.42 184.17

(CP–Net disabled)Average 77.94 76.28 76.67

(CP–Net enabled)Std deviation 186.84 192.03 193.25

(CP–Net enabled)Resources 445 852 1,559

(CP–Net disabled)Resources 435 795 1,319


Table 6 Grid consisting of 2,000 resources, initial work-load of 2,000 jobs

Jobs 500 1,000 2,000Time (CP–Net disabled) 32,767.8 61,576.5 125,502Time (CP–Net enabled) 38,217.1 75,871.9 145,591Difference 5,449.3 14,295.4 20,089% Difference 16.63 23.21 16.007Average 65.53 61.57 62.75

(CP–Net disabled)Std deviation 173.91 171.65 174.7

(CP–Net disabled)Average 76.43 75.87 72.79

(CP–Net enabled)Std deviation 185.39 191.97 184.4

(CP–Net enabled)Resources 855 1,564 1,948

(CP–Net disabled)Resources 794 1,317 1,869


to the set of resources returned by the CP–Netsalgorithm, which are ranked according to theirmatching degrees; i.e., each job is scheduled on thefirst available resource returned by the algorithm.Given that this is the simplest possible schedul-ing strategy, our aim therefore is not to com-pare different scheduling algorithms, but, rather,to determine experimentally the time required toschedule all of the jobs, the average time and thestandard deviation to schedule a job and resourceutilization when CP–Nets matchmaking is takeninto account.

Table 7 Unloaded Grid consisting of 2,000 resources,100,000 submitted jobs

Jobs 100,000Time (CP–Net disabled) 9,949,690Time (CP–Net enabled) 11,782,200Difference 1,832,480% Difference 18.41Average (CP–Net disabled) 99.49Std deviation (CP–Net disabled) 324.07Average (CP–Net enabled) 117.82Std deviation (CP–Net enabled) 370.76Resources (CP–Net disabled) 1,950Resources (CP–Net enabled) 1,954Difference 4% Difference 0.2


Fig. 5 500 jobs on a Gridconsisting of 500resources

20 40 60 80

50

100

150

200

250

Number of jobs

CP−Net disabledCP−Net enabled


10 20 30 40 50 60 70

100

200

300

400

Number of jobs

Seconds

Seconds

(a) Grid unloaded

(b) Grid workload: 500 jobs

We begin discussing the results related toTable 1 and Figs. 5–7. The figures are histogramsin which we plot the distribution of data as fre-quencies. On the x and y axes we plot respectivelythe time in seconds to schedule a job (rounded tothe nearest integer) and the number of jobs thatrequired that time to be scheduled.

For a Grid consisting of 500 resources with noinitial workload (Table 1 and Figs. 5–7), enablingthe CP–Net algorithm we observe as expected anincrease of the total time required to schedule the

jobs. However, the rate of increase is not directlyproportional to the number of submitted jobs.When submitting 500 jobs, the rate of increase is90.02 %, on average it takes 5.05 s to schedulea job (versus 2.66 with CP–Nets disabled) andthe standard deviation is 8.47 s (versus 5.29). Therate of increase is only 28.59 % when submit-ting 1,000 jobs. Correspondingly, on average ittakes 4.16 s to schedule a job (versus 3.24 withCP–Nets disabled) and the standard deviation is5.61 s (versus 5.76). Therefore, in this case the


scheduling process appears to be more uniform,with less dispersion around the mean value w.r.t.the same experiment in which CP–Nets are notenabled. Finally, when submitting 2,000 jobs therate of increase becomes 75.76 %, the averagetime to schedule a jobs is 5.6 s (versus 3.19 withCP–Nets disabled) and the standard deviation is6.57 s (versus 3.28). From these experimental re-sults we conclude that for this Grid increasing thesubmitted jobs leads to a reduction of the small

overhead associated to the CP–Net algorithm upto (probably) a minimum value, and then theoverhead increases again. The behavior is thus theone associated to a monotonically decreasing andthen increasing function. As can be seen in Figs. 5–7, the majority of the submitted jobs requires a fewseconds to be scheduled, with or without the CP–Net algorithm. The figures also show the presenceof outliers, a few job requiring more time to bescheduled. We now discuss resource utilization.

Fig. 6 1,000 jobs on aGrid consisting of 500resources

10 20 30 40 50 60 70Seconds

100

200

300

400

500

600

700

Number of jobs

(a) Grid unloaded

20 40 60 80 100Seconds

200

400

600

800

Number of jobs





Fig. 7 2,000 jobs on aGrid consisting of 500resources

20 40 60 80 100

20 40 60 80 100

Seconds

200

400

600

800

1000

1200

1400

Number of jobs

(a) Grid unloaded

Seconds

500

1000

1500

Number of jobs




Since the application in our simulations are notinstalled on each resource, a complete utilizationof the full set of available resources is not possible.In the experiments, resource utilization falls from383 to 293 resources used when submitting 500jobs, from 484 to 420 resources for 1,000 jobs andfinally from 486 to 482 for 2,000 jobs. The trendis therefore the one of a monotonically increasingfunction; when enabling the CP–Net algorithmthe overall difference in resource utilization be-comes negligible as the number of submitted jobsincreases.

Regarding the rate of increase of the schedulingtime associated to the CP–Net algorithm, the samepattern can be observed in the results related toTable 2 and Figs. 8–10. These results are related tothe same Grid consisting of 500 resources, but inthe corresponding experiments there is an initialworkload of 500 jobs already running on the Gridbefore submitting new jobs. While the rate of in-crease is higher with regard to the unloaded Grid,we observe that on average the time required toschedule a job is practically almost always lower.Resource utilization is also better, with almost no


difference when increasing the number of submit-ted jobs, and very close to the maximum possible(given that, as already stated, full resource utiliza-tion is not possible).

We now analyze the results obtained for a Gridconsisting of 1,000 resources, with no initial work-load. These results are provided in Table 3 andFigs. 8–10. As shown, when increasing the numberof computational resources belonging to the Grid,the rate of increase is monotonically decreasing

when increasing the number of submitted jobsfrom 500 to 2,000. The average time to schedulea job using the CP–Net algorithm is, respectively,13.75, 17.07 and 17.29 s (versus 8.96, 11.67 and 11.9without CP–Net) for 500, 1,000 and 2,000 submit-ted jobs. The overall resource utilization factoris very good for 500 and 2,000 jobs (respectivelya difference of 31 and 85 resources not utilized)and worse for 1,000 submitted jobs (difference of162 resources). The number of resources utilized

Fig. 8 500 jobs on a Gridconsisting of 1,000resources

50 100 150 200 250Seconds

100

200

300

400

Number of jobs

(a) Grid unloaded

50 100 150 200Seconds

100

200

300

400

500Number of jobs

(b) Grid workload: 1,000 jobs




Fig. 9 1,000 jobs on aGrid consisting of 1,000resources

50 100 150 200 250 300 350Seconds

200

400

600

800

Number of jobs

(a) Grid unloaded

50 100 150 200 250 300Seconds

200

400

600

800

Number of jobs




increases steadily with the number of submittedjobs, reaching 887 resources (out of 1,000) for2,000 jobs.

When considering the same Grid consisting of1,000 resources (Table 4 and Figs. 8–10), this timewith an initial workload of 1,000 jobs already run-ning on the Grid before submitting new jobs, weobtained the following results. As in the previouscase, the rate of increase is monotonically decreas-ing when increasing the number of submitted jobs

from 500 to 2,000. Moreover, the average timerequired to schedule a job is lower than the corre-sponding time in the previous case, and resourceutilization is again steadily increasing with thenumber of submitted jobs, reaching 972 resources(out of 1,000) for 2,000 jobs.

Regarding the results obtained for a Grid con-sisting of 2,000 resources, with no initial work-load and with an initial workload of 2,000 jobs(Tables 5–6 and Figs. 11–13), we note a dramatic


decrease of the rate of increase of the total timerequired to schedule the jobs with respect to thesmall (500 resources) and the medium (1,000 re-sources) sized Grids in all of the experiments with500, 1,000 and 2,000 submitted jobs. The averagetime to schedule a job and the standard deviationwhen using the CP–Net are only slightly largerthan the corresponding times without the CP–Netalgorithm, and the increase is negligible. Resourceutilization is also quite good. The utilization fac-

tor increases with the number of submitted jobs,and, for the Grid with no initial workload, in theworst case (2,000 jobs submitted) there is only a−15.39 % difference between the correspondingexperiments with and without CP–Net. For theGrid with an initial workload of 2,000 jobs, theworst case happens when submitting 1,000 jobs,with a similar −15.79 % difference. Interestingly,the percentage difference is the lowest (−4.05 %)when submitting 2,000 jobs.


50 100 150 200 250 300Seconds

500

1000

1500

Number of jobs

(a) Grid unloaded

50 100 150 200 250 300 350Seconds

500

1000

1500

Number of jobs





Fig. 11 500 jobs on aGrid consisting of 2,000resources

200 400 600 800 1000Seconds

100

200

300

400

Number of jobs

(a) Grid unloaded

200 400 600 800 1000Seconds

100

200

300

400

Number of jobs




Finally, Table 7 and Fig. 14 refers to the lasttwo experiment carried out on an unloaded Gridconsisting of 2,000 resources. To assess the scal-ability of the CP–Net algorithm, we submitted100,000 jobs. As shown, the time required to sub-mit all of the jobs using the CP–Net algorithm isonly 18.41 % more than the corresponding timewithout the algorithm. Average time and standarddeviation increase slightly, and resource utiliza-tion is in this case even slightly better, with moreresources utilized when running the experimentwith the CP–Net algorithm enabled.

6 Related Work

In this paper we strictly deal with the matchmak-ing process in the context of Grid scheduling, notwith scheduling algorithms in general; therefore,we discuss in this Section relevant work in the fieldof matchmaking only.

In the field of Artificial and ComputationalIntelligence, earliest results in matchmaking in-clude [5, 22, 35, 38, 41, 43]. Agent-Based SoftwareInteroperability (ABSI) [38] takes advantage ofthe KQML (Knowledge Query and Manipulation


Language) specification and uses KIF (Knowl-edge Interchange Format) as content language.Matchmaking of advertisements and users’ re-quests happens through the unification of equalitypredicates.

COIN [22], is a system in which matchmakingis based on a unification process quite similar tothe one carried out by the Prolog programminglanguage. InfoSleuth [5] uses KIF as the contentlanguage; the matchmaking process is based onsolving a constraint satisfaction problem, so that

an advertisement and a request match if the user’sconstraints are satisfied.

The Service Description Language has beenproposed in [41] to describe available ser-vices. Here, matchmaking requires determining k-nearest services for a request according to the dis-tance between the service names (pairs of verb andnoun terms) and the request. Capability Descrip-tion Language was proposed in [48]. It supportsreasoning through the notions of subsumption andinstantiation.


200 400 600 800 1000Seconds

200

400

600

800

Number of jobs

(a) Grid unloaded

200 400 600 800 1000Seconds

200

400

600

800

Number of jobs






200 400 600 800 1000Seconds

500

1000

1500

Number of jobs

(a) Grid unloaded

200 400 600 800 1000Seconds

500

1000

1500

Number of jobs




Language for Advertisement and Request forKnowledge Sharing (LARKS) appeared in [43]and is able to describe both service capabili-ties and service requests. It is based on theITL (Information Terminological Language) con-cept language [42]. LARKS exploits the relationsamong concepts in order to compute semanticsimilarities.

Traditionally, service and resource discoveryhave been carried out using methods based onname and keyword matchmaking. A semantic

matchmaking framework based on DAML-S, aDAML (DARPA Agent Markup Language)-based language for service description, has beenproposed in [35]. In this ontology-based match-making framework an advertisement matches arequest when the service or resource provided bythe advertisement can provide a certain degreeof usefulness to the requester. When perform-ing matchmaking, the system uses the outputsand inputs of the advertisement and the requestbased on the ontologies available, and, through


the subsumption relationship of one concept ofthe input/output of the advertisement and oneconcept of the input/output of the request, is ableto determine four different levels of matching:exact, plug-in, subsume, and fail.

Another ontology-based matchmaking serviceis presented in [18]. It uses separate ontologies todeclaratively describe resources and job requests.Instead of exact syntax matching, their ontology-based matchmaker performs semantic matchingusing terms defined in ontologies. The loose cou-pling between resource and request descriptionsremoves the tight coordination requirement be-tween resource providers and consumers. The au-thors designed and prototyped their matchmakingservice using TRIPLE to use ontologies encodedin W3C Resource Description Format (RDF) andrules (based on Horn logic and F-logic) for re-source matching. Resource descriptions, requestdescriptions, and usage policies are all indepen-dently modeled and syntactically and semanticallydescribed using RDF schema. Inference rules are

utilized for reasoning about the characteristics ofa request, available resources, and usage poli-cies to find a resource that satisfies the requestrequirements.

In [3] the authors implemented a matchmakingservice in an intelligent Grid environment, theBondGrid [4]. Their matchmaking framework isbased on a resource specification component, arequest specification component, and matchmak-ing algorithms. The request specification includesa matchmaking function and possibly two addi-tional constraints, namely a cardinality thresholdand a matching degree threshold. The cardinal-ity threshold specifies how many resources therequestor expects from the matchmaking service.The matching degree threshold purpose is to spec-ify the least matching degree of one of the re-sources returned. The input of the matchmakingalgorithm is the request and the Grid resource in-stances stored in a knowledge base; the algorithmevaluates the request function in the context ofeach resource instance. The output is a number

500 1000 1500 2000Seconds

20 000

40 000

60 000

80 000

Number of jobs

CP−Net enabled

CP−Net disabled

Fig. 14 Unloaded Grid consisting of 2,000 resources, 100,000 submitted jobs


of Grid resources, which are ranked according totheir matching degrees. The matchmaking servicereturns the Grid resources that have the n largestmatching degrees to the requester, where n is thecardinality threshold specified by the request.

In [30], the authors discuss the problem ofmatchmaking for mathematical services, wherethe semantics play a critical role in determiningthe applicability or otherwise of a service andfor which they use OpenMath descriptions ofpre- and post-conditions. A matchmaking archi-tecture supporting the use of match plug-ins isdescribed, along with five kinds of plug-in thathave been developed for this pourpose: (i) a ba-sic structural match, (ii) a syntax and ontologymatch, (iii) a value substitution match, (iv) analgebraic equivalence match and (v) a decomposi-tion match. Their matchmaker uses the individualmatch scores from the plug-ins to compute a rank-ing by applicability of the services. The authorsconsider the effect of pre- and post-conditionsof mathematical service descriptions on matching,and how and why to reduce queries into Dis-junctive Normal Form (DNF) before matching.Finally, a case study demonstrates in detail howthe matching process works.

Trust-aware matchmaking is the subject of [2].The authors presents a peer-to-peer trust broker-ing system, in which the network of trust brokersoperate by providing peer reviews in the formof recommendations regarding potential resourcetargets. One of the distinguishing features of thiswork is that it separately models the accuracy andhonesty concepts, so that their model is able tosignificantly improve the performance. The trustbrokering system is applied to a resource managerin order to illustrate its utility in a public-resourceGrid environment. The simulations performed toevaluate the trust-aware resource matchmakingstrategies indicate that high levels of robustnesscan be attained by considering trust while match-making and allocating resources.

The Condor high-throughput resource man-agement system for compute-intensive jobs [45]requires task submissions to be specified in de-scription files containing basic information andtask requirements. The latter are translated toclassif ied advertisements (ClassAds) that are setsof named expressions. ClassAds, which maps

attribute names to expressions, are also usedto express characteristics of resources, and amatchmaking service matches task and resource-related ClassAds to determine the proper re-sources where tasks can be executed. A Constraintattribute in a classad is evaluated against the clas-sad being matched with this classad, and when thevalues of attribute Constraint of both classads areevaluated to true, these two classads are matched.Another attribute, Rank, measures the quality ofa match. The value of Rank therefore provides anindication of how much the two classads match, sothat the larger the value, the better is the match-ing. Condor requires a provider and a requesterto know each other’s classad structure. The eval-uation result of the Rank attribute is, in general,not normalized and can not tell explicitly how welltwo classads match.

Matchmaking in Condor supports selectingonly one resource. In [28], an extension, called set-extended classad syntax, was proposed in order tosupport multiple resource selection. Matchmakingworks evaluating a set-extended classad with aset of classads and returning a classad set withthe highest rank. However, when the size of theclassad set is large, evaluating all of the possiblecombinations is infeasible; in this case, a simplegreedy heuristic is used to find the classad setproviding the highest rank.

In [27] the authors present a new approachto symmetric matching that achieves significantadvances in expressiveness relative to ClassAds. Itallows multi-way matches, expression and locationof resource with negotiable capability. The keyto their approach is reinterpreting matching asa constraint problem and exploiting constraint-solving technologies to implement matching op-erations. A prototype matchmaking mechanism,named Redline, has been implemented and usedto model and solve several challenging matchingproblems.

GREEN [11] matches a job demand with aGrid resource supply on the basis of a charac-terization of resources by means of their perfor-mance, evaluated through benchmarks relevantto the application. The matchmaking service isbased on a two-level benchmarking methodology;a requestor specifies both syntactic and perfor-mance requirements, independently of the under-


lying middleware. GREEN fosters Grid interop-erability through the use of JSDL to express jobsubmission requirements, and an internal transla-tion to the job submission languages used by thetargets middleware. Middleware independence ispursued through an extension of JSDL basedon the GLUE 2.0 schema. Moreover, some ex-tensions to JSDL related to concurrency aspectswere borrowed from JSDL SPMD ApplicationExtension, in oder to support execution of parallelapplications.

A resource selection system for exploitinggraphics processing units (GPUs) as general-purpose computational resources in desktop Gridenvironments is presented in [21]. The system al-lows Grid users to share remote GPUs, which aretraditionally dedicated to local users who directlysee the display output. The key contribution ofthis paper is a novel system for non-dedicatedenvironments. The authors first show criteria fordefining idle GPUs from the Grid users’ perspec-tive. Based on these criteria, the system uses ascreensaver approach with some sensors that de-tect idle resources at a low overhead. The idea forthis lower overhead is to avoid GPU interventionduring resource monitoring. Detected idle GPUsare then selected according to a matchmakingservice, making the system adaptive to the rapidadvance of GPU architecture. Though the sys-tem itself is not yet interoperable with currentdesktop Grid systems, the idea can be applied toscreensaver-based systems such as BOINC. Thesystem has been evaluated using Windows PCswith three generations of nVIDIA GPUs. Theexperimental results show that it achieves a lowoverhead of at most 267 ms, minimizing interfer-ence to local users while maximizing the perfor-mance delivered to Grid users. Some case studiesare also performed in an office environment todemonstrate the effectiveness of the system interms of the amount of detected idle time.

7 Conclusions

In this paper we dealt with the problem of condi-tional preference matchmaking of computationalresources belonging to a Grid. We introduced

CP–Nets, a recent development in the field of Ar-tificial Intelligence, as a means to deal with user’spreferences in the context of Grid scheduling.We discussed CP–Nets from a theoretical perspec-tive and then analyzed, qualitatively and quantita-tively, their impact on the matchmaking process,with the help of a Grid simulator we developedfor this purpose. Many different experiments havebeen setup and carried out, and we report here ourmain findings and the lessons learnt.

1. Introducing CP–Nets in the matchmakingprocess is feasible. The overhead associated toCP–Nets is minimal when considering that theaverage time to schedule a job in all of ourexperiments ranges from 3.89 (best case) to117.82 s (worst case, less than two minutes).We also note here that, besides requiring a fewseconds, the average time to schedule a jobwhen using the CP–Nets is almost always closeto the average time to schedule a job withoutCP–Nets, and is never more than two timesthis value. Moreover, the outcome optimiza-tion query is polynomial (linear) in its inputfor the particular case related to our experi-ments, and the average time to schedule a jobis not directly proportional to the number ofsubmitted jobs.

2. Bigger Grids are well suited to the use of CP–Nets in the matchmaking process. Comparedto smaller Grids, bigger ones exhibit a reducedrate of increase of the scheduling time associ-ated to the CP–Net algorithm.

3. Resource utilization does not decreases exces-sively using CP–Nets. Overall, resource uti-lization is extremely good, ranging from nodifference at all (best case) to a maximumdifference of 23.49 %. For bigger Grids andworkloads, resource utilization is close to themaximum possible.

4. Grids with an initial workload provide betterperformances w.r.t. unloaded ones. Interest-ingly, Grids which are already busy execut-ing a previous workload react better to con-ditional preference matchmaking, leading inalmost all of the experiments to a reducedaverage time to schedule a job.

Therefore, we conclude that CP–Nets can be auseful tool to ensure that user’s preferences are


met in the matchmaking process, so that schedul-ing may provide results more appealing to the endusers, with minimal overhead.

Acknowledgements The authors would like to thank theanonymous reviewers for their useful, constructive com-ments, that greatly helped improving the quality of thispaper.

References

1. Anjomshoaa, A., Brisard, F., Drescher, M., Fellows,D., Ly, A., McGough, S., Pulsipher, D., Savva, A.: Jobsubmission description language (jsdl), specification,version 1.0. Global Grid Forum Working Draft (2005)

2. Azzedin, F., Maheswaran, M., Mitra, A.: Trust broker-ing and its use for resource matchmaking in public-resource Grids. J. Grid Computing 4, 247–263 (2006)

3. Bai, X., Yu, H., Ji, Y., Marinescu, D.C.: Resourcematching and a matchmaking service for an intelligentGrid. In: International Conference on ComputationalIntelligence, pp. 262–265 (2004)

4. Bai, X., Yu, H., Wang, G., Ji, Y., Marinescu, D., Bölöni,L.: Intelligent Grids. In: Grid Computing: Software En-vironments and Tools, pp. 45–74. Springer (2005)

5. Bayardo, R.J. Jr., Bohrer, W., Brice, R., Cichocki, A.,Fowler, J., Helal, A., Kashyap, V., Ksiezyk, T., Martin,G., Nodine, M., Rashid, M., Rusinkiewicz, M., Shea,R., Unnikrishnan, C., Unruh, A., Woelk, D.: Infos-leuth: agent-based semantic integration of informationin open and dynamic environments. SIGMOD Rec.26(2), 195–206 (1997)

6. Boutilier, C., Brafman, R.I., Domshlak, C., Hoos, H.H.,Poole, D.: Cp-nets: a tool for representing and reason-ing with conditional ceteris paribus preference state-ments. J. Artif. Intell. Res. 21, 135–191 (2004)

7. Boutilier, C., Brafman, R.I., Hoos, H.H., Poole, D.:Reasoning with conditional ceteris paribus preferencestatements. In: Laskey, K.B., Prade, H. (eds.) UAI, pp.71–80. Morgan Kaufmann (1999)

8. Buyya, R., Murshed, M.: Gridsim: a toolkit for themodeling and simulation of distributed resource man-agement and scheduling for Grid computing. Concurr.Comput.: Pract. Exper. 14(13–15), 1175–1220 (2002)

9. Cameron, D.G., Millar, A.P., Nicholson, C., Carvajal-Schiaffino, R., Stockinger, K., Zini, F.: Analysis ofscheduling and replica optimisation strategies for dataGrids using optorsim. J. Grid Computing 2(1), 57–69(2004)

10. Casanova, H., Legrand, A., Quinson, M.: Simgrid:a generic framework for large-scale distributed ex-periments. In: Proceedings of the 10th InternationalConference on Computer Modeling and Simulation,UKSIM ’08, pp. 126–131. IEEE Computer Society(2008)

11. Clematis, A., Corana, A., D’Agostino, D., Galizia,A., Quarati, A.: Job-resource matchmaking on Grid

through two-level benchmarking. Future Gener.Comput. Syst. 26(8), 1165–1179 (2010)

12. Czajkowski, K., Fitzgerald, S., Foster, I., Kesselman, C.:Grid information services for distributed resource shar-ing. In: Proceedings of 10th IEEE International Sym-posium on High Performance Distributed Computing,2001, pp. 181–194 (2001)

13. Dail, H., Sievert, O., Berman, F., Casanova, H.,YarKhan, A., Vadhiyar, S., Dongarra, J., Liu, C., Yang,L., Angulo, D., Foster, I.: Scheduling in the Grid ap-plication development software project. In: Nabrzyski,J., Schopf, J.M., Weglarz, J. (eds.) Grid ResourceManagement, pp. 73–98. Kluwer Academic Publishers,Norwell, MA, USA (2004)

14. Dumitrescu, C., Foster, I.: Usage policy-based cpusharing in virtual organizations. In: Proceedings ofthe 5th IEEE/ACM International Workshop on GridComputing, GRID ’04, pp. 53–60. IEEE ComputerSociety, Washington, DC, USA (2004)

15. Elmroth, E., Tordsson, J.: Grid resource brokering al-gorithms enabling advance reservations and resourceselection based on performance predictions. FutureGener. Comput. Syst. 24, 585–593 (2008)

16. Foster, I., Kesselman, C.: The Grid. Blueprint for aNew Computing Infrastructure (Elsevier Series in GridComputing), 2nd edn., Morgan-Kaufmann (2003)

17. Foster, I., Kesselman, C., Tuecke, S.: The anatomy ofthe Grid: enabling scalable virtual organizations. Int. J.High Perform. Comput. Appl. 15(3), 200–222 (2001)

18. Harth, A., Decker, S., He, Y., Tangmunarunkit, H.,Kesselman, C.: A semantic matchmaker service on theGrid. In: Proceedings of the 13th international WorldWide Web Conference on Alternate Track Papers &Posters, WWW Alt. ’04, pp. 326–327. ACM (2004)

19. Iosup, A., Epema, D.: Grenchmark: a framework foranalyzing, testing, and comparing Grids. In: IEEE In-ternational Symposium on Cluster Computing and theGrid, pp. 313–320. IEEE Computer Society (2006)

20. Klusácek, D., Rudová, H.: Alea 2: job schedulingsimulator. In: Proceedings of the 3rd InternationalICST Conference on Simulation Tools and Techniques,SIMUTools ’10, vol. 61, pp. 1–61:10, ICST (Institute forComputer Sciences, Social-Informatics and Telecom-munications Engineering). Brussels, Belgium, Belgium(2010)

21. Kotani, Y., Ino, F., Hagihara, K.: A resource selectionsystem for cycle stealing in gpu Grids. J. Grid Comput-ing 6, 399–416 (2008)

22. Kuokka, D., Harada, L.: Matchmaking for informa-tion agents. In: Proceedings of the 14th InternationalJoint Conference on Artificial Intelligence-IJCAI’95,vol. 1, pp. 672–678. Morgan Kaufmann Publishers Inc.(1995)

23. Kurowski, K., Nabrzyski, J., Oleksiak, A., Weglarz, J.:Grid scheduling simulations with gssim. In: Proceed-ings of the 13th International Conference on Paralleland Distributed Systems-ICPADS ’07, vol. 02, pp. 1–8.IEEE Computer Society (2007)

24. Lamehamedi, H., Shentu, Z., Szymanski, B., Deelman,E.: Simulation of dynamic data replication strategiesin data Grids. In: Proceedings of International Parallel


and Distributed Processing Symposium, 2003, pp. 10(2003)

25. Li, H., Buyya, R.: Model-driven simulation of Gridscheduling strategies. In: Proceedings of the 3rdIEEE International Conference on e-Science and GridComputing, pp. 287–294. IEEE Computer Society.Washington, DC, USA (2007)

26. Li, H., Buyya, R.: Model-based simulation and perfor-mance evaluation of Grid scheduling strategies. FutureGener. Comput. Syst. 25, 460–465 (2009)

27. Liu, C., Foster, I.: A constraint language approach tomatchmaking. In: Proceedings of the 14th InternationalWorkshop on Research Issues on Data Engineering:Web Services for E-Commerce and E-GovernmentApplications (RIDE’04), RIDE ’04, pp. 7–14. IEEEComputer Society (2004)

28. Liu, C., Yang, L., Foster, I., Angulo, D.: Design andevaluation of a resource selection framework for Gridapplications. In: Proceedings of the 11th IEEE Inter-national Symposium on High Performance DistributedComputing, HPDC ’02, pp. 63. IEEE Computer Soci-ety (2002)

29. Lublin, U., Feitelson, D.G.: The workload on parallelsupercomputers: modeling the characteristics of rigidjobs. J. Parallel Distrib. Comput. 63, 1105–1122 (2003)

30. Ludwig, S., Rana, O., Padget, J., Naylor, W.: Match-making framework for mathematical web services. J.Grid Computing 4, 33–48 (2006)

31. Medernach, E.: Workload analysis of a cluster in aGrid environment. In: Feitelson, D., Frachtenberg, E.,Rudolph, L., Schwiegelshohn, U. (eds.) Job Schedul-ing Strategies for Parallel Processing of Lecture Notesin Computer Science, vol. 3834, pp. 36–61. SpringerBerlin/Heidelberg (2005)

32. Naqvi, S., Riguidel, M.: Grid security services simu-lator (g3s) — a simulation tool for the designand analysis of Grid security solutions. In: Proceedingsof the 1st International Conference on e-Science andGrid Computing, E-SCIENCE ’05, pp. 421–428. IEEEComputer Society (2005)

33. Nassif, L.N., Nogueira, J.M., de Andrade, F.V.V.: Re-source selection in Grid: a taxonomy and a new systembased on decision theory, case-based reasoning, andfine-grain policies. Concurr. Comput.: Pract. Exper. 21,337–355 (2009)

34. Pacini, F.: Job submission description language at-tributes, glite specification (submission through wm-proxy service). egee-jra1-tec-590869-jdlattributes-v0-8.EGEE (2006)

35. Paolucci, M., Srinivasan, N., Sycara, K.P., Nishimura,T.: Towards a semantic choreography of web services:from wsdl to daml-s. In: Proceedings of the Interna-tional Conference on Web Services, ICWS ’03, pp. 22–26. Las Vegas, Nevada, USA. CSREA Press, 23–26June 2003

36. Ranganathan, K., Foster, I.: Computation schedulingand data replication algorithms for data Grids. In:Nabrzyski, J., Schopf, J.M., Weglarz, J. (eds.) Grid Re-source Management, pp. 359–373. Kluwer AcademicPublishers (2004)

37. Sfiligoi, I., Bradley, D.C., Holzman, B., Mhashilkar,P., Padhi, S., Wurthwein, F.: The pilot way to Gridresources using glideinwms. In: Proceedings of the 2009WRI World Congress on Computer Science and In-formation Engineering-CSIE ’09, vol. 02, pp. 428–432.IEEE Computer Society (2009)

38. Singh, N.: A common lisp api and facilitator forabsi: version 2.0.3. Technical Report Logic-93-4,Logic Group, Computer Science Department. StanfordUniversity (1993)

39. Solomon, M.: The Classad Language Reference Man-ual v2.1. Computer Sciences Department. University ofWisconsin, Madison, USA (2003)

40. Song, H.J., Liu, X., Jakobsen, D., Bhagwan, R., Zhang,X., Taura, K., Chien, A.: The microgrid: a scientific toolfor modeling computational Grids. Sci. Program. 8(3),127–141 (2000)

41. Subrahmanian, V.S., Bonatti, P., Dix, U.J., Eiter, T.,Kraus, S., Ross, R.: Heterogeneous Agent Systems.MIT Press (2000)

42. Sycara, K., Lu, J., Klusch, M.: Interoperability amongheterogeneous software agents on the internet. Tech-nical Report CMU-RI-TR-98-22, Robotics Institute.Pittsburgh, PA (1998)

43. Sycara, K., Widoff, S., Klusch, M., Lu, J.: Larks: Dy-namic matchmaking among heterogeneous softwareagents in cyberspace. Auton. Agent Multi-Agent Syst.5(2), 173–203 (2002)

44. Takefusa, A., Matsuoka, S., Nakada, H., Aida, K.,Nagashima, U.: Overview of a performance evaluationsystem for global computing scheduling algorithms. In:Proceedings of the 8th International Symposium onHigh Performance Distributed Computing, pp. 97–104(1999)

45. Thain, D., Tannenbaum, T., Livny, M.: Condor andthe Grid. In: Berman, F., Fox, G., Hey, T. (eds.) GridComputing: Making the Global Infrastructure a Real-ity. Wiley (2002)

46. Thysebaert, P., Volckaert, B., de Turck, F., Dhoedt, B.,Demeester, P.: Evaluation of Grid scheduling strate-gies through nsgrid: a network-aware Grid simulator.Neural Parallel Sci. Comput. 12(3), 353–378 (2004)

47. Wang, C.-M., Chen, H.-M., Hsu, C.-C., Lee, J.: Dy-namic resource selection heuristics for a non-reservedbidding-based Grid environment. Future Gener.Comput. Syst. 26, 183–197 (2010)

48. Wickler, G.J.: Using expressive and flexible actionrepresentations to reason about capabilities for intel-ligent agent cooperation. PhD thesis, University ofEdinburgh (1999)

Preference–Based Matchmaking of Grid Resources with CP–Nets

Documents

Transcript of Preference–Based Matchmaking of Grid Resources with CP–Nets