Optimizing Performance and Reliability on Heterogeneous...
Transcript of Optimizing Performance and Reliability on Heterogeneous...
Optimizing Performance and Reliability on Heterogeneous Parallel
Systems: Approximation Algorithms and Heuristics
Emmanuel Jeannota, Erik Sauleb, Denis Trystramc
aINRIA Bordeaux Sud-Ouest, Talence, FrancebBMI - Ohio State University - Columbus 43210, OH / USA
cGrenoble Institute of Technology, Grenoble, France
Abstract
We study the problem of scheduling tasks (with and without precedence constraints) on
a set of related processors which have a probability of failure governed by an exponen-
tial law. The goal is to design approximation algorithms or heuristics that optimize both
makespan and reliability. First, we show that both objectives are contradictory and that
the number of points of the Pareto-front can be exponential. This means that this prob-
lem cannot be approximated by a single schedule. Second, for independent unitary tasks,
we provide an optimal scheduling algorithm where the objective is to maximize the relia-
bility subject to makespan minimization. For the bi-objective optimization, we provide a
(1+ε,1)-approximation algorithm of the Pareto-front. Next, for independent arbitrary tasks,
we propose a⟨2, 1⟩-approximation algorithm (i.e. for any fixed value of the makespan, the
obtained solution is optimal on the reliability and no more than twice the given makespan)
that has a much lower complexity than the other existing algorithms. This solution is used
to derive a (2 + ε, 1)-approximation of the Pareto-front of the problem.
All these proposed solutions are discriminated by the value of the product {failure
rate}×{unitary instruction execution time} of each processor, which appears to be a cru-
cial parameter in the context of bi-objective optimization. Based on this observation, we
provide a general method for converting scheduling heuristics on heterogeneous clusters into
heuristics that take into account the reliability when there are precedence constraints. The
average behaviour is studied by extensive simulations. Finally, we discuss the specific case
of scheduling a chain of tasks which leads to optimal results.
Preprint submitted to J. of Parallel and Dist. Computing November 22, 2011
Keywords: Scheduling, Pareto-front approximation, Reliability, Makespan, Precedence
Task Graphs.
1. Introduction
With the recent development of large parallel and distributed systems (computational
grids, cluster of clusters, peer-to-peer networks, etc.), it is difficult to ensure that the re-
sources are always available for a long period of time. Indeed, hardware failures, software
faults, power breakdown or resources removal often occur when using a very large number
of machines. Hence, in this context, taking into account new objectives dealing with fault-
tolerance is a major issue. Several approaches have been proposed to tackle the problem of
faults. One possible approach is based on duplication. The idea is that if one resource fails,
other resources can continue to correctly execute the redundant parts of the application.
However, the main drawback of this approach is a possible waste of resources. An alterna-
tive solution consists in check-pointing the computations from time to time and, in case of
failure, to restart it from the last check-point [1, 2]. However, check-pointing an application
is costly and may require to modify it. Furthermore, restarting an application slows it down.
Therefore, in order to minimize the cost of the check-point/restart mechanism, it is necessary
to provide a reliable execution that minimizes the probability of failure of this application.
Scheduling an application corresponds to determine which resources will execute the tasks
and when they will start. Thus, the scheduling algorithm is responsible for minimizing the
probability of failure of the application by choosing the adequate set of resources that enable
a fast and reliable execution.
Unfortunately, as we will show in this paper, increasing the reliability implies, most of the
time, an increase of the execution time (a fast schedule is not necessarily a reliable one). This
motivates the design of algorithms that look for a set of trade-offs between these compromise
solutions.
In this paper, we study the problem of scheduling an application represented by a prece-
dence task graph or by a set of independent tasks on heterogeneous computing resources.
The objectives are to minimize the makespan and to maximize the reliability of the schedule.
2
In the literature, this problem has been mainly studied from a practical point of view [3, 4, 5].
We lack analysis based of well-founded theoretical studies for this problem. Some unanswered
questions are the following:
• Is maximizing the reliability a difficult (NP-Hard) problem?
• Is it possible to find polynomial solutions of the bi-objective problem for special kind
of precedence task graphs?
• Is it possible to approximate the general problem for any precedence relations?
• Can we build approximation schemes?
• How to help the user in finding a good trade-off between reliability and makespan?
All these questions will be addressed in this article. More precisely we show why both
objectives are contradictory and how provide approximation of the Pareto-front1 in the case
of independent tasks and task graphs (with the special case of chain of tasks).
The main goal of this paper is to provide a deep understanding of the bi-criteria problem
(makespan vs. reliability) we study as well as different ways to tackle the problem depending
on the specificity of the input.
The content and the organization of this paper are as follows. In section 2.1, we intro-
duce the definition of reliability and makespan and some related notations. In section 2.2, we
present and discuss most significant related works. In section 3, we study some basic charac-
teristics of the bi-objective problem. In particular, we show that maximizing the reliability
is a polynomial problem (Proposition 1) and is simply obtained by executing the application
on the processors that have the smallest product of {failure rate} and {unitary instruction
execution time} sequentially. This means that minimizing the makespan is contradictory
to the objective of maximizing the reliability. Furthermore, we show that for the general
case, approximating both objectives simultaneously is not possible (Proposition 2). We show
1Intuitively, the Pareto-front is the set of best compromise solutions; any absolutely better solution being
infeasible
3
that the number of points of the Pareto-front in the case of independent tasks can be expo-
nential (Theorem 2) and hence it is required to be able to approximate it. In section 4.2,
we study the problem of scheduling a set of independent unitary tasks (i.e. same length).
For this case, we propose an optimal algorithm (Algorithm 3) for maximizing the reliability
subject to makespan minimization. We also propose an (1+ε,1)-approximation of the Pareto-
front (Section 4.2.2). This means that we can provide a set of solutions of polynomial size
that approximates, at a constant ratio, all the optimal makespan/reliability trade-offs. In
section 4.3, we study the case of independent tasks of arbitrary length. We provide (Al-
gorithm 4) a⟨2, 1⟩-approximation algorithm (i.e. for any fixed value of the makespan, the
obtained solution is optimal on the reliability and no more than twice the given makespan)
and derive a Pareto-front approximation from this algorithm (Section 4.3.2). An experimen-
tal evaluation of this algorithm is provided in Section 4.4. All the above solutions emphasize
the importance of the {failure rate} by {unitary instruction execution time} product. Based
on this observation, we show, in section 5.1, how to easily transform a heuristic that targets
makespan minimization to a bi-objective heuristic for the case of any precedence relation
(Algorithm 5). In this case also, we demonstrate how to help the users to choose a suitable
makespan/reliability trade-off. We implement this methodology using two heuristics and we
compare our approach against other heuristics of the literature. Moreover, in section 5.2
we study a special sub-case of precedence task graphs where all the tasks are sequentially
serialized by a chain (lemma 4). Finally, we conclude the paper and discuss some challenging
perspectives.
2. Preliminaries
2.1. Problem Definition
As in most related studies, a parallel application is represented by a precedence task
graph: let G = (T , E) be a Directed Acyclic Graph (DAG) where T is the set of n vertices
(that represent the tasks) and E is the set of edges that represent precedence constraints
among the tasks (if there are any). Let Q be a set of m uniform processors as described
4
in [6]. A uniform processor is defined as follows: processor j computes 1/τj operations by
time unit and pij = piτj denotes the running time of task i on processor j (τj is also called
the unitary instruction execution time, i.e. the time to perform one operation). In the
remainder of the paper, i will denote the task index while j will refer to the processors.pi
denotes the processing requirement of task i. Moreover, processor j has a constant failure
rate of λj. When a processor is affected by a failure, it stops working until the end of the
schedule (this model is usually called crash fault). If a processor fails before completing the
execution of all its tasks, the execution has failed.
A schedule s = (π, σ) is composed of two functions: a function π : T → Q that maps
a task to the processor that executes it and a function σ : T → R∗ that associates to each
task the time when it starts its execution. We denote by π−1 the function which maps a
processor to the set of tasks allocated on it; which we improperly call the inverse of function
π. To be valid a schedule must satisfy the precedence constraints and no processor should
execute more than a task at once. The completion time of processor j is the first time when
all its tasks are completed: Cj(s) = maxi∈π−1(j) σ(i) + pij. The makespan of a schedule is
defined as the maximum completion times Cmax(π) = maxjCj(π). The probability that a
processor j executes all its tasks successfully is given by an exponential law: Prjsucc(π) =
e−λjCj(π). We assume that faults are independent, therefore, the probability that schedule π
finished correctly is: Prsucc = ΠjPrjsucc(π) = e−
∑j Cj(π)λj . The reliability index is defined by
rel(π) =∑
j Cj(π)λj. When no confusion is possible, π will be omitted.
We are interested in minimizing both Cmax and rel simultaneously (i.e. minimizing the
makespan and maximizing the probability of success of the whole schedule).
2.2. Related Works
Optimizing single objectives. First, we discuss briefly how each single-objective problem
has been studied in the literature. The minimization of the makespan is a classical problem.
It is well-known that scheduling independent tasks on uniform processors in less than a fixed
amount of time is a NP-complete problem because it contains PARTITION as a sub-problem
which is NP-complete [7]. A low cost (2− 1m+1
)-approximation algorithm has been proposed
5
in [8]. It consists of classical list scheduling where the longest task of the list is iteratively
mapped on the processor that will complete it the soonest. Hochbaum and Shmoys pro-
posed a PTAS(Polynomial Time Approximation Scheme) based on the bin packing problem
with variable bin sizes [9]. However, this result is only of theoretical interest because its
runtime complexity is far too high. The problem with precedence constraints is much less
understood from the approximation theory point of view. Without communication delay, the
best known approximation algorithm for arbitrary dependency graphs and uniform processor
is a O(logm)-approximation proposed by [10]. The problem with communication delay is
known to be difficult even on identical processors and often requires to make the distinction
between small communication delays and large communication delays or hypothesis such as
Unitary Execution Task [11]. It is beyond the scope of this paper to make a full review on
the scheduling theory, the reader is referred to [12] for more details.
Since there exists many reliability models, there exist multiple methods to minimize the
reliability depending on the chosen reliability model. Some models lead to harder problems
for determining the maximum reliability, the main problem is to avoid having dependent
probabilistic event which prevent the existence of a useful closed formula and that often arises
in schedule with replication. For instance, [13] needs to add constraint on the structure of the
schedule to be able to compute the reliability in polynomial time; without this restriction,
determining the reliability of a schedule would be a difficult problem [14]. We consider in
this work a realistic model where the schedule with the best reliability can be computed in
polynomial time (of course, the corresponding makespan may be very large). The assumption
of crash faults is realistic in the sense that it corresponds to the most common case of failure:
a machine goes offline. The assumption that the probability of success follows an exponential
law is a direct consequence of the assumption that the failure rate is constant during the
execution of the application. This assumption is reasonable since the execution time of the
application is small compared to the lifetime of the cluster. Moreover, this assumption is
the base of the Shatz-Wang reliability model [15] which has been used in numerous works on
reliability such as [13, 3, 4, 5]. Finally, some authors studied new non-conventional objectives
6
like maximizing the number of tasks performed before failure [16].
Related bi-objective problems: Shmoys and Tardos studied the bi-objective problem of
minimizing both the makespan and the sum of costs of a schedule of independent tasks on
unrelated machines in [17]. This problem is mathematically the same as the problem of
optimizing the makespan and reliability of independent tasks. In their model, the cost is
induced by scheduling a task on a processor and the cost function is given by a cost matrix.
They proposed an algorithm based on two parameters, namely, a target value M for the
makespan and C for the cost and returns a schedule whose makespan is lower than 2M
with a cost better than C. This method can be adapted to solve our problem. However,
it is difficult to implement since it relies on Linear Programming and its complexity is in
O(mn2 log n) which is costly. Section 4.3.1 will present an algorithm tailored to our case
of uniform machine that is asymptotically faster by a ratio of O(nm). It is also possible
to use integrated approaches where one of the objectives implicitly contains the other like
the minimization of the mean makespan with check-points [18]. Here, the trade-off between
doing a check-point or not is included into the expression of the mean makespan.
Optimizing both makespan and reliability: several heuristics have been proposed to solve
this bi-objective problem. Dogan and Ozguner proposed in [3] a bi-objective heuristic called
RDLS. In [4], the same authors improved their solution using an approach based on genetic
algorithms. In [5], Hakem and Butelle proposed a bi-objective heuristic called BSA that out-
performs RDLS. In [19], the authors proposed MRCD, an algorithm to compute/reliability
compromise. They show that this compromise can be better than ones found by other
heuristics, but contrary to this work they do not focus on the whole Pareto-front. All these
results focused on the general case where the precedence task graph is arbitrary. Moreover,
none of the proposed heuristics have a constant approximation ratio. This manuscript is an
extended version of two works on this topic: [20] and [21]. On the theoretical side, we prove
here that the Pareto Front can be exponential and we work on the case of chains of tasks
and propose an optimal algorithm. On the experimental side we have added a huge set of
work concerning the experimental validation of our algorithm for the independent arbitrary
7
task and for the different heuristics studied in the case of arbitrary task graph.
2.3. Preliminary Analysis
The goal of our work is to solve a bi-objective problem, namely minimizing the makespan
and maximizing the reliability (which corresponds to minimize the probability of failure).
Unfortunately, these objectives are conflicting. More precisely, as shown in the following
Proposition 1, the optimal reliability is obtained while mapping all the tasks on processor
j such that, j = argmin(τjλj), i.e., on the processor for which the product of {failure
rate}×{unitary instruction execution time} is minimal. However, from the view point of the
makespan, such a schedule can be arbitrarily far from the optimal one.
Proposition 1. Let S be a schedule where all the tasks have been assigned to processor j0,
in topological order, such that τj0λj0 is minimal. Let rel be the reliability of the successful
execution of schedule S. Then, any schedule S ′ 6= S, with reliability rel’, is such that rel ≤
rel’.
Proof. Suppose without loss of generality that j0 = 0 (i.e. ∀j : τ0λ0 ≤ τjλj). Then
rel = C0λ0 (all the tasks are mapped to processor 0). Let call C′j the completion date of
the last task on processor j with schedule S ′. Therefore, rel’ ≥∑m
j=0C′jλj (The inequality
comes from the idle times that may appear which can be omitted here since it decreases the
bound on rel’ and a lower bound is enough for our calculations). Let T be the set of tasks
that are not executed on processor 0 by schedule S ′. Then, C′0 ≥ C0 − τ0
∑i∈T pi (there are
still some tasks of T \T to execute on processor 0). Let T = T1 ∪ T2 ∪ . . . ∪ Tm, where Tj is
the subset of the tasks of T executed on processor j by schedule S ′ (these sets are disjoint:
∀j1 6= j2, Tj1 ∩ Tj2 = ∅). Then, ∀j, 1 ≤ j ≤ m, C′j ≥ τj
∑i∈Tj pi. Let us compute the
8
difference rel’-rel:
≥∑m
j=0C′jλj − C0λ0 ≥ C0λ0 − τ0λ0
∑i∈T pi
+∑m
j=1
(τjλj
∑i∈Tj pi
)− C0λ0
=∑m
j=1
(τjλj
∑i∈Tj pi
)− τ0λ0
∑i∈T pi
=∑m
j=1
(τjλj
∑i∈Tj pi
)− τ0λ0
∑mj=1
(∑i∈Tj pi
)(because the Tj’s are disjoint)
=∑m
j=1
((τjλj − τ0λ0)
∑i∈Tj pi
)≥ 0 (because ∀j : τ0λ0 ≤ τjλj)
This proposition shows that the problem of minimizing the makespan subject to the
condition that the reliability is maximized corresponds to the problem of minimizing the
makespan using only processors having a minimal τjλj. If there is only one such single pro-
cessor, the problem is straightforward. In this case, the reliability is maximized only if all
the tasks are sequentially executed on this processor. However, in the case when there are
several processors that have the same minimal λjτj value, the problem is NP-Hard since it
requires to minimize the makespan on all of these processors.
The following proposition proves that for the problem we are interested in, there are no
solutions for the bi-objective problem simultaneously that are close to each of both objectives.
Proposition 2. The bi-objective problem of minimizing Cmax and rel cannot be approxi-
mated within a constant factor with a single solution.
Proof. Consider the class of instances Ik of the problem with two machines such that τ1 = 1,
τ2 = 1/k and λ1 = 1, λ2 = k2 (k ∈ R+∗) and a single task t1 with p1 = 1. There exist only
two feasible schedules, namely, π1 in which t1 is scheduled on processor 1 and π2 in which it
is scheduled on processor 2. Remark that π2 is optimal for Cmax and that π1 is optimal for
rel.
9
Cmax(π1) = 1 and Cmax(π2) = 1/k. This leads to Cmax(π1)/Cmax(π2) = k. This ratio
goes to infinity when k goes to infinity. Similarly, rel(π1) = 1 and rel(π2) = k2
k= k which
leads to rel(π2)/rel(π1) = k. Again, this ratio goes to infinity with k.
None of these feasible schedules can approximate both objectives within a constant factor.
Proposition 2 shows that the problem of optimizing simultaneously both objectives can
not be approximated. That is to say, in general, there exists no solution which is close
to the optimal value on both objectives at the same time. Therefore, we will tackle the
problem as optimizing one objective subject to the condition that the second one is kept
at a reasonable value ([22] Chap. 3, pp. 12). For our problem, it corresponds to maximize
the reliability subject to the condition that the makespan is under a threshold value. This
approach may be seen as giving the priority to the makespan (the most difficult objective
to optimize) and optimizing the reliability as a secondary goal. However, since finding the
optimal makespan is usually NP-hard, we aim first at designing an approximation algorithm
and then at determining an approximation of the Pareto-front.
As the number of Pareto-optimal solutions can be exponential, it is important to be able
to generate an approximation of the Pareto-front that has a polynomial size. In order to
achieve this goal, we use the methodology proposed by Papadimitriou and Yannakakis in
[23]. It is recalled briefly in the next section. This methodology will be used in section 4 for
the case of independent tasks.
3. Bi-objective Approximation
In bi-objective optimization there is no concept of absolute best solution. In general,
no solution is the best on both objectives. However, a given solution may be better than
another one on both objectives. It is said that the former Pareto-dominates the latter.
The interesting solutions in bi-objective optimization, called Pareto-optimal solutions,
are those that are not dominated by any other solutions. The Pareto-front (also called
Pareto-set) of an instance is the set of all Pareto-optimal solutions. Intuitively, the Pareto-
10
x
rel
y
yρ2
Cmaxxρ1
Figure 1: Bold crosses are a (ρ1, ρ2)-approximation of the Pareto-front.
front divides the solution space between feasible and unfeasible solutions. It is the set
of interesting compromise solutions and determining this set is the main target of multi-
objective optimization. Unfortunately, this set is most of the time difficult to compute
because one of the underlying optimization problem is NP-hard or because its cardinality is
exponential. In our case, both reasons stand2. Thus, we look for an approximation of the
Pareto-front with a polynomial cardinality.
A generic method to obtain an approximated Pareto-front was introduced by Papadim-
itriou and Yannakakis in [23]. Pc is a (ρ1, ρ2)-approximation of the Pareto-front Pc∗
if each solution s∗ ∈ Pc∗ is (ρ1, ρ2)-approximated by a solution s ∈ Pc: ∀s∗ ∈ Pc∗,∃s ∈
Pc, Cmax(s) ≤ ρ1Cmax(s∗) and rel(s) ≤ ρ2rel(s
∗). Fig. 1 illustrates this concept. Crosses
are solutions of the scheduling problem represented in the (Cmax; rel) space. The bold
crosses are an approximated Pareto-front. Each solution (x; y) in this set (ρ1, ρ2)-dominates
a quadrant delimited in bold in the figure and whose origin is at (x/ρ1; y/ρ2). All solutions
are dominated by a solution of the approximated Pareto-front as they are included into a
(ρ1, ρ2)-dominated quadrant.
One possible way for building such an approximation uses an algorithm that constructs
2We will show in the next section that the size of the Pareto-front can be exponential
11
a ρ2-approximation of the second objective constrained by a threshold on the first one. The
threshold cannot be exceeded by more than a constant factor ρ1. Such an algorithm is said
to be a⟨ρ1, ρ2
⟩-approximation algorithm. More formally,
Definition 1. Given a threshold value of the makespan ω, a⟨ρ1, ρ2
⟩-approximation algorithm
delivers a solution whose Cmax ≤ ρ1ω and rel ≤ ρ2rel∗(ω) where rel∗(ω) is the best possible
value of the reliability index in schedules whose makespan is less than ω.
Let APPROX be a⟨ρ1, ρ2
⟩-approximation algorithm (For instance, Algorithm 3 and 4,
we will explain later). Algorithm 1 constructs a (ρ1+ε, ρ2)-approximation of the Pareto-front
of the problem by applying APPROX on a geometric sequence of makespan thresholds. The
geometric sequence will only be considered between a lower bound Cminmax and an upper bound
Cmaxmax of makespan of Pareto-optimal solutions.
Algorithm 1: Pareto-front approximation (according to the method of Papadimitriou
and Yannakakis)
Data: ε a positive real number
Result: S a set of solutions
begink ← 0
S ← ∅
while k ≤ dlog1+ε/ρ1(Cmaxmax
Cminmax)e do
ωk ← (1 + ερ1
)kCminmax
sk ← APPROX(ωk)
S ← S ∪ {sk}
k ← k + 1return S
end
Theorem 1. The method of Papadimitriou and Yannakakis described in Algorithm 1 builds
a (ρ1 + ε, ρ2) approximation of the Pareto-front from a⟨ρ1, ρ2
⟩-approximation algorithm.
12
Cmax
rel∗(ωk)
rel∗(ωk+1)
ωk ωk+1 = (1 + ερ1
)ωk
×ρ1
rel
×ρ2
×(ρ1 + ε)
APPROX(ωk+1)
×(1 + ερ1
)
Figure 2: APPROX(ωk+1) is a (ρ1 + ε, ρ2) approximation of Pareto-optimal solutions whose makespan is
between ωk and ωk+1. There is at most a factor of ρ2 for the reliability between APPROX(ωk+1) and
rel∗(ωk+1). The ratio for the makespan between APPROX(ωk+1) and ωk+1 is less than ρ1 and ωk+1 =
(1 + ερ1
)ωk. Thus, APPROX(ωk+1) is a (ρ1 + ε, ρ2)-approximation of (ωk, rel∗(ωk+1))
Proof. Let s∗ be a Pareto-optimal schedule. Then, there exists k ∈ N such that (1 +
ερ1
)kCminmax ≤ Cmax(s
∗) ≤ (1 + ερ1
)k+1Cminmax. We show that sk+1 is an (ρ1 + ε, ρ2)-approximation
of s∗. The construction from step k to step k + 1 is illustrated in Figure 2.
• Reliability. rel(sk+1) ≤ ρ2rel∗((1 + ε
ρ1)k+1Cmin
max) (by definition). s∗ is Pareto-optimal,
hence rel(s∗) = rel∗(Cmax(s∗)). But, Cmax(s
∗) ≤ (1 + ερ1
)k+1Cminmax. Since rel∗ is a
decreasing function, we have: rel(sk+1) ≤ ρ2rel(s∗).
• Makespan. Cmax(sk+1) ≤ ρ1(1 + ερ1
)k+1Cminmax = (ρ1 + ε)(1 + ε
ρ1)kCmin
max (by definition)
and Cmax(s∗) ≥ (1 + ε
ρ1)kCmin
max.
Thus, Cmax(sk+1) ≤ (ρ1 + ε)Cmax(s∗).
Remark that APPROX(ωk) may not return a solution (in this case we sk is set to ∅
and we increment k). However, this is not a problem because it means that no solution has
a makespan lower than ωk. APPROX(ωk) approximates Pareto-optimal solutions whose
makespan is lower than ωk. Hence, there is no forgotten solution.
The algorithm generates dlog1+ ερ1
Cmaxmax
Cminmaxe solutions and calls the APPROX algorithm the
same number of times.
13
4. Independent tasks
4.1. Size of the Pareto-front
Before proposing algorithmic solutions for the bi-objective problem, we show that it is
not possible to compute the whole Pareto-front in polynomial time. More precisely, we show
that the number of points of the Pareto-front can be exponential in the size of the input.
Theorem 2. There exists a class of instances whose set of Pareto-optimal solutions is ex-
ponential in the number of tasks.
Proof. The proof is obtained by exhibiting a class of instances with an exponential number of
solutions. Let us consider instance In composed of n tasks such that pi = 2i−1,∀i, 1 ≤ i ≤ n
and 2 processors where the first one is very fast and unreliable (τ1 = 2−n, λ1 = 1) whereas
the second one is very slow but highly reliable (τ2 = 1, λ2 = 2−n). The processor parameters
and task sizes induce that:
• The makespan is only determined by the task scheduled on processor 2: Cmax =∑i∈π−1(2) pi (or is equal to
∑ni=1 2i−1 × τ1 = 2n−1
2n≈ 1 if all the tasks are scheduled on
processor 1).
• The reliability is mainly determined by the tasks scheduled on processor 1: rel =∑i∈π−1(1) pi (the contribution of the tasks on processor 2 is less than 2n−1
2nand thus can
be omitted for the sake of clarity).
• There are exactly 2n solutions since each task may be scheduled either on processor 1
or 2. Each solution is uniquely described by the sum of processing times of the tasks
scheduled on processor 1 which can take all the values between 0 and 2n − 1.
From above, let solution πi be the schedule with a makespan of Cmax = i. Its reliability
is rel = 2n−1− i. All the solutions have different objective values. Moreover, the makespan
strictly increases with i whereas the reliability strictly decreases. This proves that each
solution is Pareto-optimal.
14
4.2. Independent unitary tasks
Notice that when we consider only independent tasks, all the solutions are compact (i.e.,
they do not contain idle time) and the order of the tasks does not matter. Therefore, a
solution for independent unitary tasks is entirely defined by the number of tasks allocated
to each processor.
4.2.1. A⟨1, 1⟩-approximation algorithm
Given a makespan objective ω, we show how to find a task allocation that is the most
reliable for a set of n independent unitary tasks (∀i ∈ T , pi = 1).
To build a⟨ρ1, ρ2
⟩-approximation algorithm, we consider the problem of minimizing the
probability of failure subject to the condition that the makespan is constrained. Since the
tasks are unitary and independent, the problem is then to find for each processor j ∈ Q the
number of tasks aj to allocate on processor j such that the following constraints are fulfilled:
(1)∑
j∈Q aj = n. (2) The makespan is constrained: ∀j ∈ Q, ajτj ≤ ω. This threshold ω on
the makespan is assumed to be larger than the optimal makespan C∗max. (3) Subject to the
previous constraints, rel is minimized, i.e.,∑
j∈Q ajλjτj is minimized. Once the allocation
is known, it is easy to express a solution π such that aj = |π−1(j)|.
First, it is important to notice that finding a schedule whose makespan is smaller than
a given objective ω can be found in polynomial time. Indeed, Algorithm 2 determines the
minimal makespan allocation for any given set of independent unitary tasks as shown in [24],
pp. 161.
Second, we propose Algorithm 3 to solve the problem. It determines an optimal allocation
as proven by Theorem 3. It is a greedy algorithm that allocates the tasks to the processors
in an increasing order of their λjτj products. Each processor receive the largest number of
task while keeping the makespan less than ω.
Theorem 3. Algorithm 3 is a⟨1, 1⟩-approximation.
Proof. Let X be the number of tasks already assigned. Since when X < n we allocate at
most n−X tasks to a processor, at the end of the algorithm we have: X ≤ n (since ω ≥ C∗max,
15
Algorithm 2: Optimal allocation for independent unitary tasks
begin
for j from 1 to m do
aj ←⌊
1/τj∑1/τi
⌋× n
while∑aj < n do
k ← argminl(τl(al + 1))
ak ← ak + 1
end
Algorithm 3: Optimal reliable allocation for independent unitary tasks
Input: ω ≥ C∗max
beginSort the processors by increasing λjτj
X ← 0
for j from 1 to m do
if X < n then
aj ← min(n−X,
⌊ωτj
⌋)else
aj ← 0
X ← X + aj
end
16
at the end of the algorithm X = n, i.e. all the tasks are assigned). For each processor j
we allocate at most b ωτjc tasks, hence the makespan constraint is respected: ajτj ≤ ω. Since
in Algorithm 2, the order of the tasks and the order of the processors are not taken into
account, Algorithm 3 is valid (i.e., all tasks are assigned using at most the m processors).
Hence, the makespan of the schedule is lower than ω.
We need to show that∑
j∈Q ajλjτj is minimum. First let us remark that Algorithm 3
allocates the tasks to the processors in increasing order of the λjτj values. Hence, any other
valid schedule π′ of allocation a′ is such that a′i < ai and a′j > aj for any i < j. Without loss
of generality, let us assume that a′1 = a1 − k, a′i = ai + k and aj = a′j for k ∈ N, 1 ≤ k ≤ ai,
j 6= 1 and j 6= i. Then, the difference between the two objective values is
D =∑x∈Q
axλxτx −∑x∈Q
a′xλxτx
= λ1τ1(a1 − a′1) + λiτi(ai − a′i)
= −kλ1τ1 + kλiτi
= k(λiτi − λ1τ1)
≥ 0 because λiτi ≥ λ1τ1.
Hence, the first allocation has a smaller objective value.
4.2.2. Approximating the Pareto-front
We propose below two methodologies for computing the Pareto-front based on Algo-
rithm 3.
The first technique consists in using the method of Papadimitriou and Yannakakis pre-
sented in Algorithm 1. Since Algorithm 3 is a⟨1, 1⟩-approximation algorithm, we obtain
a (1+ε,1) Pareto-front approximation thanks to Theorem 1. In this case, the lower bound
Cminmax = C∗max computed by Algorithm 2 and the upper bound Cmax
max = nτ1 is the makespan
where all the tasks are executed on the processor that leads to the most reliable schedule
(hence, the longer schedules are Pareto-dominated by this one). The time-complexity of this
method is in O(m log1+ε(nτ1)
)which is polynomial.
17
The second method consists in calling Algorithm 3 only on relevant values of ω. It leads
to the question “What is the smallest value of ω′ > ω that produces a different schedule
?”. ω′ must be large enough to allow one task scheduled on processor j to be scheduled on
processor j′ < j instead, improving the reliability. Therefore, only the values of ω = xτj
are interesting; they correspond to the execution time of x(1 ≤ x ≤ n) tasks on processor
j(1 ≤ j ≤ m). There are less than nm interesting times and thus, less than nm Pareto-
optimal solutions. Using Algorithm 3, the exact Pareto-front can be found in O(nm2); this
time-complexity is exponential in the size of the instance. Indeed the size of the instance is
not n but O(log n): we only need to encode the value of n, not the n tasks as they are all
identical.
4.3. Independent arbitrary tasks
In this section, we extend the analysis to the case where the tasks are not unitary (the
values pi are integers). As before, the makespan objective is fixed and we aim at determining
the best possible reliability. However, since the problem of finding if there exists a schedule
whose makespan is smaller than a target value, given a set of processors and any independent
tasks, is NP-complete, it is not possible to find an optimal schedule unless P=NP.
4.3.1. A⟨2, 1⟩
approximation-algorithm
We present below a⟨2, 1⟩-approximation algorithm called CMLT (for ConstrainedMin-
LambdaTau) which has a better complexity and which is easier to implement than the general
algorithm presented in [17].
Let ω be the guess value of the optimum makespan. Let M(i) = {j | pij ≤ ω} be the
set of processors able to execute task i in less than ω units of time. It is obvious that if i is
executed on j /∈M(i) then, the makespan will be greater than ω.
The following proposition states that if task i has less operations than task i′, then all
the machines able to schedule i′ in less than ω time units can also schedule i in the same
time. The proof is directly derived from the definition of M and thus it is omitted.
Proposition 3. ∀i, i′ ∈ T such that pi ≤ pi′, M(i′) ⊆M(i)
18
CMLT is presented as follows: for each task i considered in non-increasing number of
operations, schedule i on the processor j of M(i) that minimizes λjτj with Cj ≤ ω (or it
returns no schedule if there is no such processor). Sorting the tasks by non-increasing number
of operations implies that more and more processors are used over time.
The principle of the algorithm is rather simple. However several properties should be
verified to ensure that it is always possible to schedule all the tasks this way.
Lemma 1. CMLT returns a schedule whose makespan is lower than 2ω or ensures that there
is no schedule whose makespan is lower than ω.
Proof. We need first to remark that if the algorithm returns a schedule, then its makespan
is lower than 2ω (task i is executed on processor j ∈ M(i) only when Cj ≤ ω). It remains
to prove that if the algorithm does not return a schedule then there is no schedule with a
makespan lower than ω.
Suppose that task i cannot be scheduled on any processor of M(i). Then all processors
of M(i) execute tasks during more than ω units of time, ∀j ∈M(i), Cj > ω.
Moreover, due to Proposition 3, each task i′ ≤ i such that pi′ > pi could not have been
scheduled on a processor not belonging to M(i). Thus, in a schedule with a makespan lower
than ω, all the tasks i′ ≤ i must be scheduled on M(i).
There are more operations in the set of tasks {i′ ≤ i} than processors in M(i) can execute
in ω units of time.
Lemma 2. CMLT generates a schedule such that rel ≤ rel∗(ω)
Proof. We first construct a non-feasible schedule π∗ whose reliability is a lower bound of
rel∗(ω). Then, we will show that rel(CMLT ) ≤ rel(π∗).
We know from Theorem 3, that the optimal reliability under the makespan constraint
for unitary tasks and homogeneous processors is obtained by adding tasks to processors in
(sorted in increasing order of λτ) up to reaching the threshold ω. For arbitrary lengths, we
can construct a schedule π∗ using a similar method. Task i is allocated to the processor
of M(i) that minimizes the λτ product. But if i finishes after ω, the exceeding quantity is
19
scheduled on the next processor belonging to M(i) in λτ order. Note that such a schedule
exists because CMLT returns a solution. Of course this schedule is not always feasible as
the same task can be required to be executed on more than one processor at the same time.
However, it is easy to adapt the proof of Theorem 3 and to how that rel(π∗) ≤ rel∗(ω).
The schedule generated by CMLT is similar to π∗. The only difference is that some
operations are scheduled after ω. In π∗, these operations are scheduled on less reliable
processors. Thus, the schedule generated by CMLT has a better reliability than π∗.
Finally, we have rel(CMLT ) ≤ rel(π∗) ≤ rel∗(ω) which concludes the proof.
Remark that if ω is very large, M(i) = Q for all tasks i and hence all the tasks will
be scheduled on the processor which minimizes the λτ product leading to the most reliable
schedule.
Lemma 3. The time complexity of CMLT is in O(n log n+m logm).
Proof. The algorithm should be implemented using a heap according to what is presented
in Algorithm 4. The cost of sorting tasks is in O(n log n) and the cost of sorting processors
is in O(m logm). Adding (and removing) a processor to (from) the heap costs O(logm) and
such operations are done m times. Heap operations cost O(m logm). Scheduling the tasks
and all complementary tests are done in constant time, and there are n tasks to schedule.
Scheduling operations cost is in O(n).
All the results of this section are summarized in the following theorem:
Theorem 4. CMLT is a⟨2, 1⟩-approximation algorithm with a complexity in O(n log n +
m logm).
4.3.2. Approximating the Pareto-front
Here again we can approximate the Pareto-front using the method of Papadimitriou and
Yannakakis. Thank to Theorem 1, Algorithm 1 applied on CMLT leads to a (2 + ε,1)-
approximation of the Pareto front.
20
Algorithm 4: CMLT
Input: ω the makespan threshold
begin
Sort the tasks in non-increasing pi order (now, ∀i ∈ [1, n− 1], pi ≥ pi+1)
Sort the processors in non-decreasing τj order (now, ∀j ∈ [1,m− 1], τj ≤ τj+1)
Let H be an empty heap
j ← 1
for i from 1 to n do
while j ∈M(i) doAdd j to H with key λjτj
j ← j + 1
if H.empty() thenReturn no solution
j′ ← H.min()
schedule i on j′
Cj′ ← Cj′ + piτj′
if Cj′ > ω then
Remove j′ from H
end
21
The lower bound Cminmax =
∑i pi∑j
1τj
is obtained by considering that a single virtual processor
gathers the whole computational power of all the processors.
The upper bound Cmaxmax =
∑i pi maxj τj is the makespan obtained by scheduling all tasks
on the slowest processor. No solution can have a worse makespan without introducing idle
times which are harmful for both objective functions. Notice that Cmaxmax can be achieved by
a Pareto-optimal solution if the slowest processor is also the most reliable one.
The last points to answer are about the cardinality of the generated set and the complexity
of the algorithm.
• Cardinality: The algorithm generates less than dlog1+ ερ1
Cmaxmax
Cminmaxe ≤ dlog1+ ε
ρ1
maxiτi∑
j 1/τje
≤ dlog1+ ε2mmaxiτi
miniτie solutions which is polynomial in 1/ε and in the size of the instance.
• Complexity: Remark that CMLT sorts the tasks in an order which is independent of
ω. This sorting can be done once for all. Thus, the complexity of the Pareto-front
approximation algorithm is O(n log n+ dlog1+ε/2(Cmaxmax
Cminmax)e(n+m logm)).
In Section 2.2 we briefly recalled the work of Shmoys and Tardos done for a different
bi-objective problem [17] which may also be used in our context. Using this method, we can
derive a⟨2, 1⟩-approximation algorithm whose time-complexity is in O(mn2 log n). This is
larger than the time-complexity of CMLT in O(n log n+m logm). Moreover, in the perspec-
tive of approximating the Pareto-front of the problem with the method previously presented,
the algorithm derived from [17] would have a time-complexity of dlog1+ε/2(Cmaxmax
Cminmax)e(mn2 log n).
Unlike CMLT, this algorithm cannot be easily tuned to avoid a significant part of computa-
tions when the algorithm is called several times. Thus, CMLT is significantly better than the
algorithm presented in [17] which has been established in a more general setting on unrelated
processors.
4.4. Experimental analysis of CMLT
The goal of this section is to compare the front obtained by approximation Algorithm 1
applied with CMLT with an idealized virtual front (called F ). We intent to show that this
22
�������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������
������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������� Ref. point
π1
π2
π3
π4
π5
Fai
l.pro
ba.
Makespan
Figure 3: The hypervolume is the set of the points that are dominated by a point of the front and that
dominates the reference point. In this example, it is the blue zone. When the two objectives have to be
minimized, the hypervolume should be maximized.
algorithm has not only a very-good worst case guaranty as shown in Theorem 4, but has
also a good behavior on average.
More precisely, we use Algorithm 1 with ε = 10−3 applied on CMLT. The obtained result
is compared to a front F composed of three points, namely, the HEFT [25] schedule (oriented
to optimize the makespan), the most reliable schedule (obtained by scheduling all the tasks to
the processor with the smallest λτ product) and a fictitious schedule with the same makespan
as HEFT and the best reliability. Although one can find a better makespan-centric schedule
than the one found by HEFT, the front F is a very good front that dominates all the fronts
found by CMLT.
To compare the fronts we use the Hypervolume unary indicator [26] (see Fig. 3) which
considers the volume of the objective space dominated by the considered front up to a
reference point. This choice is motivated by the fact that this indicator is the only unary
indicator that is sensitive to any type of improvements. Hence, if a front maximizes this
indicator, then it contains all the Pareto-optimal solutions. Since we target a problem of
minimizing two objectives, the greater the hypervolume the better the front [26]. In our
case, the hypervolume of F is always a rectangular.
23
2−approx vs. Inf. Bound
Hypervolume ratio
Fre
quen
cy
010
020
030
040
050
0
●●● ●●● ●● ●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0.6 0.7 0.8 0.9 1.0
Figure 4: ECDF and histogram of the hypervolume ratio between the approximation algorithm front and F
The input cases are the following. We consider three sets of machines with respectively
10, 20 and 50 processors. Speeds and the inverses of the failure rates are randomly generated
according to an uniform distribution. We generate sets of tasks with cardinality between
10 and 100 (by increment of 1). For each set of tasks we draw the processing requirement
uniformly between 1 and 100 (resp. 104, 106 and 109) for sets of class A (resp. B, C and D).
For each set and class of tasks, 4 different seeds were used.
In Fig. 4, we plot the empirical cumulative distributed function (ECDF) and the his-
togram of the ratio between the hypervolume of the two fronts for all the input cases (the
higher the ratio the closer the approximation algorithm front to F ). From this figure, we see
that the ratio is never lower than 0.6 and the median is 0.94 and 2/3 of the cases have a ratio
greater than 0.9. This means that the (2 + ε, 1)-approximation algorithms gives very good
fronts on average: in most of the cases, the obtained fronts are very close to the optimal
ones.
24
5. Precedence Task Graphs
5.1. Arbitrary Graphs
In this section, we study the general case where there is no restriction on the precedence
task graph. We present three ways of designing bi-objective heuristics from makespan centric
ones. The first one is based on the characterization of the role of the λτ product {failure
rate} {unitary instruction execution time}. The second one uses aggregation to change the
allocation decision in the list-based makespan-optimizing heuristic. The third one, called
geometric, selects the solution that follows the best a given direction in the objectives space.
5.1.1. The case of communication
When dealing with regular task graph, edges model communication. In this case failure
of the network can also have an impact on the reliability. We could tackle this problem by
considering the network as a new resource like in [27, 5]. However, a simpler way to consider
this problem is to incorporate the network and the CPU into one entity called a node3. As
we only consider failstop error, a node has to be up from the start of the application to its
end. We assume that each node has a unique dedicated link to a fail-free network backbone.
If for a schedule π, node j is used during Cj(π), this means that both the network and the
CPU and the network must work. Let call λcj, λnj , and λlj the failure rate of the CPU and
the network card and the network link of node j the probability that the three are up is
therefore e−λcjCj(π) × e−λ
nj Cj(π) × e−λ
ljCj(π) = e−(λ
cj+λ
nj +λ
lj)Cj(π). This means the node has a
failure rate which is the sum of the failure rate of its CPU, its network card and its network
link to the fail-free backbone. Therefore, in the following we will call λj the failure rate of
the whole node in order to take into account the CPU and the network failures.
5.1.2. Approximating the Pareto-front Using a Makespan-Centric Heuristic
Both for the unitary and non-unitary independent tasks we have shown that scheduling
tasks on the nodes with the smallest λτ helps in improving the reliability. Therefore, in order
3In the remaining, the term node is used to encompass both the CPU and the network card.
25
to approximate the Pareto-front we propose a heuristic, called GPFA (General Pareto-Front
Approximation), which is detailed in Algorithm 5 below.
Algorithm 5: GPFA a General heuristic for approximating the Pareto-front
Input: H a makespan centric heuristic
Data: G the input DAG
Result: S an approximation of the Pareto-front
beginSort the nodes in non-decreasing λjτj order
S ← ∅
for j from 1 to m doLet πj be the schedule of G obtained by H using the first j nodes
if πj is not dominated by any solutions of S then
S ← S ∪ {πj}
return S
end
The idea is to build a set of makespan/reliability trade-offs by scheduling the tasks on a
subset of nodes (sorted by non-decreasing λτ product) using a makespan centric heuristic.
The smaller the number of used nodes the larger the makespan and the better the reliability
(and vice-a-versa). We can use any makespan centric heuristics to implement this strategy
such as HEFT [25], BIL [28], PCT [29], GDL [30], HSA [5] or CPOP [25].
5.1.3. Bi-objective Aggregation-based Heuristic
The class of heuristics based on aggregation uses an additive function to combine objec-
tives. As in [5], we use the following function. Given a ranking of the tasks, the heuristic
schedules task i to the node j such that:√α
(end(i, j)
maxj′ end(i, j′)
)2
+ (1− α)
(piτjλj
maxj′ piτj′λj′
)2
is minimized, where, end(i, j) is the completion time of task i if it is scheduled as soon as
possible on node j and α is parameter given by users that determines the tradeoff between
26
each objective (α = 1 leads to a makespan-centric heuristic). Each term represents one of
the objective and is normalized since all objectives are expressed in different units and can
have different orders of magnitude. The normalization is done relatively to an approximation
of the worst allocation of the tasks.
5.1.4. Bi-objective Geometric-based Heuristic
Concerning the geometric class of heuristics, the idea has been introduced in [31] and is
described below. The user provides an angle θ between 0◦ an 90◦ and a greedy scheduling
algorithm. Intuitively, θ is the direction in the objective space, the user wants to follow. A
value close to 0◦ means that the user favors the Makespan while a value close to 90◦ means
the opposite. At each step, a partial schedule S is constructed and a new task is considered.
The algorithm simulates its execution on all the m nodes and hence, it generates m partial
schedules, each one having its own reliability and makespan. Among these schedules, we
discard the Pareto dominated ones. Then, these partial schedules and S – the one generated
at the previous step – are plotted into a square of size 1, S being at the origin (see Fig 5).
Then, a line determined by the origin and an angle θ with the x-axis is drawn. The closest
partial schedule to this line is retained (s2 in the figure) and we proceed the next step.
5.1.5. Experimental Settings
We compare experimentally the three ways of designing bi-objective heuristics from
makespan centric ones by implementing them on HEFT and HSA. Therefore, GPFA is
used to derive P-HEFT and P-HSA, the aggregation scheme is used to derive B-HEFT and
B-HSA4, and the geometric construction is used to derive G-HEFT and G-HSA.
We have used 3 types of graphs: the Strassen DAG [32] and 2 random graphs namely
samepred (each created node can be connected to any other existing nodes) and layrpred
(where the nodes are arranged by layers). We have used the following parameters to build
the graphs:
• Number of tasks: 10, 100, 1000 for random graphs or 23, 163, 1143 for Strassen DAGs.
4notice that this heuristic was first proposed By Hakem and Butelle and is called BSA in [5]
27
Makespan
Θ
s1
s2
Fai
l.p
rob
a.
S
s3
s4
s5
s6
s7
s8
s9
s10
Figure 5: The geometric heuristic with 10 nodes: star are Parto-optimal solutions and crosses are dominated
solution and are discarded. Hence partial schedule s2 is selected and the task is mapped on node 2.
• Average task cost of the pis (in FLOP), for random graphs: 106, 107 or 109 (fixed by
structure for Strassen).
• Variation of the task costs: 0.5, 0.0001, 0.1, 0.3, 1 or 2 for random graphs (fixed by
structure for Strassen). These numbers, combined with the average costs are used to
compute the standard deviation of the Gamma distribution used to draw the task cost
(we use a Gamma distribution because it is a positive function that is commonly used
to model timings). In this case, the standard deviation is computed by multiplying the
average cost by the variation.
• Average communication cost (Byte) 103, 104 or 106 for random graphs (fixed by struc-
ture for Strassen).
• Variation of communication costs: 0.5, 0.0001, 0.1, 0.3, 1 or 2 for random graphs (fixed
by structure for Strassen). Here again, the variation is used combined with the average
cost to compute the standard deviation of the distribution.
28
• Average number of edges per node: 1, 3 or 5 for random graphs (fixed by structure for
Strassen).
• Number of available machines: 10, 20 or 50
• The speeds of the machines are randomly generated according to a uniform distribution:
τ = (107(U(1000) + 1))−1
• The inverses of the failure rates are randomly generated according to a uniform distri-
bution: λ = BFR/(U(1000)+1). Where BFR is used to scale the probability of failure
and is equal to 1, 10−3 or 10−6.
• The network is homogeneous and the topology supposed to be complete. There is no
latency.
• 10 seeds from 0 to 9 for random graphs (no seed for the Strassen).
All the combinations lead to 525 123 different settings. For each setting, we have com-
puted an approximation of the Pareto-front as follows. For P-HSA and P-HEFT we have
generated as many schedules as available nodes (one with the node with the smallest value
of λτ , one with the two nodes with the smallest λτ etc.). For the 4 other heuristics, we have
used 1000 different values for the compromise parameters to generate the front. Finally,
more than 2.1 billions of schedules have been computed.
Among the 525 123 fronts generated about 34 100 of them have only one point and are
not considered in the evaluation. This is the case where the probability of success of the
schedule is 0 even on the nodes which have the smallest λτ product (e.g. when the generated
platform is highly unreliable or the precedence task graph is very large).
5.1.6. Results
As in section 4.4, we use the Hypervolume [26] unary indicator to compare the fronts of
each heuristic.
We have computed the hypervolume using the same reference point for all the fronts
of the same setting that have at least 2 points. We then compare 2 by 2 each heuristic
29
xlim
P.HEFT
xlim
0:1
43.50 % (>)32.17 % (=)
1.000[1.00,1.01]
xlim
0:1
84.20 % (>)2.20 % (=)
1.046[1.02,1.11]
xlim0:
1
83.11 % (>)2.04 % (=)
1.048[1.02,1.11]
xlim
0:1
69.08 % (>)4.11 % (=)
1.074[0.995,1.37]
xlim
0:1
69.78 % (>)3.16 % (=)
1.092[0.993,1.39]
log(ratio)x xlim
0:1 P.HSA
xlim
0:1
82.26 % (>)1.72 % (=)
1.046[1.01,1.11]
xlim
0:1
82.70 % (>)2.38 % (=)
1.045[1.01,1.10]
xlim0:
1
68.20 % (>)2.71 % (=)
1.086[0.992,1.36]
xlim
0:1
67.36 % (>)3.60 % (=)
1.077[0.992,1.38]
log(ratio)x log(ratio)
Fre
quen
cy
x
Fn(
x)
xlim
0:1 G.HEFT
xlim
0:1
42.09 % (>)17.34 % (=)
1.000[0.99,1.01]
xlim
0:1
60.16 % (>)1.42 % (=)
1.054[0.916,1.30]
xlim
0:1
60.18 % (>)1.16 % (=)
1.068[0.915,1.32]
log(ratio)x log(ratio)
Fre
quen
cy
x
Fn(
x)
log(ratio)
Fre
quen
cy
x
Fn(
x)
xlim
0:1 G.HSA
xlim
0:1
60.74 % (>)1.09 % (=)
1.071[0.915,1.31]
xlim
0:1
60.52 % (>)1.30 % (=)
1.060[0.92,1.32]
log(ratio)x log(ratio)
Fre
quen
cy
x
Fn(
x)
log(ratio)
Fre
quen
cy
x
Fn(
x)
log(ratio)
Fre
quen
cy
x
Fn(
x)
xlim
0:1 B.HEFT
xlim
0:1
50.68 % (>)20.33 % (=)
1.001[0.997,1.01]
0.5 1.0 2.0
Fre
quen
cyF
n(x)
0.5 1.0 2.0
Fre
quen
cyF
n(x)
0.5 1.0 2.0
Fre
quen
cyF
n(x)
0.5 1.0 2.0
Fre
quen
cyF
n(x)
0.5 1.0 2.0
0:1 B.HSA
Figure 6: Scatter plot of the ratio of the hypervolume indicator for the 6 heuristics
30
by computing the ratio of the hypervolume of each front of each setting. Fig. 6 shows the
obtained results. The six heuristics are displayed on the diagonal of the figure. On the
lower part, the histogram and the ECDF (empirical cumulative distribution function) of the
hypervolume ratio are displayed for the two heuristics on the corresponding row and column.
On the upper part, we summarized some numerical values that indicate: the percentage of
ratios that are strictly above 1; the percentage of ratios that are equal to 1, the median ratio
and in brackets, the first and the third quartiles5 . For example, we see that 84.2 % of the
hypervolume ratio between P-HEFT and G-HEFT is greater than 1 (the hypervolume of P-
HEFT is greater than the one of G-HEFT in 84.2% of the cases), in 2.2% the hypervolumes
are equal, the median hypervolume ratio is 1.046. Moreover half of the ratios are between
1.02 and 1.11, a quarter of them being under 1.02 and the other quarter above 1.11.
The results show that GPFA-based heuristics (P-HEFT and P-HSA) perform the best
according to the hypervolume indicator. They are much better than geometric heuristics and
outperform aggregation heuristics in more than two thirds of the cases. P-HEFT is slightly
better than P-HSA (it outperforms P-HSA in 43.5% of the cases while P-HSA outperforms
P-HEFT in 24.33% of the cases). Next, geometric heuristics (GEFT and GFA) are better
than aggregation ones (B-HEFT and B-HSA). This is explained by the fact that some fronts
computed by B-HEFT or B-HSA are really bad (as shown by the histograms in the lower
part: some of the hypervolume are more than twice as large are their respective P-HEFT or
P-HSA counterpart). We also see that the HEFT ordering provides better results than the
HSA ordering (G-HEFT is slightly better than G-HSA and B-HEFT is slightly better than
B-HSA).
We have also compared the resources required to compute the fronts. We record for each
schedule the number of used nodes by the schedule. In Fig 7, we present a similar scatter
plot than in Fig 6. The difference is, as we want to minimize the number of nodes, we now
display, in the upper part the fraction of ratios that are lower than 1 instead of greater than
5The first (resp. third) quartile is the value such that 25% of the results are under (resp. above) this
value
31
xlim
P.HEFT
xlim
0:1
32.98 % (>)20.09 % (=)
1.000[0.98,1.01]
xlim
0:1
11.62 % (>)7.22 % (=)
0.877[0.778,0.97]
xlim
0:1
12.00 % (>)6.39 % (=)
0.872[0.778,0.964]
xlim
0:1
2.87 % (>)8.15 % (=)
0.700[0.603,0.896]
xlim
0:1
4.31 % (>)7.30 % (=)
0.701[0.604,0.887]
log(ratio)x xlim
0:1 P.HSA
xlim
0:1
13.72 % (>)6.98 % (=)
0.885[0.78,0.979]
xlim
0:1
11.63 % (>)7.47 % (=)
0.883[0.781,0.97]
xlim
0:1
3.87 % (>)7.75 % (=)
0.700[0.607,0.908]
xlim
0:1
2.82 % (>)8.62 % (=)
0.701[0.608,0.899]
log(ratio)x log(ratio)
Fre
quen
cy
x
Fn(
x)
xlim
0:1 G.HEFT
xlim
0:1
43.44 % (>)10.63 % (=)
1.000[0.97,1.03]
xlim
0:1
14.26 % (>)8.10 % (=)
0.867[0.732,0.988]
xlim
0:1
14.26 % (>)7.32 % (=)
0.862[0.728,0.984]
log(ratio)x log(ratio)
Fre
quen
cy
x
Fn(
x)
log(ratio)
Fre
quen
cy
x
Fn(
x)
xlim
0:1 G.HSA
xlim
0:1
16.06 % (>)7.32 % (=)
0.867[0.734,0.991]
xlim
0:1
14.90 % (>)7.73 % (=)
0.865[0.731,0.986]
log(ratio)x log(ratio)
Fre
quen
cy
x
Fn(
x)
log(ratio)
Fre
quen
cy
x
Fn(
x)
log(ratio)
Fre
quen
cy
x
Fn(
x)
xlim
0:1 B.HEFT
xlim
0:1
41.08 % (>)13.25 % (=)
1.000[0.986,1.01]
0.6 1.0 1.6
Fre
quen
cyF
n(x)
0.6 1.0 1.6
Fre
quen
cyF
n(x)
0.6 1.0 1.6
Fre
quen
cyF
n(x)
0.6 1.0 1.6
Fre
quen
cyF
n(x)
0.6 1.0 1.6
0:1 B.HSA
Figure 7: Scatter plot of the ratio of the average number of nodess required to compute a given front for the
6 heuristics
32
1.
The results show that GPFA-based heuristics (P-HEFT and P-HSA) use much less nodes
than the other heuristics (P-HEFT being slightly better than P-HSA, for this metric). Be-
tween 79.3% and 88.97% of the cases are favorable to this type of heuristics. Here again,
geometric heuristics are better than aggregation ones. Last, the difference between the HSA
and HEFT based heuristics is very low with the heuristics based on HEFT being marginally
better.
We also have performed some projections of these results to the different possible pa-
rameters (task cost, number of task, failure rate, etc.). Most of the time we notice that a
variation of a parameter has no or very little influence on the obtained results and hence re-
sults are not displayed here. The only interesting case is related to the number of machines.
When this number is low (10 machines to schedule the whole graph) we see that, for the
hypervolume metric, the B-HEFT and B-HSA heuristics perform the best. For instance the
hypervolume ratio is favorable to B-HEFT in 71.07% of the cases (resp. 74.27%, 93.36%,
92.96%) compared to P-HEFT (resp. P-HSA, G-HEFT, G-HSA) when using 10 nodes.
In conclusion to this section, we see that thanks to our understanding of the problem and
since we have identified the crucial role of the product {failure rate}× {unitary instruction
execution time} we have been able to design a heuristic that provides a good approximation
of the Pareto-front. Such understanding help us to overcome general solution (aggregation
or geometric-based heuristics) that do not use this problem-specific feature. Moreover, the
geometric heuristics are much better than aggregation ones.
5.2. A single chain on m nodes
Computing an approximation algorithm for the general case is a difficult problem. Indeed,
there are no known approximation algorithms for optimizing the makespan of an arbitrary
precedence task graph on related nodes. In fact, even the case of multiple chains has no
known constant approximation algorithm. In this section, we address the following special
case.
33
5.2.1. Characterizing the Pareto-Front
In this section we are interested in an elementary subcase of the general graph case. The
precedence task graph is a single chain, therefore task i− 1 must be completed before task
i can start its execution.
Precedence constraints may induce idle times in the schedule and the formulation of Cj,
the completion time of a node must be modified to take the precedence into account. The
important point is that the reliability computation contains idle times.
The precedence graph being a single chain, only one of the nodes is working at a time.
Therefore, one can assume that no nodes are faster and more reliable than another one (a
node that is slower and less reliable than an other node could just be ignored). Without loss
of generality, the nodes are ordered from the fastest one to the most reliable one. That is to
say, τj < τj+1 and λjτj > λj+1τj+1; a direct consequence is that λj > λj+1.
Lemma 4. A solution that executes a task on node j after executing a task on node j′, j′ > j
is not Pareto-optimal.
Proof. Let π be such a solution. The proof is done by constructing a solution π′ that Pareto-
dominates π.
Let i be the last task executed on node j′ for which a successor is executed on node j.
π′ is constructed by keeping the same allocation as π except that task i is moved onto node
j. The completion times of all nodes are smaller in π′ than in π. Indeed, the completion
time of each node j′′ which executes a task i′ ≥ i diminishes by (τj′ − τj)pi. The other nodes
complete at the same time. This improves both the makespan and the reliability.
The solutions complying with Lemma 4 have the following structure: at most one interval
of tasks is scheduled on a node and if task i is scheduled on node j then, all the tasks i′ > i
are scheduled on nodes j′ ≥ j. Those solutions are in bijection with the set of partitions of
the chain of n tasks in m intervals, allowing empty intervals.
Those partitions can be enumerated using a recursive function that takes a schedule of
the first x tasks and returns all the solutions that comply with this partial schedule. The
number of such partition is in O(nm−1).
34
Theorem 5. The number Pareto-optimal solutions for the problem of scheduling of chain of
tasks on related nodes to optimize the makespan and the reliability is in O(nm−1) and can be
enumerated using a algorithm of time-complexity O(nm−1).
Since each task may be scheduled on m different nodes, there are mn valid schedules to
this problem. However, Lemma 4 allows to restrict the number of Pareto-optimal solutions
to O(nm−1) which is significantly better since there are usually more tasks than nodes in
such problems.
5.2.2. Experimental evaluation
Here we compare the optimal front found by the method described above with the one
found by P-HEFT (GPFA implementation with the HEFT heuristic), G-HEFT (geometric
heuristic) and B-HEFT (aggregation heuristic). We did not use the HSA heuristic because
it differs from HEFT only on the ranking of the tasks and as we are dealing with chains, this
ranking is imposed by the chain and is the same for HSA and HEFT.
The experimental setting is the same as in section 4.4. The only difference being that
the tasks are strictly ordered. Moreover, as the optimal algorithm is exponential we limit its
usage to the case where the number of explored solutions is lower than 100 000 000. Moreover,
we discarded the cases where the Pareto-front is reduced to one point (e.g. low number of
tasks).
Finally, we have compared more than 30 000 fronts for each of the 3 heuristics. Each
fronts requiring up to 100 schedules.
Results are displayed in Fig. 8. We see that P-HEFT and G-HEFT find the optimal
result in one percent of the cases. The median ratio is respectively 1.149 and 1.451. P-
HEFT outperfoms G-HEFT in more than 44% of the cases and is outperformed in only
0.62% of the cases. P-HEFT (resp. G-HEFT) outperforms B-HEFT in more than 88%
(resp. 56%) of the cases. Hence, we see that our GPFA-based heuristic is better than the
other heuristics (the aggregation-based one performing particularly poorly). Last, we see
that our GPFA-based heuristic is not too far from the optimal since 75% of the cases have
a ratio lower than 1.47.
35
xlim
OPT
xlim
0:1
98.89 % (>)1.02 % (=)
1.149[1.04,1.47]
xlim
0:1
98.91 % (>)0.999 % (=)
1.451[1.17,1.57]
xlim
0:1
99.99 % (>)1.665
[1.14,14.75]
log(ratio)x xlim
0:1 P.HEFT
xlim
0:1
44.40 % (>)54.98 % (=)
1.000[1.00,1.36]
xlim0:
1
88.55 % (>)1.265
[1.06,6.93]
log(ratio)x log(ratio)
Fre
quen
cy
x
Fn(
x)
xlim
0:1 G.HEFT
xlim
0:1
56.30 % (>)1.053
[0.902,6.87]
0.02 0.20 2.00 50.00
Fre
quen
cyF
n(x)
0.02 0.20 2.00 50.00
Fre
quen
cyF
n(x)
0.02 0.20 2.00 50.00
0:1 B.HEFT
Figure 8: Scatter plot of the ratio of the hypervolume indicator of the optimal front and the one found by
the 3 heuristics for the Chain case
36
6. Conclusions
As larger and larger infrastructures are available to execute distributed applications,
reliability becomes a crucial issue. However, optimizing both the reliability and the length
of the schedule is not always possible as they are often conflicting objectives.
Here, we have studied the problem of scheduling tasks on heterogeneous platforms. We
have tackled two metrics: reliability and makespan. As these two objectives are unrelated
and sometimes contradictory, we need to investigate bi-objective approximation algorithms.
In the previous works of the literature, some heuristics have been proposed to solve similar
problems [3, 4, 5]. However, none of them discuss the fundamental properties of a good bi-
objective scheduling algorithm. In this paper, we have tackled important subproblems in
order to determine how to efficiently solve this problem.
We have shown that minimizing the reliability is a polynomial problem but optimizing
both the makespan and the reliability cannot be approximated. For the case of scheduling
independent unitary tasks, we have proposed an approximation algorithm that finds, among
the schedules that do not exceed a given makespan, the one with the best reliability. Based
on this algorithm we have derived a (1+ε,1) approximation algorithm of the Pareto-front.
For the case of independent non-unitary tasks and uniform processors, we have designed
the CMLT algorithm and proved that it is a⟨2, 1⟩-approximation. Finally, we derived a
(2 + ε, 1)-approximation of the Pareto-front of the problem.
The above results have highlighted the role of the {failure rate} × {unitary instruction
execution time} (λτ). For general precedence task graphs, based on the importance of this
product, we have shown that it is easy to extend most of the heuristics designed for optimizing
the makespan by taking into account the reliability. Experiments show that we outperform
the other heuristic of the literature both in terms of the front quality and for the resource
usage. Finally for a single chain with two nodes, we have proved that the Pareto-front can
be obtained in polynomial time.
37
References
[1] R. Koo, S. Toueg, Checkpointing and rollback-recovery for distributed
systems, IEEE Transactions on Software Engineering 13 (1987) 23–31.
doi:http://doi.ieeecomputersociety.org/10.1109/TSE.1987.232562.
[2] A. Bouteiller, T. Herault, G. Krawezik, P. Lemarinier, F. Cappello, MPICH-V: a Mul-
tiprotocol Fault Tolerant MPI, International Journal of High Performance Computing
and Applications 20 (3) (2006) 319–333.
[3] A. Dogan, F. Ozguner, Matching and Scheduling Algorithms for Minimizing Execution
Time and Failure Probability of Applications in Heterogeneous Computing, IEEE Trans.
Parallel Distrib. Syst. 13 (3) (2002) 308–323.
[4] A. Dogan, F. Ozguner, Bi-objective Scheduling Algorithms for Execution Time-
Reliability Trade-off in Heterogeneous Computing Systems, Comput. J. 48 (3) (2005)
300–314.
[5] M. Hakem, F. Butelle, A Bi-objective Algortithm for Scheduling Parallel Applications
on Heterogeneous Systems Subject to Failures, in: Renpar 17, 2006.
[6] R. L. Graham, E. L. Lawler, J. K. Lenstra, A. H. R. Kan, Optimization and approx-
imation in deterministic sequencing and scheduling : a survey, ann. Discrete Math. 5
(1979) 287–326.
[7] M. R. Garey, D. S. Johnson, Computers and Intractability, Freeman, San Francisco,
1979.
[8] T. Gonzalez, O. H. Ibarra, S. Sahni, Bounds for LPT schedules on uniform processors,
SIAM Journal of Computing 6 (1977) 155–166.
[9] D. S. Hochbaum, D. B. Shmoys, A polynomial approximation scheme for scheduling on
uniform processors: Using the dual approximation approach, SIAM Journal on Com-
puting 17 (3) (1988) 539 – 551.
38
[10] C. Chekuri, M. A. Bender, An efficient approximation algorithm for minimizing
makespan on uniformly related machines., Journal of Algorithms 41 (2001) 212–224.
[11] R. Giroudeau, J. Konig, Multiprocessor Scheduling: Theory and Applications, ARS
publishing, 2007, Ch. Scheduling with Communication Delays.
[12] M. L. Pinedo, Scheduling: Theory, Algorithms, and Systems, 3rd Edition, Springer
Publishing Company, Incorporated, 2008.
[13] A. Girault, E. Saule, D. Trystram, Reliability versus performance for critical applica-
tions, Journal of Parallel and Distributed Computing 69 (3) (2009) 326–336.
URL jpdc09-GST.pdf
[14] A. Benoit, L.-C. Canon, E. Jeannot, Y. Robert, Reliability of task graph schedules with
transient and fail-stop failures: complexity and algorithms, Journal of Scheduling.
[15] S. Shatz, J. Wang, Task allocation for maximizing reliability of distribued computer
systems, IEEE Transactions on Computers 41 (9) (1992) 1156–1169.
[16] A. Benoit, Y. Robert, A. Rosenberg, F. Vivien, Static worksharing strategies for hetero-
geneous computers with unrecoverable interruptions, Parallel Computing 37 (8) (2011)
365 – 378.
[17] D. B. Shmoys, E. Tardos, Scheduling unrelated machines with costs, in: Proceedings
of the Fourth Annual ACM/SIGACT-SIAM Symposium on Discrete Algorithms, 1993,
pp. 448–454.
[18] X. Besseron, S. Bouguerra, T. Gautier, E. Saule, D. Trystram, Fault tolerance and avail-
ability awarness in computational grids, Fundamentals of Grid Computing, Chapman
and Hall/CRC Press, 2009, Ch. 5.
[19] I. Sardina, C. Boeres, L. de A. Drummond, An efficient weighted bi-objective schedul-
ing algorithm for heterogeneous systems, in: H.-X. Lin, M. Alexander, M. Forsell,
39
A. Knupfer, R. Prodan, L. Sousa, A. Streit (Eds.), Euro-Par 2009 – Parallel Process-
ing Workshops, Vol. 6043 of Lecture Notes in Computer Science, Springer Berlin /
Heidelberg, 2010, pp. 102–111.
[20] J. J. Dongarra, E. Jeannot, E. Saule, Z. Shi, Bi-objective scheduling algorithms for
optimizing makespan and reliability on heterogeneous systems, in: Proc. of SPAA,
2007, pp. 280–288.
[21] E. Jeannot, E. Saule, D. Trystram, Bi-Objective Approximation Scheme for Makespan
and Reliability Optimization on Uniform Parallel Machines, in: The 14th International
Euro-Par Conference on Parallel and Distributed Computing (Euro-Par 2008), Las Pal-
mas de Gran Canaria, Spain, 2008.
[22] J. Y.-T. Leung (Ed.), Handbook of Scheduling. Algorithms, Models and Performance
Analysis, Chapman & Hall/CRC, 2004.
[23] C. H. Papadimitriou, M. Yannakakis, On the approximability of trade-offs and optimal
access of web sources, in: Proc. of FOCS, 2000, pp. 86–92.
[24] A. Legrand, Y. Robert, Algorithmique Parallele, Dunod, 2005.
[25] H. Topcuoglu, S. Hariri, M.-Y. Wu, Task scheduling algorithms for heterogeneous pro-
cessors, 8th IEEE Heterogeneous Computing Workshop (HCW’99) (1999) 3–14.
[26] E. Zitzler, L. Thiele, M. Laumanns, C. M. Fonseca, V. Grunert da Fonseca, Performance
Assessment of Multiobjective Optimizers: An Analysis and Review, IEEE Transactions
on Evolutionary Computation 7 (2) (2003) 117–132.
[27] S. Shatz, J.-P. Wang, M. Goto, Task allocation for maximizing reliability of dis-
tributed computer systems, Computers, IEEE Transactions on 41 (9) (1992) 1156 –1168.
doi:10.1109/12.165396.
40
[28] H. Oh, S. Ha, A static scheduling heuristic for heterogeneous processors, in: L. Bouge,
P. Fraigniaud, A. Mignotte, Y. Robert (Eds.), Euro-Par, Vol. II, Vol. 1124 of Lecture
Notes in Computer Science, Springer, 1996, pp. 573–577.
[29] M. Maheswaran, H. J. Siegel, A dynamic matching and scheduling algorithm for hetero-
geneous computing systems, in: Heterogeneous Computing Workshop, 1998, pp. 57–69.
URL http://computer.org/proceedings/hcw/8365/83650057abs.htm
[30] G. Sih, E. Lee, A compile-time scheduling heuristic for interconnection-constrained het-
erogenous processor architectures, IEEE Transactions on Parallel and Distributed Sys-
tems 4 (2).
[31] L.-C. Canon, E. Jeannot, Evaluation and Optimization of the Robustness of DAG Sched-
ules in Heterogeneous Environments, IEEE Transactions on Parallel and Distributed
Systems 21 (4) (2010) 532–546.
[32] T. H. Cormen, C. E. Leiserson, R. L. Rivest, C. Stein, Introduction to Algorithms, 2nd
Edition, The MIT Press, 2001.
41