Algorithms That Still Produce a Solution (Maybe Not Optimal) Even When Interrupted: Shary's Idea...

Reliable Computing 4: 39–53, 1998. 39c© 1998 Kluwer Academic Publishers. Printed in the Netherlands.

Algorithms That Still Produce a Solution(Maybe Not Optimal) Even When Interrupted:Shary’s Idea Justified

MARIA BELTRAN, GILBERT CASTILLO, and VLADIK KREINOVICHDepartment of Computer Science, University of Texas at El Paso, El Paso, TX 79968, USA,e-mail: {mbeltran, castillo, vladik}@cs.utep.edu

(Received: 5 December 1995; accepted: 3 June 1996)

Abstract. Many problems of interval computations are NP-hard; this means, crudely speaking, thatevery algorithm that finds an exact solution will, in some cases, require exponential time, which isoften unrealistic. If it turns out that an algorithm works too long, then we have to stop it. Somealgorithms, when stopped, produce a reasonable approximation to the solution (for example, anenclosure for the desired interval). Some other algorithms only produce a solution at the very end, sowhen interrupted, they produce no solution at all.

For interval computations, this distinction was first described by Shary, who proposed to requirethat algorithms still produce a solution, maybe not optimal, even when interrupted. In order for analgorithm to still produce a solution, the algorithm has to do some extra work. The natural questionis: how will it affect the computation time?

In this paper, we show that even with this “extra work” requirement, computation time stays(asymptotically) the same: linear-time algorithms can be converted into linear-time interruptible ones,quadratic-time into quadratic-time interruptible, etc. Thus, from the theoretical viewpoint, Shary’sidea is quite feasible.

1. Formulation of the Problem

1.1. MANY ALGORITHMS TAKE TOO LONG, SO INTERRUPTIONS ARE NEEDED

Many problems of interval computations are extremely computationally time-consuming (in precise terms, NP-hard; for precise definitions see, e.g., [3]). TheNP-hardness of some interval computation problems was first proved by Gaganovin [1], [2]. Namely, he proved that if we are given a polynomial ƒ(x1, …, xn) of nreal variables, and n intervals x1 = [x−1 , x+

1 ], …, xn = [x−n , x+n ], then computing the

exact range

y = ƒ([x−1 , x+1 ], …, [x−n , x+

n ]) = {ƒ(x1, …, xn) | x1 ∈ [x−1 , x+1 ], …, xn ∈ [x−n , x+

n ]}

is NP-hard. (For a recent survey, see, e.g., [4].)NP-hardness means, crudely speaking, that every algorithm that finds the exact

solution will, in some cases, require exponential time, which is often unrealistic. Ifit turns out that an algorithm works too long, then we have to stop it.

40 MARIA BELTRAN, GILBERT CASTILLO, AND VLADIK KREINOVICH

1.2. DIFFERENT ALGORITHMS REACT DIFFERENTLY TO INTERRUPTIONS: SHARY’SIDEA

Shary was the first to notice (in [8]) that with respect to such unexpected interrup-tions, algorithms exhibit two types of behavior:

• Some algorithms, when stopped, produce a reasonable approximation to thesolution (for example, an enclosure Y ⊇ y for the desired interval y). Sharycalled such algorithms sequentially guaranteeing.

• Some other algorithms only produce a solution at the very end, so when inter-rupted, they produce no solution at all. Such algorithms are called finally guar-anteeing.

Shary proposed to require that algorithms still produce a solution, maybe notoptimal, even when interrupted. This idea is supported by Shokin, who, in hissurvey paper [9], describes this idea as one of the three fundamental ideas thatdefine the future of interval computations.

This proposal is not just a theoretical idea: Shary has actually produced severalsequentially guaranteeing algorithms for some classes of problems in [7], [8]. Inthe general case, however, there may be a problem:

1.3. THERE MAY BE A PROBLEM

In order for an algorithm to still produce a solution, the algorithm has to do someextra work. The natural question is: how will it affect the computation time?

What may happen is that, in some cases, the only way to convert a finally guaran-teeing algorithm into a sequentially guaranteeing one may be to drastically increaseits computation time. For example, the extra work that is needed to guarantee apartial solution may lead to an increase from linear to cubic time. If this is the case,then Shary’s requirement may not be universally applicable.

1.4. WHAT WE ARE GOING TO PROVE

In this paper, we show that even with this “extra work” requirement, computa-tion time stays (asymptotically) the same: linear-time algorithms can be convertedinto linear-time interruptible ones, quadratic-time into quadratic-time interruptible,etc.

Thus, from the theoretical viewpoint, Shary’s idea is quite feasible.Two Words of Warning:

• The result that we will prove is mainly of theoretical value. It shows that inprinciple, we can make a sequentially guaranteeing algorithm that is not muchworse than finally guaranteeing ones in terms of computation time. However,the algorithm proposed in this proof should not be viewed as a practical one: thealgorithms that we produce may still require, say, linear time (t ≤ Cn), but themultiplicative constant C may be too large for practical applications.

ALGORITHMS THAT STILL PRODUCE A SOLUTION (MAYBE NOT OPTIMAL)... 41

This remark does not mean that our result is of no value at all. We just meanthat, for every practical case, there still is a necessity to come up with a practicalsequentially guaranteeing algorithm (just like Shary did in [8] for interval linearequations).

• Our result will not include proving a new technically difficult theorem. Instead,it will be a reasonably straightforward application of a theorem proved by Levinin [5]. The reader should keep in mind, however, that Levin’s result was abouta completely different situation. In 1973, no distinction between sequentiallyguaranteeing and finally guaranteeing algorithms had been proposed. To thebest of our knowledge, our result about “interruptible” and “non-interruptible”algorithms is new.

2. Motivations for the Following Definitions

• In order to formulate our result in the most general of terms, let us first describewhat “approximate solution” means. In this description, we will follow ideasfrom [8] and [9].

For intervals, if we cannot produce the exact range y, we would like toproduce an enclosure Y that is as close to y as possible. We can thus say thatevery interval that is an enclosure is a solution to the interval problem and thatthe desired range itself is an optimal solution.

Range estimation is not the only possible problem of interval computations.For example, in some control problems or design problems, we are, vice versa,interested in the subintervals of the optimal interval y [8], [9]. In this case, aninterval Y is a solution iff Y ⊆ y. In some other real-life problems, an evenmore complicated definition of a solution is possible.

It is reasonable to assume that some numerical measure of “distance” %(Y, y)of a solution Y to the optimal solution y is available. For example:

− We may take a number that describes whether the enclosure Y is twice as wideas the desired interval y, or three times as wide, etc. In other words, we maytake the ratio of the widths of these two intervals: %(Y, y) = w(Y) / w(y).We can also re-scale this measure in such a way that the ideal enclosurewill be characterized by 0 distance. One way to accomplish this is to sub-tract 1 from the ratio. Another way would be to take the logarithm of theratio.

This definition does not always convey the meaning of approximation:

∗ For example, if the desired interval y consists of a single point, thenits width w(y) is zero, and for every Y ⊃ y with w(Y) 6= 0, we have%(Y, y) = ∞. If we get the exact same degenerate interval (Y = y), thenthe above ratio is not well defined; it is natural to define it as 1 (as whenY = y and w(y) = 0); then, %(Y, y) = w(Y) / w(y)− 1 = 0.


∗ In this case, for any ε > 0, if we want to find an estimate Y that is ε-closeto y (i.e., for which %(Y, y) ≤ ε), we end up with the task of computingY = y.

In other words, for the degenerate case w(y) = 0, contrary to our intuitiveunderstanding of the approximation, the ε-approximate problem is identi-cal with the original problem. To remedy this situation, we can consider adifferent measure of distance:

− As an alternative measure of distance, we may take a number that describeswhether the enclosure is 1 unit wider, or two units wider, etc. In other words,take the difference between the widths %(Y, y) = w(Y)−w(y) as the measureof distance.

Several other possible measures of “distance” are used in different intervalcomputation problems, and there are other reasonable measures that can, inprinciple, be used. Some of these measures satisfy all three axioms that define ametric:

− the condition that %(g, g′) = 0 iff g = g′;

− symmetry %(g, g′) = %(g′, g); and

− triangle inequality %(g, g′′) ≤ %(g, g′) + %(g′, g′′).

Other natural measures of “distance” satisfy only some (or even none) of theseproperties.

For each of these measures, in addition to the problem of finding the exactrange y, we can formulate the following ε-approximate problem for every realnumber ε ≥ 0: find an enclosure Y whose measure of distance from the desiredinterval y does not exceed ε.

For ε = 0, this is the same problem of finding the exact interval y. For ε > 0,approximate solutions suffice.

For smaller values of ε, we seek a more accurate solution, and therefore,the smaller ε, the more complicated the problem becomes. Conversely, as εincreases, the requirements on the accuracy are relaxed and the problem offinding an ε-approximate solution becomes easier.

• One important feature of one-dimensional intervals and reasonable measures ofdistance (such as difference between the widths) is that:

− if we have two intervals g and g′ that are both solutions to a certain problem,

− then we can combine them into a new interval (we will denote this combinedinterval by g ∗ g′) that is at least as close to the ideal solution as both g andg′: %(g ∗ g′, y) ≤ min(%(g, y), %(g′, y)).

For example:

− When we are interested in an enclosure, we can take the intersection of g andg′ as g ∗ g′.


− When we are interested in a subinterval, we can take the union of g and g′ asg ∗ g′.

This operation is computationally very simple; namely, it consists of two ele-mentary operations (minimum and maximum):

− When we are interested in an enclosure, the intersection of g = [g−, g+] andg′ = [g′−, g′+] is g ∗ g′ = [max(g−, g′−), min(g+, g′ +)].

− When we are interested in a subinterval, the union of g = [g−, g+] andg′ = [g′−, g′+] is g ∗ g′ = [min(g−, g′−), max(g+, g′ +)].

Similar operations exist for many other important interval computation problems(in general, they require finitely many computational steps):

− Similar operations exist for multi-dimensional enclosure-type problems, inwhich the goal is to find the enclosure for an implicitly defined n-dimensionalset (e.g., for the solution set of a system of linear equations), if two multi-dimensional intervals

g = [g−1 , g+1 ] × · · · × [g−n , g+

n ]

and

g′ = [g′1−, g′1

+] × · · · × [g′n−, g′n

+]

are desired enclosures, then their intersection

[max(g−1 , g′1−), min(g+

1 , g′1+)] × · · · × [max(g−n , g′n

−), min(g+n , g′n

+)]

is also an enclosure. If we take, e.g., the difference between the volumes ofthe enclosure and the volume of the enclosed set as a measure of distance,then the intersection is at least as close to the ideal solution as each of theintersected intervals g and g′.

− Similar operations exist also for “inner estimation” multi-dimensional prob-lems, in which the problem is to find the interval that is contained in thedesired solution set, and the measure of distance is, e.g., the differencebetween the volume of the desired set and the volume of the interval. In thiscase, for every two given inner estimates g and g′, we can take the one withthe largest volume as g ∗ g′. Due to this choice, the resulting interval g ∗ g′is at least as close to the ideal solution as each of the intervals g and g′.

• The whole idea of interval computations is to consider only algorithms thatprovide a guaranteed solution. There are many algorithms in numerical math-ematics that result in estimates whose accuracy is not immediately known andstill needs to be verified. If an interval computations algorithm produces a result,the accuracy of this result is produced automatically. The fact that the solutionhas an automatic result verification means that the algorithm not only producesthe interval, but that it can be proven that for all possible inputs, the resultinginterval (if the algorithm generates such an interval) has the desired property


(e.g., that it is an enclosure). This proof (that every interval generated by thisalgorithm has the desired property) is usually not printed by the algorithm, butthis proof exists in the literature (and we can, therefore, easily modify the algo-rithm so that it will produce this proof as well). If the proof is detailed enough(i.e., if this proof is a sequence of statements with an explanation of how eachstatement follows from the previous ones) then a computer can easily checkin linear time whether this is indeed a correct proof. We only need to checkthat each transition in this proof is done according to the rules of this particulardeductive system.

The proof is about the algorithm; this is a general proof in the sense thatit serves all possible input data for this algorithm and thus, does not dependon the input. So, for completeness, we can add this proof to the outcome ofthe algorithm. In other words, we can safely assume that for each problem x,the output y of the interval computations program contains the proof that thisalgorithm produces the desired interval (and thus, that for this particular input x,the outputed interval y is indeed an enclosure). We can also include checking thisproof as a part of the algorithm. Since the proof of algorithm’s correctness doesnot depend on the input, the checking whether this proof is really a proof willonly add a constant (independent of the input, but, of course, depending on thealgorithm) to the total computation time of the algorithm. This addition will not,therefore, change the asymptotics of computation time: a linear-time algorithmwill remain linear-time, a quadratic-time algorithm will remain quadratic-time,etc.

• In the majority of the existing computers, all data, programs, texts, etc., arerepresented by binary numbers (i.e., by sequences of 0 and 1). Therefore, in orderto simplify the exposition, we will assume that the input data, the intermediateand output intervals, the proofs, etc. are all represented by binary numbers.

• In our result, we will be talking about the computation time of the algorithm.In theory of computation, the computation time is usually identified with thetotal number of computational steps performed by the algorithm (see, e.g., [3]).To get the actual computation time, we need to multiply this number by theactual time of performing one step. This simplified definition has good and badaspects:

− The good side of this definition is that the computation time, when defined inthis way, depends only on the algorithm itself and not on the exact hardwarethat we use.

− The negative side of this definition is that we do not get the real computa-tion times, only computation times within an unknown constant factor: e.g.,quadratic time may mean n2, may mean 100 ⋅ n2, may mean C ⋅ n2 for somereally large constant factor C. However, this uncertainty is acceptable for ourpurposes because we will be showing that the resulting sequentially guaran-


teeing algorithm is no worse than the original finally guaranteeing algorithmonly in the asymptotic sense, i.e., also within an unknown constant factor.

This definition also depends on what we call an elementary step. There are twomain definitions (see, e.g., [3]):

− In algebraic complexity, every arithmetic operations is considered one step.In this case, e.g., computing the sum or the minimum of two real numbers isone elementary operation.

− In bit complexity, a single operation with bits is considered one elementarystep. In this case, adding two n-bit numbers requires at least n steps, becausewe need to add them bit-by-bit (counting carries, we actually need up to2n + 1 elementary steps).

In the following examples:

− we will mainly consider algebraic complexity, and

− we will comment on how our results will change if we use bit-wise complexityinstead.

3. Definitions and the Main Result

The following definitions can be viewed as an algorithmic version of definitionspresented in [9].

DEFINITION 3.1.

• By a decidable set (of binary sequences), we will mean a set S of finite binarysequences for which there exists an algorithm that for every binary sequence schecks whether s ∈ S or s 6∈ S.

• By a decidable subset of a set S, we mean set T ⊆ S for which there exists analgorithm that for every binary sequence s ∈ S checks whether s ∈ T or s 6∈ T .

• We say that an algorithm is constant-time if it takes a finite number of elementarysteps.

• By an easy operation ∗ on the set S of binary sequences, we mean a constant-time algorithm that, given any two binary sequences s ∈ S and s′ ∈ S, returnsa new sequence belonging to the same set S. We will denote this sequence bys ∗ s′.

Comment. For example, in terms of algebraic complexity, the intersection andunion of one-dimensional intervals is an easy (constant-time) operation. (From theviewpoint of bit complexity, however, it is a linear-time operation.)

DEFINITION 3.2. By a generalized interval problem, we mean a tuple(G, P, E ,M, ∗, %), where:


G is a decidable set; its elements g, g′, … are called generalized intervals, or simplyintervals (for short);

P is a decidable set; its elements are called problems, or instances of the generalizedinterval problem;

E is a partial function from P to G (i.e., a function from a subset of P to G); thevalue E(p) is called the optimal solution to the problem p;

M is a relation on G; if gM(E(p)), we say that g is a solution to the problem p;

* is an easy operation on G with the property that for every g, g′, and g′′, if gMg′′and g′Mg′′, then (g ∗ g′)Mg′′;

% is a function that is defined on all pairs that satisfyM and results in a number%(g, g′) ∈ [0, +∞] (i.e., a non-negative real number or +∞). This number will becalled a measure of distance between g and g′, and we require that %(g∗g′, g′′) ≤min(%(g, g′′), %(g′, g′′)). For every real number ε > 0, if %(g, E(p)) ≤ ε, we willsay that g is an ε-approximate solution to the problem p.

Comment 1. To clarify this definition, let us explain it on the simple example ofrange estimation for linear polynomials with binary-rational coefficients:

G: the set of all binary representations of intervals (with binary-rational endpoints);

P: each instance p ∈ P is a description of the coefficients p0, p1, …, pn of the linearfunction ƒ = p0 + p1x1 + · · · + pnxn and of the input intervals [x−i , x+

i ];

E : for p = (p0, p1, …, pn, [x−1 , x+1 ], …, [x−n , x+

n ]), the value E(p) is equal to the rangeof the function ƒ: E(p) = [y− ∆, y + ∆], where:

y = p0 +∑

pixi;

∆ =∑|pi| ⋅ ∆i;

xi = (1 / 2) ⋅ (x−i + x+i ), and

∆i = (1 / 2) ⋅ (x+i − x−i );

M: gMg′ is equivalent to g ⊇ g′;

∗: an intersection of two intervals;

%: e.g., the difference between the widths.

Comment 2. In the example given above, every instance p ∈ P of the generalizedinterval problem has a solution. For more complicated problems, it can happen thatno solution exists. This is why in Definition 3.2, we required that the function E is,in general, partial, i.e., it may not be defined for some p ∈ P.

DEFINITION 3.3. By a computer, we will mean a triple (S, C, V), where:

S is a decidable set; its elements are called syntactically correct programs for thiscomputer.

C is an algorithm that can be applied to pairs (s, p) consisting of a syntacticallycorrect s ∈ S and a problem p ∈ P.


The algorithm C is called a compiler.If the algorithm C stops on a pair (s, p), we say that s is applicable to p.The result of applying C to the pair (s, p) will also be called the result of

applying s to p and denoted by s(p).

V is a decidable subset of the set S that satisfies the following property:

If an element v ∈ V is applicable to a problem p ∈ P, then the result v(p) ofapplying v to p is a solution to p (i.e., (v(p))M(E(p))). In this case, we saythat v solves p.

Elements of the set V will be called programs with automatic result verification,or interval programs (for short).

Comment. Informally, an algorithm UV that checks whether a given program vbelongs to V or not is an algorithm that does two things:

• first, it checks that the program v actually contains a part called its correctnessproof, and

• second, it checks (line-by-line) whether a proof presented as part of V is correct.

This algorithm is usually linear-time.

DENOTATION. By ts(p), we will denote the computation time of a program s onan input p (i.e., in other words, the computation time of a compiler C on a pair(s, p)).

Comment. We want to show that there exists, in Shary’s terms, a sequentiallyguaranteeing algorithm V that can solve interval problems in time that is not muchlonger than any given finally guaranteeing algorithm. We will design this algorithmby combining the computation steps of different interval algorithms v ∈ V .

THEOREM 3.1. Let a generalized interval problem and a computer be given. Then,there exists an algorithm V with the following three properties:

• first, if V is applicable to a problem p ∈ P, then the result V(p) of this applicationis a solution to p;

• second, for every problem p ∈ P, if any interval program v is applicable to thisproblem, then V is also applicable to p.

• third, for every interval program v ∈ V, there exist constants C0 > 0 and C1 > 0such that for every problem p, if v produces an ε(p)-approximate solution ofthis problem in time tv(p), then the algorithm V produces an ε(p)-approximatesolution to p in time ≤ C0 + C1 ⋅ tv(p).

Comment 1. Informally, this theorem says the following:

• if any interval algorithm solves the ε-approximation problem in linear time,then V also solves this same problem in linear time;


• if any interval algorithm solves the ε-approximation problem in quadratic time,then V also solves this same problem in quadratic time;

• etc.

In other words, if we interrupt V at any time, the accuracy of the resulting approxi-mate solution will not be much worse (in the asymptotic sense) than the accuracyof the approximate solution produced by any interval algorithm that was scheduledto complete by this time.

Comment 2. The theorem itself is based on the assumption that the combinationoperation ∗ is easy, i.e., that it takes finitely many computational steps. We havealready mentioned that for reasonable (generalized) interval problems, whetherthe corresponding operation is easy or not depends on how we define complexity(i.e., on how we define an elementary step). For example, if we count all bit-wiseoperations, then the combination is not easy even for one-dimensional intervals: itbecomes a linear-time operation.

From the proof of the theorem, one can easily see that

• first, the construction of the algorithm V does not depend on this “easiness”assumption; and

• second, that in general, the running time of the algorithm V is bounded by(constant times) the product of the running time of the given interval program vand the running time of the combination ∗.

For example, if ∗ is a linear-time operation, then:

• if any interval algorithm solves the ε-approximationε-approximation problemin linear time, then V solves this same problem in quadratic time;

• if any interval algorithm solves the ε-approximation problem in quadratic time,then V also solves this same problem in cubic time;

• etc.

In other words, if ∗ is a linear-time operation, and if we interrupt V at any time, theaccuracy of the resulting approximate solution will still not be much worse than theaccuracy of the approximate solution produced by any interval algorithm that wasscheduled to complete by the slightly longer (quadratic) time.

A Word of Warning. It is possible that for some p ∈ P, the given particulargeneralized interval problem has no solutions at all: e.g., if we want to describe anon-empty interval that is included in the solution set of a system, and the systemis inconsistent, then clearly, no solution is possible. For such particular cases p, thealgorithm V will not produce any result at all.

4. Proof

This proof follows the idea of the universal search algorithm proposed by L. Levin[5] (a simplified proof of Levin’s result is presented in [6]).


4.1. FIRST PREPARATORY STEP

First, for every binary sequence s, let us describe the following algorithm p0(s):

• Since S is a decidable set, there exists an algorithm US that checks whether s isa syntactically correct program (i.e., formally, whether s is an element of the setS). So, first, we apply this algorithm US to the given word s:

− If the result of applying this algorithm is s 6∈ S, we stop p0(s).

− If s ∈ S, we continue.

• Since V is a decidable subset of the set S, there exists an algorithm UV that checkswhether a given program s is an interval program (i.e., formally, whether s is anelement of the set V). So, as a second part of p0(s), we apply this algorithm UV

to the given word s:

− If the result of applying this algorithm is “s 6∈ V”, we stop p0(s).

− If s ∈ V , we continue.

• If we have reached this point, this means that s is an interval program. So, as athird part of p0(s), we apply s to the input p.

The resulting algorithm p0(s) consists of two parts that are independent of theinput and of the third part in which we actually apply s to the input p. Therefore,the computation time of this algorithm can be bounded as follows: tp0(s)(p) ≤C(s) + ts(p), where C(s) is the total time of the first two parts of the algorithmp0(s).

4.2. MAIN IDEA OF THE ALGORITHM

We are now ready to construct the desired algorithm V . This algorithm works asfollows:

Our goal is to find a solution g to the given problem p. Since we are designinga “sequentially guaranteeing” algorithm, we will try to have some current solutiong at each computation step, and to update this solution if necessary.

The algorithm V consists of running different steps of different algorithms p0(s)in a special order (this order is described in [6] as a simplification of the orderproposed by Levin in [5]). Some of these algorithms may result in generating asolution to the original problem p.

If none of these algorithms generate a solution, then we simply do nothing andproduce no results. However, if one of the algorithms does produce a solution, then,when it happens for the first time, we take the resulting (generalized) interval andkeep it as a current solution g. When (and if) it happens next time, we combine theresulting solution g with the current solution g and keep the combination g ∗ g asthe new updated value of the current solution: g ← g ∗ g.

To specify the idea, let us describe the order.


4.3. LEVIN’S ORDER

To describe which operation of which algorithm will be performed at a givenmoment t, we must do the following:

• Expand t into the binary code;

• Starting from the last bit of this expansion (the bit that corresponds to ones)count the number N of consecutive 1’s, For example:

− the binary number 100 has no 1’s at the end, so N = 0;

− the binary number 10111 has 3 consecutive 1’s at the end, so N = 3.

• Cut the 1’s and the 0 right before them (if any) from the binary expansion ofthe number t. What is left will be the binary code of the local time loc of theprogram that runs at time t. For example:

− the binary number 100 has no 1’s at the end, so we only cut 0, and getloc = 102 = 210;

− the binary number 10111 has 3 consecutive 1’s at the end, so when we cutthem and the 0, we get loc = 12 = 110.

• To get the program s, we add 1 to the binary expansion of N and cut the first 1from the left in the resulting binary number. For example:

− For N = 0, we have N + 1 = 110 = 12. When we delete the leftmost 1, we getan empty string: s = Λ.

− For N = 3, we have N + 1 = 410 = 1002. When we delete the first 1 from theleft, we get s = 00.

This algorithm describes the following idea. We run:

• program p0(Λ) every other step;

• program p0(0) every other step of the remaining unused steps;

• program p0(1) every other step of the remaining unused steps;

• …

To illustrate this algorithm, let us describe which steps of which algorithms are runduring moments of time 0 through 15 (see Table 1).

This order can be easily reversed. Namely, if we want to know when step numberloc of the program p0(s) will be performed, we must do the following:

• Add a 1 to the front of the binary sequence s and read the result as a binarynumber M.

• Subtract 1 from M, getting N = M − 1.

• Expand loc into the binary code.

• The binary expansion of the desired moment of time t will be as follows:

− the binary expansion of loc;


Table 1.

time t t in binary N loc s

0 0 0 0 Λ1 1 1 0 02 10 0 1 Λ3 11 2 0 14 100 0 2 Λ5 101 1 1 06 110 0 3 Λ7 111 3 0 008 1000 0 4 Λ9 1001 1 2 0

10 1010 0 5 Λ11 1011 2 1 112 1100 0 6 Λ13 1001 1 3 014 1110 0 7 Λ15 1111 4 0 01

− followed by 0;

− followed by N 1’s.

For example, if we are interested in the moment of time t when program p0(01)(with s = 01) is performing its step loc = 0, we must do the following:

• Add a 1 to the front of the binary sequence 01 and read the result as a binarynumber M = 1012.

• Subtract 1 from M, getting N = M − 1 = 1002 = 410.

• Expand loc = 0 into the binary code: 010 = 02.

• The binary expansion of the desired moment of time t will be as follows:

− the binary expansion of loc, i.e., 0;

− followed by 0;

− followed by N = 4 1’s.

In other words, we get 0011112 = 1510.

4.4. THE PROOF ITSELF

• The first statement of the theorem says that if the algorithm V produces someresult, then this result is a solution. This statement easily follows from thefollowing two facts:

− First, we “screen” the possible programs s so that only interval programs areallowed; interval programs, by definition, always lead to solutions.


− Second, we combine the solutions by using a combination operation ∗;according to Definition 3.2, a combination g ∗ g′ of two solutions g andg′ is always a solution.

• Let us now prove the desired bounds on computation time. According to theabove construction, step number loc of the program p0(s) will be implementedby V at the moment of time t = loc ⋅ 2M + 1 … 1 (N times).. Therefore, if theprogram p0(s) finishes its computations at a time tp0(s)(p) then this same resultis produced by V at a moment of time

t ≤ tp0(s)(p) ⋅ 2M + 1…1.

If we take c0 = 2M and c1 = 1 … 1, we get the same solution in time t ≤ c0 +c1 ⋅ tp0(s)(p).

Each of these t steps can also contain a combination operation ∗. We assumedthat a combination operation takes constant time, i.e., its running time is boundedby some constant C∗. Therefore, each of t steps take ≤ 1 + C∗ moments of time.Hence, all these operations take time

T ≤ (C∗ + 1) ⋅ t ≤ (C∗ + 1) ⋅(c0 + c1 ⋅ tp0(s)(p)

).

Finally, tp0(s)(c) ≤ C(s) + ts(p), hence, time is

T ≤ (C∗ + 1) ⋅(

c0 + c1 ⋅(C(s) + ts(p)

)).

In other words, we can take C0 = (C∗ + 1) ⋅ (c0 + c1 ⋅ C(s)) and C1 = (C∗ + 1) ⋅ c1.

• Finally, let us show that the solution produced by V is as accurate as the solutionproduced by s.

Indeed, the actual output g of the algorithm V is not necessarily equal to theoutput s(p) of the program s: it could happen that g is the result of combinings(p) with some previous solution. However, according to Definition 3.2, thecombination operation does not worsen the accuracy of the solution. So, ifs(p) is ε-accurate, then the generalized interval g = s(p)∗ … that is obtainedby combining s(p) with some other generalized interval (or intervals) is alsoε-accurate.

The theorem is proven. 2

A Word of Warning. The fact that this algorithm is not very practical, followsfrom the fact that the constant C1 is proportional to 2M , where M is the binarynumber that is obtained if we read the binary code describing the program s as aninteger. For a program of length 100, we get M ≈ 2100, so, 2M is absolutely notrealistic.

Acknowledgements

This work was partially supported by NSF Grant No. EEC-9322370, NASA GrantsNo. NAG 9-757 and NCCW-0089, and by a grant from the German Science Foun-dation.


We are thankful to Luc Longpre, Alexander Semenov, Sergey Shary, and YuriShokin for important discussions, and to the anonymous referees for the importantsuggestions.

One of the authors (V.K.) is greatly thankful to Leonid Levin who, in 1972,gave a very clear and very exciting presentation of his universal search theoremin St.Petersburg, Russia, and to Nikolai Alexandrovich Shanin, who, during thispresentation, emphasized the fact that Levin’s universal search algorithm is not yeta practical result.

References

1. Gaganov, A. A.: Computational Complexity of the Range of the Polynomial in Several Variables,Leningrad University, Math. Department, M.S. Thesis, 1981 (in Russian).

2. Gaganov, A. A.: Computational Complexity of the Range of the Polynomial in Several Variables,Cybernetics (1985), pp. 418–421.

3. Garey, M. R. and Johnson, D. S.: Computers and Intractability: A Guide to the Theory ofNP-Completeness, Freeman, San Francisco, 1979.

4. Kreinovich, V., Lakeyev, A., and Rohn, J.: Computational Complexity of Interval Algebra-ic Problems: Some Are Feasible and Some Are Computationally Intractable—A Survey, in:Alefeld, G., Frommer, A., and Lang, B. (eds), Scientific Computing and Validated Numerics,Akademie-Verlag, Berlin, 1996, pp. 296–306.

5. Levin, L. A.: Universal Sequential Search Problems, Problems of Information Transmission 9(3) (1973), pp. 265–266.

6. Li, M. and Vitanyi, P.: An Introduction to Kolmogorov Complexity and Its Applications, Springer-Verlag, N.Y., 1993.

7. Shary, S. P.: A New Class of Algorithms for Optimal Solution of Interval Linear Systems, IntervalComputations 2 (4) (1992), pp. 18–29.

8. Shary, S. P.: On Optimal Solution of Interval Linear Equations, SIAM J. Numer. Anal. 32 (1995),pp. 610–630.

9. Shokin, Yu. I.: On Interval Problems, Interval Algorithms and Their Computational Complexity,in: Alefeld, G., Frommer, A., and Lang, B. (eds), Scientific Computing and Validated Numerics,Akademie-Verlag, Berlin, 1996, pp. 314–328.

Note added in proof

After this paper was written, the authors learned that a similar concept of inter-ruptible algorithms exists in Artifical Intelligence (AI), under the name of anytimealgorithms (see, e.g., ACM SIGART Bulletin 7 (2) (1996), and references there-in). Although the AI definitions and techniques are somewhat different, it may beadvantageous to relate these two ideas.

Algorithms That Still Produce a Solution (Maybe Not Optimal) Even When Interrupted: Shary's Idea...

Documents

Transcript of Algorithms That Still Produce a Solution (Maybe Not Optimal) Even When Interrupted: Shary's Idea...