1 Approximation Algorithms. 2 Motivation By now we’ve seen many NP-Complete problems. We...
-
Upload
douglas-alexander -
Category
Documents
-
view
212 -
download
0
Transcript of 1 Approximation Algorithms. 2 Motivation By now we’ve seen many NP-Complete problems. We...
1
Approximation Algorithms
2
Motivation
• By now we’ve seen many NP-Complete problems.
• We conjecture none of them has polynomial time algorithm.
3
Motivation
• Is this a dead-end? Should we give up altogether?
4
Motivation
• Or maybe we can settle for good approximation algorithms?
5
Introduction
• Objectives:– To formalize the notion of
approximation.– To demonstrate several such
algorithms.• Overview:
– Optimization and Approximation– VERTEX-COVER, SET-COVER
6
Optimization
• Many of the problems we’ve encountered so far are really optimization problems.
• I.e - the task can be naturally rephrased as finding a maximal/minimal solution.
• For example: finding a maximal clique in a graph.
7
Approximation
• An algorithm which returns an answer C which is “close” to the optimal solution C* is called an approximation algorithm.
• “Closeness” is usually measured by the ratio bound (n) the algorithm produces.
• Which is a function that satisfies, for any input size n, max{C/C*,C*/C}(n).
8
VERTEX-COVERVERTEX-COVER
• Instance: an undirected graph G=(V,E).• Problem: find a set CV of minimal size
s.t. for any (u,v)E, either uC or vC.
Example:
9
Minimum VC NP-hard
Proof: It is enough to show the decision problem below is NP-Complete:
• Instance: an undirected graph G=(V,E) and a number k.
• Problem: to decide if there exists a set V’V of size k s.t for any (u,v)E, uV’ or vV’.
This follows immediately from the following observation.
10
Minimum VC NP-hard
Observation: Let G=(V,E) be an undirected graph. The complement V\C of a vertex-cover C is an independent-set of G.
Proof: Two vertices outside a vertex-cover cannot be connected by an edge.
11
VC - Approximation Algorithm
• C • E’ E• while E’
– do let (u,v) be an arbitrary edge of E’
– C C {u,v}– remove from E’ every edge
incident to either u or v.• return C.
COR(B) 523-524
12
Demo
Compare this cover to the one from the example
13
Polynomial Time
• C • E’ E• while E’ do
– let (u,v) be an arbitrary edge of E’– C C {u,v}– remove from E’ every edge
incident to either u or v• return C
O(n2)
O(1)
O(n)
O(n2)
14
Correctness
The set of vertices our algorithm returns is clearly a vertex-cover, since we iterate until every edge is covered.
15
How Good an Approximation is it?
Observe the set of edges our algorithm choosesObserve the set of edges our algorithm chooses
any VC contains 1 in each any VC contains 1 in each
our VC contains both, hence at most twice as large our VC contains both, hence at most twice as large
no common vertices!
no common vertices!
16
The Traveling Salesman Problem
17
The Mission: A Tour Around the World
18
The Problem: Traveling Costs Money
1795$
19
Introduction
• Objectives:– To explore the Traveling Salesman
Problem.• Overview:
– TSP: Formal definition & Examples– TSP is NP-hard– Approximation algorithm for special cases– Inapproximability result
20
TSP
• Instance: a complete weighted undirected graph G=(V,E) (all weights are non-negative).
• Problem: to find a Hamiltonian cycle of minimal cost.
3
432
5
1 10
21
Polynomial Algorithm for TSP?
What about the greedy strategy:
At any point, choose the closest vertex not explored
yet?
22
The Greedy $trategy Fails
5
0
3
1
12
10
2
23
The Greedy $trategy Fails
5
0
3
1
12
10
2
24
TSP is NP-hard
The corresponding decision problem:• Instance: a complete weighted
undirected graph G=(V,E) and a number k.
• Problem: to find a Hamiltonian path whose cost is at most k.
25
TSP is NP-hard
Theorem: HAM-CYCLE p TSP.
Proof: By the straightforward efficient reduction illustrated below:
HAM-CYCLE TSP
0 10
0
0
1 k=0
verify!
26
What Next?
• We’ll show an approximation algorithm for TSP,
• which yields a ratio-bound of 2
• for cost functions which satisfy a certain property.
27
The Triangle Inequality
Definition: We’ll say the cost function c satisfies the triangle inequality, if
u,v,wV : c(u,v)+c(v,w)c(u,w)
28
Approximation Algorithm
1. Grow a Minimum Spanning Tree (MST) for G.
2. Return the cycle resulting from a preorder walk on that tree.
COR(B) 525-527
29
Demonstration and Analysis
The cost of a minimal
hamiltonian cycle the cost of a
MST
30
Demonstration and Analysis
The cost of a preorder walk is twice the cost of
the tree
31
Demonstration and Analysis
Due to the triangle inequality, the
hamiltonian cycle is not worse.
32
What About the General Case?
• We’ll show TSP cannot be approximated within any constant factor 1
• By showing the corresponding gap version is NP-hard.
COR(B) 528
33
gap-TSP[]
• Instance: a complete weighted undirected graph G=(V,E).
• Problem: to distinguish between the following two cases:
There exists a hamiltonian cycle, whose cost is at most |V|.
The cost of every hamiltonian cycle is more than |V|.
34
Instances
min cost
|V| |V|
1
1
1
0+1
0
0
1
35
What Should an Algorithm for gap-TSP Return?
|V| |V|
NO! YES!
min cost
gap
DON’T-CARE...
36
gap-TSP & Approximation
Observation: Efficient approximation of factor for TSP implies an efficient algorithm for gap-TSP[].
37
gap-TSP is NP-hard
Theorem: For any constant 1, HAM-CYCLE p gap-TSP[].
Proof Idea: Edges from G cost 1. Other edges cost much more.
38
The Reduction Illustrated
HAM-CYCLE gap-TSP
1 |V|+1
1
1
1
|V|+1
Verify (a) correctness (b)
efficiency
39
Approximating TSP is NP-hard
gap-TSP[] is NP-hard
Approximating TSP within factor is NP-hard
40
Summary
• We’ve studied the Traveling Salesman Problem (TSP).
• We’ve seen it is NP-hard.• Nevertheless, when the cost function
satisfies the triangle inequality, there exists an approximation algorithm with ratio-bound 2.
41
Summary
• For the general case we’ve proven there is probably no efficient approximation algorithm for TSP.
• Moreover, we’ve demonstrated a generic method for showing approximation problems are NP-hard.
42
SET-COVER
• Instance: a finite set X and a family F of subsets of X, such that
• Problem: to find a set CF of minimal size which covers X, i.e -
FS
SX
CS
SX
43
SET-COVER: Example
44
SET-COVER is NP-Hard
Proof: Observe the corresponding decision problem.
• Clearly, it’s in NP (Check!).• We’ll sketch a reduction from (decision)
VERTEX-COVER to it:
45
VERTEX-COVER p SET-COVER
one set for every vertex, containing the edges it
covers
VC SC
one element for every edge
46
• C • U X• while U do
– select SF that maximizes |SU|– C C {S}– U U - S
• return C
Greedy Algorithm
O(|F|·|X|)min{|X|,|F|}
COR(B) 530-533
47
Demonstration012345
compare to the
optimal cover
48
Is Being Greedy Worthwhile? How Do We Proceed From Here?
• We can easily bound the approximation ratio by logn.
• A more careful analysis yields a tight bound of lnn.
49
Loose Ratio-BoundClaim: If cover of size k, then after k iterations the
algorithm covered at least ½ of the elements.
Suppose it doesn’t and observe the situation after k
iterations:
the n elements
what we covered
>½
50
the n elements
Loose Ratio-BoundClaim: If cover of size k, then after k iterations the
algorithm covered at least ½ of the elements.
Since this part can also be
covered by k sets...
what we covered
>½
51
the n elements
Loose Ratio-BoundClaim: If cover of size k, then after k iterations the
algorithm covered at least ½ of the elements.
there must be a set not chosen yet, whose size
is at least ½n·1/k
what we covered
>½
52
the n elements
what we covered
>½
Loose Ratio-BoundClaim: If cover of size k, then after k iterations the
algorithm covered at least ½ of the elements.
Thus in each of the k iterations
we’ve covered at least ½n·1/k new
elements
and the claim is proven!
53
Loose Ratio-BoundClaim: If cover of size k, then after k iterations the
algorithm covered at least ½ of the elements.
Therefore after klogn iterations (i.e - after choosing klogn sets) all the n elements must
be covered, and the bound is proved.
54
Tight Ratio-Bound
Claim: The greedy algorithm approximates the optimal set-cover within factor
H(max{ |S|: SF } )
Where H(d) is the d-th harmonic number:
d
1i
def
i1
H(d)
55
Tight Ratio-Bound
1lnn1dxx1
1k1
k1 n
1
n
2k
n
1k
56
Claim’s Proof
• Whenever the algorithm chooses a set, charge 1.
0.2
0.2
0.2
0.20.2
• Split the cost between all covered vertices.
57
Analysis
• That is, we charge every element xX with
• Where Si is the first set which covers x.
|)S...(SS|1
c1i1i
def
x
cxcx
58
Lemma
Lemma: For every SF,
|)SH(|cSx
x
Proof: Fix an SF. For any i, Define
|)S...(SS|u i1
def
i
Let k be the smallest index, for which uk=0.
Number of members of
S left uncovered
after i iterations
1ik : Si covers ui-
1-ui elements from S
59
Lemma
k
1ixc
k
1i 1i1ii1-i |)S...(S-S|
1)u-(u
S
1-iu1
))H(u-)(H(u i1-i )H(u-)H(u k0 0|S|
This last observation yields:Our greedy strategy promises Si (1ik) covers at least as many new elements as S.
Since for any 1i|C| we defined ui as |S-(S1...Si)|...
For any b>aN, H(b)-H(a)=1/(a+1)+...+1/(b)(b-a)·1/b
This is a telescopic sumuk=0H(0)=0u0=|S|
60
Analysis
Now we can finally complete our analysis:
Xx
xc|C|
Sx
xc*CS
F})S|:SH(max{||C*|
61
Summary
• As it turns out, we can sometimes find efficient approximation algorithms for NP-hard problems.
• We’ve seen two such algorithms: – for VERTEX-COVER (factor 2)– for SET-COVER (logarithmic factor).
62
The Subset Sum Problem
• Problem definition– Given a finite set S and a target t, find a subset S’ ⊆ S whose
elements sum to t
• All possible sums– S = {x1, x2, .., xn}
– Li = set of all possible sums of {x1, x2, .., xi}
• Example– S = {1, 4, 5}
– L1 = {0, 1}
– L2 = {0, 1, 4, 5} = L1 (L1 + x2)
– L3 = {0, 1, 4, 5, 6, 9, 10} = L2 (L2 + x3)
• Li = Li-1 (Li-1+xi)
63
Subset Sum, revisited:• Given a set S of numbers, find a subset S’
that adds up to some target number t.
• To find the largest possible sum that doesn’t exceed t:
T = {0};
for each x in S {
T = union(T, x+T);
remove elements from T that exceed t;
}
return largest element in T;
• (Aside: How should we implement T?)
x + T adds x to each element in the set T
Potential doubling at each step
Complexity O(2n)
64
Trimming:
• To reduce the size of the set T at each stage, we apply a trimming process.
• For example, if z and y are consecutive elements and (1-)y z < y, then remove z.
• If =0.1, {10,11,12,15,20,21,22,23,24,29} {10,12,15,20,23,29}
65
Subset Sum with Trimming:
• Incorporate trimming in the previous algorithm:
T = {0};for each x in S { T = union(T, x+T); T = trim(, T); remove elements from T that exceed t;}return largest element in T;
• Trimming only eliminates values, it doesn’t create new ones. So the final result is still the sum of a subset of S that is less than t.
0 1/n
66
• At each stage, values in the trimmed T are within a factor somewhere between (1-) and 1 of the corresponding values in the untrimmed T.
• The final result (after n iterations) is within a factor somewhere between (1-)n and 1 of the result produced by the original algorithm.
67
• After trimming, the ratio between successive elements in T is at least 1/(1-), and all of the values are between 0 and t.
• Hence the maximum number of elements in T is: log(1/(1-)) t (log t / ).
• This is enough to give us a polynomial bound on the running time of the algorithm.
68
Subset Sum – Trim
• Want to reduce the size of a list by “trimming”– L: An original list– L’: The list after trimming L– d: trimming parameter, [0..1]– y: an element that is removed from L– z: corresponding (representing) element in L’ (also in L)– (y-z)/y d– (1-d)y z y
• Example– L = {10, 11, 12, 15, 20, 21, 22, 23, 24, 29}– d = 0.1– L’ = {10, 12, 15, 20, 23, 29}– 11 is represented by 10. (11-10)/11 0.1– 21, 22 are represented by 20. (21-20)/21 0.1– 24 is represented by 23. (24-23)/24 0.1
69
Subset Sum – Trim (2)
• Trim(L, d) // L: y1, y2, .., ym
1. L’ = {y1}
2. last = y1 // most recent element z in L’ which represent elements in L
3. for i = 2 to m do4. if last < (1-d) yi then // (1-d)y z y
5. // yi is appended into L’ when it cannot be represented by last
6. append yi onto the end of L’
7. last = yi
8. return L’• Example
– L = {10, 11, 12, 15, 20, 21, 22, 23, 24, 29}– d = 0.1– L’ = {10, 12, 15, 20, 23, 29}
• O(m)
70
Subset Sum – Approximate Algorithm
• Approx_subset_sum(S, t, e) // S=x1,x2,..,xn
1. L0 = {0}2. for i = 1 to n do
3. Li = Li-1 (Li-1+xi)
4. Li = Trim(Li, e/n)
5. Remove elements that are greater than t from Li
6. return the largest element in Ln
• Example– L = {104, 102, 201, 101}, t=308, e=0.20, d = e/n=0.05– L0 = {0}– L1 = {0, 104}– L2 = {0, 102, 104, 206}
• After trimming 104: L2 = {0, 102, 206}
– L3 = {0, 102, 201, 206, 303, 407}• After trimming 206: L3 = {0, 102, 201, 303, 407}• After removing 407: L3 = {0, 102, 201, 303}
– L4 = {0, 101, 102, 201, 203, 302, 303, 404}• After trimming 102, 203, 303: L4 = {0, 101, 201, 302, 404}• After removing 404: L4 = {0, 101, 201, 302}
– Return 302 (=201+101)• Optimal answer is 104+102+101=307
71
Subset Sum - Correctness• The approximation solution C is not smaller than (1-
e) times of an optimal solution C*
– i.e., C*(1-e) C
• Proof– for every element y in L there is a z in L’ such that
• (1-e/n)y z y
– for every element y in Li there is a z in L’i such that • (1-e/n)i y z y
– If y* is an optimal solution in Ln, then there is a corresponding z in Ln’
• (1-e/n)n y* z y*
– Since (1-e) < (1-e/n)n [ (1-e/n)n is increasing ]• (1-e) y* (1-e/n)n y* z• (1-e) y* z
– So the value z returned is not smaller than 1-e times the optimal solution y*
72
Subset Sum – Correctness (2)
• The approximation algorithm is fully polynomial• Proof
– Successive elements z and z’ in Li’ must have the
relationship• z/z’ = 1/(1-e/n)• i,e, they differ by a factor of at least 1/(1-e/n)
– The number of elements in each Li is at most• log 1/(1-e/n) t [ t is the largest ]• = ln t / (-ln(1-e/n)) (ln t) / (-(-e/n)) [Eq. 2.10: x/(1+x) ln(1+x) x, for x > -1] (n ln t) / e
– So the length of Li is polynomial– So the running time of the algorithm is polynomial
73
Summary:• Not all problems are computable.• Some problems can be solved in polynomial
time (P).• Some problems can be verified in polynomial
time (NP).• Nobody knows whether P=NP.• But the existence of NP-complete problems is
often taken as an indication that PNP.• In the meantime, we use approximation to
find “good-enough” solutions to hard problems.
74
What’s Next?
• But where can we draw the line?• Does every NP-hard problem have
an approximation?• And to within which factor?• Can approximation be NP-hard as
well?