1 Approximation Algorithms. 2 Motivation By now we’ve seen many NP-Complete problems. We...

1

Approximation Algorithms

2

Motivation

• By now we’ve seen many NP-Complete problems.

• We conjecture none of them has polynomial time algorithm.

3

Motivation

• Is this a dead-end? Should we give up altogether?

4

Motivation

• Or maybe we can settle for good approximation algorithms?

5

Introduction

• Objectives:– To formalize the notion of

approximation.– To demonstrate several such

algorithms.• Overview:

– Optimization and Approximation– VERTEX-COVER, SET-COVER

6

Optimization

• Many of the problems we’ve encountered so far are really optimization problems.

• I.e - the task can be naturally rephrased as finding a maximal/minimal solution.

• For example: finding a maximal clique in a graph.

7

Approximation

• An algorithm which returns an answer C which is “close” to the optimal solution C* is called an approximation algorithm.

• “Closeness” is usually measured by the ratio bound (n) the algorithm produces.

• Which is a function that satisfies, for any input size n, max{C/C*,C*/C}(n).

8

VERTEX-COVERVERTEX-COVER

• Instance: an undirected graph G=(V,E).• Problem: find a set CV of minimal size

s.t. for any (u,v)E, either uC or vC.

Example:

9

Minimum VC NP-hard

Proof: It is enough to show the decision problem below is NP-Complete:

• Instance: an undirected graph G=(V,E) and a number k.

• Problem: to decide if there exists a set V’V of size k s.t for any (u,v)E, uV’ or vV’.

This follows immediately from the following observation.

10

Minimum VC NP-hard

Observation: Let G=(V,E) be an undirected graph. The complement V\C of a vertex-cover C is an independent-set of G.

Proof: Two vertices outside a vertex-cover cannot be connected by an edge.

11

VC - Approximation Algorithm

• C • E’ E• while E’

– do let (u,v) be an arbitrary edge of E’

– C C {u,v}– remove from E’ every edge

incident to either u or v.• return C.

COR(B) 523-524

12

Demo

Compare this cover to the one from the example

13

Polynomial Time

• C • E’ E• while E’ do

– let (u,v) be an arbitrary edge of E’– C C {u,v}– remove from E’ every edge

incident to either u or v• return C

O(n2)

O(1)

O(n)

O(n2)

14

Correctness

The set of vertices our algorithm returns is clearly a vertex-cover, since we iterate until every edge is covered.

15

How Good an Approximation is it?

Observe the set of edges our algorithm choosesObserve the set of edges our algorithm chooses

any VC contains 1 in each any VC contains 1 in each

our VC contains both, hence at most twice as large our VC contains both, hence at most twice as large

no common vertices!

no common vertices!

16

The Traveling Salesman Problem

17

The Mission: A Tour Around the World

18

The Problem: Traveling Costs Money

1795$

19

Introduction

• Objectives:– To explore the Traveling Salesman

Problem.• Overview:

– TSP: Formal definition & Examples– TSP is NP-hard– Approximation algorithm for special cases– Inapproximability result

20

TSP

• Instance: a complete weighted undirected graph G=(V,E) (all weights are non-negative).

• Problem: to find a Hamiltonian cycle of minimal cost.

3

432

5

1 10

21

Polynomial Algorithm for TSP?

What about the greedy strategy:

At any point, choose the closest vertex not explored

yet?

22

The Greedy $trategy Fails

5

0

3

1

12

10

2

23

The Greedy $trategy Fails

5

0

3

1

12

10

2

24

TSP is NP-hard

The corresponding decision problem:• Instance: a complete weighted

undirected graph G=(V,E) and a number k.

• Problem: to find a Hamiltonian path whose cost is at most k.

25

TSP is NP-hard

Theorem: HAM-CYCLE p TSP.

Proof: By the straightforward efficient reduction illustrated below:

HAM-CYCLE TSP

0 10

0

0

1 k=0

verify!

26

What Next?

• We’ll show an approximation algorithm for TSP,

• which yields a ratio-bound of 2

• for cost functions which satisfy a certain property.

27

The Triangle Inequality

Definition: We’ll say the cost function c satisfies the triangle inequality, if

u,v,wV : c(u,v)+c(v,w)c(u,w)

28

Approximation Algorithm

1. Grow a Minimum Spanning Tree (MST) for G.

2. Return the cycle resulting from a preorder walk on that tree.

COR(B) 525-527

29

Demonstration and Analysis

The cost of a minimal

hamiltonian cycle the cost of a

MST

30


The cost of a preorder walk is twice the cost of

the tree

31


Due to the triangle inequality, the

hamiltonian cycle is not worse.

32

What About the General Case?

• We’ll show TSP cannot be approximated within any constant factor 1

• By showing the corresponding gap version is NP-hard.

COR(B) 528

33

gap-TSP[]

• Instance: a complete weighted undirected graph G=(V,E).

• Problem: to distinguish between the following two cases:

There exists a hamiltonian cycle, whose cost is at most |V|.

The cost of every hamiltonian cycle is more than |V|.

34

Instances

min cost

|V| |V|

1

1

1

0+1

0

0

1

35

What Should an Algorithm for gap-TSP Return?

|V| |V|

NO! YES!

min cost

gap

DON’T-CARE...

36

gap-TSP & Approximation

Observation: Efficient approximation of factor for TSP implies an efficient algorithm for gap-TSP[].

37

gap-TSP is NP-hard

Theorem: For any constant 1, HAM-CYCLE p gap-TSP[].

Proof Idea: Edges from G cost 1. Other edges cost much more.

38

The Reduction Illustrated

HAM-CYCLE gap-TSP

1 |V|+1

1

1

1

|V|+1

Verify (a) correctness (b)

efficiency

39

Approximating TSP is NP-hard

gap-TSP[] is NP-hard

Approximating TSP within factor is NP-hard

40

Summary

• We’ve studied the Traveling Salesman Problem (TSP).

• We’ve seen it is NP-hard.• Nevertheless, when the cost function

satisfies the triangle inequality, there exists an approximation algorithm with ratio-bound 2.

41

Summary

• For the general case we’ve proven there is probably no efficient approximation algorithm for TSP.

• Moreover, we’ve demonstrated a generic method for showing approximation problems are NP-hard.

42

SET-COVER

• Instance: a finite set X and a family F of subsets of X, such that

• Problem: to find a set CF of minimal size which covers X, i.e -

FS

SX

CS

SX

43

SET-COVER: Example

44

SET-COVER is NP-Hard

Proof: Observe the corresponding decision problem.

• Clearly, it’s in NP (Check!).• We’ll sketch a reduction from (decision)

VERTEX-COVER to it:

45

VERTEX-COVER p SET-COVER

one set for every vertex, containing the edges it

covers

VC SC

one element for every edge

46

• C • U X• while U do

– select SF that maximizes |SU|– C C {S}– U U - S

• return C

Greedy Algorithm

O(|F|·|X|)min{|X|,|F|}

COR(B) 530-533

47

Demonstration012345

compare to the

optimal cover

48

Is Being Greedy Worthwhile? How Do We Proceed From Here?

• We can easily bound the approximation ratio by logn.

• A more careful analysis yields a tight bound of lnn.

49

Loose Ratio-BoundClaim: If cover of size k, then after k iterations the

algorithm covered at least ½ of the elements.

Suppose it doesn’t and observe the situation after k

iterations:

the n elements

what we covered

>½

50

the n elements



Since this part can also be

covered by k sets...

what we covered

>½

51

the n elements



there must be a set not chosen yet, whose size

is at least ½n·1/k

what we covered

>½

52

the n elements

what we covered

>½



Thus in each of the k iterations

we’ve covered at least ½n·1/k new

elements

and the claim is proven!

53



Therefore after klogn iterations (i.e - after choosing klogn sets) all the n elements must

be covered, and the bound is proved.

54

Tight Ratio-Bound

Claim: The greedy algorithm approximates the optimal set-cover within factor

H(max{ |S|: SF } )

Where H(d) is the d-th harmonic number:

d

1i

def

i1

H(d)

55

Tight Ratio-Bound

1lnn1dxx1

1k1

k1 n

1

n

2k

n

1k

56

Claim’s Proof

• Whenever the algorithm chooses a set, charge 1.

0.2

0.2

0.2

0.20.2

• Split the cost between all covered vertices.

57

Analysis

• That is, we charge every element xX with

• Where Si is the first set which covers x.

|)S...(SS|1

c1i1i

def

x

cxcx

58

Lemma

Lemma: For every SF,

|)SH(|cSx

x

Proof: Fix an SF. For any i, Define

|)S...(SS|u i1

def

i

Let k be the smallest index, for which uk=0.

Number of members of

S left uncovered

after i iterations

1ik : Si covers ui-

1-ui elements from S

60

Analysis

Now we can finally complete our analysis:

Xx

xc|C|

Sx

xc*CS

F})S|:SH(max{||C*|

61

Summary

• As it turns out, we can sometimes find efficient approximation algorithms for NP-hard problems.

• We’ve seen two such algorithms: – for VERTEX-COVER (factor 2)– for SET-COVER (logarithmic factor).

62

The Subset Sum Problem

• Problem definition– Given a finite set S and a target t, find a subset S’ ⊆ S whose

elements sum to t

• All possible sums– S = {x1, x2, .., xn}

– Li = set of all possible sums of {x1, x2, .., xi}

• Example– S = {1, 4, 5}

– L1 = {0, 1}

– L2 = {0, 1, 4, 5} = L1 (L1 + x2)

– L3 = {0, 1, 4, 5, 6, 9, 10} = L2 (L2 + x3)

• Li = Li-1 (Li-1+xi)

63

Subset Sum, revisited:• Given a set S of numbers, find a subset S’

that adds up to some target number t.

• To find the largest possible sum that doesn’t exceed t:

T = {0};

for each x in S {

T = union(T, x+T);

remove elements from T that exceed t;

}

return largest element in T;

• (Aside: How should we implement T?)

x + T adds x to each element in the set T

Potential doubling at each step

Complexity O(2n)

64

Trimming:

• To reduce the size of the set T at each stage, we apply a trimming process.

• For example, if z and y are consecutive elements and (1-)y z < y, then remove z.

• If =0.1, {10,11,12,15,20,21,22,23,24,29} {10,12,15,20,23,29}

65

Subset Sum with Trimming:

• Incorporate trimming in the previous algorithm:

T = {0};for each x in S { T = union(T, x+T); T = trim(, T); remove elements from T that exceed t;}return largest element in T;

• Trimming only eliminates values, it doesn’t create new ones. So the final result is still the sum of a subset of S that is less than t.

0 1/n

66

• At each stage, values in the trimmed T are within a factor somewhere between (1-) and 1 of the corresponding values in the untrimmed T.

• The final result (after n iterations) is within a factor somewhere between (1-)n and 1 of the result produced by the original algorithm.

67

• After trimming, the ratio between successive elements in T is at least 1/(1-), and all of the values are between 0 and t.

• Hence the maximum number of elements in T is: log(1/(1-)) t (log t / ).

• This is enough to give us a polynomial bound on the running time of the algorithm.

68

Subset Sum – Trim

• Want to reduce the size of a list by “trimming”– L: An original list– L’: The list after trimming L– d: trimming parameter, [0..1]– y: an element that is removed from L– z: corresponding (representing) element in L’ (also in L)– (y-z)/y d– (1-d)y z y

• Example– L = {10, 11, 12, 15, 20, 21, 22, 23, 24, 29}– d = 0.1– L’ = {10, 12, 15, 20, 23, 29}– 11 is represented by 10. (11-10)/11 0.1– 21, 22 are represented by 20. (21-20)/21 0.1– 24 is represented by 23. (24-23)/24 0.1

69

Subset Sum – Trim (2)

• Trim(L, d) // L: y1, y2, .., ym

1. L’ = {y1}

2. last = y1 // most recent element z in L’ which represent elements in L

3. for i = 2 to m do4. if last < (1-d) yi then // (1-d)y z y

5. // yi is appended into L’ when it cannot be represented by last

6. append yi onto the end of L’

7. last = yi

8. return L’• Example

– L = {10, 11, 12, 15, 20, 21, 22, 23, 24, 29}– d = 0.1– L’ = {10, 12, 15, 20, 23, 29}

• O(m)

70

Subset Sum – Approximate Algorithm

• Approx_subset_sum(S, t, e) // S=x1,x2,..,xn

1. L0 = {0}2. for i = 1 to n do

3. Li = Li-1 (Li-1+xi)

4. Li = Trim(Li, e/n)

5. Remove elements that are greater than t from Li

6. return the largest element in Ln

• Example– L = {104, 102, 201, 101}, t=308, e=0.20, d = e/n=0.05– L0 = {0}– L1 = {0, 104}– L2 = {0, 102, 104, 206}

• After trimming 104: L2 = {0, 102, 206}

– L3 = {0, 102, 201, 206, 303, 407}• After trimming 206: L3 = {0, 102, 201, 303, 407}• After removing 407: L3 = {0, 102, 201, 303}

– L4 = {0, 101, 102, 201, 203, 302, 303, 404}• After trimming 102, 203, 303: L4 = {0, 101, 201, 302, 404}• After removing 404: L4 = {0, 101, 201, 302}

– Return 302 (=201+101)• Optimal answer is 104+102+101=307

71

Subset Sum - Correctness• The approximation solution C is not smaller than (1-

e) times of an optimal solution C*

– i.e., C*(1-e) C

• Proof– for every element y in L there is a z in L’ such that

• (1-e/n)y z y

– for every element y in Li there is a z in L’i such that • (1-e/n)i y z y

– If y* is an optimal solution in Ln, then there is a corresponding z in Ln’

• (1-e/n)n y* z y*

– Since (1-e) < (1-e/n)n [ (1-e/n)n is increasing ]• (1-e) y* (1-e/n)n y* z• (1-e) y* z

– So the value z returned is not smaller than 1-e times the optimal solution y*

72

Subset Sum – Correctness (2)

• The approximation algorithm is fully polynomial• Proof

– Successive elements z and z’ in Li’ must have the

relationship• z/z’ = 1/(1-e/n)• i,e, they differ by a factor of at least 1/(1-e/n)

– The number of elements in each Li is at most• log 1/(1-e/n) t [ t is the largest ]• = ln t / (-ln(1-e/n)) (ln t) / (-(-e/n)) [Eq. 2.10: x/(1+x) ln(1+x) x, for x > -1] (n ln t) / e

– So the length of Li is polynomial– So the running time of the algorithm is polynomial

73

Summary:• Not all problems are computable.• Some problems can be solved in polynomial

time (P).• Some problems can be verified in polynomial

time (NP).• Nobody knows whether P=NP.• But the existence of NP-complete problems is

often taken as an indication that PNP.• In the meantime, we use approximation to

find “good-enough” solutions to hard problems.

74

What’s Next?

• But where can we draw the line?• Does every NP-hard problem have

an approximation?• And to within which factor?• Can approximation be NP-hard as

well?

1 Approximation Algorithms. 2 Motivation By now we’ve seen many NP-Complete problems. We...

Documents

Transcript of 1 Approximation Algorithms. 2 Motivation By now we’ve seen many NP-Complete problems. We...