Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing...

37
Fast Algorithms for Maximization of Submodular Functions Jan Vondrák 1 1 Theory Group IBM Almaden Research Center IMA Workshop — Convexity and Optimization, February 2015 Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 1 / 19

Transcript of Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing...

Page 1: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Fast Algorithmsfor Maximization of Submodular Functions

Jan Vondrák1

1Theory GroupIBM Almaden Research Center

IMA Workshop — Convexity and Optimization, February 2015

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 1 / 19

Page 2: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Algorithms for maximization of submodular functions

What do we want to do?Maximize a submodular function f (S) over S ∈ Fwhere F is a constraint (family of feasible sets).Typical constraints:

|S| ≤ k (influence on social networks, sensor placement, active setselection, ranking, summarization,...)

S ∈ matroid (welfare maximization in combinatorial auctions, aboveapplications with group constraints)∑

j∈S aij ≤ Bi (budget constraits)

Quality of solution vs. performance:Some simple (greedy) algorithms already work well and are fast.More sophisticated tools have been developed in the last decade.These find solutions of better value / for more generalconstraints but they are slower.

Recent work: making the algorithms faster and usable in practice.Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 2 / 19

Page 3: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Overview of algorithms for submodular maximizationAlgorithm year constraint runtime issues

Greedy 1978 cardinality O(kn) not versatileCont. Greedy 2008 matroid O(n8) versatile but slow

Rnd. Local Search 2011 matroid O(n5) still slowDoubleGreedy 2012 none O(n) special-purpose

Greedy works only for monotone functions and simple constraints;O(kn) is fast but in some applications even this is not enough.Continuous Greedy gives a versatile framework, but is really slowdue to random sampling and small steps in a continuous domain.

NEW:

Randomized/Thresholding Greedy 2014 cardinality O(n)Accelerated Cont. Greedy 2014 matroid O(n2)

Cont. Greedy with MWU 2015 Ax ≤ b O(n2)

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 3 / 19

Page 4: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Overview of algorithms for submodular maximizationAlgorithm year constraint runtime issues

Greedy 1978 cardinality O(kn) not versatileCont. Greedy 2008 matroid O(n8) versatile but slow

Rnd. Local Search 2011 matroid O(n5) still slowDoubleGreedy 2012 none O(n) special-purpose

Greedy works only for monotone functions and simple constraints;O(kn) is fast but in some applications even this is not enough.Continuous Greedy gives a versatile framework, but is really slowdue to random sampling and small steps in a continuous domain.

NEW:

Randomized/Thresholding Greedy 2014 cardinality O(n)Accelerated Cont. Greedy 2014 matroid O(n2)

Cont. Greedy with MWU 2015 Ax ≤ b O(n2)

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 3 / 19

Page 5: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

New algorithms in more detail

Faster algorithms: [Ashwinkumar B.V., J.V., in SODA ’14]

Thresholding Greedy, O(nε log n

ε ) time for a cardinality constraint,

Accelerated Cont.Greedy, O(knε4

log2 nε ) for a matroid of rank k ,

both achieving (1− 1/e − ε)-approximation of the optimum.

MWU framework: [C. Chekuri, T.S. Jayram, J.V., in ITCS ’15]

Continuous Greedy integrated with multiplicative weight updateseffectively combines multiple constraints into oneO( 1

ε2(m + n)2) algorithm for m packing constraints Ax ≤ b

Implementation and testing of fast algorithms:[Ashwin B.V., B. Mirzasoleiman, A. Karbasi, A. Krause, J.V., in AAAI ’15]

new Randomized Greedy, compared with other greedy heuristicsdatasets from active set selection and k-medoid clusteringprovable guarantees and demonstrated practical improvements

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 4 / 19

Page 6: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

New algorithms in more detail

Faster algorithms: [Ashwinkumar B.V., J.V., in SODA ’14]

Thresholding Greedy, O(nε log n

ε ) time for a cardinality constraint,

Accelerated Cont.Greedy, O(knε4

log2 nε ) for a matroid of rank k ,

both achieving (1− 1/e − ε)-approximation of the optimum.

MWU framework: [C. Chekuri, T.S. Jayram, J.V., in ITCS ’15]

Continuous Greedy integrated with multiplicative weight updateseffectively combines multiple constraints into oneO( 1

ε2(m + n)2) algorithm for m packing constraints Ax ≤ b

Implementation and testing of fast algorithms:[Ashwin B.V., B. Mirzasoleiman, A. Karbasi, A. Krause, J.V., in AAAI ’15]

new Randomized Greedy, compared with other greedy heuristicsdatasets from active set selection and k-medoid clusteringprovable guarantees and demonstrated practical improvements

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 4 / 19

Page 7: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

New algorithms in more detail

Faster algorithms: [Ashwinkumar B.V., J.V., in SODA ’14]

Thresholding Greedy, O(nε log n

ε ) time for a cardinality constraint,

Accelerated Cont.Greedy, O(knε4

log2 nε ) for a matroid of rank k ,

both achieving (1− 1/e − ε)-approximation of the optimum.

MWU framework: [C. Chekuri, T.S. Jayram, J.V., in ITCS ’15]

Continuous Greedy integrated with multiplicative weight updateseffectively combines multiple constraints into oneO( 1

ε2(m + n)2) algorithm for m packing constraints Ax ≤ b

Implementation and testing of fast algorithms:[Ashwin B.V., B. Mirzasoleiman, A. Karbasi, A. Krause, J.V., in AAAI ’15]

new Randomized Greedy, compared with other greedy heuristicsdatasets from active set selection and k-medoid clusteringprovable guarantees and demonstrated practical improvements

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 4 / 19

Page 8: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

The Randomized Greedy Algorithm

The Greedy Algorithm: [Nemhauser-Wolsey-Fisher 1978]

We want to solve max{f (S) : |S| ≤ k}, f monotone submodular.Pick elements one-by-one, maximizing the gain in f (S).Finds a solution of value at least (1− 1/e)× optimal.

Si , maximizing f (S + i)− f (S)

Randomized Greedy:Find an element i maximizing the marginal gain among allelements in a randomly sampled set R.|R| ' n

k log 1ε is sufficient to find a solution of expected value at

least (1− 1/e − ε)× optimal.Instead of O(kn), the running time is O(n log 1

ε ).

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 5 / 19

Page 9: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

The Randomized Greedy Algorithm

The Greedy Algorithm: [Nemhauser-Wolsey-Fisher 1978]

We want to solve max{f (S) : |S| ≤ k}, f monotone submodular.Pick elements one-by-one, maximizing the gain in f (S).Finds a solution of value at least (1− 1/e)× optimal.

Si , maximizing f (S + i)− f (S)

Randomized Greedy:Find an element i maximizing the marginal gain among allelements in a randomly sampled set R.|R| ' n

k log 1ε is sufficient to find a solution of expected value at

least (1− 1/e − ε)× optimal.Instead of O(kn), the running time is O(n log 1

ε ).

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 5 / 19

Page 10: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Analysis of Randomized Greedy

S = current solutionOPT = optimal solution (size k ),R = random set (size r ):

SS OPTR

Pr[R ∩ (OPT \ S) 6= ∅] ≥ 1− (1− |OPT\S|n )r ≥ |OPT\S|

k (1− e−rk/n)

We choose r = nk log 1

ε ⇒ Pr[R ∩ (OPT \ S) 6= ∅] ≥ (1− ε) |OPT\S|k .

Conditioned on R ∩ (OPT \ S) 6= ∅, we gain the best element inR ∩ (OPT \ S): on average at least a random element of OPT \ S.E[gain] ≥ Pr[R∩(OPT \S) 6= ∅]·Ej∈OPT\S[fS(j)] ≥ 1−ε

k (OPT−f (S)).

Standard greedy analysis⇒ E[ALG] ≥ (1− e−(1−ε))OPT .

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 6 / 19

Page 11: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Analysis of Randomized Greedy

S = current solutionOPT = optimal solution (size k ),R = random set (size r ):

SS OPTR

Pr[R ∩ (OPT \ S) 6= ∅] ≥ 1− (1− |OPT\S|n )r ≥ |OPT\S|

k (1− e−rk/n)

We choose r = nk log 1

ε ⇒ Pr[R ∩ (OPT \ S) 6= ∅] ≥ (1− ε) |OPT\S|k .

Conditioned on R ∩ (OPT \ S) 6= ∅, we gain the best element inR ∩ (OPT \ S): on average at least a random element of OPT \ S.E[gain] ≥ Pr[R∩(OPT \S) 6= ∅]·Ej∈OPT\S[fS(j)] ≥ 1−ε

k (OPT−f (S)).

Standard greedy analysis⇒ E[ALG] ≥ (1− e−(1−ε))OPT .

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 6 / 19

Page 12: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Analysis of Randomized Greedy

S = current solutionOPT = optimal solution (size k ),R = random set (size r ):

SS OPTR

Pr[R ∩ (OPT \ S) 6= ∅] ≥ 1− (1− |OPT\S|n )r ≥ |OPT\S|

k (1− e−rk/n)

We choose r = nk log 1

ε ⇒ Pr[R ∩ (OPT \ S) 6= ∅] ≥ (1− ε) |OPT\S|k .

Conditioned on R ∩ (OPT \ S) 6= ∅, we gain the best element inR ∩ (OPT \ S): on average at least a random element of OPT \ S.E[gain] ≥ Pr[R∩(OPT \S) 6= ∅]·Ej∈OPT\S[fS(j)] ≥ 1−ε

k (OPT−f (S)).

Standard greedy analysis⇒ E[ALG] ≥ (1− e−(1−ε))OPT .

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 6 / 19

Page 13: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Analysis of Randomized Greedy

S = current solutionOPT = optimal solution (size k ),R = random set (size r ):

SS OPTR

Pr[R ∩ (OPT \ S) 6= ∅] ≥ 1− (1− |OPT\S|n )r ≥ |OPT\S|

k (1− e−rk/n)

We choose r = nk log 1

ε ⇒ Pr[R ∩ (OPT \ S) 6= ∅] ≥ (1− ε) |OPT\S|k .

Conditioned on R ∩ (OPT \ S) 6= ∅, we gain the best element inR ∩ (OPT \ S): on average at least a random element of OPT \ S.

E[gain] ≥ Pr[R∩(OPT \S) 6= ∅]·Ej∈OPT\S[fS(j)] ≥ 1−εk (OPT−f (S)).

Standard greedy analysis⇒ E[ALG] ≥ (1− e−(1−ε))OPT .

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 6 / 19

Page 14: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Analysis of Randomized Greedy

S = current solutionOPT = optimal solution (size k ),R = random set (size r ):

SS OPTR

Pr[R ∩ (OPT \ S) 6= ∅] ≥ 1− (1− |OPT\S|n )r ≥ |OPT\S|

k (1− e−rk/n)

We choose r = nk log 1

ε ⇒ Pr[R ∩ (OPT \ S) 6= ∅] ≥ (1− ε) |OPT\S|k .

Conditioned on R ∩ (OPT \ S) 6= ∅, we gain the best element inR ∩ (OPT \ S): on average at least a random element of OPT \ S.E[gain] ≥ Pr[R∩(OPT \S) 6= ∅]·Ej∈OPT\S[fS(j)] ≥ 1−ε

k (OPT−f (S)).

Standard greedy analysis⇒ E[ALG] ≥ (1− e−(1−ε))OPT .

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 6 / 19

Page 15: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Analysis of Randomized Greedy

S = current solutionOPT = optimal solution (size k ),R = random set (size r ):

SS OPTR

Pr[R ∩ (OPT \ S) 6= ∅] ≥ 1− (1− |OPT\S|n )r ≥ |OPT\S|

k (1− e−rk/n)

We choose r = nk log 1

ε ⇒ Pr[R ∩ (OPT \ S) 6= ∅] ≥ (1− ε) |OPT\S|k .

Conditioned on R ∩ (OPT \ S) 6= ∅, we gain the best element inR ∩ (OPT \ S): on average at least a random element of OPT \ S.E[gain] ≥ Pr[R∩(OPT \S) 6= ∅]·Ej∈OPT\S[fS(j)] ≥ 1−ε

k (OPT−f (S)).

Standard greedy analysis⇒ E[ALG] ≥ (1− e−(1−ε))OPT .

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 6 / 19

Page 16: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Implementation and experiments

We implemented several algorithms for max{f (S) : |S| ≤ k}[Mirzasoleiman, Karbasi, Krause, Ashwinkumar & V.; AAAI ’15]

classical GreedyLazy Greedy (speed-up heuristic, performance equiv. to Greedy)Sample Greedy (Greedy on a random subsample)Thresholding Greedy (new algorithm)Randomized Greedy (new algorithm)

Datasets:1 Active set selection: Parkinson’s Telemonitoring Dataset

(5875 patients; selecting a small subset representing the dataset).2 k-medoid clustering: Tiny Images, distance defined by similarity

(selecting a representative subset of 10,000 images of 32x32 pixels).

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 7 / 19

Page 17: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Results

Key points: Rand-Greedy is the fastest method and finds solutionsclose to Lazy-Greedy, the top-quality but slower performer;Rand-Greedy dominates the bi-criteria plot.

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 8 / 19

Page 18: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Beyond the cardinality constraint

Consider max{f (S) : S ∈ I} for a more complicated constraint I.Greedy does not work as well anymore (depending on I)We want a more robust tool — continuous relaxation!

Submodular (discrete)optimization problem

max{f (S) : S ∈ I}

Multilinear (continuous)optimization problem

max{F (y) : y ∈ P(I)}

Instead of the original problem, we solve a continuousoptimization problem (approximately!)Multilinear relaxation is useful due to convexity/concavityproperties [V. ’08]

Fractional solutions need to be rounded but that can be done for avariety of constraints; e.g. matroids [CCPV ’07, CVZ ’11, CVZ ’12]

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 9 / 19

Page 19: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Beyond the cardinality constraint

Consider max{f (S) : S ∈ I} for a more complicated constraint I.Greedy does not work as well anymore (depending on I)We want a more robust tool — continuous relaxation!

Submodular (discrete)optimization problem

max{f (S) : S ∈ I}

Multilinear (continuous)optimization problem

max{F (y) : y ∈ P(I)}

Instead of the original problem, we solve a continuousoptimization problem (approximately!)Multilinear relaxation is useful due to convexity/concavityproperties [V. ’08]

Fractional solutions need to be rounded but that can be done for avariety of constraints; e.g. matroids [CCPV ’07, CVZ ’11, CVZ ’12]

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 9 / 19

Page 20: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Matroids

DefinitionA matroid on N is a system of independent sets I ⊂ 2N , satisfying

1 ∀B ∈ I,A ⊂ B ⇒ A ∈ I.2 ∀A,B ∈ I, |A| < |B| ⇒ ∃x ∈ B \ A; A ∪ {x} ∈ I.

Q1 Q2 Q3 Q4 Q5

Example: partition matroid

S is independent, if|S ∩Qi | ≤ ki for each Qi .

Multilinear relaxation works well with matroids: [Calinescu,Chekuri,Pal,V. ’07]

Any fractional solution of the problem max{F (x) : x ∈ P(I)} canbe converted into a discrete solution I ∈ I without loss of value.

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 10 / 19

Page 21: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Matroids

DefinitionA matroid on N is a system of independent sets I ⊂ 2N , satisfying

1 ∀B ∈ I,A ⊂ B ⇒ A ∈ I.2 ∀A,B ∈ I, |A| < |B| ⇒ ∃x ∈ B \ A; A ∪ {x} ∈ I.

Q1 Q2 Q3 Q4 Q5

Example: partition matroid

S is independent, if|S ∩Qi | ≤ ki for each Qi .

Multilinear relaxation works well with matroids: [Calinescu,Chekuri,Pal,V. ’07]

Any fractional solution of the problem max{F (x) : x ∈ P(I)} canbe converted into a discrete solution I ∈ I without loss of value.

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 10 / 19

Page 22: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Continuous Greedy [V. ’08]

Problem: continuous optimization over a polytope, max{F (y) : y ∈ P}.

Continuous Greedy Algorithm:dxdt = argmaxv∈Pv · ∇F (x),

continuous process gets discretized.

Slow (O(n8)) because(1) small time steps are necessary

(2) F (x) requires random sampling.

Approximation guarantee: F (x(1)) ≥ (1− 1/e)OPT .

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 11 / 19

Page 23: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Accelerated Continuous Greedy [Ashwinkumar-V. ’14]

Problem: optimization over a matroid polytope, max{F (y) : y ∈ P(I)}.

Accelerated Continuous Greedy:dxdt = argmaxv∈Pv · ∇F (x);

inner loop is replaced by Discrete Greedy,this allows larger steps without loss.

Faster (O(kn)) because(1) larger steps mean fewer iterations,

(2) fewer samples per iteration.

Approximation guarantee: F (x(1)) ≥ (1− 1/e − ε)OPT .

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 12 / 19

Page 24: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Analysis of Accelerated Continuous Greedy (1)

Previously: find v ∈ P(I) maximizing v · ∇F , update x′ = x + δv.

F (x′)− F (x) ≥ δ(v · ∇F )−O(δ2vT (∇2F )v)

≥ δ(OPT − F (x))−O(δ2vT (∇2F )v);

we choose δ = O(1/n2) to make the error� the gain.

New approach: find v = 1I ∈ P(I) one coordinate at a time, whileupdating partial derivatives after each increment:

F (x′) = F (x) + δ

k∑i=1

∂F∂xπ(i)

∣∣∣x+

∑j<i δeπ(j)

.

Why is it better? Updating partial derivatives between incrementsleads to a cleaner analysis, mimicking the discrete greedy algorithm:

F (x′)− F (x) ≥ δ∑

j∈OPT

∂F∂xj

∣∣∣x′≥ δ(OPT − F (x′)).

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 13 / 19

Page 25: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Analysis of Accelerated Continuous Greedy (1)

Previously: find v ∈ P(I) maximizing v · ∇F , update x′ = x + δv.

F (x′)− F (x) ≥ δ(v · ∇F )−O(δ2vT (∇2F )v)

≥ δ(OPT − F (x))−O(δ2vT (∇2F )v);

we choose δ = O(1/n2) to make the error� the gain.

New approach: find v = 1I ∈ P(I) one coordinate at a time, whileupdating partial derivatives after each increment:

F (x′) = F (x) + δ

k∑i=1

∂F∂xπ(i)

∣∣∣x+

∑j<i δeπ(j)

.

Why is it better? Updating partial derivatives between incrementsleads to a cleaner analysis, mimicking the discrete greedy algorithm:

F (x′)− F (x) ≥ δ∑

j∈OPT

∂F∂xj

∣∣∣x′≥ δ(OPT − F (x′)).

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 13 / 19

Page 26: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Analysis of Accelerated Continuous Greedy (1)

Previously: find v ∈ P(I) maximizing v · ∇F , update x′ = x + δv.

F (x′)− F (x) ≥ δ(v · ∇F )−O(δ2vT (∇2F )v)

≥ δ(OPT − F (x))−O(δ2vT (∇2F )v);

we choose δ = O(1/n2) to make the error� the gain.

New approach: find v = 1I ∈ P(I) one coordinate at a time, whileupdating partial derivatives after each increment:

F (x′) = F (x) + δ

k∑i=1

∂F∂xπ(i)

∣∣∣x+

∑j<i δeπ(j)

.

Why is it better? Updating partial derivatives between incrementsleads to a cleaner analysis, mimicking the discrete greedy algorithm:

F (x′)− F (x) ≥ δ∑

j∈OPT

∂F∂xj

∣∣∣x′≥ δ(OPT − F (x′)).

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 13 / 19

Page 27: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Analysis of Accelerated Continuous Greedy (2)

We have:

F (x′)− F (x) ≥ δ∑

j∈OPT

∂F∂xj

∣∣∣x′≥ δ(OPT − F (x′)).

After 1/δ steps:

F (xout) ≥(

1− 1(1 + δ)1/δ

)OPT .

Notes:we can choose δ > 0 constant, to achieve (1− 1/e −O(δ))OPT ;even δ = 1 works, to achieve 1

2OPT ;we interpolate between classical greedy and continuous greedy.

Additional speed-up tricks lead to O(knδ4 log2 n

δ ) running time for theproblem max{f (S) : S ∈ I} (including rounding).

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 14 / 19

Page 28: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Analysis of Accelerated Continuous Greedy (2)

We have:

F (x′)− F (x) ≥ δ∑

j∈OPT

∂F∂xj

∣∣∣x′≥ δ(OPT − F (x′)).

After 1/δ steps:

F (xout) ≥(

1− 1(1 + δ)1/δ

)OPT .

Notes:we can choose δ > 0 constant, to achieve (1− 1/e −O(δ))OPT ;even δ = 1 works, to achieve 1

2OPT ;we interpolate between classical greedy and continuous greedy.

Additional speed-up tricks lead to O(knδ4 log2 n

δ ) running time for theproblem max{f (S) : S ∈ I} (including rounding).

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 14 / 19

Page 29: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Analysis of Accelerated Continuous Greedy (2)

We have:

F (x′)− F (x) ≥ δ∑

j∈OPT

∂F∂xj

∣∣∣x′≥ δ(OPT − F (x′)).

After 1/δ steps:

F (xout) ≥(

1− 1(1 + δ)1/δ

)OPT .

Notes:we can choose δ > 0 constant, to achieve (1− 1/e −O(δ))OPT ;even δ = 1 works, to achieve 1

2OPT ;we interpolate between classical greedy and continuous greedy.

Additional speed-up tricks lead to O(knδ4 log2 n

δ ) running time for theproblem max{f (S) : S ∈ I} (including rounding).

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 14 / 19

Page 30: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Beyond matroids

Less structured constraints: suppose m constraints, n variables,

max{f (x) : Ax ≤ b,x ∈ {0,1}n}.

Discrete greedy doesn’t work at all.Fractional problem max{F (x) : Ax ≤ b} can still be useful.Rounding depends on the properties of A, e.g. k -sparse matricescan be handled with a loss of factor O(k).

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 15 / 19

Page 31: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Submodular optimization with MWU

Multiplicative Weight Updates: iterative technique useful inoptimization, game theory, machine learning, differential privacy, . . .

Submodular optimization with MWU:Discrete Greedy has been combined with MWU [Azar-Gamzu ’12],performance depends on the width of the constraints.We show that Continuous Greedy can be integrated with MWU ina natural way, to obtain a (1− 1/e − ε)-approximation formax{F (x) : Ax ≤ 1} [Chekuri, Jayram, V.; ITCS ’15]

The framework is more versatile: allowing combinations of convexconstraints + concave and (nonmonotone) submodular objectives.

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 16 / 19

Page 32: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Continuous-time MWU process (monotone case)

Initialize: x(0) = (0,0, . . . ,0),w(0) = (1,1, . . . ,1).Continuous process running from t = 0 to t = 1:

v(t) = argmax{y · ∇F |x(t) : y ≥ 0,wT Ay ≤ wT 1}dwi

dt= ηwi(t)Aiv(t) ∀i ∈ [m]

dxdt

= v(t)

Analysis:

Aix(1) = 1η

∫ 10

1wi (t)

dwidt dt = 1

η

∫ 10

ddt (ln wi(t))dt = 1

η ln wi(1)

1η ln

∑i wi(1)− 1

η ln m = 1η

∫ 10

ddt (ln

∑i wi(t))dt =

∫ 10

wT Av(t)wT 1 dt ≤ 1

dFdt = v(t) · ∇F ≥ x∗ · ∇F ≥ OPT − F (x(t))

⇒ Ax(1) ≤ 1 + ln mη , F (x(1)) ≥ (1− 1/e)OPT .

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 17 / 19

Page 33: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Continuous-time MWU process (monotone case)

Initialize: x(0) = (0,0, . . . ,0),w(0) = (1,1, . . . ,1).Continuous process running from t = 0 to t = 1:

v(t) = argmax{y · ∇F |x(t) : y ≥ 0,wT Ay ≤ wT 1}dwi

dt= ηwi(t)Aiv(t) ∀i ∈ [m]

dxdt

= v(t)

Analysis:

Aix(1) = 1η

∫ 10

1wi (t)

dwidt dt = 1

η

∫ 10

ddt (ln wi(t))dt = 1

η ln wi(1)

1η ln

∑i wi(1)− 1

η ln m = 1η

∫ 10

ddt (ln

∑i wi(t))dt =

∫ 10

wT Av(t)wT 1 dt ≤ 1

dFdt = v(t) · ∇F ≥ x∗ · ∇F ≥ OPT − F (x(t))

⇒ Ax(1) ≤ 1 + ln mη , F (x(1)) ≥ (1− 1/e)OPT .

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 17 / 19

Page 34: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Discretization of the MWU process

Differential equations⇒ algorithm:Direction v(t) can be chosen to be a single nonzero coordinate.Non-uniform time steps are chosen to guarantee stronglypolynomial running time (à la [Garg-Könemann ’98])Time steps can be defined by requiring each weight to multiply byat most eε at a time.

Objective function does not pose issues since we move alongcoordinate-aligned steps⇒ F (x) is linear in each step!More complications for non-monotone submodular functions.

Results:1 (1− 1/e − ε)-approximation in time O( 1

ε4(m + n)2) for

max{F (x) : Ax ≤ b} when F is monotone multilinear submodular,2 (1− 1/e − ε)-approximation in time O( 1

ε4mn2(m + n)) for

max{F (x) : Ax ≤ b} when F is nonmonotone smooth submodular.

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 18 / 19

Page 35: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Discretization of the MWU process

Differential equations⇒ algorithm:Direction v(t) can be chosen to be a single nonzero coordinate.Non-uniform time steps are chosen to guarantee stronglypolynomial running time (à la [Garg-Könemann ’98])Time steps can be defined by requiring each weight to multiply byat most eε at a time.Objective function does not pose issues since we move alongcoordinate-aligned steps⇒ F (x) is linear in each step!More complications for non-monotone submodular functions.

Results:1 (1− 1/e − ε)-approximation in time O( 1

ε4(m + n)2) for

max{F (x) : Ax ≤ b} when F is monotone multilinear submodular,2 (1− 1/e − ε)-approximation in time O( 1

ε4mn2(m + n)) for

max{F (x) : Ax ≤ b} when F is nonmonotone smooth submodular.

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 18 / 19

Page 36: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Discretization of the MWU process

Differential equations⇒ algorithm:Direction v(t) can be chosen to be a single nonzero coordinate.Non-uniform time steps are chosen to guarantee stronglypolynomial running time (à la [Garg-Könemann ’98])Time steps can be defined by requiring each weight to multiply byat most eε at a time.Objective function does not pose issues since we move alongcoordinate-aligned steps⇒ F (x) is linear in each step!More complications for non-monotone submodular functions.

Results:1 (1− 1/e − ε)-approximation in time O( 1

ε4(m + n)2) for

max{F (x) : Ax ≤ b} when F is monotone multilinear submodular,2 (1− 1/e − ε)-approximation in time O( 1

ε4mn2(m + n)) for

max{F (x) : Ax ≤ b} when F is nonmonotone smooth submodular.

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 18 / 19

Page 37: Fast Algorithms for Maximization of Submodular Functions · 2 (m +n) 2) algorithm for m packing constraints Ax b Implementation and testing of fast algorithms: [Ashwin B.V., B. Mirzasoleiman,

Conclusions

1 Submodular function maximization under the constraint |S| ≤ kcan be done very fast (near-linear time)

2 More complicated constraints can be handled using variants ofContinuous Greedy

3 Continuous Greedy is not as slow as originally designed:can be implemented in O(kn) running time for matroids.(Recent improvement [Buchbinder-Feldman-Schwartz ’15]:O(n + k

√n) running time for partition matroids.)

Questions:1 Is near-linear time possible for (1− 1/e)-approx for submodular

maximization over matroids?2 Is it possible to eliminate randomization?

Jan Vondrák (IBM Almaden) Fast Algorithms for Submodular Functions 19 / 19