Post on 15-Jan-2016
Learning and Testing Submodular Functions
Grigory Yaroslavtsevhttp://grigory.us
Slides at http://grigory.us/cis625/lecture3.pdf
CIS 625: Computational Learning Theory
Submodularity• Discrete analog of convexity/concavity, “law of
diminishing returns”• Applications: combintorial optimization, AGT, etc.Let • Discrete derivative:
• Submodular function:
Approximating everywhere
• Q2: O-fraction of arguments (PAC-style learning with membership queries
under uniform distribution)?
• A2: Almost as hard [Balcan, Harvey, STOC’11].
Pr𝑟𝑎𝑛𝑑𝑜𝑚𝑛𝑒𝑠𝑠 𝑜𝑓 𝑨 [ Pr
𝑺∼𝑈 (2𝑋)[ 𝑨 (𝑺 )= 𝒇 (𝑺 ) ]≥1−𝜖]≥ 12
• Q1: Approximate a submodular for all arguments with only poly(|X|) queries?
• A1: Only-approximation (multiplicative) possible [Goemans, Harvey, Iwata, Mirrokni, SODA’09]
Approximate learning
• PMAC-learning (Multiplicative), with poly(|X|) queries : [Balcan, Harvey ’11]
• PAAC-learning (Additive)• Running time: [Gupta, Hardt, Roth, Ullman, STOC’11]
• Running time: poly [Cheraghchi, Klivans, Kothari, Lee, SODA’12]
Pr𝑟𝑎𝑛𝑑 .𝑜𝑓 𝑨 [ Pr
𝑺∼𝑈 (2 𝑋)[¿ 𝒇 (𝑺 )−𝑨 (𝑺 )∨≤ 𝜷 ]≥1−𝜖 ]≥ 12
Learning
Goemans,Harvey,Iwata,Mirrokni
Balcan,Harvey
Gupta,Hardt,Roth,Ullman
Cheraghchi,Klivans,Kothari,Lee
Raskhodnikova, Y.
Learning -approximationEverywhere
PMAC Multiplicative
PAACAdditive
PAC
(bounded integral range )
Time Poly(|X|) Poly(|X|)
Extrafeatures
Under arbitrary distribution
Tolerant queries
SQ- queries,Agnostic
• For all algorithms
Polylog(|X|) queries
Learning: Bigger picture
}XOS = Fractionally subadditive
Subadditive
Submodular
Gross substitutes
OXS
[Badanidiyuru, Dobzinski, Fu, Kleinberg, Nisan, Roughgarden,SODA’12]
Additive (linear)
Coverage (valuations)
Other positive results:• Learning valuation functions [Balcan,
Constantin, Iwata, Wang, COLT’12]• PMAC-learning (sketching) coverage functions
[BDFKNR’12]• PMAC learning Lipschitz submodular functions
[BH’10] (concentration around average via Talagrand)
Discrete convexity• Monotone convex
• Convex
1 2 3 … <=R … … … … … … … … n02468
1 2 3 … <=R … … … … >= n-R
… … … n02468
Discrete submodularity
• Monotone submodular
𝑋
∅|𝑺|≤𝑹
• Submodular
𝑋
∅|𝑺|≤𝑹
|𝑺|≥|𝑿|−𝑹
• Case study: = 1 (Boolean submodular functions )Monotone submodular = (monomial)Submodular = (2-term CNF)
Discrete monotone submodularity
≥𝒎𝒂𝒙 ( 𝒇 (𝑺𝟏 ) , 𝒇 (𝑺𝟐))
|𝑺|≤𝑹
• Monotone submodular
Discrete monotone submodularity• Theorem: for monotone submodular f• (by monotonicity)
|𝑺|≤𝑹
𝑇
Discrete monotone submodularity
• S smallest subset of such that • we have => Restriction of on is monotone increasing =>
|𝑺|≤𝑹
𝑇
𝑆 ′ : 𝑓 (𝑆′ )= 𝑓 (𝑇 )
𝑆 ′𝜕𝑥 𝑓 (𝑆′∖ {𝑥 } )>0
Representation by a formula• Theorem: for monotone submodular f
• Alternative notation: , • =
(Monotone, if no negations)
• Theorem (restated): Monotone submodular can be represented as a monotone pseudo-Boolean -DNF with constants
Discrete submodularity• Submodular can be represented as a
pseudo-Boolean 2R-DNF with constants • Hint [Lovasz] (Submodular monotonization): Given submodular define
Then is monotone and submodular.
𝑋
∅|𝑺|≤𝑹
|𝑺|≥|𝑿|−𝑹
Proof
• We’re done if we have a coverage :1. All have large size: 2. For all there exists3. For every restriction of on is monotone
• Every is a monotone pB R-DNF (3)• Add at most R negated variables to
every clause to restrict to (1)• (2)
𝑋
∅
𝐓
𝒇 𝑻
Proof
• There is no such coverage => relaxation [GHRU’11]– All have large size: – For all there exists a pair
– Restriction of on all is monotone𝑋
∅
𝐓
𝐓 ′
Coverage by monotone lower bounds
• Let be defined as – is monotone submodular [Lovasz]– For all we have– For all we have
• () (where is a monotone pB R-DNF)
𝒇 𝑻𝒎𝒐𝒏 (𝑺 )≤ 𝒇 (𝑺)
𝑺’
𝑻
∅
𝒇 𝑻𝒎𝒐𝒏(𝑺)= 𝒇 (𝑺)
𝑺
Learning pB-formulas and k-DNF• = class of pB -DNF with • i-slice defined as
• If its i-slices are -DNF and:
• PAC-learning:
• Learn every i-slice on fraction of arguments => union bound
(
Pr𝑟𝑎𝑛𝑑 (𝑨) [ Pr
𝑺∼𝑈( {0,1 }𝑛)[ 𝑨 (𝑺 )= 𝒇 (𝑺 ) ]≥1−𝜖]≥ 12
iff
Learning Fourier coefficients• Learn (-DNF) on fraction of arguments
• Fourier sparsity = # of largest Fourier coefficients sufficient to PAC-learn every
• = [Mansour]: doesn’t depend on n!– Kushilevitz-Mansour (Goldreich-Levin): queries/time. – ``Attribute efficient learning’’: queries– Lower bound: () queries to learn a random -junta ( -DNF) up to constant
precision.
• = – Optimizations: Do all R iterations of KM/GL in parallel by reusing queries
Property testing
• Let be the class of submodular • How to (approximately) test, whether a given is in ?• Property tester: (randomized) algorithm for distinguishing:
1. -far):
• Key idea: -DNFs have small representations:– [Gopalan, Meka,Reingold CCC’12] (using quasi-sunflowers [Rossman’10]), -DNF formula F there exists:-DNF formula F’ of size such that
𝑪-close
-far
Testing by implicit learning• Good approximation by juntas => efficient property
testing [Diakonikolas, Lee, Matulef, Onak ,Rubinfeld, Servedio, Wan]– -approximation by -junta– Good dependence on :
• For submodular functions – Query complexity , independent of n! – Running time exponential in – lower bound for testing -DNF (reduction from Gap Set
Intersection)
• [Blais, Onak, Servedio, Y.] exact characterization of submodular functions
Previous work on testing submodularity
[Parnas, Ron, Rubinfeld ‘03, Seshadhri, Vondrak, ICS’11]:
• U. • Lower bound: Special case: coverage functions [Chakrabarty, Huang, ICALP’12].
Gap in query complexity
Directions
• Close gaps between upper and lower bounds, extend to more general learning/testing settings
• Connections to optimization?• What if we use distance between functions
instead of Hamming distance in property testing? [Berman, Raskhodnikova, Y.]