Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each...
Transcript of Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each...
![Page 2: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/2.jpg)
Determinantal Point Processes (DPPs)Problem of interest: How can we select a subset of a given set that is of a good quality and diversity at the same time?
Sources:Alex Kulesza & Ben Taskar, 2013: Determinantal Point Processes for Machine Learning
Alexei Borodin, 2009:Determinantal Point Processes
2
![Page 3: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/3.jpg)
Motivation - Information Retrieval
3
![Page 4: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/4.jpg)
Motivation - Recommender Systems
4
![Page 5: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/5.jpg)
Motivation - Extractive Text Summarization
5
![Page 6: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/6.jpg)
Discrete Point Processes● Number of items (videos, sentences, …): N● Ground set: set of indexes
● Number of subsets: 2N
● Point process:
6
![Page 7: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/7.jpg)
Discrete Point ProcessesIndependent Point Process
Each element i included with probability pi
7
![Page 8: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/8.jpg)
DPP - marginal kernel
8
Let K = [Kij] be and N x N real, symmetric matrix with properties: - K is a positive semidefinite matrix xTKx ≥ 0 for all x∈RN
all principal minors of are non-negative all eigenvalues are non-negative
- all eigenvalues are bounded above by 1
Matrix K is so called marginal kernel.
Intuition: Kij represents similarity of the items i and j
![Page 9: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/9.jpg)
DPP - marginal kernelIf A ⊆ than KA= [Kij]i,j∈A is the matrix with rows and columns indexed by the elements of the set A.
9
![Page 10: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/10.jpg)
DPP - definitionPoint process is called determinantal point process if for a random subset drawn according to for every holds
Adaptation:
Notation: Y ~ DPP(K)
10
![Page 11: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/11.jpg)
DPP - diversificationFor A = {i}:
For A = {i, j}:
DPPs have ability to diversify!
11
![Page 12: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/12.jpg)
DPP - diversification
12
![Page 13: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/13.jpg)
DPP - diversification
13
![Page 14: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/14.jpg)
DPP propertiesRestriction
If Y is distributed as a DPP with marginal kernel K and then Y ∩ A is DPP with marginal kernel KA
Complement
If Y is distributed as DPP with marginal kernel K, - Y is distributed as a DPP with marginal kernel I-K.
ScalingIf K = ɣK′ for some 0≤ɣ<1, then for all we have det(KA) = ɣ|A|det(K′A)
14
![Page 15: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/15.jpg)
L-ensembleLet L = [Lij] be and N x N real, symmetric positive semidefinite matrix.
An L-ensemble is a point process that satisfies
Normalization constant:
For we have
Borodin & Rains, 2005
15
![Page 16: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/16.jpg)
DPP and L-ensemble● An L-ensemble is a DPP with marginal kernel K given by
● Not all DPPs are L-ensembles! When any eigenvalue of K achieves the upper bound of 1, the DPP is not an L-ensemble!
16
![Page 17: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/17.jpg)
DPP and L-ensembleIf
is an eigendecomposition of L then
is eigendecomposition of K.
17
![Page 18: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/18.jpg)
Sampling Algorithm
18
![Page 19: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/19.jpg)
Sampling Algorithm ComplexityAlgorithm complexity: O(Nk3) k = |V|
Gram-Schmidt orthonormalization: O(Nk2)
Bottleneck: eigendecomposition of L with complexity O(N3)
There exists exact and approximating DPP samplings with lower complexity.
Derezinski et. al, 2019
19
![Page 20: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/20.jpg)
Sampling Algorithm - Visualization
20
![Page 21: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/21.jpg)
NP HardnessKo et. al (1995): Finding the set Y that maximizes is NP-hard.
is a log-submodular function and can be optimized in polynomial time.
for
21
![Page 22: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/22.jpg)
Submodularity
22
![Page 23: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/23.jpg)
Question of DiversityL can be decomposed as L = BTB for some D x N, D ≤ N Let Bi be the i-th column of B
23
![Page 24: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/24.jpg)
Question of QualityColumns of B are vectors that represent items in the set .
24
![Page 25: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/25.jpg)
Question of Quality
Similarity matrix S:
25
![Page 26: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/26.jpg)
Question of Quality and Diversity
26
![Page 27: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/27.jpg)
Decomposition for large setsDual representation:
C = BBT is DxD real symmetric positive-semidefinite matrix
27
![Page 28: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/28.jpg)
Decomposition for large setsDual representation:
28
![Page 29: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/29.jpg)
Decomposition for large setsDual representation:
29
![Page 30: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/30.jpg)
Decomposition for large setsRandom projection:
Dimension D of diversity features can be large. Idea: Project diversity vectors to a space of a low dimension d.
There are theoretical guarantees thatrandom projection approximately preserves distances.
30
![Page 31: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/31.jpg)
Training data
We assume that DPP kernel L(X; θ) is parametrized in terms of generic θ that reflect quality and/or diversity properties.
Learning
31
![Page 32: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/32.jpg)
Learning A conditional DPP is a conditional probabilistic model which assigns a probability to every possible subset . . The model takes the form of an L-ensemble
when L(X) is a positive semidefinite matrix that depends on the input X
32
![Page 33: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/33.jpg)
Goal: choose parameter θ to maximize the conditional log-likelihood of the training data (so called Maximum Likelihood Estimation, MLE)
with conditional probability of an output Y given input X under parameter θ
Learning
33
![Page 34: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/34.jpg)
Learning the Parameters of DPP Kernels● It is conjecture that MLE is NP-hard to compute (Kulesza, 2012)
○ non-convex optimization○ non-convexity holds even under various simplified assumptions on the form of L○ approximating the mode of size k of a DPP to within a ck (c>1) factor is known to be NP-hard
● Special cases of quality or similarity functions: ○ Gaussian similarity with uniform quality○ Gaussian similarity with Gaussian quality○ Polynomial similarity with uniform quality
● Nelder-Mead simplex algorithm:does not require explicit knowledge of derivates of a log-likelihood function, there are no theoretical guarantees about convergence to a stationary point.
34
![Page 35: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/35.jpg)
Learning the parameters of DPP KernelsHeuristics
○ Expectation-Maximization (Gillenwater et al., 2014)○ MCMC (Affandi et al., 2014)○ Fixed point algorithms (Mariet & Sra, 2015)
35
![Page 36: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/36.jpg)
Extractive Document SummarizationGoal: Learn a DPP to model good summaries Y for a given input X
Kuleza & Taskar, 2012 36
![Page 37: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/37.jpg)
Extractive Document SummarizationSimilarity featuressentences are represented as normalized tf-idf vectors
Similarity function similarity among sentences
37
![Page 38: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/38.jpg)
Extractive Document SummarizationQuality scores
feature vector for the sentence i
parameter vector
Quality featureslength, position of the sentence in its original document, mean cluster similarity, personal pronouns, LexRank score, …
38
![Page 39: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/39.jpg)
Extractive Document SummarizationQuality and similarity combined in the form of conditional DPP probability:
Holds is concave in θ
Gradient
39
![Page 40: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/40.jpg)
Extractive Document Summarization Learning quality parameters by gradient descent
40
![Page 41: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/41.jpg)
Extractive Document SummarizationInference: for a given X and learned parameter θ find Y with at most b charactersMaximum a posteriori:
Y is submodular and we can approximate it through simple greedy algorithm
41
![Page 42: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/42.jpg)
Diversified Recommendation ● Content is presented in the form of a feed:
an ordered list of items through the user browses
● The goodness of the recommendation is measured via utility function
● Goal: select and order a set of k items such that the utilityof the set is maximized
Gillenwater et al, 201842
![Page 43: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/43.jpg)
k-DPPA k-DPP on a discrete set is distribution over all subsets with cardinality k.
Normalization constant is where ek is k-th
elementary symmetric polynomial
over eigenvalues of the matrix L.
43
![Page 44: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/44.jpg)
k-DPPFor N = 3 elementary symmetric polynomials are:
There is an effective recursive algorithm for elementary symmetric polynomials calculations.
44
![Page 45: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/45.jpg)
k-DPPSampling:
45
![Page 46: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/46.jpg)
Diversified Recommendation DPP inputs: personalized quality scores and pointwise item distances
46
![Page 47: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/47.jpg)
Diversified Recommendation Observed interaction of user i with the feed list of length N is given as a binary vector: yu= [0, 1, 0, 1, 1, …, 0]
Goal:maximize the total number of interactions
Slight modification in order to train models from records of previous interactions: maximize the cumulative gain by reranking the feed items
j is the new rank that the model assigns to an item.
47
![Page 48: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/48.jpg)
Diversified Recommendation The interaction yu = [0, 1, 0, 1, 1, ….0] can be written as Y = {2, 4, 5}
Assumption: Y represents a drawn from the probability distribution defined by a user-specific DPP.
48
![Page 49: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/49.jpg)
Diversified Recommendation qi - personalized quality scoreDij - Jaccard distance built on video descriptions
ɑ∈ [0, 1) and σ are learning parameters
Run grid search to find the values of the parameters that maximize the cumulative gain.
49
![Page 50: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/50.jpg)
Diversified Recommendation f - parametrized quality function g - parametrized content re-embedding function ẟ - fixed regularization parameter w - the parameters of L
50
![Page 51: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/51.jpg)
Diversified Recommendation Inference
greedy algorithm for submodular maximization for size-k window
Runs k iterations adding one video to Y on each iteration:
The greedy algorithm gives the natural order for videos in size-k window.
Than the algorithm is repeated for the unused N-k videos.
51
![Page 52: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/52.jpg)
DPP classes● Discrete DPP● k-DPP
● Continual DPP● Structured DPP
translation, bioinformatics, ...
● Sequential DPPvideo summarization, …
52
![Page 53: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/53.jpg)
CodeMatlab:
https://www.alexkulesza.com/code/dpp.tgz
Python:
https://github.com/guilgautier/DPPy
53
![Page 54: Determinantal Point Processesmachinelearning.math.rs/Zecevic-DPP.pdf · 2019. 10. 23. · Each element i included with probability p i 7. DPP - marginal kernel 8 Let K = [K ij] be](https://reader035.fdocuments.in/reader035/viewer/2022071517/613a85050051793c8c0116ab/html5/thumbnails/54.jpg)
Thank you!
54