Post on 12-Apr-2017
Fast ALS-Based Matrix
Factorization for
Recommender Systems
David Zibriczky
LAWA Workpackage Meeting
16th January, 2013
LAWA Workpackage Meeting
Problem setting
16th January, 20132
Item Recommendation
• Classical item recommendation problem (see Netflix)
• Explicit feedbacks (ratings)
16th January, 20133 LAWA Workpackage Meeting
5 ?
?
The Matrix The Matrix 2 Twilight The Matrix 3
?
Collaborative Filtering (Explicit)
• Classical item recommendation problem (see Netflix)
• Explicit feedbacks (ratings)
• Collaborative Filtering
• Based on other users
16th January, 20134 LAWA Workpackage Meeting
5
54
55
?
?
The Matrix 3The Matrix The Matrix 2 Twilight
5
?
Collaborative Filtering (Implicit)
• Items are not movies only (live content, products, holidays, …)
• Implicit feedbacks (buy, view, …)
• Less information about pref.
16th January, 20135 LAWA Workpackage Meeting
?
?
Item4Item1 Item2 Item3
?
Industrial motivation
• Keeping the response time low
• Up-to-date user models, the adaptation should be fast
• The items may change rapidly, the training time can be a bottleneck of
live performance
• Increasing amount of data from a customer Increasing training time
• Limited resources
16th January, 20136 LAWA Workpackage Meeting
LAWA Workpackage Meeting
Model
16th January, 20137
Preference Matrix
• Matrix representation
• Implicit Feedbacks: Assuming
positive preference
• Value = 1
• Estimation of unknown preference?
• Sorting items by estimation Item
Recommendation
16th January, 20138 LAWA Workpackage Meeting
R Item1 Item2 Item3 Item4
User1 1 ? ? ?
User2 ? ? 1 ?
User3 1 1 ? ?
User4 ? 1 ? 1
Matrix Factorization
𝑹 = 𝑷𝑸𝑻 𝑟𝑢𝑖 = 𝒑𝑢𝑇𝒒𝑖
𝑹𝑵𝒙𝑴: preference matrix
𝑷𝑵𝒙𝑲: user feature matrix
𝑸𝑴𝒙𝑲: item feature matrix
𝑵: #users
𝑴: #items
𝑲: #features
𝑲 ≪ 𝑴 , 𝑲 ≪ 𝑵
16th January, 20139 LAWA Workpackage Meeting
R Item1 Item2 Item3 …
User1
User2 𝒓𝑢𝑖
User3
…
P
𝒑𝑢𝑇
QT 𝒒𝑖
𝒑𝒖 ≔ 𝑷 𝒖 𝑻
𝒒𝒊 ≔ 𝑸 𝒊 𝑻
LAWA Workpackage Meeting
Objective Function
16th January, 201310
Preference Matrix
16th January, 201311 LAWA Workpackage Meeting
R Item1 Item2 Item3 Item4
User1 1
User2 1
User3 1 1
User4 1 1
• Zero value for unknown preference (zero example). Many 0s, few 1s, in practice
Preference Matrix
16th January, 201312 LAWA Workpackage Meeting
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
• Zero value for unknown preference (zero example). Many 0s, few 1s, in practice-
• 𝒄𝑢𝑖 confidence for known feedback (constant or function of the context of event)
• Zero examples are less important, but important.
Confidence Matrix
16th January, 201313 LAWA Workpackage Meeting
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
C Item1 Item2 Item3 Item4
User1 𝒄11 1 1 1
User2 1 1 𝒄23 1
User3 𝒄31 𝒄32 1 1
User4 1 𝒄42 1 𝒄44
• Objective function:
Weighted Sum of Squared Errors
16th January, 201314 LAWA Workpackage Meeting
C Item1 Item2 Item3 Item4
User1 𝒄11 1 1 1
User2 1 1 𝒄23 1
User3 𝒄31 𝒄32 1 1
User4 1 𝒄42 1 𝒄44
𝒇 𝑷,𝑸 = 𝑾𝑺𝑺𝑬 =
(𝒖,𝒊)
𝒄𝒖𝒊 𝒓𝒖𝒊 − 𝒓𝒖𝒊𝟐 𝑷 = ?
𝑸 = ?
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
LAWA Workpackage Meeting
Optimizer
16th January, 201315
• Ridge Regression
• 𝑝𝑢 = 𝑄𝑇𝐶𝑢𝑄 −1𝑄𝑇𝐶𝑢𝑅𝑟 𝑢
• 𝑞𝑖 = 𝑃𝑇𝐶𝑖𝑃−1
𝑃𝑇𝐶𝑖𝑅𝑐 𝑖
Optimizer – Alternating Least Squares
16th January, 201316 LAWA Workpackage Meeting
QT0.1 -0.4 0.8 0.6
0.6 0.7 -0.7 -0.2
P
-0.2 0.6
0.6 0.4
0.7 0.2
0.5 -0.2
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
• Ridge Regression
• 𝑝𝑢 = 𝑄𝑇𝐶𝑢𝑄 −1𝑄𝑇𝐶𝑢𝑅𝑟 𝑢
• 𝑞𝑖 = 𝑃𝑇𝐶𝑖𝑃−1
𝑃𝑇𝐶𝑖𝑅𝑐 𝑖
Optimizer – Alternating Least Squares
16th January, 201317 LAWA Workpackage Meeting
QT0.3 -0.3 0.7 0.7
0.7 0.8 -0.5 -0.1
P
-0.2 0.6
0.6 0.4
0.7 0.2
0.5 -0.2
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
• Ridge Regression
• 𝑝𝑢 = 𝑄𝑇𝐶𝑢𝑄 −1𝑄𝑇𝐶𝑢𝑅𝑟 𝑢
• 𝑞𝑖 = 𝑃𝑇𝐶𝑖𝑃−1
𝑃𝑇𝐶𝑖𝑅𝑐 𝑖
Optimizer – Alternating Least Squares
16th January, 201318 LAWA Workpackage Meeting
QT0.3 -0.3 0.7 0.7
0.7 0.8 -0.5 -0.1
P
-0.2 0.7
0.6 0.5
0.8 0.2
0.6 -0.2
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer – Alternating Least Squares
• Complexity of naive solution: 𝚶 𝑰𝑲𝟐𝑵𝑴 + 𝑰𝑲𝟑 𝑵 + 𝑴
𝑬: number of examples, 𝑰 : number of iterations
• Improvement (Hu, Koren, Volinsky)
Ridge Regression: 𝑝𝑢 = 𝑄𝑇𝐶𝑢𝑄 −1𝑄𝑇𝐶𝑢𝑅𝑟 𝑢
𝑄𝑇𝐶𝑢𝑄 = 𝑄𝑇𝑄 + 𝑄𝑇 𝐶𝑢 − 𝐼 𝑄 = 𝐶𝑂𝑉𝑄0 + 𝐶𝑂𝑉𝑄+, 𝚶(𝑰𝑲𝟐𝑵𝑴) is costly
𝐶𝑂𝑉𝑄0 is user independent, need to be calculated at the start of the iteration
Calculating 𝐶𝑂𝑉𝑄+ needs only #𝑷(𝒖)+steps.
o #𝑷(𝒖)+: number of positive examples of user u
Complexity: 𝜪 𝑰𝑲𝟐𝑬 + 𝑰𝑲𝟑(𝑵 + 𝑴) = 𝜪 𝑰𝑲𝟐(𝑬 + 𝑲(𝑵 + 𝑴)
Codename: IALS
• Complexity issues on large dataset:
If 𝑲 is low: 𝜪(𝑰𝑲𝟐𝑬) is dominant
If 𝑲 is high: 𝑶(𝑰𝑲𝟑(𝑵 + 𝑴)) is dominant
19 LAWA Workpackage Meeting 16th January, 2013
LAWA Workpackage Meeting
Problem: Complexity
16th January, 201320
Ridge Regression with Coordinate Descent
16th January, 201321 LAWA Workpackage Meeting
R Item1 Item2 Item3 Item4
User1 1 0 0 0
QT
0.9 -0.4 0.8 0.6
0.6 0.7 -0.7 -0.2
-0.1 -0.4 -0.1 0.6
P
? ? ?
• Initialize with zero values
Ridge Regression with Coordinate Descent
16th January, 201322 LAWA Workpackage Meeting
R Item1 Item2 Item3 Item4
User1 1 0 0 0
QT
0.9 -0.4 0.8 0.6
0.6 0.7 -0.7 -0.2
-0.1 -0.4 -0.1 0.6
P
0 0 0
Ridge Regression with Coordinate Descent
16th January, 201323 LAWA Workpackage Meeting
P
0.51 0 0
R Item1 Item2 Item3 Item4
User1 1 0 0 0
QT
0.9 -0.4 0.8 0.6
0.6 0.7 -0.7 -0.2
-0.1 -0.4 -0.1 0.6
• Target vector: 𝒆𝒖= 𝑪𝒖 𝒓𝒖 − 𝒑𝒖𝑸𝑻
• Optimize only one feature of 𝑝𝑢 at once
• 𝑝𝑢𝑘 = 𝑖=1
𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑒𝑢𝑖
𝑖=1𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑞𝑖𝑘
=𝑆𝑄𝐸
𝑆𝑄𝑄
• 𝑒𝑢𝑖 = 𝑒𝑢𝑖 − 𝑝𝑢𝑘𝑒𝑢𝑖𝑐𝑢𝑖
• Apply more iteration
Ridge Regression with Coordinate Descent
16th January, 201324 LAWA Workpackage Meeting
P
0.51 0.10 0
R Item1 Item2 Item3 Item4
User1 1 0 0 0
QT
0.9 -0.4 0.8 0.6
0.6 0.7 -0.7 -0.2
-0.1 -0.4 -0.1 0.6
• Target vector: 𝒆𝒖= 𝑪𝒖 𝒓𝒖 − 𝒑𝒖𝑸𝑻
• Optimize only one feature of 𝑝𝑢 at once
• 𝑝𝑢𝑘 = 𝑖=1
𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑒𝑢𝑖
𝑖=1𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑞𝑖𝑘
=𝑆𝑄𝐸
𝑆𝑄𝑄
• 𝑒𝑢𝑖 = 𝑒𝑢𝑖 − 𝑝𝑢𝑘𝑒𝑢𝑖𝑐𝑢𝑖
• Apply more iteration
Ridge Regression with Coordinate Descent
16th January, 201325 LAWA Workpackage Meeting
P
0.51 0.10 0.08
R Item1 Item2 Item3 Item4
User1 1 0 0 0
QT
0.9 -0.4 0.8 0.6
0.6 0.7 -0.7 -0.2
-0.1 -0.4 -0.1 0.6
• Target vector: 𝒆𝒖= 𝑪𝒖 𝒓𝒖 − 𝒑𝒖𝑸𝑻
• Optimize only one feature of 𝑝𝑢 at once
• 𝑝𝑢𝑘 = 𝑖=1
𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑒𝑢𝑖
𝑖=1𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑞𝑖𝑘
=𝑆𝑄𝐸
𝑆𝑄𝑄
• 𝑒𝑢𝑖 = 𝑒𝑢𝑖 − 𝑝𝑢𝑘𝑒𝑢𝑖𝑐𝑢𝑖
• Apply more iteration
Ridge Regression with Coordinate Descent
16th January, 201326 LAWA Workpackage Meeting
P
0.47 0.10 0.08
R Item1 Item2 Item3 Item4
User1 1 0 0 0
QT
0.9 -0.4 0.8 0.6
0.6 0.7 -0.7 -0.2
-0.1 -0.4 -0.1 0.6
• Target vector: 𝒆𝒖= 𝑪𝒖 𝒓𝒖 − 𝒑𝒖𝑸𝑻
• Optimize only one feature of 𝑝𝑢 at once
• 𝑝𝑢𝑘 = 𝑖=1
𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑒𝑢𝑖
𝑖=1𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑞𝑖𝑘
=𝑆𝑄𝐸
𝑆𝑄𝑄
• 𝑒𝑢𝑖 = 𝑒𝑢𝑖 − 𝑝𝑢𝑘𝑒𝑢𝑖𝑐𝑢𝑖
• Apply more iteration
Ridge Regression with Coordinate Descent
16th January, 201327 LAWA Workpackage Meeting
P
0.46 0.11 0.07
R Item1 Item2 Item3 Item4
User1 1 0 0 0
QT
0.9 -0.4 0.8 0.6
0.6 0.7 -0.7 -0.2
-0.1 -0.4 -0.1 0.6
• Target vector: 𝒆𝒖= 𝑪𝒖 𝒓𝒖 − 𝒑𝒖𝑸𝑻
• Optimize only one feature of 𝑝𝑢 at once
• 𝑝𝑢𝑘 = 𝑖=1
𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑒𝑢𝑖
𝑖=1𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑞𝑖𝑘
=𝑆𝑄𝐸
𝑆𝑄𝑄
• 𝑒𝑢𝑖 = 𝑒𝑢𝑖 − 𝑝𝑢𝑘𝑒𝑢𝑖𝑐𝑢𝑖
• Apply more iteration
Optimizer – Coordinate Descent
16th January, 201328 LAWA Workpackage Meeting
QT0.1 0.4 1.1 0.6
0.6 0.7 1.5 1.0
P
0.3 0
0 0
0 0
0 0
• Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer – Coordinate Descent
16th January, 201329 LAWA Workpackage Meeting
QT0.1 0.4 1.1 0.6
0.6 0.7 1.5 1.0
P
0.3 -0.1
0 0
0 0
0 0
• Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer – Coordinate Descent
16th January, 201330 LAWA Workpackage Meeting
QT0.1 0.4 1.1 0.6
0.6 0.7 1.5 1.0
P
0.3 -0.1
0.1 0
0 0
0 0
• Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer – Coordinate Descent
16th January, 201331 LAWA Workpackage Meeting
QT0.1 0.4 1.1 0.6
0.6 0.7 1.5 1.0
P
0.3 -0.1
0.1 0.5
0 0
0 0
• Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer – Coordinate Descent
16th January, 201332 LAWA Workpackage Meeting
QT0.1 0.4 1.1 0.6
0.6 0.7 1.5 1.0
P
0.3 -0.1
0.1 -0.5
-0.4 0.2
0.5 -0.4
• Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer – Coordinate Descent
16th January, 201333 LAWA Workpackage Meeting
QT0.1 0 0 0
0 0 0 0
P
0.3 -0.1
0.1 -0.5
-0.4 0.2
0.5 -0.4
• Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer – Coordinate Descent
16th January, 201334 LAWA Workpackage Meeting
QT0.1 0 0 0
0.6 0 0 0
P
0.3 -0.1
0.1 -0.5
-0.4 0.2
0.5 -0.4
• Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer – Coordinate Descent
16th January, 201335 LAWA Workpackage Meeting
QT0.1 0.4 0 0
0.6 0 0 0
P
0.3 -0.1
0.1 -0.5
-0.4 0.2
0.5 -0.4
• Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer – Coordinate Descent
16th January, 201336 LAWA Workpackage Meeting
QT0.1 0.4 -0.1 0.2
0.6 0.7 0.8 0.5
P
0.3 -0.1
0.1 -0.5
-0.4 0.2
0.5 -0.4
• Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer – Coordinate Descent
16th January, 201337 LAWA Workpackage Meeting
QT0.1 0.4 -0.1 0.2
0.6 0.7 0.8 0.5
P
0.2 0
0 0
0 0
0 0
• Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer – Coordinate Descent
16th January, 201338 LAWA Workpackage Meeting
QT0.1 0.4 -0.1 0.2
0.6 0.7 0.8 0.5
P
0.2 -0.1
0 0
0 0
0 0
• Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer – Coordinate Descent
16th January, 201339 LAWA Workpackage Meeting
QT0.1 0.4 -0.1 0.2
0.6 0.7 0.8 0.5
P
0.2 -0.1
0.1 -0.4
-0.3 0.1
0.5 -0.6
• Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer – Coordinate Descent
• Complexity of naive solution: 𝚶 𝑰𝑲𝑵𝑴
• Ridge Regression calculates the features based on examples directly,
Covariance precomputing solution cannot be applied here.
40 LAWA Workpackage Meeting 16th January, 2013
Optimizer – Coordinate Descent Improvement
• Synthetic examples (Pilászy, Zibriczky, Tikk)
• Solution of Ridgre Regression with CD: 𝑝𝑢𝑘 = 𝑖=1
𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑒𝑢𝑖
𝑖=1𝑀 𝑐𝑢𝑖𝑞𝑖𝑘𝑞𝑖𝑘
=𝑆𝑄𝐸
𝑆𝑄𝑄
• Calculate statistics for this user, who watched nothing (𝑆𝐸𝑄0 and 𝑆𝑄𝑄0)
• The solution is calculated incrementally: 𝑝𝑢𝑘 =𝑆𝑄𝐸
𝑆𝑄𝑄=
𝑆𝑄𝐸0+𝑆𝑄𝐸+
𝑆𝑄𝑄0+𝑆𝑄𝑄+(𝑴 + #𝑷(𝒖)+ steps)
• Eigenvalue decomposition: 𝑄𝑇𝑄 = 𝑆Λ𝑆𝑇 = 𝑆 Λ𝑇
Λ𝑆 = 𝐺𝑇𝐺
• Zero examples are compressed to synthetic examples: 𝑄𝑀𝑥𝐾 → 𝐺𝐾𝑥𝐾
• 𝑆𝐺𝐺0 = 𝑆𝑄𝑄0, but needs only 𝐊 steps to compute: 𝑝𝑢𝑘 =𝑺𝑮𝑬𝟎+𝑆𝑄𝐸+
𝑺𝑮𝑮𝟎+𝑆𝑄𝑄+(𝑲 + #𝑷(𝒖)+ steps)
• 𝑆𝐺𝐸0 is calculated the same way as 𝑆𝑄𝐸0, but using 𝐊 steps only.
• Complexity: 𝛰 𝐼𝐾(𝐸 + 𝐾𝑀 + 𝐾𝑁)) = 𝚶 𝑰𝑲(𝑬 + 𝑲(𝑴 + 𝑵)
41 LAWA Workpackage Meeting 16th January, 2013
Optimizer – Coordinate Descent
• Complexity of naive solution: 𝚶 𝑰𝑲𝑵𝑴
• Ridge Regression calculates the features based on examples directly,
Covariance precomputing solution cannot be applied here.
• Synthetic Examples
• Codename: IALS1
• Complexity reduction (IALSIALS1)
𝜪 𝑰𝑲(𝑬 + 𝑲(𝑴 + 𝑵)
• IALS1 requires higher 𝑲 for the same accuracy as IALS.
42 LAWA Workpackage Meeting 16th January, 2013
Optimizer – Coordinate Descent
...does it work in practice?
16th January, 201343 LAWA Workpackage Meeting
• Average Rank Position on the subset of a propietary implicit feedback dataset. The lower
value is better.
• IALS1 offers better time-accuracy tradeoffs, especially when K is large.
Comparison
44 LAWA Workpackage Meeting 16th January, 2013
IALS IALS1
K ARP time ARP time
5 0,1903 153 0,1898 112
10 0,1578 254 0,1588 134
20 0,1427 644 0,1432 209
50 0,1334 2862 0,1344 525
100 0,1314 11441 0,1325 1361
250 0,1311 92944 0,1312 6651
500 N/A N/A 0,1282 24697
1000 N/A N/A 0,1242 104611
0,120
0,125
0,130
0,135
0,140
0,145
0,150
0,155
100 1000 10000 100000A
RP
Training Time (s)
IALS IALS1
Conclusion
• Explicit feedbacks are rarely or not provided.
• Implicit feedbacks are more general.
• Complexity issues of Alternating Least Squares.
• Efficient solution by using approximation and synthetic examples.
• IALS1 offers better time-accuracy tradeoffs, especially when 𝑲 is large.
• IALS is approximation algorithm too, so why not change it to be even
more approximative?
45 LAWA Workpackage Meeting 16th January, 2013
LAWA Workpackage Meeting
Other algorithms
16th January, 201346
Model – Tensor Factorization
47 LAWA Workpackage Meeting 16th January, 2013
• Different preferences during the day
• Time period 1: 06:00-14:00
R1 Item1 Item2 Item3 …
User1 1 …
User2 1 …
User3 …
…. … … … …
• Different preferences during the day
• Time period 2: 14:00-22:00
Model – Tensor Factorization
48 LAWA Workpackage Meeting 16th January, 2013
R1 Item1 Item2 Item3 …
User1 1 …
User2 1 0 …
User3 …
…. … … … …
R2 Item1 Item2 Item3 …
User1 1 …
User2 1 …
User3 1 …
…. … … … …
Model – Tensor Factorization
• Different preferences during the day
• Time period 3: 22:00-06:00
49 LAWA Workpackage Meeting 16th January, 2013
R1 Item1 Item2 Item3 …
User1 1 …
User2 1 0 …
User3 …
…. … … … …
R2 Item1 Item2 Item3 …
User1 0 1 …
User2 1 …
User3 1 …
…. … … … …
R3 Item1 Item2 Item3 …
User1 1 …
User2 …
User3 1 1 …
…. … … … …
Model – Tensor Factorization
50 LAWA Workpackage Meeting 16th January, 2013
R1 Item1 Item2 Item3 …
User1 1 …
User2 1 0 …
User3 …
…. … … … …
R2 Item1 Item2 Item3 …
User1 0 1 …
User2 1 …
User3 1 …
…. … … … …
R3 Item1 Item2 Item3 …
User1 …
User2 𝒓𝑢𝑖𝑡 …
User3 …
…. … … … …
QTq11 q21 q31 …
q12 q22 q32 …
P
p11 p12
p21 p22
p31 p32
… …
Tt11
t12
t21
t22
t31
t32
𝑹𝑵𝒙𝑴: preference matrix
𝑷𝑵𝒙𝑲: user feature matrix
𝑸𝑴𝒙𝑲: item feature matrix
𝑻𝑳𝒙𝑲: time feature matrix
𝑵: #users
𝑴: #items
𝑳: #time periods
𝑲: #features
𝒓𝒖𝒊t =
𝒌
𝒑𝒖𝒌𝒒𝒊𝒌𝒕𝒕𝒌
𝑹 = 𝑷°𝑸°𝑻
• Data sets: Netflix Rating 5, IPTV Provider VOD rental, Grocery buys
• Evaluation Metric: Recall@20, Precision-Recall@20
• Number of features: 20
Comparison – ITALS vs. IALS
51 LAWA Workpackage Meeting 16th January, 2013
Test case (20) IALS ITALS
Netflix Probe 0.087 0.097
Netflix Time Split 0.054 0.071
IPTV VOD 1day 0.063 0.112
IPTV VOD 1week 0.055 0.100
Grocer 0.065 0.103
Comparison – ITALS vs. IALS
52 LAWA Workpackage Meeting 16th January, 2013
Objective Function – Ranking-based objective function
16th January, 201353 LAWA Workpackage Meeting
• Ranking-based objective function approach:
• 𝒓𝒖𝒊 − 𝒓𝒖𝒋 : difference of preference between item i and j
• 𝒓𝒖𝒊 − 𝒓𝒖𝒋 : estimated difference of preference between item i and j
• 𝒔𝒋: importance of item j in objective function
• Model: Matrix Factorization
• Optimizer: Alternating Least Squares
• Name: RankALS
𝒇 𝜽 =
𝒖𝝐𝑼
𝒊𝝐𝑰
𝒄𝒖𝒊
𝒊𝝐𝑰
𝒔𝒋[ 𝒓𝒖𝒊 − 𝒓𝒖𝒋 − 𝒓𝒖𝒊 − 𝒓𝒖𝒋 ]𝟐
Comparison – RankIALS vs. IALS
54 LAWA Workpackage Meeting 16th January, 2013
Comparison – RankIALS vs. IALS
55 LAWA Workpackage Meeting 16th January, 2013
Related Publications
• Alternating Least Squares with Coordinate Descent
I. Pilászy, D. Zibriczky, D. Tikk. Fast ALS-based matrix factorization for explicit and
implicit feedback datasets. RecSys 2010
• Tensor Factorization
B. Hidasi, D. Tikk: Fast ALS-Based Tensor Factorization for Context-Aware
Recommendation from Implicit Feedback, ECML PKDD 2012
• Personalized Ranking
G. Takács, D. Tikk: Alternating least squares for personalized ranking, RecSys 2012
• IPTV Case Study
D. Zibriczky, B. Hidasi, Z. Petres, D. Tikk: Personalized recommendation of linear content
on interactive TV platforms: beating the cold start and noisy implicit user feedback,
TVMMP @ UMAP 2012
56 LAWA Workpackage Meeting 16th January, 2013