Post on 20-Jan-2021
我爱PPT中文网 整理 www.iloveppt.org
Deviation-Based Contextual SLIM Recommenders
Yong Zheng, Bamshad Mobasher, Robin Burke
DePaul University, Chicago, IL, USA
@CIKM 2014, Shanghai, China, Nov 4, 2014
我爱PPT中文网 整理 www.iloveppt.org
Outline of the Talk
• Context-aware Recommender Systems (CARS)
• Collaborative Filtering and SLIM Recommenders
• CSLIM: Contextualizing SLIM Recommenders
• Experimental Evaluations
• Conclusions and Future Work
我爱PPT中文网 整理 www.iloveppt.org
Outline of the Talk
• Context-aware Recommender Systems (CARS)
• Collaborative Filtering and SLIM Recommenders
• CSLIM: Contextualizing SLIM Recommenders
• Experimental Evaluations
• Conclusions and Future Work
我爱PPT中文网 整理 www.iloveppt.org
Traditional Recommender Systems (RS)
T1 T2 T3 T4 T5
U1 3 2
U2 3 3 4
U3 4 2 1
U4 2 5 5
U5 3 2 4 2
Example: User-Item 2D-Rating Matrix
Traditional Recommender: Users × Items Ratings
我爱PPT中文网 整理 www.iloveppt.org
Context-aware RS (CARS)
Motivations behind: Recommendation cannot live
alone without considering contexts, because users’
preferences always change from contexts to contexts.
Companion
我爱PPT中文网 整理 www.iloveppt.org
Context-aware RS (CARS)
Example: User-Item Contextual Rating Matrix
In CARS: Users × Items × Contexts Ratings
我爱PPT中文网 整理 www.iloveppt.org
Context-aware RS (CARS)
Example: User-Item Contextual Rating Matrix
Terminology:
Context dimension: time, location, companionContext condition: values in specific dimension, e.g.,weekend and weekday are two conditions in the context dimension “Time”
我爱PPT中文网 整理 www.iloveppt.org
Context-aware RS (CARS)
Representational CARS (R-CARS):
Assuming there are known influential contextual
variables available (e.g., location, time, mood, etc),
how to build CARS algorithms to adapt to users’
preferences in different contextual situations.
我爱PPT中文网 整理 www.iloveppt.org
Context-aware RS (CARS)
Most of research in R-CARS is focusing on
development of context-aware collaborative filtering
(CACF).
CF CACFContexts
我爱PPT中文网 整理 www.iloveppt.org
Outline of the Talk
• Context-aware Recommender Systems (CARS)
• Collaborative Filtering and SLIM Recommenders
• CSLIM: Contextualizing SLIM Recommenders
• Experimental Evaluations
• Conclusions and Future Work
我爱PPT中文网 整理 www.iloveppt.org
Collaborative Filtering (CF)
CF is one of most popular recommendation algorithms.
1). Memory-based CF
Such as user-based CF and item-based CF
Pros: good for explanation; Cons: sparsity problems
2). Model-based CF
Such as matrix factorization, etc
Pros: good performance; Cons: cold-start, explanation
3).Hybrid CF Recommendation Algorithms
Such as content-based hybrid CF, etc
Pros: further improvement; Cons: running costs
我爱PPT中文网 整理 www.iloveppt.org
Item-based CF (ItemKNN, Sarwar, 2001)
T1 T2 T3 T4 T5
U1 3 2
U2 3 3 ??? 4
U3 4 2 1
U4 2 5 5
U5 3 2 4 2
𝑃𝑢,𝑖 =
𝑗∈𝑁𝑖
𝑅𝑢,𝑗 × 𝑠𝑖𝑚(𝑖, 𝑗
𝑗∈𝑁𝑖
𝑠𝑖𝑚(𝑖, 𝑗Rating Prediction:
Cons: item-item similarity calculations and neighborhood selections rely on co-ratings.
What if the # of co-ratings is limited?
我爱PPT中文网 整理 www.iloveppt.org
SLIM (Ning, et al., 2011)
Sparse Linear Model (SLIM) is considered as another shape of collaborative filtering approach.
Ranking Score Prediction:
Matrix R = User-Item rating matrix; Matrix W = Item-Item coefficient matrix ≈ similarity matrix
We name this approach as SLIM-I, since W represents item-item coefficients.
𝑆𝑖,𝑗 = 𝑅𝑖,: ⋅ 𝑊:,𝑗 =
ℎ=1,ℎ≠𝑗
𝑁
𝑅𝑖,ℎ𝑊ℎ,𝑗
我爱PPT中文网 整理 www.iloveppt.org
Comparison Between ItemKNN & SLIM-I
Pros of SLIM-I:Matrix W is learned directly towards prediction/ranking error; in other words, item-item coefficient/similarity is no longer calculated based on co-ratings, which is more reliable and can be optimized towards ranking directly.
SLIM-I has been demonstrated to outperform UserKNN, ItemKNN, matrix factorization and other traditional RS algorithms.
𝑆𝑖,𝑗 = 𝑅𝑖,: ⋅ 𝑊:,𝑗 =
ℎ=1,ℎ≠𝑗
𝑁
𝑅𝑖,ℎ𝑊ℎ,𝑗
𝑃𝑢,𝑖 =
𝑗∈𝑁𝑖
𝑅𝑢,𝑗 × 𝑠𝑖𝑚(𝑖, 𝑗
𝑗∈𝑁𝑖
𝑠𝑖𝑚(𝑖, 𝑗Rating Prediction in ItemKNN:
Ranking Score Prediction in SLIM-I:
我爱PPT中文网 整理 www.iloveppt.org
SLIM-I and SLIM-U
SLIM-I is another shape of ItemKNN; W = Item-item coefficient matrix;SLIM-U is another shape of UserKNN; W = User-user coefficient matrix;
我爱PPT中文网 整理 www.iloveppt.org
Outline of the Talk
• Context-aware Recommender Systems (CARS)
• Collaborative Filtering and SLIM Recommenders
• CSLIM: Contextualizing SLIM Recommenders
• Experimental Evaluations
• Conclusions and Future Work
我爱PPT中文网 整理 www.iloveppt.org
CSLIM: Contextual SLIM Recommenders
We use SLIM-I as an example to introduce how to build CSLIM-I approaches; contexts can also be incorporated into SLIM-U to formulate CSLIM-U models accordingly.
Ranking Prediction in SLIM-I:
CSLIM has a uniform ranking prediction:
CSLIM aggregates contextual ratings with item-item coefficients.
There are two key points:1).The rating to be aggregated should be placed under same c;2).Accordingly, W indicates coefficients under same contexts;
𝑆𝑖,𝑗 = 𝑅𝑖,: ⋅ 𝑊:,𝑗 =
ℎ=1,ℎ≠𝑗
𝑁
𝑅𝑖,ℎ𝑊ℎ,𝑗
𝑆𝑖,𝑗,𝑐 =
ℎ=1,ℎ≠𝑗
𝑁
𝑅𝑖,ℎ,𝑐𝑊ℎ,𝑗
Incorporate Contexts
我爱PPT中文网 整理 www.iloveppt.org
CSLIM: Contextual SLIM Recommenders
The challenge is how to estimate , since contextual ratings are usually sparse – it is not guaranteed that the same user already rated other items in the same context c.
Ranking Prediction in CSLIM-I:
We used a deviation-based approach to estimate it.
Matrix R: user-item 2D rating matrix (non-contextual ratings)Matrix W: item-item coefficient matrixMatrix D: a matrix estimating rating deviations in contexts;
Here, D is a CI matrix (rows are items, cols are contexts)This approach is named as CSLIM-I-CI
𝑆𝑖,𝑗,𝑐 =
ℎ=1,ℎ≠𝑗
𝑁
𝑅𝑖,ℎ,𝑐𝑊ℎ,𝑗
𝑅𝑖,ℎ,𝑐
我爱PPT中文网 整理 www.iloveppt.org
CSLIM: Contextual SLIM Recommenders
We used a deviation-based approach to estimate it.
Example: CSLIM-I-CI,
R = non-contextual Rating MatrixD = Contextual Rating Deviation MatrixW = Item-item Coefficient MatrixC = a binary context vector, as below
𝑅𝑖,𝑗,𝑐 = 𝑅𝑖,𝑗 +
𝑙=1
𝐿
𝐷𝑗,𝑙𝑐𝑙
Weekend Weekday At Home At Park
1 0 0 1
We use this estimation even if we already know a real contextual rating in situation c, since we’d like to learn as many cells in D as possible.
我爱PPT中文网 整理 www.iloveppt.org
CSLIM: Contextual SLIM Recommenders
There are three ways to model contextual rating deviation (CRD) in D:
1). D is a CI matrix – assuming there is CRD for each <item, context> pair2). D is a CU matrix – assuming there is CRD for each <user, context> pair3). D is a vector – assuming CRD is only dependent with context
Incorporate contexts into SLIM-I: CSLIM-I-CI, CSLIM-I-CU, CSLIM-I-C;Incorporate contexts into SLIM-U: CSLIM-U-CI, CSLIM-U-CU, CSLIM-U-C;
We have built six Deviation-based CSLIM models!!
我爱PPT中文网 整理 www.iloveppt.org
Further Step: General CSLIM Approaches
Cons: CSLIM requires users’ non-contextual ratings on items; if there are no such ratings, we proposed to use the average of user’s contextual ratings on the item for representative, which wasdemonstrated to be feasible in our experiments.
However, we’d like to build more General CSLIM (GSLIM) models which does not require the data of non-contextual ratings.
Simply, we model matrix D as a CC matrix, where each cell in D represents the CRD between each two contextual conditions.GCSLIM-I-CC can estimate rating deviations from a contextual ratingto another contextual rating (same item but different contexts).
我爱PPT中文网 整理 www.iloveppt.org
Further Step: General CSLIM Approaches
For example, we want to estimate R<u1, t1, {Weekday, At home}>And we already know the rating R<u1, t1, {Weekend, At cinema}>And Matrix D helps us to learn and estimateCRD (Weekday, Weekend) & CRD (At home, At cinema)
Therefore, R<u1, t1, {Weekday, At home}> = R<u1, t1, {Weekend, At cinema}> + CRD (Weekday, Weekend) + CRD (At home, At cinema)
Similarly, matrix D can be paired with users or items; e.g., we assume CRD between contexts differ from users to users.
我爱PPT中文网 整理 www.iloveppt.org
Further Step: General CSLIM Approaches
Two challenges in GCSLIM approaches:
1). For each <user, item> pair, there could be several ratings forthis pair but in different contexts. Which contextual rating shouldbe applied?
If we use all those ratings increasing computational costs;If we just select one of them there are three ways: MostSimilar,LeastSimilar and Random; our experiments showed we could randomly pick up one. See our papers for more details.
我爱PPT中文网 整理 www.iloveppt.org
Further Step: General CSLIM Approaches
Two challenges in GCSLIM approaches:
2). How to couple matrix D with user or item dimension
If assign a D for each user/item increasing computation costs
Solution: we can cluster users/items to small groups, and assumethe users/items in the same group can share a same matrix D.
We will explore this attempt in our future work.
我爱PPT中文网 整理 www.iloveppt.org
Outline of the Talk
• Context-aware Recommender Systems (CARS)
• Collaborative Filtering and SLIM Recommenders
• CSLIM: Contextualizing SLIM Recommenders
• Experimental Evaluations
• Conclusions and Future Work
我爱PPT中文网 整理 www.iloveppt.org
Data Sets
The current situation in the CARS research domain:1). The number of data sets is limited;2). The data is either small or sparse;3). There are no large data sets, or larger ones are not publiclyaccessible. Most data were collected from surveys.
All the data sets used can be found here: http://tiny.cc/contextdata
For reason of limited time, we only present results based on therestaurant and music data in this slide. See more results in our CIKM paper.
我爱PPT中文网 整理 www.iloveppt.org
Baseline Approaches
We choose the state-of-the-arts CACF algorithms as baselines:
1). Differential context modeling (DCM): DCM incorporates contextsinto UserKNN/ItemKNN, but it suffers from sparsity problem and performs the worst in terms of precision, recall and MAP.
2). Context-aware Splitting Approaches (CASA): CASA is a contextualtransformation approach, where contextual data were converted to2D user-item rating matrix, and then traditional approach (MF inthis case) can be applied to the transformed data.
3). Context-aware Matrix Factorization (CAMF): CAMF incorporatescontexts into MF, where CRD is modeled as similar way as CSLIM.
4). Tensor Factorization (TF): TF is an independent context-awarealgorithm, since contexts are assumed to be independent with user and item dimensions. TF increases computational costs with the number of contexts increases.
我爱PPT中文网 整理 www.iloveppt.org
Evaluation Protocols
1). 5-folds Cross-validationAll algorithms were run based on the same 5-folds of the data.
2). Top-N Recommendation EvaluationsMetrics: Precision, Recall and MAP (Mean Average Precision)Precision and Recall are used to measure accuracy;MAP is used to measure the position in the rankings;
Research Questions:1). CSLIM outperforms the state-of-the-art CARS algorithms?2). How about the GCSLIM? Better than CSLIM?3). There are so many CLSIM algorithms, any guidelines to pre-select the appropriate CSLIM algorithm?
我爱PPT中文网 整理 www.iloveppt.org
Evaluation Results
Research Questions:1). CSLIM outperforms the state-of-the-art CARS algorithms?2). How about the GCSLIM? Better than CSLIM?3). There are so many CLSIM algorithms, any guidelines to pre-select the appropriate CSLIM algorithm?
我爱PPT中文网 整理 www.iloveppt.org
Evaluation Results
Research Questions:1). CSLIM outperforms the state-of-the-art CARS algorithms?2). How about the GCSLIM? Better than CSLIM?3). There are so many CLSIM algorithms, any guidelines to pre-select the appropriate CSLIM algorithm?
我爱PPT中文网 整理 www.iloveppt.org
Evaluation Results
Research Questions:1). CSLIM outperforms the state-of-the-art CARS algorithms?2). How about the GCSLIM? Better than CSLIM?3). There are so many CLSIM algorithms, any guidelines to pre-select the appropriate CSLIM algorithm?
There are two pieces in CSLIM algorithms; For example, CSLIM-I-CI1). CSLIM-I, indicates we perform an ItemKNN CF approach;2). – CI, indicates we model CRD as a CI matrix;
Questions:1). CSLIM-I/ItemKNN or CSLIM-U/UserKNN should be used?AW: it depends on the average number of ratings on items orthe average number of ratings by users.2). –CI, –CU or –C should be applied?AW: it relies on contexts are more dependent with users or items
For more details, see our CIKM paper.
我爱PPT中文网 整理 www.iloveppt.org
Evaluation Results
How about the running efficiency?Typically, in CSLIM and GCSLIM, the matrices D and W should be learned in the process. There could be different challenges:
1). Large number of users/items/ratingsIn this case, the non-contextual rating matrix R or the rating space P will be very large, as well as the matrix W.Solution: adopt KNN strategy. We do not use all the ratings, but only select the top-N neighbors (items or users).
2). Large scale of contextsWhat if there are tons of contextual conditions? Usually, in CARS domain, the # of contextual dimensions are within 10, and the # of contextual conditions are 100 at most.Solution: there are many ways to pre-select influential contexts, which contributes to reduce the # of contexts.
我爱PPT中文网 整理 www.iloveppt.org
Outline of the Talk
• Context-aware Recommender Systems (CARS)
• Collaborative Filtering and SLIM Recommenders
• CSLIM: Contextualizing SLIM Recommenders
• Experimental Evaluations
• Conclusions and Future Work
我爱PPT中文网 整理 www.iloveppt.org
Conclusions
1). CSLIM actually has been demonstrated to outperform the state-of-the-art CARS algorithms;2). GCSLIM sometimes contributes further improvements, but it is not guaranteed that GCSLIM can always beat CSLIM algorithms – it depends on how sparse the contextual ratings are;3). We figure out some influential factors and discover latent rules to select the appropriate CSLIM algorithms in advance.
1). Try to examine CSLIM and GCSLIM on larger data sets;2). Try to compete with more models, e.g. factorization machines;3). Try to couple CC matrix with users/items in GCSLIM approach;4). Try to incorporate contexts into matrix W instead of adding thematrix D.
Future Work
我爱PPT中文网 整理 www.iloveppt.org
Deviation-Based Contextual SLIM Recommenders
Yong Zheng, Bamshad Mobasher, Robin Burke
DePaul University, Chicago, IL, USA
@CIKM 2014, Shanghai, China, Nov 4, 2014
Thanks!
Questions?