Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering,...
-
Upload
rebecca-logan -
Category
Documents
-
view
212 -
download
0
Transcript of Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering,...
![Page 1: Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology polcicova@dcs.elf.stuba.sk.](https://reader035.fdocuments.in/reader035/viewer/2022072010/56649dd05503460f94ac5d15/html5/thumbnails/1.jpg)
Combining Content-based and Collaborative Filtering
Department of Computer Science and Engineering, Slovak University of Technology
Gabriela PolčicováPavol Návrat
![Page 2: Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology polcicova@dcs.elf.stuba.sk.](https://reader035.fdocuments.in/reader035/viewer/2022072010/56649dd05503460f94ac5d15/html5/thumbnails/2.jpg)
Overview
• Information Filtering and its Types• Combined Method• Experiment with Information
Filtering Methods• Conclusions
![Page 3: Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology polcicova@dcs.elf.stuba.sk.](https://reader035.fdocuments.in/reader035/viewer/2022072010/56649dd05503460f94ac5d15/html5/thumbnails/3.jpg)
Information Filtering (1)
– delivery of relevant information to the people who need it
• Types of Information Filtering
– Content-based - for textual documents
– Collaborative - for communities of users
• Interests
– information about interests - stored in profiles
– expressing opinions to documents - ratings
• Ratings {i, j, rij}
– for user i, item j, the value of rating rij
![Page 4: Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology polcicova@dcs.elf.stuba.sk.](https://reader035.fdocuments.in/reader035/viewer/2022072010/56649dd05503460f94ac5d15/html5/thumbnails/4.jpg)
Information Filtering (2)
Filter
Learninginterests
Estimating the value of rating
Choosingrecommendations
Rated items{user, item, value}
Unrated items{user, item}
Recommendations{user, item, estimation}
![Page 5: Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology polcicova@dcs.elf.stuba.sk.](https://reader035.fdocuments.in/reader035/viewer/2022072010/56649dd05503460f94ac5d15/html5/thumbnails/5.jpg)
Content-based Filtering (1)
• Basic idea
– recommending documents based on content and properties of document
• Profile
– consists of keywords with assigned weights
– only documents matching profile are recommended
• Recommendations
– based on objective measurable properties
![Page 6: Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology polcicova@dcs.elf.stuba.sk.](https://reader035.fdocuments.in/reader035/viewer/2022072010/56649dd05503460f94ac5d15/html5/thumbnails/6.jpg)
Content-based Filtering (2)
Documents rated by the user
Documents of interest
Documents unrated by the user
PROFILEKeywords, phrases
with weightsDocuments matching profile=> recommended documents
Documents, ratings
![Page 7: Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology polcicova@dcs.elf.stuba.sk.](https://reader035.fdocuments.in/reader035/viewer/2022072010/56649dd05503460f94ac5d15/html5/thumbnails/7.jpg)
Collaborative Filtering (1)
• Basic idea
– automating “word of mouth”
– leverage opinions of like-minded users while making decisions
• Schema
– collecting users’ opinions
– searching for like-minded users
– making recommendations
![Page 8: Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology polcicova@dcs.elf.stuba.sk.](https://reader035.fdocuments.in/reader035/viewer/2022072010/56649dd05503460f94ac5d15/html5/thumbnails/8.jpg)
Collaborative Filtering (2)
Profile ofcurrentuser
Profile ofuser 1
Profile ofuser 2
Profile ofuser 3
Profile ofuser 4
Profile ofuser 5
Documents fromlike-minded users’
profiles=> recommended
documents
![Page 9: Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology polcicova@dcs.elf.stuba.sk.](https://reader035.fdocuments.in/reader035/viewer/2022072010/56649dd05503460f94ac5d15/html5/thumbnails/9.jpg)
kci =
(rcj - rc) (rij - ri) j Ici
(rcj - rc)2 (rij - ri)2 j Ici j Ici
• Recommendations computation: weighted sum of ratings
rcj = rc +
(rij - ri) kci i Ucj
|kci|i Ucj
Collaborative Filtering (3)
• Similarity measure: Pearson Correlation Coefficient
![Page 10: Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology polcicova@dcs.elf.stuba.sk.](https://reader035.fdocuments.in/reader035/viewer/2022072010/56649dd05503460f94ac5d15/html5/thumbnails/10.jpg)
Combining Content-based and Collaborative Filtering (1)
• Computing of estimates for missing ratings by Content-based Filtering method for each user
• Searching for like-minded users
– computing coefficient kci between current and i-th user (only from ratings)
– computing coefficient kci’ between current and i-th user (from both ratings and estimates)
• New recommendations computation
– using ratings (with coefficients kci) and also ratings with estimates (with coefficient kci’) as weights in weighted sum of ratings and estimates
![Page 11: Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology polcicova@dcs.elf.stuba.sk.](https://reader035.fdocuments.in/reader035/viewer/2022072010/56649dd05503460f94ac5d15/html5/thumbnails/11.jpg)
Datasets for Experiments
• Data:
– EachMovie - users‘ ratings for movies
www.research.digital.com/SRC/eachmovie/
– IMDB - textual information for CBF (movies‘ descriptions)
www.imdb.com/
• Datasets:
– A - ratings from the period up to Mar 1, 1996
(810 ratings from 71 users)
– B - ratings from the period uo to Mar 15, 1996
(2407 ratings from 131 users)
– C - ratings from the period up to Apr 1, 1996
(12290 ratings from 651 users)
![Page 12: Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology polcicova@dcs.elf.stuba.sk.](https://reader035.fdocuments.in/reader035/viewer/2022072010/56649dd05503460f94ac5d15/html5/thumbnails/12.jpg)
EachMovie Data and Constant Method
Percentage of ratings in EachMovie
0%5%10%15%20%25%30%35%40%45%
1 2 3 4 5 6
ratings
A
B
C
• Constant Method rcj = 5
![Page 13: Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology polcicova@dcs.elf.stuba.sk.](https://reader035.fdocuments.in/reader035/viewer/2022072010/56649dd05503460f94ac5d15/html5/thumbnails/13.jpg)
Experiments with Combination of Content-based and Collaborative Filtering (2)
Dataset
Divide dataset into training
set (90%) and test set (10%)
Apply filtering methods and evaluate their performance
Content-basedFiltering method
CollaborativeFiltering method
CombinedFiltering method
recommendations
recommendations
recommendations
test, training sets
test, training sets
Evaluation of methods’ performance
Constantmethodrecommendations
test set
![Page 14: Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology polcicova@dcs.elf.stuba.sk.](https://reader035.fdocuments.in/reader035/viewer/2022072010/56649dd05503460f94ac5d15/html5/thumbnails/14.jpg)
Metrics
• Coverage = percentage of items for which the method is able to compute estimates
• Accuracy =
• F-measure =
• NMAE =
2.Precision.RecallPrecision + Recall
|R L| + |R L||L| + |L|
|R L||R|
|R L||L|
|rij - rij|n.s
Precision =
Recall =
R - set of recommended itemsL - set of liked items
![Page 15: Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology polcicova@dcs.elf.stuba.sk.](https://reader035.fdocuments.in/reader035/viewer/2022072010/56649dd05503460f94ac5d15/html5/thumbnails/15.jpg)
Results of Experiments
Coverage
0,8
0,85
0,9
0,95
1
A B C
Accuracy
0,7
0,75
0,8
0,85
0,9
A B C
F-measure
0,8
0,85
0,9
0,95
1
A B C
F-measure
0,8
0,85
0,9
0,95
1
A B C
CF
CBF
combined
constant
![Page 16: Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology polcicova@dcs.elf.stuba.sk.](https://reader035.fdocuments.in/reader035/viewer/2022072010/56649dd05503460f94ac5d15/html5/thumbnails/16.jpg)
Conclusions
• Combination of content-based and collaborative filtering might help in initial phase
Future work
• Weighting of coefficients
• Comparing method with additional methods
![Page 17: Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology polcicova@dcs.elf.stuba.sk.](https://reader035.fdocuments.in/reader035/viewer/2022072010/56649dd05503460f94ac5d15/html5/thumbnails/17.jpg)
Content-based Filtering - Vector Representation of Documents and Profiles
Wj= (0, … , 0, 0.5 , 0, … , 0, 0.3 , 0, … , 0, 0.2 , 0, … , 0)
profilei = rj .wij
n
j = 1
D = ( … , computer, … , learning, … , machine, …. )
Documentj
computer machine learning
TF-IDF
TF-IDFTF-IDF
W . Profile
|W| . |Profile|Sim(W, Profile) =
![Page 18: Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology polcicova@dcs.elf.stuba.sk.](https://reader035.fdocuments.in/reader035/viewer/2022072010/56649dd05503460f94ac5d15/html5/thumbnails/18.jpg)
Collaborative Filtering - Example
A B C D E F G
current 1 4 5
1 3 5 1 2
2 1 3 2 5
3 5 1 4 5
4 1 4 2 4
5 2 4 2 5
2
![Page 19: Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology polcicova@dcs.elf.stuba.sk.](https://reader035.fdocuments.in/reader035/viewer/2022072010/56649dd05503460f94ac5d15/html5/thumbnails/19.jpg)
kci =
(rcj - rc) (rij - ri) j Ici
(rcj - rc)2 (rij - ri)2 j Ici j Ici
• Recommendations computation: weighted sum of ratings and estimates
rcj = rc +
(rij - ri) kci + (rij - ri) kci’i Ucj
CBF
|kci| + |kci’|
i U’cj
i Ucj i U’cj
Combining Content-based and Collaborative Filtering (2)
• Similarity measure: Pearson Correlation Coefficient
’ ’
’ ’
CBF CBF
CBF CBF
![Page 20: Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology polcicova@dcs.elf.stuba.sk.](https://reader035.fdocuments.in/reader035/viewer/2022072010/56649dd05503460f94ac5d15/html5/thumbnails/20.jpg)
Experiments with Combination of Content-based and Collaborative Filtering (1)
• Content-based Filtering Method (CBF)
– documents and profiles: vector representation - weighted keywords (TF-IDF)
– estimation computation: normalized dot product of document and profile vectors
• Collaborative Filtering (CF)
– Pearson correlation coefficient
– weighted sum of ratings
• Combination of CF and CBF
– Pearson correlation coefficients
– weighted sum of ratings and CBF estimations
• Constant Method (rcj = 5)