Developing Recommendation Techniques for Scholarly Papers
description
Transcript of Developing Recommendation Techniques for Scholarly Papers
Developing Recommendation Techniques for Scholarly Papers
Kazunari Sugiyama
National University of Singapore
Previous Research Topics• Web Information Retrieval (@NAIST)
• How to Characterize Web page
• User Adaptive Information Retrieval
• Disambiguation (@TITECH)• Personal Name Disambiguation in Web Search Results
• Word Sense Disambiguation in Japanese Texts
2
Scholarly Paper Recommendation (@NUS)
3
• Senior researchers
• Junior researchersOnly one recently published paper without citations
Multiple published papers with citation papers
User Profile Construction (Junior Researchers)
4
Weighting schemeCosine similarity
User Profile Construction (Senior Researchers)
5
Weighting schemeCosine similarity
Forgetting factor
Feature Vector Construction for Candidate Papers• Basically, TF-IDF• Also use information about citation and reference papers
6
recpcp 1
References
recp
1refrecp
recpcrecpcrecrecpppp W 11 ffF
11 refrecrefrec ppW f
Weighting schemeCosine similarity
Is Pruning of Citation and Reference Papers Effective?
7
References
ip
1refip 2refip 3refip 4refip lrefip
sim:0.18 sim:0.58 sim:0.22 sim:0.36 sim:0.45
ipcp 1
sim:0.32 sim:0.27 sim:0.42 sim:0.25 sim:0.13
Threshold: 0.3
ipcp 2 ipcp 3 ipcp 4 ik pcp
Is Pruning of Citation and Reference Papers Effective?
8
References
ip
1refip 2refip 3refip 4refip lrefip
sim:0.18 sim:0.58 sim:0.22 sim:0.36 sim:0.45
ipcp 1
sim:0.33 sim:0.27 sim:0.42 sim:0.25 sim:0.13
ipcp 2 ipcp 3 ipcp 4 ik pcp
ipcpW 1
ipcpW 3
2refipW 4refipW lrefipW
Threshold: 0.3
Weighting schemeCosine similarity
ExperimentsExperimental Data• Researchers
• 15 junior researchers
• 13 senior researchersNLP and IR researchers who have publication
lists
in DBLP
• Candidate Papers to Recommend• ACL Anthology Reference Corpus
Includes information about citation and reference papers
9
Junior ResearchersThe most recent paper with pruning its reference papers
10
[NDCG@5]
Pruning is effective!
Senior ResearchersPast published papers with forgetting factor
11
[NDCG@5]
When and are small,FF is effective!
d
ExtensionsCharacterize the target paper using potential papers
Serendipitous recommendation
12
tgtp
(‘06) (‘07)(‘09)
tgtk pcp
(‘05)
tgtpcp 1
13
Potential paper that should cite the target paper
Characterize the target paper using potential papers
Finding potential papers with collaborative filtering
14
pc1 pc2 pc3 pci pcn-1 pcN
p10.212 0.735 0.687
p20.656 0.328 0.436
p30.764 0.527
ptgt0.581 0.330
pN-10.248
pN0.654 0.525
Pi (i=1, … ,N):All papers in the dataset
Pcj (j=1, … ,N):Papers as citation papersin the dataset0.536 0.4720.368 0.211
tgtp
(‘06) (‘07)(‘09)
tgtk pcp
(‘05)
tgtpcp 1
15
Potential paper that should cite the target paper
Characterize the target paper using potential papers
tgtpcp 3tgtN pcp
User 1
User 2
User 3
User n
User profile generated from history of contents
User profile for serendipitous
recommendationUser 4 (Sim: 0.16)Weight: 1/(0.16+1)
User 10 (Sim: 0.26)Weight: 1/(0.26+1)
User 5 (Sim: 0.21)Weight: 1/(0.21+1)
User 1 (Sim: 0.32)Weight: 1/(0.32+1)
User 1 (Sim: 0.14)Weight: 1/(0.14+1)
User profile for serendipitous
recommendation
User 7 (Sim: 0.25)Weight: 1/(0.25+1)
User profile for serendipitous
recommendation
User 6 (Sim: 0.07)Weight: 1/(0.07+1)
User 2 (Sim: 0.12)Weight: 1/(0.12+1)
User profile for serendipitous
recommendation
Serendipitous Recommendation