Recommendation Engines for Scientific Literature
-
Upload
kris-jack -
Category
Technology
-
view
1.222 -
download
0
description
Transcript of Recommendation Engines for Scientific Literature
Recommendation Engines for Scientific
Literature
Kris Jack, PhDData Mining Team Lead
➔ 2 recommendation use cases
➔ literature search with Mendeley
➔ use case 1: related research
➔ use case 2: personalised recommendations
Summary
Use Cases
2) Personalised Recommendations● given a user's profile (e.g. interests)● find articles of interest to them
1) Related Research● given 1 research article● find other related articles
Two types of recommendation use cases:
Use Cases
2) Personalised Recommendations● given a user's profile (e.g. interests)● find articles of interest to them
1) Related Research● given 1 research article● find other related articles
My secondment(Dec-Feb):
Literature Search Using Mendeley
● Use only Mendeley to perform literature search for:● Related research● Personalised recommendations
Challenge!
Eating your own dog food...
Queries: “content similarity”, “semantic similarity”, “semantic relatedness”, “PubMed related articles”, “Google Scholar related articles”
Found:
0
Queries: “content similarity”, “semantic similarity”, “semantic relatedness”, “PubMed related articles”, “Google Scholar related articles”
Found:
1
Found:
1
Found:
2
Found:
4
Found:
4
Literature Search Using Mendeley
Summary of Results
Strategy Num Docs Found
Comment
Catalogue Search 19 9 from “Related Research”
Group Search 0 Needs work
Perso Recommendations 45 Led to a group with 37 docs!
Found:
64
Literature Search Using Mendeley
Summary of Results
Eating your own dog food... Tastes good!
Strategy Num Docs Found
Comment
Catalogue Search 19 9 from “Related Research”
Group Search 0 Needs work
Perso Recommendations 45 Led to a group with 37 docs!
Found:
64
64 => 31 docs, read 14 so far, so what do they say...?
Use Cases
1) Related Research● given 1 research article● find other related articles
Use Case 1: Related Research
User study (e.g. Likert scale to rate relatedness between documents). (Beel & Gipp, 2010)
TREC collections with hand classified 'related articles' (e.g. TREC 2005 genomics track). (Lin & Wilbur, 2007)
Try to reconstruct a document's reference list (Pohl, Radlinski, & Joachims, 2007; Vellino, 2009)
7 highly relevant papers (related research for scientific articles)
Q1/4: How are the systems evaluated?
Use Case 1: Related Research
Paper reference lists (Pohl et al., 2007; Vellino, 2009)
Usage data (e.g. PubMed, arXiv) (Lin & Wilbur, 2007)
Document content (e.g. metadata, co-citation, bibliographic coupling) (Gipp, Beel, & Hentschel, 2009)
Collocation in mind maps (Jöran Beel & Gipp, 2010)
7 highly relevant papers (related research for scientific articles)
Q2/4: How are the systems trained?
Use Case 1: Related Research
bm25 (Lin & Wilbur, 2007)
Topic modelling (Lin & Wilbur, 2007)
Collaborative filtering (Pohl et al., 2007)
Bespoke heuristics for feature extraction (e.g. in-text citation metrics for same sentence, paragraph). (Pohl et al., 2007; Gipp et al., 2009)
7 highly relevant papers (related research for scientific articles)
Q3/4: Which techniques are applied?
Use Case 1: Related Research
Topic modelling slighty improves on BM25 (MEDLINE abstracts) (Lin & Wilbur, 2007):- bm25 = 0.383 precision @ 5- PMRA = 0.399 precision @ 5
Seeding CF with usage data from arXiv won out over using citation lists (Pohl et al., 2007)
Not yet found significant results that show content-based or CF methods are better for this task
7 highly relevant papers (related research for scientific articles)
Q4/4: Which techniques have most success?
Use Case 1: Related Research
Progress so far...
Q1/2 How do we evaluate our system?
Construct a non-complex data set of related research:● include groups with 10-20 documents (i.e. topics)● no overlaps between groups (i.e. documents in common)● only take documents that are recognised as being in English● document metadata must be 'complete' (i.e. has title, year, author, published in, abstract, filehash, abstract, tags/keywords/MeSH terms)
→ 4,382 groups → mean size = 14 → 60,715 individual documents
Given a doc, aim to retrieve the other docs from its group● tf-idf with lucene implementation
Use Case 1: Related Research
Progress so far...
Q1/2 How do we evaluate our system?
Construct a non-complex data set of related research:● include groups with 10-20 documents (i.e. topics)● no overlaps between groups (i.e. documents in common)● only take documents that are recognised as being in English● document metadata must be 'complete' (i.e. has title, year, author, published in, abstract, filehash, abstract, tags/keywords/MeSH terms)
→ 4,382 groups → mean size = 14 → 60,715 individual documents
Given a doc, aim to retrieve the other docs from its group● tf-idf with lucene implementation
Use Case 1: Related Research
Progress so far...
Q1/2 How do we evaluate our system?
Construct a non-complex data set of related research:● include groups with 10-20 documents (i.e. topics)● no overlaps between groups (i.e. documents in common)● only take documents that are recognised as being in English● document metadata must be 'complete' (i.e. has title, year, author, published in, abstract, filehash, abstract, tags/keywords/MeSH terms)
→ 4,382 groups → mean size = 14 → 60,715 individual documents
Given a doc, aim to retrieve the other docs from its group
title
year
author
publishedIn
fileHash
abstract
generalKeyw
o rd
meshT
erms
keywords
tags
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
100.00%
Metadata Presence in Documents
Evaluation Data Det
Group
Catalogue
metadata field
% o
f do
cum
en
ts th
at f
ield
ap
pe
ars
in
Use Case 1: Related Research
Progress so far...
Q2/2 What are our results?
abstracttitle
generalKeywordmesh-term
authorkeyword
tag
0
0.05
0.1
0.15
0.2
0.25
0.3
tf-idf Precision per Field for Complete Data Set
metadata field
Pre
cisi
on
@ 5
Use Case 1: Related Research
Progress so far...
Q2/2 What are our results?
tag abstract mesh-term title general-keyword author keyword0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
tf-idf Precision per Field when Field is Available
metadata field
Pre
cisi
on
@ 5
Use Case 1: Related Research
Progress so far...
Q2/2 What are our results?
BestCombo = abstract+author+general-keyword+tag+title
bestComboabstract
titlegeneralKeyword
mesh-termauthor
keywordtag
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
tf-idf Precision for Field Combos for Complete Data Set
metadata field(s)
pre
cisi
on
@ 5
Use Case 1: Related Research
Progress so far...
Q2/2 What are our results?
BestCombo = abstract+author+general-keyword+tag+title
tagbestCombo
abstractmesh-term
titlegeneral-keyword
authorkeyword
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
tf-idf Precision for Field Combos when Field is Available
metadata field(s)
pre
cisi
on
@ 5
Use Case 1: Related Research
Future directions...?
Evaluate multiple techniques on same data set
Construct public data set● similar to current one but with data from only public groups● analyse composition of data set in detail
Train:● content-based filtering● collaborative filtering● hybrid
Evaluate the different systems on same data set
...and let's brainstorm!
Use Cases
2) Personalised Recommendations● given a user's profile (e.g. interests)● find articles of interest to them
Use Case 2: Perso Recommendations
Cross validation on user libraries (Bogers & van Den Bosch, 2009; Wang & Blei, 2011)
User studies (McNee, Kapoor, & Konstan, 2006; Parra-Santander & Brusilovsky, 2009)
7 highly relevant papers (perso recs for scientific articles)
Q1/4: How are the systems evaluated?
Use Case 2: Perso Recommendations
CiteULike libraries (Bogers & van Den Bosch, 2009; Parra-Santander & Brusilovsky, 2009; Wang & Blei, 2011)
Documents represent users and their citations documents of interest (McNee et al., 2006)
User search history (N Kapoor et al., 2007)
7 highly relevant papers (perso recs for scientific articles)
Q2/4: How are the systems trained?
Use Case 2: Perso Recommendations
CF (Parra-Santander & Brusilovsky, 2009; Wang & Blei, 2011)
LDA (Wang & Blei, 2011)
Hybrid of CF + LDA (Wang & Blei, 2011)
BM25 over tags to form user neighbourhood (Parra-Santander & Brusilovsky, 2009)
Item-based and content-based CF (Bogers & van Den Bosch, 2009)
User-based CF, Naïve Bayes classifier, Probabilistic Latent Semantic Indexing, textual TF-IDF-based algorithm (uses document abstracts) (McNee et al., 2006)
7 highly relevant papers (perso recs for scientific articles)
Q3/4: Which techniques are applied?
Use Case 2: Perso Recommendations
CF is much better than topic modelling (Wang & Blei, 2011)
CF-topic modelling hybrid, slightly outperforms CF alone (Wang & Blei, 2011)
Content-based filtering performed slightly better than item-based filtering on a test set with 1,322 CiteULike users (Bogers & van Den Bosch, 2009)
User-based CF and tf-idf outperformed Naïve Bayes and Probabilistic Latent Semantic Indexing significantly (McNee et al., 2006)
BM25 gave better results than CF but the study was with just 7 CiteULike users so small scale (Parra-Santander & Brusilovsky, 2009)
7 highly relevant papers (perso recs for scientific articles)
Q4/4: Which techniques have most success?
Use Case 2: Perso Recommendations
7 highly relevant papers (perso recs for scientific articles)
Q4/4: Which techniques have most success?
Advantage Disadvantage
Content-based
Human readable form of their profile
Quickly absorb new content without need for ratings
Tends to over-specialise
CF Works on an abstract item-user level so you don't need to 'understand' the content
Tends to give more novel and creative recommendations
Requires a lot of data
Use Case 2: Perso Recommendations
Our progress so far...
Q1/2 How do we evaluate our system?
Construct an evaluation data set from user libraries● 50,000 user libraries● 10-fold cross validation● libraries vary from 20-500 documents● preference values are binary (in library = 1; 0 otherwise)
Train:● item-based collaborative filtering recommender
Evaluate:● train recommender and test how well it can reconstruct the users' hidden testing libraries● mulitple similarity metrics (e.g. cooccurrence, loglikelihood)
Use Case 2: Perso Recommendations
Our progress so far...
Q2/2 What are our results?
Cross validation:● 0.1 precision @ 10 articles
Usage logs:● 0.4 precision @ 10 articles
Use Case 2: Perso Recommendations
Our progress so far...
Q2/2 What are our results?
Use Case 2: Perso Recommendations
Our progress so far...
Q2/2 What are our results?
Pre
cis i
on a
t 10
art
icle
s
Number of articles in user library
Use Case 2: Perso Recommendations
Future directions...?
Q2/2 What are our results?Evaluate multiple techniques on same data set
Construct data set● similar to current one but with more up-to-date data● analyse composition of data set in detail
Train:● content-based filtering● collaborative filtering (user-based and item-based)● hybrid
Evaluate the different systems on same data set
...and let's brainstorm!
➔ 2 recommendation use cases
➔ similar problems and techniques
➔ good results so far
➔ combining CF with content would likely improve both
Conclusion
www.mendeley.com
Beel, Jöran, & Gipp, B. (2010). Link Analysis in Mind Maps : A New Approach to Determining Document Relatedness. Mind, (January). Citeseer. Retrieved from http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:Link+Analysis+in+Mind+Maps+:+A+New+Approach+to+Determining+Document+Relatedness#0Bogers, T., & van Den Bosch, A. (2009). Collaborative and Content-based Filtering for Item Recommendation on Social Bookmarking Websites. ACM RecSys ’09 Workshop on Recommender Systems and the Social Web. New York, USA. Retrieved from http://ceur-ws.org/Vol-532/paper2.pdfGipp, B., Beel, J., & Hentschel, C. (2009). Scienstein: A research paper recommender system. Proceedings of the International Conference on Emerging Trends in Computing (ICETiC’09) (pp. 309–315). Retrieved from http://www.sciplore.org/publications/2009-Scienstein_-_A_Research_Paper_Recommender_System.pdfKapoor, N, Chen, J., Butler, J. T., Fouty, G. C., Stemper, J. A., Riedl, J., & Konstan, J. A. (2007). Techlens: a researcher’s desktop. Proceedings of the 2007 ACM conference on Recommender systems (pp. 183-184). ACM. doi:10.1145/1297231.1297268Lin, J., & Wilbur, W. J. (2007). PubMed related articles: a probabilistic topic-based model for content similarity. BMC Bioinformatics, 8(1), 423. BioMed Central. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/17971238McNee, S. M., Kapoor, N., & Konstan, J. A. (2006). Don’t look stupid: avoiding pitfalls when recommending research papers. Proceedings of the 2006 20th anniversary conference on Computer supported cooperative work (p. 180). ACM. Retrieved from http://portal.acm.org/citation.cfm?id=1180875.1180903Parra-Santander, D., & Brusilovsky, P. (2009). Evaluation of Collaborative Filtering Algorithms for Recommending Articles. Web 3.0: Merging Semantic Web and Social Web at HyperText ’09 (pp. 3-6). Torino, Italy. Retrieved from http://ceur-ws.org/Vol-467/paper5.pdfPohl, S., Radlinski, F., & Joachims, T. (2007). Recommending related papers based on digital library access records. Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries (pp. 418-419). ACM. Retrieved from http://portal.acm.org/citation.cfm?id=1255175.1255260Vellino, A. (2009). The Effect of PageRank on the Collaborative Filtering Recommendation of Journal Articles. Retrieved from http://cuvier.cisti.nrc.ca/~vellino/documents/PageRankRecommender-Vellino2008.pdfWang, C., & Blei, D. M. (2011). Collaborative topic modeling for recommending scientific articles. Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 448–456). ACM. Retrieved from http://dl.acm.org/citation.cfm?id=2020480
References