Newsjunkie
-
Upload
fan-jin -
Category
Technology
-
view
39 -
download
0
Transcript of Newsjunkie
![Page 1: Newsjunkie](https://reader035.fdocuments.in/reader035/viewer/2022080907/55ab66c31a28ab94148b46b1/html5/thumbnails/1.jpg)
Newsjunkie: Providing Personalized Newsfeeds via Analysis of Information Novelty
-- Evgeniy Gabrilovich
Fan Jin
4047 8963
![Page 2: Newsjunkie](https://reader035.fdocuments.in/reader035/viewer/2022080907/55ab66c31a28ab94148b46b1/html5/thumbnails/2.jpg)
What’s the problem and what’s the solution?
• Users of news sites do not want to read every piece of information over and over again, they are primarily interested in learning what’s new
• Newsjunkie is designed to rank news by its novelty
• Results have been evaluated and tested with baseline
• Personalize Newsjunkie to match user’s special requirements
![Page 3: Newsjunkie](https://reader035.fdocuments.in/reader035/viewer/2022080907/55ab66c31a28ab94148b46b1/html5/thumbnails/3.jpg)
A framework for comparing text collections• Features used to represent documents: vectors of TF.IDF weights
• Distance matrices are used to identify most different documents from previous read documents: Kullback-Leibler (KL) divergence
• Algorithm to rank news by novelty
R seedStory
for i = 1 to D
d dist(d, R)
R R ∪ {d}
![Page 4: Newsjunkie](https://reader035.fdocuments.in/reader035/viewer/2022080907/55ab66c31a28ab94148b46b1/html5/thumbnails/4.jpg)
Evaluate results
• Data:
12 topics of news span 2-9 days, 36-328 articles in each topic
• Baseline method
Chronological ordering of articles
• Evaluation methods
People are asked to read all documents and make decision which carries most novel information
• Hypothesis testing
Wilcoxon signed-rank test, an alternative to the paired student t-test
![Page 5: Newsjunkie](https://reader035.fdocuments.in/reader035/viewer/2022080907/55ab66c31a28ab94148b46b1/html5/thumbnails/5.jpg)
Personalized news updates
• A single daily update
• Reporting breaking news