Summarization of Multiple , Metadata Rich , Product Reviews
description
Transcript of Summarization of Multiple , Metadata Rich , Product Reviews
Summarization of Multiple, Metadata
Rich,Product Reviews
Fotis Kokkoras, Efstratia Lampridou,
Konstantinos Ntonas, Ioannis Vlahavas
Department of Informatics – Aristotle University of ThessalonikiLPIS Group: http://lpis.csd.auth.gr
MSoDa '08
ECAI 2008 Workshop on Mining Social Data
2
Introduction Modern, successful on-line shops allow
consumers to express their opinion on products and services they purchased. These reviews are valuable for new customers.
If there are dozens, or even hundreds, of reviews for a single product, their utilization is time-consuming.
The need for automatically generated summaries of these reviews is obvious.
3
Summarization Background Types of summary:
Extractive: use sentences from the original text Abstractive: reuse sentence fragments
Text features usually used: frequency and location of words, sentence location in
article, syntactic rules, dictionaries of important words Various Techniques/Approaches
Machine Learning Techniques LSA (Latent Semantic Analysis) Lexical Chains Cluster-based
They perform well on article-style texts.
4
The Special Nature of Reviews On-line product reviews in e-shops, are quite
different than article-style texts: They are usually short and do not obey to strict
syntactic rules. They convey only the subjective opinion of each
reviewer. there are a lot of reviewers!
They include a lot of repeated content. There are usually too many reviews.
5
What is the problem? Traditional summarization techniques do
not work very well of such data. Why?
a frequently mentioned problem can be reported many times in the summary of summarizers that work on the sentence level
reuse of sentence fragments to construct new sentences is risky because reviews are short with weak/poor syntax
it is difficult to detect biased reviews based on their text only
6
Motivation
On-line reviews are usually accompanied by various metadata, such as: buyer's technology level, ownership of the product, overall judgment for the product or service, in some scale, labeled (positive or negative) or unlabeled comments, usefulness of the review to other customers, etc.
How can these metadata help in summarization?
7
Our Approach ReSum Algorithm (Review Summarizer)
Creates extractive summary Uses dictionary of important words and metadata Is applied separately for (+) and (-) comments
For each product two summaries are created
How it works Scores the sentences based on their words Adjusts the initial score based on the metadata Selects sentences avoiding repetition of concepts
Tested on newegg.com
8
Requirements A dictionary D of important words for the
domain: automatically created from a few thousands
reviews of the domain in question concatenation of reviews removal of common (500) English words selection of the top 150 most frequent words
Access to the reviews (and their metadata): we use DEiXTo, an in-house
developed, web content extraction system
HTML/DOM based extraction rules
9
Step 1: Concatenate all positive (or negative)
comments and divide them into separate sentences.
Remove stop words, punctuation, numbers, etc Count frequency fv of every word v.
Step 2: Score every sentence i based on its words and
the dictionary D:
ReSum – Initial Scoring
Dv
vDv
vi
j
j
j
jffR 2
10
ReSum – Metadata Contribution Metadata used:
Reviewer’s Technology Level (w1) Ownership duration of the product (w2) Usefulness of a review to other users (w3)
Step 3: Initial score Ri is adjusted based on the
metadata, in a weighted fashion: weights are initialized using multicriteria techniques
(will be explained later)
3
1kkiii wRRS
11
ReSum – Redundancy Elimination Step 4:
Select the sentence with the highest score S. Penalize the rest sentences that share common
words with the selected. This eliminates redundancy.
Dv
vDv
vii
j
j
j
jffSS 2'
The step is repeated until the desired number of sentences is reached.
12
Weight Initialization (1/3) Subjective task
we need a consistent way for weight initialization
Analytic Hierarchy Process (AHP–Saaty ‘99) multicriteria method provides a methodology to calculate consistent
weights for selection criteria, according to the importance we assign to them
importance values are selected from a predefined scale (defined by AHP)
13
Weight Initialization (2/3)
Tech level Ownership Usefulness
Tech level 1 1/2 3/2
Ownership 2 1 2/3
Usefulness 3 2 1
Value Interpretation
1 Criteria a and b are of the same importance.
2 Criterion a is very little more important than b.
3 Criterion a is a little more important than b.
5 Criterion a is enough more important than b.
etc (up to 9) etc
Subjective Importance Values we used
Fundamental Scale of AHP
14
Weight Initialization (3/3) Calculated weights: w’1=0.14, w’2=0.24, w’3=0.62 Initial weights were further adjusted based on the
metadata values:
otherwise
echLevelww
0
'1
1
highT
otherwise
yearathanmore Ownership
0
'2
2
ww
24.124.1
),('3
δ2.0
'33
w
e1
1wgw
vv
-0.8
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
-40 -30 -20 -10 0 10 20 30 40
15
Experimental Results (1/2) Dataset:
1587 reviews from newegg.com 3 domains (monitors, printers, cpu coolers) 9 products (3 from each domain)
Reference Summary manually generated by 3 human experts
Comparison Systems Two commercial summarizers:
TextAnalyst (Megaputer Intelligence Inc) Copernic (Copernic Inc)
Naive ReSum contribution of metadata (step 3) was removed
16
Experimental Results (2/2) Average Recall: 91.7 (78.8), 69.5, 54 Average Precision: 73.3 (62.8), 58.3, 53.3
Precision %
0
10
20
30
40
50
60
70
80
90
100
Monitor Α Monitor Β Monitor C Printer Α Printer B Printer C Cooler Α Cooler B Cooler C
ReSum
Naïve ReSum
Copernic
TextAnalyst
Recall %
0
10
20
30
40
50
60
70
80
90
100
MonitorΑ
MonitorΒ
MonitorC
Printer Α Printer B Printer C Cooler Α Cooler B Cooler C
ReSumNaïve ReSumCopernicTextAnalyst
17
Interesting Facts in our Summaries Neither biased nor abusive comments
appeared it did happened in the other 3 systems
Comments with low frequency but with significant meaning were included was not the case for the other 3 systems
Repetition of concepts was minimal or absent thanks to the redundancy elimination step that’s why naive ReSum performed so well repetition in Copernic and TextAnalyst was
evident
18
Conclusions Metadata can contribute to a better
summary. We proposed an algorithm for summarizing
on-line, metadata rich, product reviews. Is Statistical in it's nature. Assumes labeled comments (pros & cons). Works at the sentence level:
Ranks sentences based on some "importance” measure and selects the N most important of them.
Uses metadata to make "good" ranking.
19
Future Work Generalize our methodology to adapt to the
availability or not of the various metadata. the scoring algorithm is modular – can easily
add or remove weights/metadata Remove the requirement for categorized
reviews (positive and negative)
Summarization of Multiple, Metadata
Rich,Product Reviews
Fotis Kokkoras, Efstratia Lampridou,
Konstantinos Ntonas, Ioannis Vlahavas
Department of Informatics – Aristotle University of ThessalonikiLPIS Group: http://lpis.csd.auth.gr
MSoDa '08
ECAI 2008 Workshop on Mining Social Data
Thank you!
21
Monitor A - ReSum PROS1. Great resolution, clear picture, very very good price, 24in monitors are gigantic, widescreen
aspect ratio makes dvds look awesome 2. Very, VERY bright, HDMI, no dead pixels, looks much nicer than online photos, unbeatable
viewing angle 3. Excellent color reproduction; fantastic image and text quality; very good brightness and contrast;
HDMI input; unbeatable value4. Several things stood out above all other monitors I'd considered: Almost non-existent issues of
dead/stuck pixels5. Resolution & sharpness is amazing In my opinion, sleek design Functional speakers (not the best)
Audio output is available Multiple inputs
CONS1. So when Windows power management turns off the monitor signal, instead of turning off the
monitor goes to bluescreen and says ""no signal"" on the HDMI input 2. no height or rotation adjustments; flimsy base; awkward location of OSD buttons; no DVI
connection (no DVI to HDMI cable included)3. Weak stand, awful menu controls, no audio out, no USB ports, low buzzing sound when
brightness turned down 4. This monitor is so darn tall it strains my neck a bit to view it - but that's simply a natural
consequence of its size5. Doesn't come with a DVI to HDMI cable that you will need to run this with a computer to get a
good picture (don't use the vga port)