KIT – The Research University in the Helmholtz Association
INSTITUTE OF APPLIED INFORMATICS AND FORMAL DESCRIPTION METHODS (AIFB)
www.kit.edu
LinkSUM: Using Link Analysis to Summarize Entity Data
Andreas Thalhammer, Nelia Lasierra, and Achim Rettinger
16th International Conference on Web Engineering (ICWE 2016) 08.06.2016
Lugano
Institute of Applied Informatics and Formal
Description Methods (AIFB)
2
Outline
Introduction
Approach: LinkSUM
Related Resources
Predicate Selection
Configuration
Evaluation
Quantitative
Qualitative
Conclusions
08.06.2016 LinkSUM: Using Link Analysis to Summarize Entity Data
Institute of Applied Informatics and Formal
Description Methods (AIFB)
3
INTRODUCTION
LinkSUM: Using Link Analysis to Summarize Entity Data
08.06.2016
Institute of Applied Informatics and Formal
Description Methods (AIFB)
4
Motivation: Entity Summarization (I)
Examples for entities:
Movies:
Pulp Fiction
Kill Bill vol. 1
Books:
1984
A farewell to arms
People:
John Travolta
Arnold Schwarzenegger
etc.
08.06.2016 LinkSUM: Using Link Analysis to Summarize Entity Data
Example for data about an entity:
Institute of Applied Informatics and Formal
Description Methods (AIFB)
5
Motivation: Entity Summarization (II)
08.06.2016 LinkSUM: Using Link Analysis to Summarize Entity Data
How to decide:
Which facts should we show?
Facts are unranked in the
knowledge base.
Entities have individual
features (even if they are of
the same type).
Institute of Applied Informatics and Formal
Description Methods (AIFB)
6
Idea
Use link analysis for selecting facts.
08.06.2016 LinkSUM: Using Link Analysis to Summarize Entity Data
Step 1: Select top-k important related resources.
Step 2: Select the most relevant connecting predicate.
Strongly relevance-oriented.
Lightweight.
Avoids redundancy.
Institute of Applied Informatics and Formal
Description Methods (AIFB)
7
APPROACH: LINKSUM
LinkSUM: Using Link Analysis to Summarize Entity Data
08.06.2016
Institute of Applied Informatics and Formal
Description Methods (AIFB)
8
Related Resources (I)
08.06.2016 LinkSUM: Using Link Analysis to Summarize Entity Data
Compute PageRank [1] scores (pr) of entities with (untyped) links that
occur in textual descriptions of entities (i.e., Wikipedia).
l(r) – set of incoming links of r.
c(r) – number of outgoing links of r.
d – damping factor (usually 0.85).
Example:
dbpedia:Category:English-language_films 220.961
dbpedia:Quentin_Tarantino 137.403
dbpedia:John_Travolta 105.771
dbpedia:Miramax_Films 993.986
... ...
Institute of Applied Informatics and Formal
Description Methods (AIFB)
9
Related Resources (II)
08.06.2016 LinkSUM: Using Link Analysis to Summarize Entity Data
Use Backlinks [2] for finding strong connections:
Example:
Pulp FictionQuentin
Tarantino
director
dbpedia:Quentin_Tarantino dbpedia:Roger_Avary
dbpedia:Bruce_Willis dbpedia:Tim_Roth
dbpedia:John_Travolta dbpedia:Ving_Rhames
dbpedia:Samuel_L._Jackson dbpedia:Amanda_Plummer
dbpedia:Harvey_Keitel dbpedia:Lawrence_Bender
dbpedia:Miramax_Films dbpedia:Sally_Menke
dbpedia:Uma_Thurman dbpedia:Maria_de_Medeiros
dbpedia:Andrzej_Sekuła dbpedia:Rosanna_Arquette
dbpedia:Christopher_Walken dbpedia:Eric_Stoltz
Institute of Applied Informatics and Formal
Description Methods (AIFB)
10
Related Resources (III)
08.06.2016 LinkSUM: Using Link Analysis to Summarize Entity Data
Combined score for related resources:
Linear combination.
Normalized PageRank scores.
Indicator function on the set of Backlinks of e (bl(e)).
Parameter α (alpha) to be estimated.
Institute of Applied Informatics and Formal
Description Methods (AIFB)
11
Predicate Selection
Problem: multiple predicates connect two resources.
Approaches:
Frequency (FRQ)
#times the predicate is used
Exclusivity (EXC)
1 / (N + M)
Description (DSC):
#domain + #range + #label
08.06.2016 LinkSUM: Using Link Analysis to Summarize Entity Data
and combinations
of those, e.g. (FRQ * EXC)
Pulp
Fiction
Quentin
Tarantino
starring
director
Institute of Applied Informatics and Formal
Description Methods (AIFB)
12
CONFIGURATOIN
LinkSUM: Using Link Analysis to Summarize Entity Data
08.06.2016
Institute of Applied Informatics and Formal
Description Methods (AIFB)
13
Dataset and Measure
08.06.2016 LinkSUM: Using Link Analysis to Summarize Entity Data
Introduced in Gunaratna et al. [3].
Contains human-created summaries of 50 entities
(DBpedia 3.9, outgoing relations).
Includes seven top-5 and seven top-10 summaries for each entity.
The dataset was created by 15 experts from the Semantic Web field.
Used measure:
Institute of Applied Informatics and Formal
Description Methods (AIFB)
14
Configuration
08.06.2016 LinkSUM: Using Link Analysis to Summarize Entity Data
top-5 top-10
Parameters:
α – linear combination of PageRank and Backlinks.
Predicate selection – combinations of FRQ, EXC, and DSC.
Best configuration: α = 0.8 / α = 0.9, FRQ*EXC*DSC
Institute of Applied Informatics and Formal
Description Methods (AIFB)
15
EVALUATION
LinkSUM: Using Link Analysis to Summarize Entity Data
08.06.2016
Institute of Applied Informatics and Formal
Description Methods (AIFB)
16
Setup: Quantitative Evaluation
Compare results to the FACES system (introduced in [3]).
FACES:
Semantically diverse predicates via clustering.
Basic ranking heuristic for selecting cluster representatives.
Dataset and quality measure: like in configuration.
Evaluated configurations:
config-1: α = 0.8, FRQ*EXC*DSC
config-2: α = 0.9, FRQ*EXC*DSC
Significance testing:
Wilcoxon Signed-Rank Test with two tails.
08.06.2016 LinkSUM: Using Link Analysis to Summarize Entity Data
Institute of Applied Informatics and Formal
Description Methods (AIFB)
17
Results: Quantitative Evaluation
08.06.2016 LinkSUM: Using Link Analysis to Summarize Entity Data
SO: Subject-Object pairs (predicates not considered).
SPO: Full triple.
Significance with respect to both LinkSUM configurations (p < 0.05).
Significance with respect to the best LinkSUM configuration (p < 0.05).
Standard deviation.
SD
Institute of Applied Informatics and Formal
Description Methods (AIFB)
18
Setup: Qualitative Evaluation
08.06.2016 LinkSUM: Using Link Analysis to Summarize Entity Data
Scenario: Search Engine Result Page (SERP).
20 users, 10 entities (from the FACES dataset).
Institute of Applied Informatics and Formal
Description Methods (AIFB)
19
Results: Qualitative Evaluation
08.06.2016 LinkSUM: Using Link Analysis to Summarize Entity Data
In some cases the task is
subjective.
Reasons for:
Selection
- the presented related
resources are relevant for
the entity.
Rejection
- redundancy.
- related resources do not
characterize the entity.
Institute of Applied Informatics and Formal
Description Methods (AIFB)
20
CONCLUSIONS
LinkSUM: Using Link Analysis to Summarize Entity Data
08.06.2016
Institute of Applied Informatics and Formal
Description Methods (AIFB)
21
Conclusions
08.06.2016 LinkSUM: Using Link Analysis to Summarize Entity Data
LinkSUM improves on the state of
the art in entity summarization.
LinkSUM is lightweight and can be
applied in other scenarios, e.g.
Web sites with semantic
annotations.
Semantic MediaWikis.
Entity summarization in SERP
scenarios:
Focus should be on selecting
relevant resources.
Redundancies at the object level
should be avoided.
Institute of Applied Informatics and Formal
Description Methods (AIFB)
22
Lessons learned and future directions
Selected facts should provide information about the entity (main
difference to recommender systems).
Summaries and rankings are often subjective (but general tendencies
are noticeable).
Established quantitative evaluation datasets are still missing (although
different research efforts already targeted that problem).
Presentation aspects are very important (these should be neutralized in
qualitative evaluation [4]).
Personalization and contextualization of entity summaries is becoming
an important field (LinkSUM can serve as a basis).
08.06.2016 LinkSUM: Using Link Analysis to Summarize Entity Data
Institute of Applied Informatics and Formal
Description Methods (AIFB)
23 08.06.2016 LinkSUM: Using Link Analysis to Summarize Entity Data
Questions?
@thalhamm
Institute of Applied Informatics and Formal
Description Methods (AIFB)
24
Resources
DBpedia PageRank dataset:
http://people.aifb.kit.edu/ath/#DBpedia_PageRank
LinkSUM:
http://km.aifb.kit.edu/services/link/
International Workshop on Summarizing and Presenting Entities and
Ontologies:
http://km.aifb.kit.edu/ws/sumpre2015
http://km.aifb.kit.edu/ws/sumpre2016
FACES:
http://wiki.knoesis.org/index.php/FACES
08.06.2016 LinkSUM: Using Link Analysis to Summarize Entity Data
Institute of Applied Informatics and Formal
Description Methods (AIFB)
25
References
1. S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search
engine. In Proceedings of the 7th International Conference on World Wide
Web 7. Elsevier, 1998.
2. J. Waitelonis and H. Sack. Towards exploratory video search using linked
data. Multimedia Tools and Applications, 59:645–672, 2012. 10.1007/s11042-
011-0733-1.
3. K. Gunaratna, K. Thirunarayan, and A. P. Sheth. FACES: Diversity-Aware
Entity Summarization Using Incremental Hierarchical Conceptual Clustering.
In Proceedings of the 29th AAAI Conf. Artificial Intelligence, 2015, Austin,
Texas, USA., 2015.
4. A. Thalhammer and S. Stadtmüller. SUMMA: A Common API for Linked Data
Entity Summaries. In Engineering the Web in the Big Data Era. Springer,
2015.
08.06.2016 LinkSUM: Using Link Analysis to Summarize Entity Data
Top Related