EVELIN: Exploration of Event and Entity Links in Implicit ... · tion of events. EVELIN is an...

EVELIN:Exploration of Event and Entity Links in Implicit Networks

Andreas Spitz, Satya Almasian and Michael GertzDatabase Systems Research Group, Heidelberg University, Im Neuenheimer Feld 205, 69120 Heidelberg, Germany

Representing Document Collections as Implicit Networks of Entities EVELIN Online Query Interface

Implicit entity networks are graph struc-tures that can be constructed easily fromlarge collections of text documents. Un-like knowledge bases, implicit networksdo not encode explicit, discrete relationsbetween entities. Instead, they representthe strength of entity relations as weightsof the edges between entities. Theseweights support ad-hoc information re-trieval on the underlying document col-lection. The recently introduced LOADnetworks [1] include entities of the typeslocation, organization, actor, and date,which frequently appear in the descrip-tion of events. EVELIN is an interactiveonline tool for the exploration and analy-sis of such entity-centric networks.

How It Works:For edges between entities x ∈ X andy ∈ Y , we generate weights ω(x, y) fromtheir textual distances δ (measured in thenumber of sentences) over all instances Iin which x and y are mentioned together.We normalize by neighbourhood size N .

ω(x, y) :=

(log

|Y ||N(x) ∩ Y |

)∑i∈I

e−δ(x,y,i)

Demo Document Collection:LOAD network created from the text ofthe English Wikipedia (excluding struc-tured tables). All entity mentions arelinked to Wikidata entities by followingWikipedia links. Temporal expressionsare annotated with HeidelTime [2], entitytypes classified according to YAGO [3].

Online Demonstration:The EVELIN online search interface forthe demonstration data is available as aweb application. The in-terface is mobile friendlyand you can try it out foryourself at:

evelin.ifi.uni-heidelberg.de

Input Entity Selection Entity Queries and Term Queries

Composing Queries from Entities:Enter a string and use the pop-up menuto select one of the suggested Wikidataentities from the database of stored en-tities that occur in the documents. Alter-natively, press enter to input the stringas an untyped search term. Continueadding more entities or launch a query.

A query consists of a set of query enti-ties Q and some target set X. They arelaunched by selecting the correspondingtarget set button. The result of a queryis a ranking of all entities in X accord-ing to their relation strength to entities inthe query set. An entity query results ina ranking of locations, organizations, ac-tors or dates. Entities in the result includelinks to the Wikidata knowledge base. Aterm query returns a ranking of stemmedkeywords for the query entities.

Query Extension:Discovered entities can be added to growthe query with a double click on the entity.

How It Works:We rank all entities x in target set X inrelation to entities in the query set Q ina two component ranking scheme. It ac-counts for the number of common neigh-bours and the weights of edges betweenthe target and query entities.

r(x|Q) := |N(x)∩Q| − 1 +1

smax

∑q∈Q

ω(q, x)

Sentence Queries

Similar to entities, sentences can beused as a query target. Here, the aimis to extract sentences that best describean entity or a combination of entities.

How It Works:A set of the most relevant terms TQ forthe query entities in Q is determined andused to refine the ranking of sentences sin a two component scheme.

r(s|Q) := |N(s) ∩Q| + |N(s) ∩ TQ||TQ| + 1

Page Queries

Entities or sets of entities can be linkedto informative pages in the document col-lection that describe them well. Since weuse Wikipedia as a document collection,we link entities to Wikipedia pages.

How It Works:Ranking scores are computed for sen-tences s based on the input entities andthen propagated to the pages p.

r(p|Q) := maxs∈p{|N(s)∩Q|}+ 1

smax

∑s∈p

r(s|Q)

Subgraph Queries

For exploratory purposes, subgraphs canbe extracted around the set of query enti-ties to highlight the immediate neighbour-hood. Here, we consider the three high-est ranked entities of each type for thesubgraph construction. Due to the den-sity of neighbourhoods, this extractiontends to be slower than ranking queries.

Contact Information:Andreas [email protected]://dbs.ifi.uni-heidelberg.de/

References[1 ] A. Spitz and M. Gertz: Terms over LOAD: Leveraging Named Entities for Cross-Document

Extraction and Summarization of Events. SIGIR’16, 2016[2 ] J. Strotgen and M. Gertz: Multilingual and Cross-domain Temporal Tagging.

Language Resources and Evaluation, 47(2): 269–298, 2013[3 ] F. Mahdisoltani, J. Biega and F. Suchanek: YAGO3: A Knowledge Base from Multilingual Wikipedias. CIDR’14, 2014

This demonstration was presented at the 26th International World Wide Web Conference (WWW ’17), April 3-7, 2017, Perth, Australia.

EVELIN: Exploration of Event and Entity Links in Implicit ... · tion of events. EVELIN is an...

Documents

Transcript of EVELIN: Exploration of Event and Entity Links in Implicit ... · tion of events. EVELIN is an...