Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

42
Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities Andrea Caielli, Marco Brambilla, Stefano Ceri, Florian Daniel [email protected] marcobrambi SoWeMine Workshop @ ICWE 2017, Rome, Italy

Transcript of Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Page 1: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Harvesting Knowledge from Social Networks:

Extracting Typed Relationships among Entities

Andrea Caielli, Marco Brambilla, Stefano Ceri, Florian Daniel

[email protected]

marcobrambiSoWeMine Workshop @ ICWE 2017, Rome, Italy

Page 2: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Agenda

(1)Context

(2)Objectives

(3)Method

(4)Experiments and Validation

(5)Visualization and Exploration

(6)Conclusions

Page 3: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

(1) Context

Page 4: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Ontology is the philosophical study ofthe nature of being, becoming,

existence or realityand the basic categories of being and their

relations.

Page 5: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Formalizing new knowledge is hard

Only high frequency emerges

The long tail challenge

Page 6: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Sourcing the Long Tail

Famous Emerging

Page 7: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

(2) Objective

Page 8: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Objective

Extraction of relationships among entities

Reconstruct a typed graph of entities & relationships

Represent the knowledge contained in social data

No need for a-priori domain knowledge

Page 9: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Knowledge Enrichment Setting

HF Entity1 HF Entity5

HF Entity2 HF Entity4

HF Entity3

LF Entity1??

LF Entity2 LF Entity4

LF Entity3

??

High Frequency

Entities

Low Frequency

Entities

??

??????

??

Type1

Type11

Type2

Type111

InstancesTypes

<<instanceof>>

<<instanceof>>

<<in

stan

ceof

>>

<<instanceof>>

<<instanceof>>

<<instanceof>>

??

??

??

??

??

Seed Entity

Seed TypeType of

interest

Legend

Expert inputs

Enrichment problems

Property2

Relations HF - LF entities

Relations LF - LF entities

Typing of LF entities

Extraction of new LF entities

Property1

?? ?? ??Finding attribute values

Page 10: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

A Practical Example

Page 11: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

A Practical Example

Page 12: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Challenge and Innovation

Highly unstructured social data (tweets and Facebook posts)

No reliable grammar structures

Page 13: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

(3) Method

Page 14: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Analysis Pipeline

(0) Preprocessing

(1) Entity Extraction

(2) Relationship Extraction

(3) Relationship Aggregation

(4) Relationship Typing

(1) Evolution of work presented in:

M. Brambilla, S. Ceri, E. Della Valle, R. Volonterio, and F. Acero Salazar.

“Extracting Emerging Knowledge from Social Media”, WWW 2017.

Page 15: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Pipeline Summary

Page 16: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

(0) Preprocessing

Text cleaning and enrichment

+ Traditional text preprocessing (stemming, …)

Page 17: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

(1) Entity Extraction

Entity identification and semantic typing

Exploiting:

Stanford CoreNLPNER

Dandelion API

Page 18: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

(2) Relationship Extraction

Baseline with Stanford OpenIE for triple extraction:

Several issues:

- Meaningless relations

- Wrong relations

- Multiple relations

Page 19: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

(3) Relationship Aggregation

Sails fans. Season 2 airs on May 24th on History on D Stv Jag Comms

Too many answers

for the same question!

Empirical rules

{"entity1":"Season 2",

"relationship":"air on",

"entity2":"May 24th"}

Page 20: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

(4) Relationship Typing (A): Synonyms

Exploiting synsets based on WordNet 3.1

Page 21: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

(4) Relationship Typing (B): Matching Types

Page 22: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

(4) Relationship Typing (C): Linguistics

Based on VerbNet

Groupings of verbs based on syntactic and semantic properties

Page 23: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Pipeline Implementation

Page 24: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

(4) Validation

Page 25: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Experiments

TV Series: Black Salis, Teen Wolf, Vikings

Milan Fashion Week

Rugby games

Page 26: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Domains and quality of results -summary

Page 27: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Relationships and Verb Classes

Page 28: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Example: Teen Wolf

0

100

200

300

400

500

600

700

800

Occ

urr

ence

s

Teen Wolf Synonyms Classes

Page 29: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Example: Teen Wolf

0

100

200

300

400

500

600

700

800

Occ

urr

ence

s

Teen Wolf Synonyms Classes

OC

CU

RR

ENC

ES

TEEN WOLF VERBNET CLASSES

Page 30: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Overall Quality Indexes of Entity and Relationships Extraction

Page 31: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

(5) Visualization

Page 32: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Motivation

Resulting semantic models extremely large and hard to interpret

Example:

Black Sails collection, containing 1243 entities and 2025 relations.

Page 33: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Exploration

Visualization

Filtering

Navigation

Page 34: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Exploration

Visualization

Filtering

Navigation

Page 35: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Exploration

Visualization

RELATIONSHIP Filtering

Navigation

Page 36: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Examples

Milano Fashion Week

Generated graph

Page 37: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Examples

Milano Fashion Week

Generated graph

Page 38: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Examples

Milano Fashion Week

Generated graph

Page 39: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Examples

Milano Fashion Week

Generated graph

Page 40: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

(6) Conclusions

Page 41: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Conclusions

Extraction of relevant emerging relationships feasible even in case of extremely unstructured

and informal content (social media)

Still a long way to perfect extraction:•N-ary relations•Time-dependency•Poor typing of entities in ontologies

Page 42: Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

THANKS! QUESTIONS?

Andrea Caielli, Marco Brambilla, Stefano Ceri, Florian Daniel

Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Marco Brambilla @marcobrambi [email protected]

http://datascience.deib.polimi.it http://home.deib.polimi.it/marcobrambi