Co-funded by the European Union Information Access through Textual Entailment: The Experience of the...

69
Co-funded by the European Union Information Access through Textual Information Access through Textual Entailment: The Experience of the Entailment: The Experience of the QALL-ME project QALL-ME project Bernardo Magnini FBK-irst, Trento, Italy

Transcript of Co-funded by the European Union Information Access through Textual Entailment: The Experience of the...

Page 1: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

Co-funded by the European Union

Information Access through Textual Information Access through Textual Entailment: The Experience of the Entailment: The Experience of the

QALL-ME projectQALL-ME project

Bernardo MagniniFBK-irst, Trento, Italy

Page 2: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 2

OutlineOutline The Qallme scenario

Semantic Interpretation of user queriesS

uggested direction: textual entailment engines

Interacting with the userS

uggested direction: provide answers with as much structure a

s possible (RDF)

Porting the systemS

uggested direction: learn as much as possible from data (user q

uestions)

Conclusions

Page 3: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 3

QALLME

Reference: FP6 IST-033860 Contract Type: STREP Start date: October 1st, 2006 Duration: 36 months Project Funding: 2.82 M euros

http://qallme.itc.it

FBK-irst, Italy Comdata S,p.A., Italy

DFKI, Germany Ubiest S.p.A., Italy

University of Alicante, Spain Waycom S.r.l., Italy

University of Wolverhampton, UK

Question Answering Learning Technologies in a Multilingual and Multimodal Environment

Page 4: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 4

Query Driven vs Answer Driven Information Access How many people live in Trento?

No answer in the first ten documents using Google.

When did Hitler attack Soviet Union? We find documents containing the question itself, no matter

whether or not the answer is actually provided.

Current information access is query driven. Question Answering proposes an answer driven approach to

information access.

See how Google and Yahoo answer to “Who is Bill Clinton?”

Page 5: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 5

SMSSMS

INPUT OUTPUT

SMSSMS

MMSMMS

VOICE

TEXT

TEXT

VOICE

VIDEO

DIGITAL

ASSISTANT

QALL-ME QALL-ME ScenarioScenario Mobile Devices: Mobile Phones & PDA Question Input: Voice/SMS Answer Output: Voice/SMS/MMS/Digital Assistant

(Images/Audio/Video/Maps and geo-referenced interactive maps)

Page 6: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 6

hallo I am in Trento and I would like to visit a church in the centre of the town I would like to know the name and the location of one of these churches thanks

from the QALL-ME benchmark

QALL-ME: RequestsQALL-ME: Requests

Page 7: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 7

hallo I am in Trento and I would like to visit a church in the centre of the town I would like to know the name and the location of one of these churches thanks

To greet

from the QALL-ME benchmark

QALL-ME QuestionsQALL-ME Questions

Page 8: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 8

hallo I am in Trento and I would like to visit a church in the centre of the town I would like to know the name and the location of one of these churches thanks

To contextualise

This is explicit context Time is implicit

from the QALL-ME benchmark

QALL-ME QuestionsQALL-ME Questions

Page 9: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 9

hallo I am in Trento and I would like to visit a church in the centre of the town I would like to know the name and the location of one of these churches thanks

To ask from the QALL-ME benchmark

QALL-ME QuestionsQALL-ME Questions

Page 10: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 10

hallo I am in Trento and I would like to visit a church in the centre of the town I would like to know the name and the location of one of these churches thanks

To thank

from the QALL-ME benchmark

QALL-ME QuestionsQALL-ME Questions

Page 11: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 11

audio transcr. Eng. translat.

speech acts

EAT Sekine

EAT ontology

ITALIAN X X X X X X

SPANISH X X X X X X

ENGLISH X X --- X almost finished almost finished

GERMAN X X X X in progress in progress

Both the QALL-ME benchmark and QALL-ME ontology are being made incrementally available at the project website (http://qallme.fbk.eu) under a creative common licenseTwo papers at LREC 2008

Qallme benchmark Acquisition for four languages (about 12,000 requests in total). Semantic annotations: transcriptions, speech acts, EAT, translations

Qallme Ontology: version 4

QALL-ME ResourcesQALL-ME Resources

Page 12: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 12

Front-EndAPPLICATION

Front-EndAPPLICATION

CLIENT LIBRARYCLIENT LIBRARY

VirtualPhoneEngine

VirtualPhoneEngine

ManagerASR

ManagerASR

ASREngineASR

EngineResourceInterfaceResourceInterface

ManagerTTS

ManagerTTS

ResourceInterfaceResourceInterface

TTSEngineTTS

Engine

VoicedataVoicedata

IPIP

IPIP

APIAPI

APIAPI

Server Side APPLICATIONServer Side

APPLICATION

QALL-MEQALL-MEWebservicesWebservices

WebservicesWebservices

IPIP

IPIP

Waycom srl, Demo Prototype

ResourceInterface

(German/English)

ResourceInterface

(German/English)

Application data

Application data

TOWN

-TRENTO

Address

- VIA VERDI 3

QALL-ME QALL-ME Mobile InfrastructureMobile Infrastructure

Page 13: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 13

ShowcasesShowcases

Cinema and Accommodation domain• Automatic procedures for daily updating

(Trento) • Distributed services • Cross-language• More complex questions

Mobile showcase • Infrastructure has been consolidated• Run on Comdata server• Nokia N95 with GPS• Speech input (Italian only)• Cross-language: SMS only• Navigation• Text to Speech

Page 14: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 14

QALL-ME QALL-ME architecturearchitecture

Spanish Answer

Extractor

Italian Answer

Extractor

German Answer

ExtractorQALL-ME

central QA planner

Service Provider

Question Type

Ontology

Answer Type Ontology

Dialog Models

English Answer

Extractor

Local Information Sources

Shared Semantic

representation

Speech Recognizer

s

Page 15: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 15

Structured and Unstructured DataStructured and Unstructured Data

MOVIEtitle 007 casino Royale

… …

Date From: 01/26/2007

To: 02/01/2007

Hours From 01/26 to 01/30: 19.30

02/01: 18.20

… …

Original Title

Casino Royale

Director Martin Campbell

Genre Action

Characters James Bond

Page 16: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

QALL-ME in a nutshellQALL-ME in a nutshell

User Data

Question Collection

Question Annotation

TrainingEntailment

Engine

QallmeOntology

Question

Presentationoutput

AnswerRepresentation

PresentationTemplate

QALL-ME

M

M

M

A

SM

Page 17: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 17

QALL-ME QALL-ME ArchitectureArchitecture

Page 18: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 18

OutlineOutline The Qallme scenario

Semantic interpretation of user queriesS

uggested direction: Entailment Engine

Presenting information

How to build the system

Conclusions

Page 19: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 19

Question Interpretation

Given:1. A domain ontology

Domain ontology

(entailment-based Relation Extraction)

Page 20: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 20

Given:1. A domain ontology describing binary relations of interest

Domain ontology

(entailment-based RE)

Question Interpretation

Page 21: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 21

Question

Given:1. A domain ontology describing binary relations of interest2. A natural language question

Domain ontology

(entailment-based RE)

Question Interpretation

Page 22: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 22

Domain ontology

Question

Given:1.A domain ontology describing binary relations of interest2.A natural language question

Determine ALL the relations of interest expressed by the question

(entailment-based RE)

Question Interpretation

Page 23: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 23

Q1

Q2

Q3

Q4

Q5

Q6

Q7

Q8

Q9

Q10

(entailment-based RE)

Question Interpretation

Page 24: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 24

Out of domain questions

Q1

Q2

Q3

Q4

Q5

Q6

Q7

Q8

Q10

(entailment-based RE)

Q9

Question Interpretation

Page 25: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 25

The task: example

R1: HasDirector(Movie,Director) R6: HasDescription(Movie,Description)

R2: HasGenre(Movie,Genre) R7: IsInSite(Movie, Site)

R3: HasPhoneNumber(Cinema,Phone) R8: HasDate(Movie, Date)

R4: HasActor(Movie,Actor) …

R5: IsInDestination(Cinema, Destination) Rn: HasStartTime(Movie,StartTime)

OUTPUT:

INPUT:“ What science fiction movie can I see today at cinema Astra in Trento?”

Page 26: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 26

R1: HasDirector(Movie,Director) R6: HasDescription(Movie,Description)

R2: HasGenre(Movie,Genre) R7: IsInSite(Movie, Site)

R3: HasPhoneNumber(Cinema,Phone) R8: HasDate(Movie, Date)

R4: HasActor(Movie,Actor) …

R5: IsInDestination(Cinema, Destination) Rn: HasStartTime(Movie,StartTime)

The task: example

R2: HasGenre(Movie,Genre)R2: HasGenre(Movie,Genre)

OUTPUT:

R2

INPUT:“ What science fiction movie can I see today at cinema Astra in Trento?”

Page 27: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 27

The task: example

R1: HasDirector(Movie,Director) R6: HasDescription(Movie,Description)

R2: HasGenre(Movie,Genre) R7: IsInSite(Movie, Site)

R3: HasPhoneNumber(Cinema,Phone) R8: HasDate(Movie, Date)

R4: HasActor(Movie,Actor) …

R5: IsInDestination(Cinema, Destination) Rn: HasStartTime(Movie,StartTime) R5: IsInDestination(Cinema, Destination)R5: IsInDestination(Cinema, Destination)

OUTPUT:

R2, R5

INPUT:“ What science fiction movie can I see today at cinema Astra in Trento?”

Page 28: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 28

The task: example

R1: HasDirector(Movie,Director) R6: HasDescription(Movie,Description)

R2: HasGenre(Movie,Genre) R7: IsInSite(Movie, Site)

R3: HasPhoneNumber(Cinema,Phone) R8: HasDate(Movie, Date)

R4: HasActor(Movie,Actor) …

R5: IsInDestination(Cinema, Destination) Rn: HasStartTime(Movie,StartTime)

INPUT:“ What science fiction movie can I see today at cinema Astra in Trento?”

R7: IsInSite(Movie, Site)R7: IsInSite(Movie, Site)

OUTPUT:

R2, R5, R7

Page 29: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 29

The task: example

R1: HasDirector(Movie,Director) R6: HasDescription(Movie,Description)

R2: HasGenre(Movie,Genre) R7: IsInSite(Movie, Site)

R3: HasPhoneNumber(Cinema,Phone) R8: HasDate(Movie, Date)

R4: HasActor(Movie,Actor) …

R5: IsInDestination(Cinema, Destination) Rn: HasStartTime(Movie,StartTime)

OUTPUT:

R2, R5, R7,R8

INPUT:“ What science fiction movie can I see today at cinema Astra in Trento?”

R8: HasDate(Movie, Date)R8: HasDate(Movie, Date)

Page 30: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 30

Textual Entailment

t: The technological triumph known as GPS … was incubated in the mind of Ivan Getting.

h: Ivan Getting invented the GPS.

TE tutorial at ACL 2007, Dagan, Roth, Zanzotto

Page 31: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 31

Applied Textual Entailment

A directional relation between two text fragments: Text (t) and Hypothesis (h):

t entails h (th) if

humans reading t will infer that h is most likely true

Operational (applied) definition: Human gold standard - as in NLP applications Assuming common background knowledge –

which is indeed expected from applications

TE tutorial at ACL 2007, Dagan, Roth, Zanzotto

Page 32: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

32

Distance-Based TE EngineDistance-Based TE Engine

Determines the best (less costly) sequence of edit operations that allow to transform T into H: - Linear distance - Tree Edit Distance

Determines the cost of the three edit operations (insertion, deletion, substitution)

Each rule has a probability representing the degree of confidence of the rule. Rules can be at different levels (e.g. lexical, syntactic)

Page 33: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 33

Q: “Where is cinema Astra located?” P1: What is the telephone number of Cinema:X?

P1SPARQL

P2: Who is the director of Movie:X? P2SPARQL

P3: What is the ticket price of Cinema:X? P3SPARQL

P4: Give me the address of Cinema:X. P4SPARQL

… …

Pn PnSPARQL

Pattern repositoryInput question

Entailment engine

Entailment-based QA over structured dataEntailment-based QA over structured data

Page 34: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 34

Entailment-based QA over structured dataEntailment-based QA over structured data

Q: “Where is cinema Astra located?” P1: What is the telephone number of Cinema:X?

P1SPARQL

P2: Who is the director of Movie:X? P2SPARQL

P3: What is the ticket price of Cinema:X? P3SPARQL

P4: Give me the address of Cinema:X. P4SPARQL

… …

Pn PnSPARQL

Pattern repositoryInput question

Entailment engine

Q P4

Page 35: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 35

Entailment-based QA over structured dataEntailment-based QA over structured data

Q: “Where is cinema Astra located?” P1: What is the telephone number of Cinema:X?

P1SPARQL

P2: Who is the director of Movie:X? P2SPARQL

P3: What is the ticket price of Cinema:X? P3SPARQL

P4: Give me the address of Cinema:X. P4SPARQL

… …

Pn PnSPARQL

Pattern repositoryInput question

Entailment engine

Q P4

CONSTRUCT ?addressWHERE { ?cinema rdf:type tourism:Cinema

?cinema tourism:name “Astra”. ?cinema tourism:hasPostalAddress ?addr. ?addr tourism:street ?address }

Page 36: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 36

Entailment-based QA over structured dataEntailment-based QA over structured data

Q: “Where is cinema Astra located?” P1: What is the telephone number of Cinema:X?

P1SPARQL

P2: Who is the director of Movie:X? P2SPARQL

P3: What is the ticket price of Cinema:X? P3SPARQL

P4: Give me the address of Cinema:X. P4SPARQL

… …

Pn PnSPARQL

Pattern repositoryInput question

Entailment engine

Q P4

CONSTRUCT ?addressWHERE { ?cinema rdf:type tourism:Cinema

?cinema tourism:name “Astra”. ?cinema tourism:hasPostalAddress ?addr. ?addr tourism:street ?address }

A: Corso Buonarroti, 16 - Trento

Answer

Page 37: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 37

Entailment-based QA over structured dataEntailment-based QA over structured data

Q: “What’s the address of Astra?” P1: What is the telephone number of Cinema:X?

P1SPARQL

P2: Who is the director of Movie:X? P2SPARQL

P3: What is the ticket price of Cinema:X? P3SPARQL

P4: Give me the address of Cinema:X. P4SPARQL

… …

Pn PnSPARQL

Pattern repositoryInput question

Entailment engine

Q P4

CONSTRUCT ?addressWHERE { ?cinema rdf:type tourism:Cinema

?cinema tourism:name “Astra”. ?cinema tourism:hasPostalAddress ?addr. ?addr tourism:street ?address }

Answer

A: Corso Buonarroti, 16 - Trento

Page 38: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 38

Entailment-based QA over structured dataEntailment-based QA over structured data

Q: “Where can I find a cinema in the city centre?” P1: What is the telephone number of

Cinema:X? P1SPARQL

P2: Who is the director of Movie:X? P2SPARQL

P3: What is the ticket price of Cinema:X? P3SPARQL

P4: Give me the address of Cinema:X. P4SPARQL

… …

Pn PnSPARQL

Pattern repositoryInput question

Entailment engine

Q P4

CONSTRUCT ?addressWHERE { ?cinema rdf:type tourism:Cinema

?cinema tourism:name “Astra”. ?cinema tourism:hasPostalAddress ?addr. ?addr tourism:street ?address }

Answer

A: Corso Buonarroti, 16 - Trento

Page 39: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 39

Entailment-based QA over structured dataEntailment-based QA over structured data

Q: “I want to see a movie at Astra. Where is it?” P1: What is the telephone number of

Cinema:X? P1SPARQL

P2: Who is the director of Movie:X? P2SPARQL

P3: What is the ticket price of Cinema:X? P3SPARQL

P4: Give me the address of Cinema:X. P4SPARQL

… …

Pn PnSPARQL

Pattern repositoryInput question

Entailment engine

Q P4

CONSTRUCT ?addressWHERE { ?cinema rdf:type tourism:Cinema

?cinema tourism:name “Astra”. ?cinema tourism:hasPostalAddress ?addr. ?addr tourism:street ?address }

Answer

A: Corso Buonarroti, 16 - Trento

Page 40: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 40

Entailment-Based QAEntailment-Based QA

Language variations are held at textual level. • Alleviate the need of lexical mapping (as in traditional NLI

systems)

Any textual entailment approach/algorithm can be used• Distance-based, Machine Learning based

• Entailment rules with lexical and syntactic information

Linguistic phenomena are independent from the database organization• Re-usable across different tasks (e.g. Relation Extraction)

• Does not change in case of open domain QA

Page 41: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 41

OutlineOutline The Qallme scenario

Semantic Interpretation of user queries

Presenting informationS

uggested direction: provide answers with as much structure

as possible (RDF)

How to build the system

Conclusions

Page 42: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 42

QALLME: RDF-based outputQALLME: RDF-based output RDF is a standard for representing knowledge in the Semantic Web

RDF is independent both from languages and from media, allowing specific presentation components to be designed on top of it.

All reasoning capabilities allowed by RDF will be available in order to draw inferences from answers.

In order to represent the informative content of an answer, it seems natural to re-use concepts and relations already defined for the QALL-ME Ontology, rather then define a new set of predicates.

However the informative content is not adequate for generating interactive QA presentations

Page 43: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 43

A closer look to SPARQL queriesA closer look to SPARQL queries

CONSTRUCT{

}

WHERE{

}

Page 44: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 44

A closer look to SPARQL queriesA closer look to SPARQL queries

CONSTRUCT{

}

WHERE{

}

“Construct” portion

Selects fragments of the ontology, that represent the “answer” (core answer PLUS relevant additional information, for different answer presentation strategies)

Page 45: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 45

A closer look to SPARQL queriesA closer look to SPARQL queries

CONSTRUCT{

}

WHERE{

}

“Construct” portion

Returns fragments of the ontology in the form of an RDF graph, that represent the “answer” (core answer PLUS relevant additional information, useful for answer presentation)

“Where” portion

Represents the constraints necessary for answer extraction

Page 46: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 46

CONSTRUCT portionCONSTRUCT portion

CONSTRUCT {?event qmo:hasPeriod ?period .?event qmo:isInSite ?cinema .?event qmo:hasEventContent ?movie .?movie rdf:type ?movietype .?movie qmo:name ?moviename .?cinema qmo:hasGPSCoordinate ?coordinate .?cinema qmo:name ?cinemaname .?cinema qmo:hasPostalAddress ?postaladdress .?postaladdress qmo:isInDestination ?destination .…qma:AnswerInstance a qma:AnswersObject ;qma:hasAnswerValue ?movie} 

IN: What’s on at Modena?

Page 47: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 47

CONSTRUCT portionCONSTRUCT portionIN: What’s on at Modena? eventperiod hasPeriod

cinema

isInSite

movie

hasEventContent

movietype

type

moviename

name

coordinate

cinemaName

name

postalAddress

hasPostalAddr.

Destination isInDestination

CONSTRUCT {?event qmo:hasPeriod ?period .?event qmo:isInSite ?cinema .?event qmo:hasEventContent ?movie .?movie rdf:type ?movietype .?movie qmo:name ?moviename .?cinema qmo:hasGPSCoordinate ?coordinate .?cinema qmo:name ?cinemaname .?cinema qmo:hasPostalAddress ?postaladdress .?postaladdress qmo:isInDestination ?destination .…qma:AnswerInstance a qma:AnswersObject ;qma:hasAnswerValue ?movie} 

hasGPSCoord.

Page 48: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 48

WHERE portionWHERE portion

CONSTRUCT {…} WHERE{?event qmo:hasPeriod ?period .?event qmo:isInSite ?cinema .… { ?cinema qmo:name ”Supercinema Modena" } UNION { ?cinema qmo:name "Multisala Modena" } } .…FILTER (xsd:dateTime("2008-12-05T14:19:55") <= xsd:dateTime(fn:string-join(fn:string-join(xsd:string(?date),"T"),xsd:string(?time))))

…}

IN: What’s on at Modena?

…the name of the cinema is “SUPERCINEMA MODENA” or “MULTISALA MODENA”

Page 49: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 49

WHERE portionWHERE portion

CONSTRUCT {…} WHERE{?event qmo:hasPeriod ?period .?event qmo:isInSite ?cinema .… { ?cinema qmo:name ”Supercinema Modena" } UNION { ?cinema qmo:name "Multisala Modena" } } .…FILTER (xsd:dateTime("2008-12-05T14:19:55") <= xsd:dateTime(fn:string-join(fn:string-join(xsd:string(?date),"T"),xsd:string(?time))))

…}

IN: What’s on at Modena?

…the movie should be TODAY, and AFTER THE TIME OF THE QUERY

Page 50: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 50

Resulting RDF graphResulting RDF graphIN: What’s on at Modena? eventperiod hasPeriod

cinema

isInSite

movie

hasEventContent

Crime

type

La Fuga

name

coordinate

Modena

name

postalAddress

hasPostalAddr.

Trento isInDestination

hasGPSCoord.

11°7′0′′E

Longitude

46°4′0′′N

Latitude

Dateperiod

HasDatePeriod

12/11/2008

StartDate

Timeperiod

HasTimePeriod

21.00

StartTime

Page 51: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 51

Using RDF for PresentationsUsing RDF for Presentations

RDF for interactive/presentation purposes• Annotations over RDF triples for interactive/presentation purposes:• Core vs complementary information wrt question• Typical follow up questions in specific domains• Explanations for error• Aggregation of redundant information• Natural messages

Three steps:1. Produce RDF output2. Annotate RDF with presentation metadata (in progress)3. Generate a presentation for a specific

media/language/devise/user/… (in progress)

Page 52: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 52

Resulting RDF graphResulting RDF graphIN: What’s on at Modena? eventperiod hasPeriod

cinema

isInSite

movie

hasEventContent

Crime

type

La Fuga

name

coordinate

Modena

name

postalAddress

hasPostalAddr.

Trento isInDestination

hasGPSCoord.

11°7′0′′E

Longitude

46°4′0′′N

Latitude

Dateperiod

HasDatePeriod

12/11/2008

StartDate

Timeperiod

HasTimePeriod

21.00

StartTime

Page 53: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 53

Adding metadata to the RDF graphAdding metadata to the RDF graph

IN: What’s on at Modena? eventperiod hasPeriod

cinema

isInSite

movie

hasEventContent

Crime

type

La Fuga

name

coordinate

Modena

name

postalAddress

hasPostalAddr.

Trento isInDestination

hasGPSCoord.

11°7′0′′E

Longitude

46°4′0′′N

Latitude

Dateperiod

HasDatePeriod

12/11/2008

StartDate

Timeperiod

HasTimePeriod

21.00

StartTime

CORE ANSWERCORE ANSWER

Page 54: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 54

Adding metadata to the RDF graphAdding metadata to the RDF graph

IN: What’s on at Modena? eventperiod hasPeriod

cinema

isInSite

movie

hasEventContent

Crime

type

La Fuga

name

coordinate

Modena

name

postalAddress

hasPostalAddr.

Trento isInDestination

hasGPSCoord.

11°7′0′′E

Longitude

46°4′0′′N

Latitude

Dateperiod

HasDatePeriod

12/11/2008

StartDate

Timeperiod

HasTimePeriod

21.00

StartTime

DEFAULT TIMEDEFAULT TIME

Page 55: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 55

Adding metadata to the RDF graphAdding metadata to the RDF graph

IN: What’s on at Modena? eventperiod hasPeriod

cinema

isInSite

movie

hasEventContent

Crime

type

La Fuga

name

coordinate

Modena

name

postalAddress

hasPostalAddr.

Trento isInDestination

hasGPSCoord.

11°7′0′′E

Longitude

46°4′0′′N

Latitude

Dateperiod

HasDatePeriod

12/11/2008

StartDate

Timeperiod

HasTimePeriod

21.00

StartTime

NAMED BY USERNAMED BY USER

Page 56: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 56

Adding metadata to the RDF graphAdding metadata to the RDF graph

IN: What’s on at Modena? eventperiod hasPeriod

cinema

isInSite

movie

hasEventContent

Crime

type

La Fuga

name

coordinate

Modena

name

postalAddress

hasPostalAddr.

Trento isInDestination

hasGPSCoord.

11°7′0′′E

Longitude

46°4′0′′N

Latitude

Dateperiod

HasDatePeriod

12/11/2008

StartDate

Timeperiod

HasTimePeriod

21.00

StartTime

COMPLEMENTARY INFORMATIONCOMPLEMENTARY INFORMATION

Page 57: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 57

OutlineOutline

The Qallme scenario

Semantic Interpretation of user queries

Presenting information

How to port the system to a different domainA

101 hours experiment

Conclusions

Page 58: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 58

Porting through domainsPorting through domains Methodological issues

• Acquisition of domain-specific questions

• Questions Annotation

Estimating domain complexity Estimating the costs of domain portability

Porting to ACCOMMODATION System Training and Evaluation

Focus on two crucial aspects of the QALL-ME approach• Expected Answer Type recognition

• Entailment-based Relation extraction from questions

Page 59: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 59

Page 60: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 60

STRUCTURED FIELDSnamecategory ratingstreetpostal codetownregiontelephone numberfax numberemailwebsite

UNSTRUCTURED FIELDSwebsitedescriptionemailyear of costructionnumber of jounior suitesnumber of single roomscategory (recommended)serviceslanguagesfederal statestarsroom ….

Page 61: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 61

CINEMA ACCOMMODATION

Class instances 1010 4595

Relations 1692 6895

Data properties 1107 7912Total 3809 19402

Cinema Vs Accommodation: a look at the data

Domain Complexity Domain Complexity

Page 62: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 62

Domain Complexity Domain Complexity

1. Number of possible Expcted Answer Types (EAT) The more the EATs, the harder the EAT recognition task

2. Number of valid relations Impact on RTE performance, the more the relations the

harder the task

3. Size of the vocabulary Impact on EAT and RTE performance (language variability)

4. Average number of relations per question Impact on RTE performance (questions complexity)

5. Average length of the collected questions Impact on RTE performance (questions complexity)

Page 63: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 63

101 hours experience101 hours experience Porting from Movies to ACCOMMODATION

• 1 Italian speaker, hired for a 40 hours work ~32 hours for question collection ~8 hours for question annotation (EAT, Relations, MRPs) Result:

232 questions collected and annotated 144 MRPs (13 on average per rel. –min 5, max 57– 18per hour)

• System training (182 questions) Development of a rule-based EAT recognizer Training of the RTE engine

• System evaluation (50 questions) EAT recognition Entailment-based Relation Extraction

Page 64: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 64

Development and evaluationDevelopment and evaluation

Matteo Negri, Domain Portability - Dec. 11, 2008

EAT recognition, on Italian data:

• ML results confirm that ACCO is slightly easier than CIN

• On ACCO, +9% Accuracy over ML with only 120 rules

• Limited impact on performance for the combined system In the combined test set, most of the errors are still due to wrong

assignments for the ACCO domain

Domain #Questions EATs Approach Person hours Accuracy

CIN (ALL) 283 18 Rules (257) ~100 75%

CIN (correct) 86 18 Rules (257) ~100 92%

CIN (correct) 86 18 ML1 ~30 52%

ACCO 232 14 Rules (120) ~30 68%

ACCO 232 14 ML2 ~1 59%

CIN+ACCO 318 28 Rules (437) ~130 78%

Page 65: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 65

Accuracy: CINEMA Vs ACCOAccuracy: CINEMA Vs ACCOA

ccur

acy

Number of patterns (hours of work)

Accuracy in terms of Exact Matches (proportion of questions for which the system recognized ALL and ONLY the correct relations), using different amounts of patterns/person hours

Page 66: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 66

Accuracy: CINEMA Vs ACCOAccuracy: CINEMA Vs ACCOA

ccur

acy

Number of patterns (hours of work)

Page 67: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 67

Accuracy: ACCO Vs CINEMAAccuracy: ACCO Vs CINEMA

• The curves confirm the previous conclusions…• ACCO is slightly easier to handle than CIN (92% Vs 82% with IDF, 80 Vs 48 with WOLP) • In the CIN domain, higher Accuracy differences reflect differences between questions and patterns

• …together showing that• more than the number of instantiated classes/relations/properties, the number of different classes/relations/properties (the database schema) is a relevant indicator of domain complexity

Page 68: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 68

101 hours experiment: summary101 hours experiment: summary 40 hours acquisition/annotation

• 232 questions featuring:

1. Number of possible EATs: 14

2. Number of valid relations: 11

3. Size of the vocabulary: 120

4. Average number of relations per question: 1.44

5. Average length of the collected questions: 9.45

• 144 MRPs (13 on average per relation)

30 hours on EAT rules development• 68% Accuracy

31 hours on ML-based EAT recognizer (CIN)• 59% Accuracy on ACCO

0 hours on entailment-based RE• 92% Exact Matches in RE

Page 69: Co-funded by the European Union Information Access through Textual Entailment: The Experience of the QALL-ME project Bernardo Magnini FBK-irst, Trento,

First Kyoto Workshop, February 3, 2009 69

ConclusionConclusion The QALL-ME project

• Entailment-based QA EDITS system: open source release expected March 2009

• Consolidated web service architecture: open source release expected March 2009

• Interactive QA Based on richer structured output From RDF to annotated RDF for interaction/presentations

• Porting through domains Learn from data (questions)

Interesting perspectives for deployment• Several application scenarios (mobile services in a town, FAQ)• Integration with digital assistant