NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on the Web

13
Multilingual Retrieval Interface for Structured Data on the Web Dana Dannells, Ramona Enache, Mariana Damova NLIWoD - ISWC’2014

description

This presentation described a Multilingual Retrieval Interface for Structured data on the Web, a talk given at NLIWoD workshop at ISWC 2014. The approach is based on Grammatical framework and semantic web and linked data technologies

Transcript of NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on the Web

Page 1: NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on the Web

Multilingual Retrieval Interface for Structured Data on the Web

Dana Dannells, Ramona Enache, Mariana Damova

NLIWoD - ISWC’2014

Page 2: NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on the Web

Semantic Data Infrastructures

Leonardo Mona Lisa

RDF Repository

SPARQL

:Painter :Painting

:painted

Mona Lisa ?

rdf:typ

e

rdf:

typ

e

• Semantic Web • Linked data • SPARQL query language

2 10/19/2014

Page 3: NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on the Web

Natural Interface through NL

10/19/2014 3

EN FR DE

Who painted Mona Lisa? Qui a paint Mona Lisa? Wer hat Mona Lisa gemahlt?

Who is Mona Lisa’s painter? Qui est le paintre de Mona Lisa? Wer ist der Mahler von Mona Lisa ?

Who created Mona Lisa? Qui a créé Mona Lisa? Wer hat Mona Lisa geschöpft?

Leonardo da Vinci A:

Q:

NL to ontology interoperability

Page 4: NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on the Web

Multilingual Retrieval Interface

10/19/2014 4

Page 5: NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on the Web

GF – Grammar Framework

• Type-theoretical grammar formalism supporting multilingual applications

• Two-layered architecture

– Abstract syntax - semantics

– Concrete syntax – language dependent surface structure

10/19/2014 5

Abstract syntax: Concrete English syntax: Abstract representation: cat NP, VP, S; lincat NP, VP, S = {s: Str}; lin Mary = mkNP (mkPN "Mary"); fun Mary, John: NP; lin Mary = {s = "Mary"}; lin John = mkNP (mkPN "John"); fun Love: NP -> VP; lin John = {s = "John"}; lin Love o = mkVP (mkV2 "love") o; fun Pred : NP -> VP -> S; lin Love o = {s = "loves" ++ o.s}; lin Pred sub v = mkS (mkCl sub v); lin Pred sub v = {s = sub.s ++ v.s};

Ex: John loves Mary

Page 6: NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on the Web

Multilingual Aspect of GF

10/19/2014 6

Page 7: NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on the Web

YAQL (Yet Another Query language)

• A common architecture with one base module and domain knowledge representation

• Straightforward abstract syntax generation from ontology with just the minimum lexical types – Common noun – Kind

– Noun phrase – Entity

– Verb phrase – Property

– Verb phrase with higher arity – Relation

• Reusable generic grammar structure

10/19/2014 7

Abstract syntax: Concrete syntax: Move ; Move = Utt ; Query ; Query = QS ; MQuery : Query -> Move ;

Query Command Answer

Page 8: NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on the Web

YAQL and the Semantic Web

The category Kind gets coupled with OWL entities

10/19/2014 8

• Yet another NL two layers so that a new domain model can be easily integrated into the query module • Bidirectional translation in 15 languages (Bulgarian, Catalan, Danish, Dutch, English, Finnish, French, Hebrew, Italian, German, Norwegian, Romanian, Russian, Spanish, Swedish)

Text Query Answer

Data

Lexicon

RGL

YAQL

Page 9: NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on the Web

NL to SPARQL

10/19/2014 9

Page 10: NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on the Web

Who painted Mona Lisa?

English: Who painted t ?

QPainter t = mkQS pastTense (mkQCl who\_IP paint\_V2 t)

Finnish: Whose painting is t ?}

QPainter t = mkQS (mkQCl (mkIP (E.GenIP who\_IP) (mkN "maalaama")) t)

French: By who is t ?

QPainter t = mkQS (mkQCl (mkIAdv by8agent\_Prep who\_IP) t)

10/19/2014 10

$MQuery (QPainter (PTitle TMona\_Lisa))$

Abstract syntax

SPARQL

MQuery q = "PREFIX painting:<http://spraakbanken.gu.se/rdf/owl/painting.owl#> PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#>> SELECT distinct"++ q.wh1 ++ " WHERE { ?painting rdf:type painting:Painting; rdfs:label ?title; " ++ q.wh2 ++ q.prop++"}" ;

Concrete syntax

Page 11: NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on the Web

Evaluation

• User satisfaction

• Efficiency in terms of time, effort and cost

• Effectiveness, how the system scales up

10/19/2014 11

Coverage: 1159 query patterns in 15 languages 10 characteristics of CH objects Extendibility New query grammar - 150 lines of code Evaluation Random queries in 7 languages with very few native informants’ corrections

Page 12: NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on the Web

Conclusion

• NL to ontology interoperability approach

• Multilingual interface for retrieval of structured data from the Web

• Easily extendable initial base of YAQL transformations

• Great coverage of paraphrases

• Expert language/information engineers required

10/19/2014 12

Page 13: NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on the Web

Thank you for your attention

10/19/2014 13

Contacts: [email protected] [email protected] [email protected]

?