NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on the Web
-
Upload
mariana-damova -
Category
Data & Analytics
-
view
308 -
download
0
description
Transcript of NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on the Web
Multilingual Retrieval Interface for Structured Data on the Web
Dana Dannells, Ramona Enache, Mariana Damova
NLIWoD - ISWC’2014
Semantic Data Infrastructures
Leonardo Mona Lisa
RDF Repository
SPARQL
:Painter :Painting
:painted
Mona Lisa ?
rdf:typ
e
rdf:
typ
e
• Semantic Web • Linked data • SPARQL query language
2 10/19/2014
Natural Interface through NL
10/19/2014 3
EN FR DE
Who painted Mona Lisa? Qui a paint Mona Lisa? Wer hat Mona Lisa gemahlt?
Who is Mona Lisa’s painter? Qui est le paintre de Mona Lisa? Wer ist der Mahler von Mona Lisa ?
Who created Mona Lisa? Qui a créé Mona Lisa? Wer hat Mona Lisa geschöpft?
Leonardo da Vinci A:
Q:
NL to ontology interoperability
Multilingual Retrieval Interface
10/19/2014 4
GF – Grammar Framework
• Type-theoretical grammar formalism supporting multilingual applications
• Two-layered architecture
– Abstract syntax - semantics
– Concrete syntax – language dependent surface structure
10/19/2014 5
Abstract syntax: Concrete English syntax: Abstract representation: cat NP, VP, S; lincat NP, VP, S = {s: Str}; lin Mary = mkNP (mkPN "Mary"); fun Mary, John: NP; lin Mary = {s = "Mary"}; lin John = mkNP (mkPN "John"); fun Love: NP -> VP; lin John = {s = "John"}; lin Love o = mkVP (mkV2 "love") o; fun Pred : NP -> VP -> S; lin Love o = {s = "loves" ++ o.s}; lin Pred sub v = mkS (mkCl sub v); lin Pred sub v = {s = sub.s ++ v.s};
Ex: John loves Mary
Multilingual Aspect of GF
10/19/2014 6
YAQL (Yet Another Query language)
• A common architecture with one base module and domain knowledge representation
• Straightforward abstract syntax generation from ontology with just the minimum lexical types – Common noun – Kind
– Noun phrase – Entity
– Verb phrase – Property
– Verb phrase with higher arity – Relation
• Reusable generic grammar structure
10/19/2014 7
Abstract syntax: Concrete syntax: Move ; Move = Utt ; Query ; Query = QS ; MQuery : Query -> Move ;
Query Command Answer
YAQL and the Semantic Web
The category Kind gets coupled with OWL entities
10/19/2014 8
• Yet another NL two layers so that a new domain model can be easily integrated into the query module • Bidirectional translation in 15 languages (Bulgarian, Catalan, Danish, Dutch, English, Finnish, French, Hebrew, Italian, German, Norwegian, Romanian, Russian, Spanish, Swedish)
Text Query Answer
Data
Lexicon
RGL
YAQL
NL to SPARQL
10/19/2014 9
Who painted Mona Lisa?
English: Who painted t ?
QPainter t = mkQS pastTense (mkQCl who\_IP paint\_V2 t)
Finnish: Whose painting is t ?}
QPainter t = mkQS (mkQCl (mkIP (E.GenIP who\_IP) (mkN "maalaama")) t)
French: By who is t ?
QPainter t = mkQS (mkQCl (mkIAdv by8agent\_Prep who\_IP) t)
10/19/2014 10
$MQuery (QPainter (PTitle TMona\_Lisa))$
Abstract syntax
SPARQL
MQuery q = "PREFIX painting:<http://spraakbanken.gu.se/rdf/owl/painting.owl#> PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#>> SELECT distinct"++ q.wh1 ++ " WHERE { ?painting rdf:type painting:Painting; rdfs:label ?title; " ++ q.wh2 ++ q.prop++"}" ;
Concrete syntax
Evaluation
• User satisfaction
• Efficiency in terms of time, effort and cost
• Effectiveness, how the system scales up
10/19/2014 11
Coverage: 1159 query patterns in 15 languages 10 characteristics of CH objects Extendibility New query grammar - 150 lines of code Evaluation Random queries in 7 languages with very few native informants’ corrections
Conclusion
• NL to ontology interoperability approach
• Multilingual interface for retrieval of structured data from the Web
• Easily extendable initial base of YAQL transformations
• Great coverage of paraphrases
• Expert language/information engineers required
10/19/2014 12
Thank you for your attention
10/19/2014 13
Contacts: [email protected] [email protected] [email protected]
?