Post on 02-Jul-2015
description
On the Semantic Mapping of Schema-
agnostic Queries: A Preliminary Study
André Freitas, João C. Pereira da Silva, Edward Curry
Insight Centre for Data Analytics
NLIWoD, ISWC 2014
Riva del Garda
On the Semantic Mapping of Schema-
agnostic Queries: A Preliminary Study
André Freitas, João C. Pereira da Silva, Edward Curry
Insight Centre for Data Analytics
NLIWoD, ISWC 2014
Riva del Garda
Outline
Goals
Semantic Tractability
Dimensions of Query-Database Semantic Heterogeneity
Definitions
Semantic Resolvability
Summary
Motivation
QA/NLI
Q0, R0
...Q1, R1
Qn, Rn
f-measure
What is being evaluated by the test collection ?
semantic matching
Goals
Provide a preliminary categorization on the semanticmatching (schema-agnosticism) classes.
Support a conceptual understanding on the semanticphenomena behind schema-agnostic queries.
Applications:
- Help on the design and evaluation of schema-agnostic query mechanisms
- Relevant to Question Answering and Natural Language Interfaces
Semantic Tractability
Popescu et al. (2003)
Towards a Theory of Natural Language Interfaces to Databases
Definition focuses on soundness and completeness
conditions for mapping Natural Language Queries to Database
elements
Semantic Tractability
Leaves many queries outside the tractability scope
Conditions:- Query-Database syntactic isomorphism- Explicit and unambiguous synonymic mapping
Goal is to provide an all inclusive categorization system
Dimensions of Query-Database Semantic
Heterogeneity
Methodology for the creation of a taxonomy of lexico-semantic
differences
Listing of concepts expressed in the existing semantic
heterogeneity taxonomies - George, 2005
- Colomb, 1997
- Parent & Spaccapietra, 1998
- Kashyap & Sheth, 1996
Elimination of concepts which were not relevant in the context of
the query-database semantic differences
Merging and renaming of equivalent concepts
Taxonomy of Semantic Differences
Semantic Mapping
Query Tokens
Dataset Lexical Element
Associated Semantic Knowledge Base (M)
Query
TokenM token q
Dataset
LexiconM Σ
...
Semantic Reachability
Query-Dataset Semantic mapping:
Semantic Resolvability
Resolved Schema-agnostic Query
Semantic Mapping Types
Classifies each semantic mapping
According to the semantic heterogeneity classes
Taking into account some semantic phenomena (ambiguity, vagueness)
AP: Abstraction Process
Trivial
Lexical
Synonymic
Generalization/specialization
Conceptual
Functional/Aggregation
PS: Predicate Structure
Predication preseving
Predication difference
M: Semantic Knowledge Base
Self-Sufficient
Dependent on External Knolwedge Base
SE: Semantic Evidence & Uncertainty
Absolute
Context resolvable
CT: Context
Sufficient
Insufficient
MC: Mapping Cardinality
1:1
1:N
N:1
M:N
Semantic Intepretation Model
Example
Semantic Resolvability Classes
Easier
Harder
Example test collection analysis
Test collection X
Has 4 distinct semantic resolvability classes
50% are trivial mappings
23% are lexical mappings
27% are synonymic mappings
100% of the predicates are structure preserving
100% of the mapping cardinalities are 1:1
Example system evaluation
System Y
Addresses 5 out of 10 semantic resolvability classes
(AP=conceptual, PS=*, MC=1:1, SE=*, M=*, CT=*)- map = 0.51, recall = 0.7
...
Summary
NLI/QA Systems have semantic matching (schema-
agnosticism) at its center
The proposed categorization can be used for a more principled
interpretation of the results of NLI/QA systems
... and also on which dimensions evaluation campaigns actually
measure
It supports deeper comparative analysis
Future work includes the categorization of the QALD test
collection