On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

25
On the Semantic Mapping of Schema- agnostic Queries: A Preliminary Study André Freitas , João C. Pereira da Silva, Edward Curry Insight Centre for Data Analytics NLIWoD, ISWC 2014 Riva del Garda

description

The growing size, heterogeneity and complexity of databases demand the creation of strategies to facilitate users and systems to consume data. Ideally, query mechanisms should be schema-agnostic or vocabulary-independent, i.e. they should be able to match user queries in their own vocabulary and syntax to the data, abstracting data consumers from the representation of the data. Despite being a central requirement across natural language interfaces and entity search, there is a lack on the conceptual analysis of schema-agnosticism and on the associated semantic differences between queries and databases. This work aims at providing an initial conceptualization for schema-agnostic queries aiming at providing a fine-grained classification which can support the scoping, evaluation and development of semantic matching approaches for schema-agnostic queries.

Transcript of On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

Page 1: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

On the Semantic Mapping of Schema-

agnostic Queries: A Preliminary Study

André Freitas, João C. Pereira da Silva, Edward Curry

Insight Centre for Data Analytics

NLIWoD, ISWC 2014

Riva del Garda

Page 2: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

On the Semantic Mapping of Schema-

agnostic Queries: A Preliminary Study

André Freitas, João C. Pereira da Silva, Edward Curry

Insight Centre for Data Analytics

NLIWoD, ISWC 2014

Riva del Garda

Page 3: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

Outline

Goals

Semantic Tractability

Dimensions of Query-Database Semantic Heterogeneity

Definitions

Semantic Resolvability

Summary

Page 4: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

Motivation

QA/NLI

Q0, R0

...Q1, R1

Qn, Rn

f-measure

What is being evaluated by the test collection ?

semantic matching

Page 5: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

Goals

Provide a preliminary categorization on the semanticmatching (schema-agnosticism) classes.

Support a conceptual understanding on the semanticphenomena behind schema-agnostic queries.

Applications:

- Help on the design and evaluation of schema-agnostic query mechanisms

- Relevant to Question Answering and Natural Language Interfaces

Page 6: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

Semantic Tractability

Popescu et al. (2003)

Towards a Theory of Natural Language Interfaces to Databases

Definition focuses on soundness and completeness

conditions for mapping Natural Language Queries to Database

elements

Page 7: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

Semantic Tractability

Leaves many queries outside the tractability scope

Conditions:- Query-Database syntactic isomorphism- Explicit and unambiguous synonymic mapping

Goal is to provide an all inclusive categorization system

Page 8: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

Dimensions of Query-Database Semantic

Heterogeneity

Methodology for the creation of a taxonomy of lexico-semantic

differences

Listing of concepts expressed in the existing semantic

heterogeneity taxonomies - George, 2005

- Colomb, 1997

- Parent & Spaccapietra, 1998

- Kashyap & Sheth, 1996

Elimination of concepts which were not relevant in the context of

the query-database semantic differences

Merging and renaming of equivalent concepts

Page 9: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

Taxonomy of Semantic Differences

Page 10: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

Semantic Mapping

Query Tokens

Dataset Lexical Element

Associated Semantic Knowledge Base (M)

Query

TokenM token q

Dataset

LexiconM Σ

...

Semantic Reachability

Query-Dataset Semantic mapping:

Page 11: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

Semantic Resolvability

Page 12: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

Resolved Schema-agnostic Query

Page 13: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

Semantic Mapping Types

Classifies each semantic mapping

According to the semantic heterogeneity classes

Taking into account some semantic phenomena (ambiguity, vagueness)

Page 14: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

AP: Abstraction Process

Trivial

Lexical

Synonymic

Generalization/specialization

Conceptual

Functional/Aggregation

Page 15: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

PS: Predicate Structure

Predication preseving

Predication difference

Page 16: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

M: Semantic Knowledge Base

Self-Sufficient

Dependent on External Knolwedge Base

Page 17: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

SE: Semantic Evidence & Uncertainty

Absolute

Context resolvable

Page 18: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

CT: Context

Sufficient

Insufficient

Page 19: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

MC: Mapping Cardinality

1:1

1:N

N:1

M:N

Page 20: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

Semantic Intepretation Model

Page 21: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

Example

Page 22: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

Semantic Resolvability Classes

Easier

Harder

Page 23: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

Example test collection analysis

Test collection X

Has 4 distinct semantic resolvability classes

50% are trivial mappings

23% are lexical mappings

27% are synonymic mappings

100% of the predicates are structure preserving

100% of the mapping cardinalities are 1:1

Page 24: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

Example system evaluation

System Y

Addresses 5 out of 10 semantic resolvability classes

(AP=conceptual, PS=*, MC=1:1, SE=*, M=*, CT=*)- map = 0.51, recall = 0.7

...

Page 25: On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study

Summary

NLI/QA Systems have semantic matching (schema-

agnosticism) at its center

The proposed categorization can be used for a more principled

interpretation of the results of NLI/QA systems

... and also on which dimensions evaluation campaigns actually

measure

It supports deeper comparative analysis

Future work includes the categorization of the QALD test

collection