Gsdi10

31
Lorenzino Vaccari, Pavel Shvaiko, Maurizio Marchese, DISI Department University of Trento, Italy email: vaccari/pavel/[email protected] An emergent semantics approach to integration of GI in SDI GSDI10 - Tenth International Conference for Spatial Data Infrastructure St. Augustine, Trinidad - February 25-29, 2008

description

Semantic, SDI, GIS, Web Services, SMatch

Transcript of Gsdi10

Page 1: Gsdi10

Lorenzino Vaccari, Pavel Shvaiko, Maurizio Marchese, DISI Department University of Trento, Italy email: vaccari/pavel/[email protected]

An emergent semantics approach to integration of GI in SDI

GSDI10 - Tenth International Conference for Spatial Data Infrastructure St. Augustine, Trinidad - February 25-29, 2008

Page 2: Gsdi10

Overview

• Context • GI Interoperability Issues • Emergent GI Semantics • Structure Preserving Semantic

Matching – Geo-Services Example – Preliminary Results

• Conclusion and Future Work

Page 3: Gsdi10

Context

• Sharing GI between different stakeholders • International, National and Local

initiatives. E.g.: – INSPIRE – The Italian “Intesa Stato-Regioni”

• Spatial Data Infrastructure adoption – Geo-Data Integration – Geo-Services Discovery and Coordination

Page 4: Gsdi10

GI Interoperability Issues

•  Geo-Data Integration –  Physical Format –  Production process –  Different Resolution –  Different Schemas/

Ontologies

•  Geo-services interoperability –  Geo-services discovery and

coordination –  Technological solutions: SOA

and OGC specifications. –  Semantic heterogeneity of

geo-services

Page 5: Gsdi10

Emergent GI Semantics

•  Our approach to improve GI interoperability is based on:

–  The Structure Preserving Semantic Matching (SPSM) algorithm

•  Emergent GI semantics –  We store the matching results to use them

for the following interactions

Page 6: Gsdi10

?Semantic heterogeneity of Geo-Services

getMap( –  Dimension(Width,

Height), –  MapFile, –  Edition, –  Layers, –  DataFormat, –  Xmin, Ymin,

Xmax, Ymax)

getMap( –  MapFile, –  Version, –  Layers, –  Width, –  Height, –  Format, –  XMin_BB, YMin_BB,

XMax_BB, YMax_BB)

SPSM ?

Page 7: Gsdi10

Information sources (e.g., schemas, ontologies, web service descriptions) can be viewed as graph-like structures containing terms and their inter-relationships

Matching takes two graph-like structures and produces a set of correspondences between the semantically related nodes of those graphs

Structure preserving semantic matching finds correspondences between semantically related nodes of the graphs, still preserving a set of structural properties (e.g., vertical ordering of nodes)

Structure Preserving Semantic Matching

Page 8: Gsdi10

getMap(MapFile, Version, Layers, Width, Height, Format, XMin_BB, YMin_BB, XMax_BB, YMax_BB) getMap(Dimension(Width, Height), MapFile, Edition, Layers, DataFormat, Request, Xmin, Ymin, Xmax, Ymax)

SPSM(T1,T2) = 0.38+ set of correspondences

Example from Geo-Services

getMap MapFile Version Layers Width Height

T1

Format Xmin_BB Ymin_BB Xmax_BB Ymax_BB

DataFormat

getMap Dimension

Width

Edition MapFile

Height

Layers

T2

Request Xmin Ymin Xmax Ymax

Page 9: Gsdi10

Characteristics of matching

•  Returns: –  global similarity between two trees in [0 1] –  a set of correspondences

•  Approximation is on two levels: –  Node (S-Match) –  Structure

•  Provides one-to-one node correspondences

–  Functions are matched to functions –  Variables are matched to variables

Page 10: Gsdi10

Predicate (Pd): ―  Two or more predicates are merged, typically to the least

general generalization in the predicate type hierarchy ―  Height(X) + Dimension(X) → Dimension (X)

Domain (D): ―  Two or more terms are merged, typically by moving the

functions or constants to the least general generalization in the domain type hierarchy

―  Xmin_BB + Xmin → Xmin Propositional (P):

―  One or more arguments are dropped ―  Layers (L1) → Layers

Composition example: ―  Height (X) ⊑P Height ⊑D Dimension

Abstraction operations (AO)

Page 11: Gsdi10

Node matching: two nodes n1 and n2 in trees T1 and T2 approximately match if and only if: c@n1 R c@n2 holds (based on S-Match), where:

―  c@n1 and c@n2 are the concepts at nodes n1 and n2

―  R ∈ {=, ⊑, ⊒, idk} ―  Version = Edition, Xmin_BB ⊑ Xmin

Tree matching: two trees T1 and T2 approximately match if and only if there is at least one node n1 in T1 and one node n2 in T2 such that:

―  n1 approximately matches n2 ―  All ancestors of n1 are approximately matched to the

ancestors of n2 ―  Horizontal order of siblings is not preserved (in most

of the cases)

Approximate SPSM

Page 12: Gsdi10

Key idea: use abstractions/refinements (standing for relations of a correspondence) as tree edit distance operations in order to estimate the similarity of two given trees

Tree edit distance (TED): the minimum number of tree edit operations (node insertion, deletion, replacement) required to transform one tree to another. We want to:

―  Minimize the editing cost, i.e., computation of the minimal cost composition of abstraction/refinement operations

―  Allow only those tree edit operations that have their abstraction theoretic counterparts

Tree edit distance

Page 13: Gsdi10

AO TED operation Preconditions Cost= Cost⊑ Cost⊒

n1⊒Pd n2 replace(a, b) a ⊒ b; a and b correspond to predicates 1 ∞ 1

n1⊒D n2 replace(a, b) a ⊒ b; a and b correspond to functions, … 1 ∞ 1

n1⊒P n2 insert(a) a corresponds to predicate, function, … 1 ∞ 1

n1⊑Pd n2 replace(a, b) a ⊑ b; a and b correspond to predicates 1 1 ∞

n1⊑D n2 replace(a, b) a ⊑ b; a and b correspond to functions, … 1 1 ∞

n1⊑P n2 delete(a) a corresponds to predicate, function, … 1 1 ∞

n1= n2 replace(a, b) a = b; corresponds to predicate, … 0 0 0

TED operations and costs

Page 14: Gsdi10

Example from Geo-Services (cont’d) getMap

MapFile Version Layers Width Height

T1

Format Xmin_BB Ymin_BB Xmax_BB Ymax_BB

DataFormat

getMap Dimension

Width

Edition MapFile

Height

Layers

T2

Request Xmin Ymin Xmax Ymax

1

2 3 4 5 6 7 8

10 9

11

1 2

3 4

5 6 7 8

10 9

11 12 13

Page 15: Gsdi10

Preliminary evaluation

Synthesized datasets (hundreds of trees) from various versions of SUMO and AKT ontologies and Brown Corpus lexicon. E.g.:

journal(periodical_publication) vs. magazine(periodical-publication) flowers_Michigan(northern, Whisky) vs. flowers_Michigan(Whisky, bourbon)

Measures Precision, Recall, F-measure, Time

Results on a standard laptop (Core Duo CPU - 2Hz, 2GB RAM, Windows Vista):  average F-measure = 0.78  average execution time = 93ms

Page 16: Gsdi10

SPSM for Geo-services ― Geo-Services and Geo-data heterogeneity ― Geo-Service use case (GetMap) ―  Node matching of S-Match ―  Structure preserving matching based on theory of

abstraction and tree edit distance ―  Preliminary evaluation with encouraging results

Future work ―  Conducting an extensive evaluation ―  Extending the matching approach for dealing with

fully fledged SDI/GIS geo-data ontologies

Conclusions and future work

Page 17: Gsdi10

Acknowledgements

We are grateful to: Fausto Giunchiglia, Mikalai Yatskevitch, Juan

Pane and Fiona McNeill for many fruitful discussions on the structure preserving semantic matching

This work has been supported by the FP6 OpenKnowledge project (http://www.openk.org)

Page 18: Gsdi10

Thank you for your attention and...

Questions ?

Lorenzino Vaccari: [email protected]

[1] University of Trento – DISI department: www.disi.unitn.it [2] OpenKnowledge project (SPSM, WP3): www.openk.org [3] Knowdive group (S-match): http://dit.unitn.it/~knowdive/

“MANAGING KNOWLEDGE DIVERSITY BUILDING A BRIDGE FOR INTEGRATING THE

DIVERSE KNOWLEDGE IN DIFFERENT RESEARCH FIELDS, SPANNING ACROSS PEOPLE OF DIFFERENT NATIONS AND

CULTURES” Knowdive Group

Page 19: Gsdi10

Geo-Services Example (LCC) a(map_requestor,R):: requestMap(MapFile, Version, Layers, Width, Height, Format, SRS, XMin_BB, YMin_BB, XMax_BB, YMax_BB) => a(ga_sp,P) <- selectLayers(AvailableLayers,Layers) and needMap(Width, Height) and selectBoundingBox(XMin_ME, YMin_ME, XMax_ME, YMax_ME, XMin_BB, YMin_BB, XMax_BB, YMax_BB) and selectFormat(AvailableFormats, Format) then returnMap(Map) <= a(ga_sp,P) then null <- showMap(Map)

a(map_provider,P) :: requestMap(MapFile, Version, Layers, Width, Height, Format, SRS, XMin_BB, YMin_BB, XMax_BB, YMax_BB) <= a(ga_sr,R) then returnMap(Map) => a(ga_sr,R) <- getMap(MapFile, Version, Layers, Width, Height, Format, SRS, XMin_BB, YMin_BB, XMax_BB, YMax_BB, Map)

Page 20: Gsdi10
Page 21: Gsdi10

Assign the same unit cost to all operations that have their abstraction theoretic counterparts

TED operations not allowed by definition of abstractions/refinements are assigned an infinite cost

AO TED operation Preconditions Cost= Cost⊑ Cost⊒

n1⊒Pd n2 replace(a, b) a ⊒ b; a and b correspond to predicates 1 ∞ 1

n1⊒D n2 replace(a, b) a ⊒ b; a and b correspond to functions, … 1 ∞ 1

n1⊒P n2 insert(a) a corresponds to predicate, function, … 1 ∞ 1

n1⊑Pd n2 replace(a, b) a ⊑ b; a and b correspond to predicates 1 1 ∞

n1⊑D n2 replace(a, b) a ⊑ b; a and b correspond to functions, … 1 1 ∞

n1⊑P n2 delete(a) a corresponds to predicate, function, … 1 1 ∞ n1= n2 replace(a, b) a = b; corresponds to predicate, … 0 0 0

TED operations and costs

Page 22: Gsdi10

Geo-Services Example (LCC) a(map_requestor,R):: requestMap(MapFile, Version, Layers, Width, Height, Format, SRS, XMin_BB, YMin_BB, XMax_BB, YMax_BB) => a(ga_sp,P) <- selectLayers(AvailableLayers,Layers) and needMap(Width, Height) and selectBoundingBox(XMin_ME, YMin_ME, XMax_ME, YMax_ME, XMin_BB, YMin_BB, XMax_BB, YMax_BB) and selectFormat(AvailableFormats, Format) then returnMap(Map) <= a(ga_sp,P) then null <- showMap(Map)

a(map_provider,P) :: requestMap(MapFile, Version, Layers, Width, Height, Format, SRS, XMin_BB, YMin_BB, XMax_BB, YMax_BB) <= a(ga_sr,R) then returnMap(Map) => a(ga_sr,R) <- getMap(MapFile, Version, Layers, Width, Height, Format, SRS, XMin_BB, YMin_BB, XMax_BB, YMax_BB, Map)

Page 23: Gsdi10

P2P Infrastructure

•  Workflows are formalized by Interaction Models (IMs) between peers

•  We use the Lightweight Coordination Calculus (LCC), an executable specification language (like BPEL)

–  Uses roles for peers and constraints on message sending to enforce social norms and behaviours

Page 24: Gsdi10

Related works

• Geo-data integration – Alignment efforts [Chen et al.] – Semantic heterogeneity [Lutz et al.], [GEON

project] – Ontology matcher [G-Match]

• Geo-services interoperability – OGC cataloguing services – Geospatial Semantic Interoperability

Experiment – Chaining Geo-services [Lemmens et al.]

Page 25: Gsdi10

P2P infrastructure

•  Peer-to-peer network for distributed application sharing and execution.

–  Peers can search/download applications from other peers.

–  Developers can publish applications and their interaction specifications.

•  Provides a framework to execute and coordinate the programs in each peer.

•  Anyone with a computer and internet access may join the system.

•  No central organization. •  Lightweight Coordination Language.

Page 26: Gsdi10

Example of IM

a(inquirer, I):: ask(W) => a(oracle,O)←toknow(W) ‏ then definition(W,D) <= a(oracle,O) ‏ then null←show(W,D)

a(oracle, O):: ask(W) <= a(inquirer,I) then definition(W,D)=> a(inquirer,I)←define(W,D)

Roles Constraints

Page 27: Gsdi10

Interaction Run

Inquirer Oracle

a(oracle, O):: ask(W)<= a(inquirer,I) then definition(W,D)=>a(inquirer,I)←define(W,D)

toknow(W)‏ a(inquirer, I):: ask(W)=>a(oracle,O)←toknow(W)‏ then definition(W,D)<=a(oracle,O)‏ then null←show(W,D)

Page 28: Gsdi10

Interaction Run

Inquirer Oracle

a(inquirer, I):: ask(W)⇒a(oracle,O)←toknow(W)‏ then definition(W,D)⇐a(oracle,O)‏ then null←show(W,D)

a(oracle, O):: ask(W)⇐a(inquirer,I) then definition(W,D)⇒a(inquirer,I)←define(W,D)

toknow(W)‏ ask(W) ‏

Page 29: Gsdi10

Interaction Run

Inquirer Oracle

a(inquirer, I):: ask(W)⇒a(oracle,O)←toknow(W)‏ then definition(W,D)⇐a(oracle,O)‏ then null←show(W,D)

a(oracle, O):: ask(W)⇐a(inquirer,I) then definition(W,D)⇒a(inquirer,I)←define(W,D)

toknow(W)‏ ask(W) ‏

define(W)‏

Page 30: Gsdi10

Interaction Run

Inquirer Oracle

a(inquirer, I):: ask(W)⇒a(oracle,O)←toknow(W)‏ then definition(W,D)⇐a(oracle,O)‏ then null←show(W,D)

a(oracle, O):: ask(W)⇐a(inquirer,I) then definition(W,D)⇒a(inquirer,I)←define(W,D)

ask(W) ‏

define(W)‏ def(W,D)‏

toknow(W‏ toknow(W)‏

Page 31: Gsdi10

Interaction Run

Inquirer Oracle

a(inquirer, I):: ask(W)⇒a(oracle,O)←toknow(W)‏ then definition(W,D)⇐a(oracle,O)‏ then null←show(W,D)

a(oracle, O):: ask(W)⇐a(inquirer,I) then definition(W,D)⇒a(inquirer,I)←define(W,D)

ask(W) ‏

define(W)‏ def(W,D)‏

toknow(W) toknow(W)‏

show(W,D)‏

toknow(W)‏