Contents

33
Quality issues in Quality issues in Spatial Databases Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS CRG & GEOIDE & REVIGIS Victoria, May 2003 Victoria, May 2003

description

Quality issues in Spatial Databases M. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGIS Victoria, May 2003. Contents. Introduction Problems Objective Methodology Results Discussion Conclusions and perspectives. Introduction. Data fusion and Data Quality - PowerPoint PPT Presentation

Transcript of Contents

Page 1: Contents

Quality issues in Spatial Quality issues in Spatial DatabasesDatabases

M. Mostafavi, G. Edwards, R. Jeansoulin M. Mostafavi, G. Edwards, R. Jeansoulin

CRG & GEOIDE & REVIGISCRG & GEOIDE & REVIGISVictoria, May 2003Victoria, May 2003

Page 2: Contents

Contents

Introduction

Problems

Objective

Methodology

Results

Discussion

Conclusions and perspectives

Page 3: Contents

IntroductionData fusion and Data Quality

Multi sources spatial data

Vector data : BNDT, BDTQ, …

Raster data: satellites images, aerial images,…

Need for better qualityLogical consistency

Completeness

Semantic accuracy

Temporal accuracy

Positional accuracy

and more …

Decision making (Effective crisis management (MSPQ))

Page 4: Contents

A real case problem BNDT: good geometry Statistics Canada database, Canada election database:

reach descriptive information but weak geometry How to reconcile these two data sets?

BNDT

SC, EC

Page 5: Contents

Context

Fusion

SDB1

SDB3

SDB2 Information of greater quality

User vision(fitness for use)

Producer vision(Product ontology)

Page 6: Contents

Logical consistencyLogical consistency is an important element of data quality. It defines the degree of consistency of the data with respect to its specifications.

Integrity constrains

Explicit rules stated in the data specifications (e.g. connectivity between two objects)

Implicit rules (e.g. a river always flows downstream)

Ontology vs. specifications

Ontology

specifications

Page 7: Contents

Project definition

NTDBOntology

BDTQOntology

NTDB data

BDTQdata

New dataset

Ontology consistency

dataconsistency

dataConsistency

vs. BNDT

dataconsistency

Step 1

Step 2

Step 3Step 4

Step 5

Integrated ontology

Mapping the ontologiesOntology fusion

Data fusion

Does thisHelp?

Lack of explicit rules

Yes?......No

Page 8: Contents

Consistency in NTDB

NTDB Ontologies

DelphiInterface

DelphiInterface

Studying theLogical consistency of the dataset

Prolog

Dataset

Step 1 Step 2

Page 9: Contents

Formalizing the ontology

Knowledge baseFacts Rules

Queries

BNDTOntology

Page 10: Contents

Spatial relations in NTDB

Spatial relations in NTDB are:

1. Connection relations

2. Sharing relations

3. adjacency relations

4. Superposition relations

AB

A

B

A

B

C D E

AB

C

1

2

34

Page 11: Contents

Logical approach- factsFor NTDB the facts consist of

Taxonomy of NTDB Themes Entities Allowed Combinations Code (NTDB identity code) Geometric representations

Spatial relations Connection Sharing Superposition/ adjacency Minimal values (e.g. distance constraints between objects)

Page 12: Contents

Logical approach- facts

Types of facts Total groups Facts

Taxonomy 368 386

Connection 574 330 796

Sharing 523 15 853

Adjacency/superposition 138 1637

Total 348672

There are about 350,000 facts describing the NTDB• Remark: regrouping of objects for programming purposes

has created some inconsistencies

Page 13: Contents

Logical approach- rules

Relation Inconsistency rules

Connection a. Object A is connected to object B and the inverse relation is not defined.

b. Connection is illegal (C1=0-0) and for the same objects we have C1 ≠ 0-0 .

Sharing a. Object A shares with object B but the inverse relation is not defined.

b. The same objects share with different values of C2.

SuperpositionAdjacency

a. Two objects are superposed and are adjacent at the same time.

Several rules are defined to analyze the ontological consistency of the NTDB.

Inconsistency rules

Page 14: Contents

Results (1/2)

Inconsistency (inverse connection)Data dictionary: (generic relation) between themes: Railway (L) Connected to Road (L) between themes : Road (L) Connected to Railway (L)

Table of connection and cardinalities

Code Entity Combination C1 Code

3002 Railway Standard, Ground level, Operational, Multiple

1-2 3660

3660 Road Secondary, Ground level ,Hard surface

Not verified

?

Page 15: Contents

Results (2/2)

Inconsistency (Different Values for the cardinality one)

Data dictionary: (Generic relation)

Gas and oil facilities (P) is Connected to Building (P)

Table of connection and cardinalities

Code Entity Combination C1 Code

788 Gas and oil facilities

Generic/

unknown

0-0 147

788 Gas and oil facilities

Generic/

unknown

-

(Not verified)

147

?

Page 16: Contents

Consistency in Data

NTDB Ontologies

DelphiInterface

Studying theLogical consistency of the dataset

PrologVB

Interface

Dataset

Step 1 Step 2

Page 17: Contents

Geomedia professional

•Meet•Entirely Contained•Entirely Contained by•Contains and •Contained by•Spatially equal•touch

Meet

Overlap

Spatial operations

Page 18: Contents

Mapping

Polygon – Polygon Relations

Relations Disjoint Meet Equal Inside ContainsCovered

byCovers Overlap

Connection

Sharing x x x x

Superposition x x x x x x

Adjacent x

Page 19: Contents

Mapping problems

Several problems

Confusions in spatial relations

Unique mapping is not possible

Cardinalities cannot be considered

Page 20: Contents

Step 2: BNDT Data vs its ontology

Page 21: Contents

Data vs ontology

File 21E05 Region: Sherbrooke 68 Entities 23,283 objects Analyzed binary relations:

Contours vs. water bodies Buildings vs. roads Water bodies vs. buildings Liquid depot vs. Liquid depot Roads vs. water bodies …

Page 22: Contents

Results

Liquid depot vs. Liquid depot Spatial representations (Point, Area)Spatial relations

Ontology/ specification (superposition is illegal)Data (superposition case is found)

Page 23: Contents

• Problem: Road crosses a water body• Illegal relation with respect to semantics of the

objects• Incomplete ontology

Results

Page 24: Contents

• Problem: Cut line crosses a water body• Illegal relation with respect to semantic definition

of the objects• Incomplete ontology

Results

Page 25: Contents

• Problem: Contour crosses water body• Illegal relation with respect to the ontology• Inconsistent data

Results

Page 26: Contents

Results• Problem: Road crosses water body• Illegal relation with respect to the ontology• Inconsistent data

Page 27: Contents

• Problem: Road crosses Building• Illegal relation with respect to the semantics of

objects• Incomplete ontology

Results

Page 28: Contents

• Problem: Water body (L) superposed Vegetation (A)• Illegal relation with respect to the ontology• Inconsistent data• Control system problem

Results

Page 29: Contents

• Problem: Buildings (S) superposed to water body (A)

• Illegal relation with respect to the semantics of objects

• Inconsistent data

Results

Page 30: Contents

• Problem: Building (A) Overlap Vegetation (A)• Illegal relation with respect to the semantics of

objects• Inconsistent data

Results

Page 31: Contents

Suggestions, solutions

Adding new rulesBuilding (a) and vegetation (a) (illegal

superposition)Road (l) and building (conditional

superposition)A better control system is needed

Find exceptions

Page 32: Contents

Current situationProduct ontology is analyzed

Mapping of topological relations to binary relations

Ontology translation in prolog (Delphi program)

Consistency studding of spatial relations

Connection (table C)

Sharing (table D)

Superposition and adjacency (table E)

Consistency between different relations (fusion of facts)

connection and sharing , connection and superposition / adjacency,

sharing and superposition / adjacency

Consistency of data vs. specifications are studied

Page 33: Contents

Future work

logical consistency of other available

datasets

Mapping of ontologies

Fusion of ontologies

Fusion of data

Consistency of the newly created data set