gis|data 1/17
© stankute, asche·ifg·uni·potsdam 2011
Improvement of spatial data quality through data conflation
Silvija Stankute, Hartmut AscheGeoinformation Research GroupDept of Geography | University of Potsdam | Germany
ICCSA 2011 | GEOG-AN-MOD 2011 | University of Santander | 20-23/06/2011
gis|data 2/17
© stankute, asche·ifg·uni·potsdam 2011
Summary
1. Motivation: Spatial data quality matters
2. Spatial data quality: Definition,
indicators
3. Data conflation: Optimising spatial data
quality
4. Data conflation at work: Inserting a
roundabout
5. Conclusion: What‘s the merit of data
conflation?
gis|data 3/17
© stankute, asche·ifg·uni·potsdam 2011
Introduction of digital mapping techniques and GIS in the 1960s made quality of digital spatial data an issue in geoinformation processing (GI)
Error and uncertainty in spatial data identified as potential problems in GI processing uncommon in production and use of paper maps
Ongoing development from 1980s to design and implement data transfer standards which include data quality information hitherto available on the margins of paper maps only
Objective of this work is to present data conflation as one option in GI processing for improvement of spatial data quality
1 Motivation Spatial data quality matters
gis|data 4/17
© stankute, asche·ifg·uni·potsdam 2011
OpenStreetMap Analog topo map 1:10K Brandenburg Viewer
1 Motivation Spatial data quality matters
Potsdam in different spatial datasets
gis|data 5/17
© stankute, asche·ifg·uni·potsdam 2011
Geodata quality ISO 8402: totality of characteristics of a product that bear
on its ability to satisfy stated or implied needs > fitness-for-use
Definition of spatial data quality necessitates information on (a) geodata used, (b) user requirements
Fitness-for-use: data meet requirements of target application
Geodata quality indicators Completeness Logical consistency Positional accuracy Temporal accuracy: accuracy of reporting time of data Semantic/thematic/attribute accuracy Information on geodata quality included in metadata
2 Spatial data quality Definition, indicators
gis|data 6/17
© stankute, asche·ifg·uni·potsdam 2011
Data acquisition Different methods for spatial data acquisition developed
by spatial data producers result in different data types data formats semantic information of geodata
Consequence: multiplicity of spatial data
Problem: multiple data use of specific datasets
Option: data integration or data conflation applied to existing datasets instead of continuous acquisition of new spatial data with above faults
2 Spatial data quality Data acquisiton
gis|data 7/17
© stankute, asche·ifg·uni·potsdam 2011
Objective Automated merge of heterogenous
geodata to application requirements to produce best-fit dataset for any specific application
source dataset SDS
target dataset TDS output dataset
3 Data conflation Optimising spatial data quality
missing data
inserted data
gis|data 8/17
© stankute, asche·ifg·uni·potsdam 2011
One spatial object, different data models Real world spatial data transformed into computer-
readable digital data model representing spatial features as (a) points, (b) lines or (c) areas (polygons)
Modelling of real world spatial data can result in different data models of identical real world object: traffic roundabout
3 Data conflation Optimising spatial data quality
gis|data 9/17
© stankute, asche·ifg·uni·potsdam 2011
One spatial object, multiple geometry
OpenStreetMap
TeleAtlas
ATKIS
3 Data conflation Optimising spatial data quality
gis|data 10/17
© stankute, asche·ifg·uni·potsdam 2011
4 Data conflation at work Conceputal frameworkSubstituting roundabout for road crossing
Inserting roundabout in dataset where roundabout modelled as road crossing = not defined as roundabout
Detecting “missing” roundabout by identifying position of crossings in input datasets: roundabout identified if minimum of 3 edges of road network have identical start and end point
When 3 edges are identified which have the same node (start or end point of edge), this intersection is part of roundabout
gis|data 11/17
© stankute, asche·ifg·uni·potsdam 2011
4 Data conflation at work Automated workflowProducing best-fit dataset
dataset 1
dataset 2
pre-processing
pre-processing
object assignment
new datasetdata sources
gis|data 12/17
© stankute, asche·ifg·uni·potsdam 2011
(a) edge tracing for identification of roundabout in input data-set 1, (b) search for roundabout access/exits in input dataset 2
Merge access/exits with corresponding points on crossroads
4 Data conflation at work Semantic accuracyInserting roundabout in target dataset
Inserting roundabout
gis|data 13/17
© stankute, asche·ifg·uni·potsdam 2011
All access or exits of roundabout found in first input dataset
Corresponding edges in second input dataset also detected.
Geometrical information about new objects can be assigned to target dataset
4 Data conflation at work Geometric completenessAssigning geometric information
Inserting roundabout
gis|data 14/17
© stankute, asche·ifg·uni·potsdam 2011
After completion of merge process of 2 or more datasets (points, lines, polygons) completeness of input data is always increased
Prerequisite: one of the input datasets must have more infor-mation than the other(s)
Not all new geometry objects of target dataset include infor-mation on thematic attributes, hence completeness of target dataset can never be complete in terms of thematic information
Consequence: Datasets generated by conflation can only be complete in terms of geometrical information
4 Data conflation at work Data quality optimised
gis|data 15/17
© stankute, asche·ifg·uni·potsdam 2011
4 Data conflation at work Data quality optimised Real world spatial
data: 8 buildings Source dataset in-
cludes information on 6 buildings (geo-metry, use)
Target dataset in-cludes information on 5 buildings (geo-metry, floors)
End dataset com-plete with geometric information
Geometric objects of both input datasets have 100% thematic completeness
gis|data 16/17
© stankute, asche·ifg·uni·potsdam 2011
Conflation methods allow the improvement of positional and temporal accuracy of spatial data
Positional accuracy of a dataset can be increased with the information provided by another input dataset
If both datasets show major variance from the corresponding real world objects, arithmetic average of all input datasets can increase this quality element
Temporal accuracy can be improved if metadata provide infor-mation about actuality of spatial data
Data conflation facilitates multiple use of quality spatial data which can be generated automatically to application require-ments from existing suboptimal datasets
5 Conclusion What‘s the merit of data conflation?
gis|data 17/17
© stankute, asche·ifg·uni·potsdam 2011
Thank you for your attention
Questions? Comments? Feedback?
Contact Hartmut Asche | [email protected] of Geography | University of Potsdam
| GER Web www.geographie.uni-potsdam.de/geoinformatik
ICCSA 2011 | GEOG-AN-MOD 2011 | University of Santander | 20-23/06/2011
Top Related