A Data Fusion System for Spatial Data Mining, Analysis and Improvement Silvija Stankute, Hartmut...
-
Upload
geographical-analysis-urban-modeling-spatial-statistics -
Category
Technology
-
view
685 -
download
0
Transcript of A Data Fusion System for Spatial Data Mining, Analysis and Improvement Silvija Stankute, Hartmut...
data|fusion
1/18
© stankute|asche·ifg·uni·potsdam 2012
A data fusion system for spatial data mining, analysis and improvement
Silvija Stankute, Hartmut AscheGeoinformation Research GroupDept of Geography | University of Potsdam | Germany
ICCSA 2012 | GEOG-AN-MOD 2012 | Salvador da Bahia, Brazil | 18-21/06/2012
data|fusion
2/18
© stankute|asche·ifg·uni·potsdam 2012
Summary Data fusion system for spatial data mining
1. Motivation
2. Concept: Automated data fusion
3. System architecture: Generic
components
4. Fusion pipeline: Operations and
workflow
5. System operation: User interface
6. Conclusion
data|fusion
3/18
© stankute|asche·ifg·uni·potsdam 2012
Acquistion of geodata by range of actors including state insti-tutions (NMAs) and private enterprises resulting in heteroge-nous, frequently redundant geospatial databases
Geometric, semantic quality of geospatial data heterogenous, frequently insufficient or inaccurate: unreliable data quality of existing datasets for identical real world section
Effective geodata management and use necessitate harmonisa-tion of heterogenous geodata according to application-specific data quality specifications
To avoid fresh data acquisition automated process required to fuse imperfect geometric and/or semantic information of 2 or more datasets to produce optimal application-specific data
1 Motivation Improvement of geodata quality
data|fusion
4/18
© stankute|asche·ifg·uni·potsdam 2012
21
1+2
1+2+33
1+2
2 Concept Automated fusion of imperfect geodata
data|fusion
5/18
© stankute|asche·ifg·uni·potsdam 2012
Development and implementation of automated fusion process (DataFusion) to produce single geospatial dataset from existing datasets superior in geometric and/or semantic quality to im-perfect source data
Objective to extract, filter and combine relevant features from diverse source data into single best-fit quality dataset according to user and application specifications
Data harmonisation and fusion process allows for selection, elimination and/or substitution of unwanted source attribute features by user-specified geometric and/or semantic attributes
DataFusion or DAFU provides user-defined data filter to gene-rate optimal geodata in automated filtering process
2 Concept Automated fusion of imperfect geodata
data|fusion
6/18
© stankute|asche·ifg·uni·potsdam 2012
3 System architecture Modular components
data|fusion
7/18
© stankute|asche·ifg·uni·potsdam 2012
Implementation of DataFusion based on generic, modular com-ponent architecture and object-oriented, procedural cross-plat-form programming language (Perl)
Presently DataFusion consists of 3 components, sequentially linked in fusion pipeline
Preprocessing component: preprocessing modules for Tele-atlas, Navteq, ATKIS input data, at present
Filtering/fusion component: merge of 2 or more different input datasets into single optimal dataset
Validation component: quality assessment of merged dataset according to user or application specifications
3 System architecture Modular component system
data|fusion
8/18
© stankute|asche·ifg·uni·potsdam 2012
4 Fusion pipeline Preprocessing of source data
Quality measures
Analysis for topo-logical errors
Conversion to uniform coordinate system
Analysis for uniqueness
3
1
2
6
Source
data
Conversion to uniform data format
Analysis for topological completeness
Analysis for completeness
Geometric correction
Preprocessed input data
2
3 4 5 6
7
data|fusion
9/18
© stankute|asche·ifg·uni·potsdam 2012
Preprocessing component executes the following operations on heterogenous geospatial source data:
Objective: Quality assessment of input vector data model underlying each source dataset
Operations: Selection of source data; integration of source data by conversion to unified coordinate system; transformation into common data format; source data assessment for uniqueness and completeness; quality assessment and adjustment of topo-logical correctness, thematic completeness
Result: Preprocessed input datasets used as input data for sub-sequent fusion/filtering component
4 Fusion pipeline Preprocessing of source data
data|fusion
10/18
© stankute|asche·ifg·uni·potsdam 2012
Detection of relations among input data
1
Merged output data
4 Fusion pipeline Fusion of preprocessed data
Preprocessed input data
Assignments of related objects
3
2Transfer of the-matic information
Transfer of geo-metric information
2
3 4
data|fusion
11/18
© stankute|asche·ifg·uni·potsdam 2012
Data filtering/fusion component executes following operations on preprocessed geospatial input data:
Objective: Generation of single optimal dataset by transmission and augmentation of attribute features from n input datasets
Operations: Iterative comparison of geometric features (coor-dinates) of vector input datasets; determination of relationships between data features and real-world objects; generation of non-redundant fusion data (1 semantic feature assigned 1 geo-metric feature only, vice versa); transfer (cross-referencing) and extension of specified attributes to target dataset
Result: Merged dataset used as input data for subsequent vali-dation component
4 Fusion pipeline Fusion of preprocessed data
data|fusion
12/18
© stankute|asche·ifg·uni·potsdam 2012
Specified DAFU data
Validation of fusion quality
1
4 Fusion pipeline Validation of merged data
Interactive error correction
3
2
Data format con-version
Coordinate system transformation
2
3
4
Merged output data
data|fusion
13/18
© stankute|asche·ifg·uni·potsdam 2012
Validation component executes the following operations on single merged geospatial dataset:
Objective: Quality verification of fusion process
Operations: Calculation and evaluation of data fusion quality; if required and/or specified: interactive correction of errors of source data (< 5 percent for linear objects, <10-15 percent for polygonal objects); transfer of merged geodata to specified co-ordinate systems; conversion of merged dataset into specified data formats (SVG, CSV, SHP, etc.)
End result: Application and/or user-specified optimal geospatial dataset
4 Fusion pipeline Validation of merged data
data|fusion
14/18
© stankute|asche·ifg·uni·potsdam 2012
5 System operation User interface
Front-end of Data Fusion system allows for 2 operation modes: graphical user interface (GUI) or command-line interface
Command-line operation for implementation into remote sys-tems, such als servers, clusters, etc., by GI experts
GUI operation standard operation mode for application-orien-ted GI users
GUI composed of 8 widgets covering core funtions of DAFU; widgets communicate via data exchange and signal exchange (bindings)
Additional flexible support system provides user with relevant information on operation and understanding of DAFU
data|fusion
15/18
© stankute|asche·ifg·uni·potsdam 2012
Abb 5-3 Diss
5 System operation User interface > GUI
data|fusion
16/18
© stankute|asche·ifg·uni·potsdam 2012
6 Conclusion Data fusion – what‘s the benefit?
data|fusion
17/18
© stankute|asche·ifg·uni·potsdam 2012
6 Conclusion Data fusion – what‘s the benefit? The DataFusion system presents an innovatiove
approach to geospatial data mining by harmonising and improving the geo-metric and semantic quality of digital vector data
DAFU demonstrates that single optimal geospatial data can be generated from existing suboptimal datasets making repeated data acquistion unneccessary
DAFU facilitates cost-effective geospatial data management by multiple re-use of existing datasets customised to individual user and/or application requirements
DAFU contributes to reducing heterogeneity and redundancy of geospatial data in geo databases, at the same time increasing efficient, meaningful use of geographically-related mass data
data|fusion
18/18
© stankute|asche·ifg·uni·potsdam 2012
Thank you for your attention
Questions? Comments? Feedback?
Contact Hartmut Asche | [email protected] of Geography | University of Potsdam
| GER Web www.geographie.uni-potsdam.de/geoinformatik
ICCSA 2012 | GEOG-AN-MOD 2012 | Salvador da Bahia, Brazil | 18-21/06/2012