Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research...

40
Geographic OLAP: from Modelling to Visualization Sandro Bimonte TSCF, CEMAGREF, Clermont-Ferrand, France [email protected]

description

Spatial OLAP for environmental data: solved and unresolved problemsSandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont Ferrand ( France )Intelligent Analysis of Environmental Data (S4 ENVISA Workshop 2009)

Transcript of Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research...

Page 1: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

Geographic OLAP: from Modelling to

Visualization

Sandro Bimonte

TSCF, CEMAGREF, Clermont-Ferrand, [email protected]

Page 2: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 2/38

Outline

Context Geographic information and Spatial analysis Data Warehouse and OLAP Spatial OLAP

Contributions Modelling

Geographic OLAP GeoCube: conceptual model

Visualization GeWOlap: a Web-based Geographic OLAP Tool GeOlaPivot Table: a 3D visualization and interaction methaphor GoOLAP: integration of Geovisualization and OLAP tools

Perspectives

Conclusions

Page 3: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 3/38

Geographic information Geographic information is the representation of

an object or a real phenomenon located in the space

It is characterized by Spatial component: position and the shape Semantic component:

Information about the nature, the aspect and the other descriptive properties

Spatial, thematic and/or cartographic generalization relationships with other objects or phenomena

Context

Page 4: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 4/38

Spatial Analysis Spatial analysis process is flexible and

iterativeIdentify the problem

Examine results

Change parameters

Redefine the process

Select tools

Identify data

Create and analysis plan

Show results

Layer A

Layer B

Layer C

Spatial operation

Input

Output

Spatial operation

Context

Page 5: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 5/38

Data Warehousing and OLAP (1/2)

A data warehouse is "a subject-oriented, integrated, non-volatile and time-variant collection of data stored in a single site repository and collected from multiple sources" [Immon92]

Data warehouse models are designed to represent measurable facts, described by measures, and the various dimensions that characterize the facts and represent analysis axes

An instance of a multidimensional model is an hypercube

OLAP tools implement interactive analysis techniques used to rapidly explore the data warehouse through OLAP operators

Sales

Item

CodeNamePrice

Products

Store

NameCode

Address

Location

City

NamePopulation

Client

NameAge

Clients

Volume : SUM

Month

Code_MonthLabel

Time

Type

CodeLabel

Brand

NameCode

Year

Code_yearLabel

Context

Page 6: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 6/38

Spatial OLAP Spatial OLAP (SOLAP)

"A visual platform built especially to support rapid and easy spatio-temporal analysis and exploration of data following a multidimensional approach comprised of aggregation levels available in cartographic displays as well as in tabular and diagram displays“ [Bédard97]

Cartographic representation of the multidimensional data allows :

Visualize spatial distribution of the facts Visualize (spatial) relationships between facts and

classical dimensions Visualize facts at different spatial granularities

Context

Page 7: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 7/38

Main Spatial OLAP Concepts Spatial Dimension:

Spatial non geometric (i.e. text only members) Spatial geometric (i.e. members with a cartographic representation) Mixed spatial (i.e. combining cartographic and textual members)

Spatial Measure: List of spatial objects Result of spatial operators

Spatio-multidimensional operators Navigate into spatial dimension (Roll-Up/Drill-Down) Slice the hypercube

Accidents

Insurance

NumberValidity period

Insurance Type

Time

Date_day

Calendar Month

Name

Amount paidLocation /GU

Client

First nameLast name

AgePosition

Age Category

Age Group

Group nameMin valueMax value

Insurance Category

NameWeek

Week number

Quarter

Number

Year

YearHighway

Manteinance

Coating

NameType

Durability

Road Coating

City

NamePopulation

Geo Location

State

NamePopulation

Area

Highway Segment

Segment numberRoad Condition

Highway Structure

HighwaySection

Section number Length(S)No. Cars

Repair Cost

Highway

Name

Date

DateEvent

Season

Time

Cardinalités

(1,1)

(1,N)

Niveaux

Nom du niveau

Attribut clé

Autres Attributs

Nom du niveau

Attribut clé

Autres Attributs

Critères d’analyse

Nom

Dimension

PointLigneSurface

Collection de PointsCollection de Lignes

AdjacentIntersectionDisjoint

A l’intérieurEgalCollection de Surfaces

A travers

A

B

C

Fait et Mesures

Nom du Fait

M esures

Cardinalités

(1,1)

(1,N)

Niveaux

Nom du niveau

Attribut clé

Autres Attributs

Nom du niveau

Attribut clé

Autres Attributs

Critères d’analyse

Nom

Dimension

PointLigneSurface

Collection de PointsCollection de Lignes

AdjacentIntersectionDisjoint

A l’intérieurEgalCollection de Surfaces

A travers

A

B

C

Fait et Mesures

Nom du Fait

M esures

Cardinalités

(1,1)

(1,N)

Niveaux

Nom du niveau

Attribut clé

Autres Attributs

Nom du niveau

Attribut clé

Autres Attributs

Critères d’analyse

Nom

Dimension

PointLigneSurface

Collection de PointsCollection de Lignes

AdjacentIntersectionDisjoint

A l’intérieurEgalCollection de Surfaces

A travers

A

B

C

Fait et Mesures

Nom du Fait

M esures

Context

Page 8: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 8/38

Spatial OLAP: Tools

Rivest, et al. 05 Scotch, et al. 05

Voss, et al. 04Webigeo

Context

Page 9: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 9/38

Spatial OLAP Limits

SOLAPGeographic Information

Dimension Spatial Hierarchy

Map GeneralizationRelationships

Measure Spatial Component

DescriptiveAttributes

Analysis Axes and subject defined a priori

Data creation/ modification

Semantic

component

Flexibility

Context

Page 10: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 10/38

Geographic OLAP

Page 11: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 11/38

Geographic Dimension

A dimension is geographic if the members at least of one level are geographic objects

Contribution:Geographic OLAP

Page 12: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 12/38

Descriptive Hierarchy

A descriptive hierarchy is defined using descriptive attributes of objects

Pollution

Day

Day

Time

Unit

NamePlantsAreaType

Salinity

Lagoon

Type

Name

Pollutant

CodeName

DensityBoilngPoint

Pollutants

BoundsType

Bt_code Rate : AVG

CarbonsAtomsNumber

Cbn_code

Month

Month

Year

Year

TypeP

Name

Hiérarchie descriptive

Ancora Chioggia Romea

Commercial Industrial

All_units

Mazzorbo

Contribution:Geographic OLAP

Page 13: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 13/38

Spatial Hierarchy

A spatial hierarchy if a hierarchy where members of different levels are related by topological inclusion and/or intersection relationships

Hiérarchie spatiale

Pollution

Day

Day

Time

Unit

NamePlantsAreaType

Salinity

Lagoon

Zone

NameArea

Pollutant

CodeName

DensityBoilngPoint

Pollutants

BoundsType

Bt_code Rate : AVG

CarbonsAtomsNumber

Cbn_code

Month

Month

Year

Year

TypeP

NameCanalBissa

Carbonera Mazzorbo Ancora

BoccaLido

North Swam

All_units

Choggia Romea Ronzei Figheri

Bocca Chioggia South Swam

Contribution:Geographic OLAP

Page 14: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 14/38

Generalization Hierarchy A hierarchy is a Generalization hierarchy if:

members represent the same geographic information at different scales

members of a level are the result of generalization of members of the directly inferior level

UnitàBarenaliPollution

Time

Day

day

Month

month

Year

year

Pollutants

Pollutant

CodeNameDensityBoilingPoint

Bounds Type

Bt_code

Carbons Atoms

NumberCbn_code

Type

name

Unit 1:1500

NamePlantsAreaTypeSalinity

Lagoon

Unit 1:500

NamePlantsAreaSalinity

Rate: Avg

Paleazza

Sacco GheboStorto

All_units

Botta SoraCanal

Botta SoraCanal-Treporti

TreportiSacco GheboStorto

Contribution:Geographic OLAP

Page 15: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 15/38

Geographic Measure

A geographic measure is a geographic object which can belong to one or more hierarchy schemas

Pollution

Day

Day

Time

Rate5

Value5

Rate

Rate10

Value10

Pollutant

CodeName

DensityBoilngPoint

Pollutants

BoundsType

Bt_code

CarbonsAtomsNumber

Cbn_code

Month

Month

Year

Year

TypeP

Name Unit

Geom : FusionName : No Aggregation

Plants : List/Area

Type : RatioSalinity : AVG

Contribution:Geographic OLAP

Page 16: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 16/38

Multidimensional Operators

Drill and slice operatorsAnd… Operators which dynamically modify

spatial dimensions Operator to permute measure and

dimension Operators to navigate into hierarchy

measure

Contribution:Geographic OLAP

Page 17: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 17/38

GeoCube

Page 18: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 18/38

GeoCube

Entity Schema et Instances model members and measures

Entity Schema et Instances are organized into hierarchies (Hierarchy Schema et Instance)

Base Cube represents the fact table where all dimensions are at the most detailed levels

Every level can be used as dimension or as measure A measure belongs to a hierarchy

Aggregation Mode defines aggregations for the entity used as measure

View represents a multidimensional query

Page 19: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 19/38

Algebra

Let Vv = BCbc, L, k, then

Op (Vv) [parameters] = V’v = BC’bc, L’, ’k, ’

where ’ is calculated using an algorithmNavigation Modification

Roll-UpSliceDice

ClassifySpecialize

PermuteOLAP-Buffer

OLAP-Overlay

Contribution:GeoCube

Page 20: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 20/38

Properties

Contribution:GeoCube

Data modelling properties Damiani Jensen Ahmed Pourabbas GeoCube

Set of measures OK NO OK NO OK

Dimension attributes NO NO NO OK OK

Multi-valued measures OK OK OK OK OK

User-defined aggregation functions

OK OK NO OK OK

Derived measures(derived dimension attributes)

NO NO NO NO OK

N-n relationships between dimensions and facts

NO OK NO OK OK

Complex hierarchies OK OK NO OK OK

Correct Aggregation of Geographic measures

NO NO NO NO OK

Imprecision of Multi-association relationships for Map Generalization hierarchies

NO NO NO NO OK

Page 21: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 21/38

Properties

Spatio-multidimensional Operators

Damiani Jensen Ahmed Pourabbas GeoCube

Operators which modify spatial dimensions

NO NO NO NO OK

Permute NO OK NO NO OK

Navigation into measures hierarchy

(Multigranular analysis)

Part Part NO NO OK

Contribution:GeoCube

Page 22: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 22/38

GeWOlap

Page 23: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 23/38

GeWOlap Web Geographic OLAP tool:

OLAP-GIS integrated Synchronized environment Geographic measures and dimensions Geographic OLAP operators

Contribution

Page 24: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 24/38

Architecture

Spatial Data Warehouse

OLAP Server

OLAP Client

Spatial TablesAggregate Tables

Dimensions and facts tables

Spatial ORACLE

Mondrian<Schema name=pollution>

<AggName name=agg_1_poll>

….

<Cube name=Pollution>

</Cube>

Pollution.xml

Cube definition

Tabular Display

JPivot

+

MapXtreme Java

Cartographic display

Contribution:GeWOlap

Page 25: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 25/38

User Interface

GIS operators Geographic OLAP operators

Contribution:GeWOlap

Page 26: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 26/38

Geographic Measures

Contribution:GeWOlap

Page 27: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 27/38

Drill-down Position

Contribution:GeWOlap

Page 28: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 28/38

OLAP-Overlay

Depuration

Map

Contribution:GeWOlap

Page 29: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 29/38

GeOlaPivot Table

Page 30: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 30/38

GeOlaPivot Table

GeOlaPivot Table is a 3D interaction metaphor

Combines Space-Time Cube and Pivot Table concepts

A third dimension provides an insight of spatial evolution of the phenomenon in function of other inputs (time, products) using the map overlay

Visually compare spatial relationships between measures of different members of the same level

Visualize spatial relationships between measures and dimensions members

Visual representation of the structure of the multidimensional application

OLAP operators through the simple interaction

Contribution

Page 31: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 31/38

Mock-up

Contribution:GeOlaPivot Table

Page 32: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 32/38

GoOLAP

Page 33: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 33/38

It combines the facilities provided by a commonly used geobrower and a traditional OLAP system

It integrates in a web application, the 3D capabilities provided by the geobrowser Google Earth with a freely available OLAP server, Mondrian

The main advantage of this solution is to provide a web-based SOLAP environment, able to render in 3D spatial data

Date can be provided by different (remote) data repositories.

The Decision Maker can highly personalize the visual encodings of the information

GoOLAP

Page 34: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 34/38

User Interface

Contribution:GoOLAP

Page 35: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 35/38

Current work

Introduction of continuous field data into SOLAP Aggregation by means of Map Algebra

Definition of visual language for Spatial Data Warehouse

Spatial Data Warehouse using semi-structured data (GML)

Page 36: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 36/38

Future Work

Modelling SOLAP Conceptual Model for sensor network data

Introduction of Spatio-temporal multigranular data in SOLAP

Definition of new operators which modify dynamically spatial dimensions

Integrity constraints for Spatial Data Warehouse

Introduction of vague spatial data in SOLAP

Visualization Introduction of temporal component in GoOLAP

Page 37: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 37/38

Conclusions (1/2)

Spatial OLAP integrates spatial data in OLAP systems

SOLAP models and tools do not “well” handle geographic data and spatial analysis

A new multidimensional analysis paradigm: Geographic OLAP

Page 38: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 38/38

Conclusions (2/2)

Geocube: multidimensional model and algebra for Geographical OLAP

GeWOlap: web OLAP-GIS integrated solution based on GeoCube

GeOlaPivot Table: a visualization and interaction metaphor to analyze geographic measures

GoOLAP: a system wich integrates geovisualization and OLAP functionalities

New trends in SOLAP and Spatial Data warehousing

Page 39: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 39/38

Questions for me…and You

How we can estimate missing values in SDW? using hierachies ?

Is it possible to couple ML,DM algorithms with SOLAP ? using hierarchies ?

How improve SOLAP visualization? reducing dimensionality

Page 40: Spatial OLAP for environmental data: solved and unresolved problems Sandro Bimonte – Research Centre on Tecnologies, information systems and processes for agriculture (TSCF), Clermont

S4 ENVISA Workshop19/6/2009 40/38

Thanks for your attentionMerciGrazie

Questions ?