1 Antonio Rogmann (ZEFc), Universität Bonn Data Management in the GLOWA Volta Project Data...
-
date post
19-Dec-2015 -
Category
Documents
-
view
214 -
download
0
Transcript of 1 Antonio Rogmann (ZEFc), Universität Bonn Data Management in the GLOWA Volta Project Data...
1 Antonio Rogmann (ZEFc), Universität Bonn
Data Management in the GLOWA Volta Project
Data Management and Application of GIS and Remote Sensing in Natural Resources Management Training Workshop
Data Management in theGLOWA Volta Project
Antonio Rogmann
Center for Development Research (ZEFc)University of Bonn
Wednesday, December 12 – Friday, December 14, 2007 DGRE, Ouagadougou, Burkina Faso
2 Antonio Rogmann (ZEFc), Universität Bonn
Data Management
Content
1. Data Management Problems, Solutions and Challenges
2. Data Management Workflow
3. Data Management Infrastructure Concept Components and Interfaces
3 Antonio Rogmann (ZEFc), Universität Bonn
Data Management: Problems
Survey with GLOWA Volta Partner Institutions and Stakeholders at the
PARTNERS’ CAPACITY NEEDS ASSESSMENT WORKSHOP(31.05.-01.06.2007, Accra, Ghana)
For understanding
coherences between the institutions in terms of data exchange / flows related to water management
data environment: software/models in use, data storage and access facilities, hardware
defined set of problems in managing (getting access to) data
as condition for
adjusting the data management system of the GLOWA Volta Project to the requirements of the partners
offering solutions to increase the quality of data management to the partners
4 Antonio Rogmann (ZEFc), Universität Bonn
Consequences of lack of data management
Institutions participating in the survey:
Coalition of NGO's in Water and Sanitation (CONIWAS) Kwame Nkrumah University of Science and Technology, Kumasi (KNUST) Soil Research Institute, Council for Scientific and Industrial Research (SRI) Water Research Institute, Council for Scientific and Industrial Research (WRI) Hydrological Service Department (HSD) Water Resources Commission (WRC) (2 participants) Hydrological Service Department (HSD) Ghana Irrigation Development Authority (GIDA) Water Research Institute, Council for Scientific and Industrial Research (WRI) Ghana Water Company Ltd, Head Office (GWCL) Dept. of Agriculture Economy & Agriculture Business. College of Agric. and Consumer service Environmental Protection Agency (EPA) Centre for Environmental Impacts Analysis (CEIA) Volta Basin Development Foundation (VBDF) Training, Research Network for Development (TREND) UDS: Faculty of Integrated Development Studies Savannah Agricultural Research Institute (SARI) Volta River Authority (VRA)
5 Antonio Rogmann (ZEFc), Universität Bonn
Results
Lack of information about data = data about data = meta data
Data Management: Problems
Problems to get information about data concerning
0
1
2
3
4
5
6
7
8
9
10
Data Provider Data Quality DataContent/Format
Use Rights/prices Softw are data hasbeen processed
Vo
tes
based on survey squestionnaire.Number of participants = 19.
6 Antonio Rogmann (ZEFc), Universität Bonn
Data Management: Problems
Documentation of data available by
0
1
2
3
4
5
6
7
Meta database(internel)
Digital Catalog Catalog on paper Withoutdocumentation
Vo
tes
Results
Documentation of data mainly on internal digital catalogues (e.g. Excel-Tables), on papers or completely without documentation
Web-based and searchable meta database as exception
based on survey squestionnaire.Number of participants = 19.
7 Antonio Rogmann (ZEFc), Universität Bonn
Results
Data transfer is copious and time consuming Sending data by E-mail causes problems because of data volumes and transfer times
Data Management: Problems
Data transfer to partners by
0123456789
1011
Email CD /DVD sent bypost
Web dow nload picking uppersonally
Other w ays
Vo
tes
based on organizations represented in the questionnaire participants. Multiple choice. N = 19.
8 Antonio Rogmann (ZEFc), Universität Bonn
Data user
Organization
data
Service Department
data
Institution
data
Data Management: Problems
data
Common questions encountered
when searching for data:
Which data exist that can serve my research / decision / information requirements?
Where are the data available?
How can I get the data with little effort?
What are the formats of the data? Are they compatible with my applications / models?
What are the data characteristics (e.g. time steps, units ...)?
Who owns the data? Are there costs?
?
9 Antonio Rogmann (ZEFc), Universität Bonn
Organization (Hoster)
GVPdata
Service Department
data
Institution
data
??
To solve these problems the GVP would like to offer you:
a centraly hosted database which provides
access to the GVP datastock
the option to extent the datastock
with your own data
a centrally hosted metadatabase giving
information about data needed
references about data providers
a geo portal informing
about projects related to water management in the Volta-Basin
and their data: in a spatial visualization
Metadata
Geoportal Meta data
!!
Web
data
Data ServerMap Server
Solution: Data Management
10 Antonio Rogmann (ZEFc), Universität Bonn
Data
Project data: what can GVP provide?
Hydrological data: water discharge, groundwater (time series) ...
Climatological data: precipitation, temperature, air humidity, evapotranspiration, heat flux (time series and forecasts) ...
Water use data: agricultural (irrigation) / domestic / industrial (hydropower) / reservoirs …
Land use / land cover data: agriculture, urbanization, soil, geology, vegetation ...
Topographic / infrastructure / administrative (basic) data: river networks, lakes, elevation, roads, settlements, electricity, boundaries ....
Socio–economic data: demography, census data, economic activities (markets), surveys ...
in several formats: vector / raster data (remote sensing), tables, documents, model specific formats ...
11 Antonio Rogmann (ZEFc), Universität Bonn
Solution: Data Management
Data Management is the holistic background in which data access facilities are embedded
Data Management in an organization is based on a variety of methods for
Data description (meta data)
Data organization
Data quality assurance
Data access and distribution
Security
12 Antonio Rogmann (ZEFc), Universität Bonn
Solution: Data Management
Data Management in an organization is practically based on
Standards
global standards e.g for metadata, resource identification, formats ...
internal standards according to a concensus inside the organization e.g.
database models, file naming, data policy ....
Workflows / Process Steps / Responsibilities
Technology: hardware, software, interfaces ... data infrastructure
13 Antonio Rogmann (ZEFc), Universität Bonn
Data Management: Metadata
Metadata Standards:
several standards developed by standardization organizations like Federal Geographic Data Committee‘s (FGDC) standard ISO 15119 for geodata
registered by the International Organization for Standardization (IOS)
consisting of a range of elements/fields to describe resources (data, software, services)
some metadata standards partly consist of several hundreds of elements
14 Antonio Rogmann (ZEFc), Universität Bonn
Data Management: Metadata
Metadata Standard in the GVP: Dublin Core (DCMES)
core of 15 elements, extended by some special elements for geodata
all elements, except titel and identifier, are optional
understandable element description
every kind of resource (data, software, model, …) can be described
Searchable Metadata elements like
„Subject“: topic will be categorized using keywords, key phrases, or classification codes
„Publisher“: an entity (institution, person) responsible for making the resources available
„Format“: the file format, physical medium, or dimensions of the resource
go to manual
15 Antonio Rogmann (ZEFc), Universität Bonn
Data Management: Metadata
Creating metadata
Metadata should be stored in a metadatabase
hosted in a central place
providing web-based access and search interfaces to data and resource descriptions
Metadata can be created in two ways:
online: direct entry of metadata into a central metadatabase using internet browser, java script, php
offline: using an internet browser and a java script, storing each metadataset locally and close to the described object in a XML-file
if metadata XML-files have been created offline
► a metadata harvester can collect and insert local files automatically into the metadatabase on a server
► the metadata-files can be uploaded to the central metadatabase thango to manual
16 Antonio Rogmann (ZEFc), Universität Bonn
obligatory metadata elements
metadata elements input
field
opens URN mask
Input Mask* for
creating metadata-sets as XML files
entering metadata in to metadatabase (web / LAN)
go to manual
* Developed as prototype by Dr. Marcel Endejan,Deputive Executive Officer, GWSP in Dissertation
Data Management: Metadata
Metadata input mask
17 Antonio Rogmann (ZEFc), Universität Bonn
Data Management: Metadata
Metadata input mask
metadata elements
insert button go to manual
18 Antonio Rogmann (ZEFc), Universität Bonn
Data Management: Internal Description
Internal file description for structured data (e.g. measurements):
data file-headers giving information about content, units in use, instrumentation, quality of values, location, ...
apart from metadata important information are provided to
the data user / recipient
multiple similar files / data sets can be described in
► the first sheet (e.g. of an excel file)
► the first file of a file set (referenced by others)
► separate text file stored close to the files
19 Antonio Rogmann (ZEFc), Universität Bonn
Data Management: Naming Convention
Basic determination of data categories
qualitative data: data which are rich in detail and description, usually in a textual or narrative format, e.g. case studies, document reviews, …
quantitative data: numerical data. Data which are measured on either the ratio or interval scale of measurement, e.g. temperature, water level, …
Naming of data (recommended particulary for quantitative data) should reflect
(Example:) hyd_waterlevel_ghana-kaburi_020101-020630_v1.xls
Discipline Topic Site Time Frame version
Change of current naming systems is not necessary, but necessary is …
20 Antonio Rogmann (ZEFc), Universität Bonn
Data Management: Identification
Identification of resources (data, documents, maps)
… an unique identifier for each resource as a central metadata element. GVP uses the Uniform Resource Name (URN)
quasi-standard for identifying resources in information systems. Example: International Standard Book Number (ISBN)
can be used as name for resources (e.g. file name)
has to follow an standardized syntax
URN in the GVP will be generated easily using a resource name generator (internet browser)
21 Antonio Rogmann (ZEFc), Universität Bonn
Identification: URN
standardized syntax: ‚urn:‘<NID>‘:‘<NSS>
NID = Namespace Identifier representing an organisation, project, network, person
urn:x-gvp:uid:<NSS>
urn = uniform resource namex = experimental, not officially registered gvp = glowa volta project uid = user identification
Data Management: Identification
22 Antonio Rogmann (ZEFc), Universität Bonn
standardized syntax: ‚urn:‘<NID>‘:‘<NSS>
NSS = Namespace Specific String encoding information about the „type“,„use“ and „storage medium“ of the resource / data
urn:<NID>:<resType>-<resSubType>.<sTitel>.v<verNr>.<for>.<med>
<resType> = type of resource, e.g. dataset, document, software<resSubType> = subtype o.r., e.g. primary / secondary data, model input<sTitel> = short titel, name <verNr> = versionsNumber<for> = format<med> = medium on which resource / data file is stored
Data Management: Identification
23 Antonio Rogmann (ZEFc), Universität Bonn
Example:
urn:x-gvp:HD12:ds-pd.waterlevel_gh-kab_020101-020630-v1.0.xls.cd
gvp = Glowa Volta ProjectHD_12 = Institution e.g. „Hydro Service“, editor e.g. 12 = person xy ds = datasetpd = primary datawaterlevel ... 30 = short titel e.g. abbreviation for „hyd_waterlevel_ghana-
kaburi_020101-020630”v1.0 = version of dataset, e.g. raw data in 1st version (uncontrolled)xls = MS ExcelCD = on CD
Data Management: Identification
24 Antonio Rogmann (ZEFc), Universität Bonn
Creating of URN‘sUsing the Resource Name Generator*
creates URN‘s using a simple Java Script application running in an internet explorer currently existing as prototype
* Developed as prototype by Dr. Marcel Endejan,Deputive Executive Officer, GWSP in Dissertation
Data Management: Identification
25 Antonio Rogmann (ZEFc), Universität Bonn
Resource Name Generator integrates special codes for resource types within a network that shares data
the resource types have to be identified before modeling the URN-Syntax ….
… and integrated into the script
Resource Type
Resource Sub-Type
Data Management: Identification
26 Antonio Rogmann (ZEFc), Universität Bonn
Resource Name Generator*
version, format and storage medium is selectable
copy and paste URN into the name of the dataset (if required) and enter it into the metadata
individually URN‘s will be adjusted to the central metadata base, in which the data will be registered and described
Version Number Format
Storage Medium
URN
to avoid duplicates
* Developed as prototype by Dr. Marcel Endejan,Deputive Executive Officer, GWSP in Dissertation
Data Management: Identification
27 Antonio Rogmann (ZEFc), Universität Bonn
Formats
data can be stored in “proprietary” or in “non-proprietary” formats
proprietary format encodes data in a such a way, that the file can only be opened with the software which was used to generate the data
non-proprietary formats can be used by a wide range of applications (mostly using import functions) and platforms, increasingly in future
data has to be stored for a long period of time and it is not sure which programs will be used in future
interoperability between different software applications has to be provided as long as possible
Data Management: Formats
28 Antonio Rogmann (ZEFc), Universität Bonn
international certified standards like the ISO standard “Open Document Format for Office Applications” (ODF), “HTML”, “XML” or OGC’s “GML” (Geographic Markup Language - Open Geospatial Consortium)
some formats are de facto-standards (like MS Excel) because the proprietary programs are applied by many users
processing software widely used by the members of a data exchange framework have requirements in respect to input formats
Data Management: Formats
29 Antonio Rogmann (ZEFc), Universität Bonn
Conclusion: try to use non-proprietary exchange formats as far as possible and consider the format requirements of software in use
Examples:
Microsoft Word (.doc) Rich Text Format (.rtf), Open Document Text (.odt)
MS Excel (.xls) Comma Separate Value (.csv), Extensible Markup Language (.xml)
ESRI shape Geographic Markup Language (GML)
Recommendation:
use open office software like OpenOffice.org
in his functionalities similar to Microsoft Office (incl. Excel, Access, etc.)
format is ISO-Standard since 2006 (ODF - ISO/IEC 26300)!
no costs!
Data Management: Formats
30 Antonio Rogmann (ZEFc), Universität Bonn
Security
Warranty to avoid unallowed access and missaplication of data and resources
Use of computing security facilities as
Authentification Control Lists (ACL)
secure access channels like Secure Shell (SSH) technology
Data Management: Security
31 Antonio Rogmann (ZEFc), Universität Bonn
Data Access Control
data might have produced costs in creating, are not in the public domain, still not published, ….
data access control is based on an agreement within a (scientific) community of data producer, data user and data provider in terms of data access rules
Means:
who (user, user groups) is allowed to use (get) which data under which constraints (owner rights, payment)
how to organize the authentification prozess schematically user groups with graduaded access rights
how to implement the authentification process on a technical level
Data Management: Access Control
32 Antonio Rogmann (ZEFc), Universität Bonn
Quality assurance for data
Data Quality means: the state of completeness, validity, consistency, timeliness and accuracy that makes data appropriate for a specific use using computing facilities
In a comprehensive view provided by data management as subordinate concept
Software-based methods linked with specific scientific disciplines
have to be transparent and comprehensible
should be declared (recommended) within a scientific or administrative network
level of quality must be described within a data file, within the metadata....
Data Management: Quality
33 Antonio Rogmann (ZEFc), Universität Bonn
Quality assurance for data in the GVP
Is done by the scientists within their own discipline in their responsabilty
Test by diagramms, if data are consistent
Comparisons with other data sources
Routine recalibration of instruments
Program limit checks
Basic statistics
Data Management: Quality
34 Antonio Rogmann (ZEFc), Universität Bonn
Getting benefits from data management requires the effort of all participants
DM needs firm agreements with regard to
standards
selected data user and their capabilities in accessing and using data
technical environment as software (-interfaces), network protocols, etc.
personal and / or institutional responsibilities within a ....
... data management workflow: data production quality control naming, identifying description transfer to data host delivery from data host
DM requires the willingness to invest time and to hold the standards!!
Data Management: Challenges
35 Antonio Rogmann (ZEFc), Universität Bonn
Data Management Workflow
Next slides are part of an digital GVP-data-management-workflow manual and documentation
Will be completed and published at the beginning of 2008
Background for the next training sessions for web-based data management and (geo)database administration
Workflow Manual will be offered in a similar design but in different formats (PDF, HTML), thus it can be delivered or published within the web
It serves as a good practice in the GVP, but has to be extended for fitting further requirements to the system from stakeholders side - after the GVP!!
36 Antonio Rogmann (ZEFc), Universität Bonn
Data Management Workflow
7 6
5
4
321
transfer
1workflow
steps (linked)
37 Antonio Rogmann (ZEFc), Universität Bonn
Data Management: Workflow Steps
Step 1: Data Collection
Processes-survey-data logger download-surveying & mapping
Location-field-site
Processor-scientist -planner -data collector
Software / Interfaces-file explorer-download interface-GPS-Tracking-data processing
software-hardcopies
Hardware-Data Logger-Lap Top-GPS-...
Take notes in a log book about measurement device: name,
manufacturer, serial number date: when has the data been collected name of the person who collects the data
in the field what has been done: maintenance,
calibration particularities: could anything special be
observed?
GPS measurements and mappings choose the appropriate Coordinate
System for the spatial working area for Ghana Coordinate System WGS1984
projected in UTM (Zones 30/31N), (Burkina Faso 30/31P)
Recommendations ...in note form
Back to overview
38 Antonio Rogmann (ZEFc), Universität Bonn
Data Management workflow steps
Step 2: Quality Control
Processes-searching for gaps,
outliers, file damages -deleting data errors-filling gaps-documenting
Location-field-site-office
Processor-scientist -data collector
Software / Interfaces-statistical methods
(algorithm)-data processing
programs (e.g. HYDAT)
Hardware-Lap Top-PC-workstation
Data Quality Assurance Julia, Uli bsphft. methods
Documentation which uncertainties are still given what was done for quality control specific algorithms and software
used note it in the meta data note it in table headers
Recommendations ...in note form more to this topic
Back to overview
Precipitation Sept. 05 - June 06
0
50
100
150
200
250m
m
39 Antonio Rogmann (ZEFc), Universität Bonn
Data Management Workflow Steps
Step 3: Naming, URN
Processes-designing of an
appropriate name syntax-naming of resources-crating of URNs
Location-office
Processor-scientists -planners-database administrator
Software / Interfaces-file explorer-Internet Explorer -html, Java Script
Hardware-Lap Top-PC-workstation
data name reflecting topic of content spatial and temporal coverage status of processing (version)
local data sharing (e.g. office with network)
find an agreement about file name syntax GVP-Standard?
identify resource / data types to define an URN Syntax GVP-Standard?
assign an Uniform Resource Name use the Resource Name Generator store URN within the data sets store URN in local data catalogue store URN creating metadata
in note form more to this topicto consider ….
Back to overview
40 Antonio Rogmann (ZEFc), Universität Bonn
Data Management Workflow Steps
Step 4: organization of data
Processes-designing of an
appropriate storage structure (directories) on
file system
Location-office
Processor-scientists -planners-network administrator
Software / Interfaces-file explorer / manager
Hardware-Lap Top-PC-LAN (Server)
Directory Structure
especially important►when data or resources are shared within an office community ►within small Local Area Networks (LAN)►within peer-to-peer network
can be concepted focussing on►data processing framework (models etc.)►project structure (subprojects project hierarchy)►spatial, temporal or thematic content of datastock (e.g. by
regions, themes..)
should be matched on local drives by all participants of the network - adjusted to personal focal points in work
makes easier to find resources
Recommendations ...in note form more to this topic
Back to overview
41 Antonio Rogmann (ZEFc), Universität Bonn
Data Management Workflow Steps
Step 4: organization of data
Processes-insert information about
data into a data dictionary
Location-field-site-office
Processor-scientist -planner
Software / Interfaces-Excel -OpenOffice Calc
Hardware-Lap Top-PC-workstation
Data Catalogue a small table file with
registration of own data, scripts, etc. on local drives
provides overview and saves time
minimum elements should be:
►Uniform Ressource Name (URN)
►Titel / Name►Short Description►Format►Storage Location
(path)
Example from GVP
Recommendations ...in note form more to this topic
Back to overview
42 Antonio Rogmann (ZEFc), Universität Bonn
Data Management Workflow Steps
Step 4: organization of data
Processes-insert dataset
information directly into or closely to the file
Location-field-site-office
Processor-scientist -planner
Software / Interfaces-processing software-file explorer
Hardware-Lap Top-PC-workstation
table header with details to Unified Resource Name: [‚urn:‘<NID>‘:‘<NSS>] Data provided by: [surname, first name, email-address, institution] Location: [name of location, UTM coordinates (X,Y)] Elevation: [m above sea level] Measuring Design: [description of applied methods] Measurement Executer: [name, (project, institution)] Measuring period: [JJJJMMDD – JJJJMMDD, time steps (d/h/s,
Minutes)] Missing values: [-9999.9] Quality: [description of quality assurance methods] Notes: [remark]
table header with description of parameters in use explain the meanings of abbreviations / codes declare the units used within the parameters if not self-explaining
use informations from data collection log book
Recommendations ...in note form more to this topic
red font = metadata elements (if metadata file just created this ones are not necessary as table header!)
Back to overview
43 Antonio Rogmann (ZEFc), Universität Bonn
Data Management Workflow Steps
data file header: example more to this topic
Back to overview
44 Antonio Rogmann (ZEFc), Universität Bonn
Data Management workflow steps
Step 5: description, create metadata
Processes-description of data /
resources following metadata standard
Location-office
Processor-data producer
Software / Interfaces-internet browser -html, java script
Hardware-Lap Top-PC-workstation
Metadata at latest if data is going to be published, it should be described by
entering metadata use the internet browser interface (as described here) for entering
metadata try to fill out as much elements as possible the accurate use of keywords in element “subject and keywords”
is very important most queries to metadata address “subject and keywords” as well
as “spatial coverage”
to do …in note form more to this topic
Back to overview
45 Antonio Rogmann (ZEFc), Universität Bonn
Data Management Workflow Steps
Step 5: create metadata
Processes-description of data /
resources following metadata standard
Location-office
Processor-data producer
Software / Interfaces-internet browser -html, java script
Hardware-PC-workstation
in note form more to this topicto consider ….
Metadata don’t forget to give access information about the data / resource
►current location: where the resource can be retrieved►access modalities (costs, user rights, technical way of
retrieving, etc.)►if data are not transmitted to central host: local contact person
Metadata storage files
if direct input to metadata base is not possible (no internet connection): XML-metadata files are to be sent to the administrator of the central metadatabase e.g. on CD by postal service
Data and metadata
metadata only have to be created, if the further use of resources by others is due
Back to overview
46 Antonio Rogmann (ZEFc), Universität Bonn
Data Management Workflow Steps
Step 6: (preparing to) transfer
Processes-decision making to
publishing of data access constraints
(user) transmission to
central database
Location-collective institution-local offices
Processor-data user framework-database administrators
Software / Interfaces
Hardware
in note form more to this topicto do ….
Make a decision
if datasets or resources (e.g. software, models) should be shared
who - persons, institutions, partners - should have access to the data
if there should be a payment for data sets
where the accessible datasets should be stored: locally or on a central server
who is the responsible person controlling the transmission to a central database. This person in charge has to control if
►the resources/datasets meet the data management standard of the community
►particularly if the data have proper metadata including clear definition of use rights ( provide database administrator with a list of potential user groups)
Back to overview
47 Antonio Rogmann (ZEFc), Universität Bonn
Step 7: transfer
Processes-formatting-upload to central
database
Location-local office-central database host
Processor-data producer-central database
administrator
Software / Interfaces-data processing software-html, java script-SSH (e.g. winscp)
Hardware-PC-Server
in note form more to this topicto do ….
Preparing the transfer
reformat the data sets, if required
inform central database administrator►which datasets are going to be uploaded to the central
database and why ►that metadata are entered directly into the metadatabase
using the web interface ►that metadata files are transmitted together with the datasets
Do the transfer
upload the data to a “transfer” directory on the main server
use upload software based on ftp (file transfer protocol) or SFTP (Secure Shell - File Transfer Protocol) if facilities are given
GVP uses SFTP for data transferring to the Data-Server
if upload is not possible because of slow internet connection, send data by postal service on CD / DVD
Data Management Workflow Steps
Back to overview
48 Antonio Rogmann (ZEFc), Universität Bonn
InternetzoneIntranetzone
Datenserver (+RAID)Webserver (VM)
File System(Samba)
ESRI-Geodata-
base
MySQL/Postgres:
Meta-DB
Portal-DB
GLOWAVolta HP
SMB
ESRI ArcGISClients
Mapbenderinkl. PostgreSQL
MapServer
Apache
Catalog-Managerinkl. phpMyAdmin
PHP(CGI)
CGI
PHP, DOM
Portal
TomcatJSP/ Java Java-based Client
(COBIDS)
SMB,JDBC
JDBC:1521
ArcGIS Client
ADODB:1521
SMB
SMB
File
JavaScript
lokal/offline
Meta.dc.xml
describes
Metadata-Interface
xml/xsl
request to download
GVP-Data Infrastructure
49 Antonio Rogmann (ZEFc), Universität Bonn
Don‘t feel shocked, that‘s technical stuff, let‘s look at it from the user‘s side
GVP-Data Infrastructure
50 Antonio Rogmann (ZEFc), Universität Bonn
Harvesting the fruits:
a centrally hosted database
giving access to the GVP datastock
with the option to extend the datastock with your own data
a central hosted metadatabase giving
answers about data needed
references about data providers
a geoportal informing
about projects related to water management in the Volta Basin
and their data: in a spatial visualization
Organization (Hoster)
GVPdata
Service Department
data
Institution
data
Data user
Metadata
Geoportal Meta data
!!
Web
data
Data ServerMap Server
GVP-Data Infrastructure
51 Antonio Rogmann (ZEFc), Universität Bonn
GVPdata
Service Department
data
Institution
data
Data user
Metadata
Geoportal Meta data
Web
data
Data Server
Map Server
userinterfaces
databases
View at the background:
user interfaces
Geoportal
Internet explorer with database interfaces
databases
Portal-Database
Metadatabase
File-System
Geodatabase
web technologies
not today‘s topic!
GVP-Data Infrastructure
52 Antonio Rogmann (ZEFc), Universität Bonn
1. Approach to get data
Via Geoportal
1. Geoportal searches in metadatabase
2. Catalogue Manager software on server provides result-list
3. found geodata can be requested as ...
4. ... interactive maps, provided to internet browser as WebMapService generated by UMN-MapServer
5. or for download original data files (also other data) if allowed
Metadata
Data Server
GVPdata
1
2
ACL-List
4
3
5
Data Infrastructure: Web-User‘s view
53 Antonio Rogmann (ZEFc), Universität Bonn
2. Approach to get data
Via Resource / Data Catalogue
1. Internet Browser Interface (on homepage) searches in metadatabase
2. Catalogue Manager software on server provides result-list
3. found geodata can be requested as
4. ... interactive maps, provided to internet browser as WebMapService generated by UMN-MapServer
5. or for download original data files (also other data) if allowed
Metadata
Data Server
GVPdata
1
2
ACL-List
4
3
5
Data Infrastructure: Web-User‘s View
54 Antonio Rogmann (ZEFc), Universität Bonn
Data Infrastructure: Geoportal
Layer
(selected) Feature Info:attribute table with properties, and links to documents/ grafics/web-adresses
Map Tools: zoom, pan, select
Overview
Geoportal-Components
WebMapService is OGC-Standard
Mapbender free client software for mapserver (server- side)
UMN Mapserver free and widely used mapserver software
Geoportal Interface Software currently under development by J. Laubach (Institute for Computer Science III, University of Bonn)
55 Antonio Rogmann (ZEFc), Universität Bonn
Data Infrastructure: Interfaces
Data Server(Linux)
Geodata-Base
(ESRI)
File System(Samba)
web intranet
GIS-Client
Internet Browser
File Explorer
SSH Client(only authorized direct access)
56 Antonio Rogmann (ZEFc), Universität Bonn
GVP Data Infrastructure
intranetview on data server file system
User Group 1
User Group 2
Data user
User System
Data user
Data user
Data user
57 Antonio Rogmann (ZEFc), Universität Bonn
GVP Data Infrastructure: Geodatabase
„Geodatabase“ = Geodata-base Format from ESRI
Relational Database all data ( entities = objects = layer)
organized in tables tables can be related to each other
► by using keys► based on cardinality (1:1, 1:n, m:n
relationships)
Common GIS-Formats (shape, ArcInfo coverage, ...) are
organized in several single files representing an object class
► for geometry► for attributes► for linkage geometry <--> attributes ► etc...
58 Antonio Rogmann (ZEFc), Universität Bonn
A relational database is managed by a database management system e.g. MS Access, DB2, Oracle, MySQL
An ESRI-Geodatabase is managed by the ArcGIS application „ArcCatalog“
GVP Data Infrastructure: Geodatabase
59 Antonio Rogmann (ZEFc), Universität Bonn
A geodatabase provides comprehensive facilities for
storage of a collection of different geodata types on a central place
application of sophisticated relationships and rules to the data
modeling of complex spatial behaviour (topology, geometric networks ..)
maintaining of data integrity
easy scaling of the data storage
defining custom objects
GVP Data Infrastructure: Geodatabase
60 Antonio Rogmann (ZEFc), Universität Bonn
(Relational) Geodatabase in the GVP
only prototypes
format still not used in the GVP
establishing of a geodatabase in the GVP still under discussion
advantages of geodatabase
well organized geodata
high information level (modeling)
database facilities for warranty of data integrity (-quality)
disadvantages of geodatabase
licences for upgrading ArcView Client Software (costs)
high effort for creating the database
GVP Data Infrastructure: Geodatabase
61 Antonio Rogmann (ZEFc), Universität Bonn
Alternatives for geodata storages in the GVP
Geodata in common GIS-Formats (shape, ...) within an File-System, as well as now
Open Source (free) Geodatabases (PostgreSQL/PostGIS)
no licence costs for installing, but for support
not easy to administrate
bad connections between ESRI and Postgres
but use of open source and free geodatabases should be considered in the further developments of
the GLOWA-Volta-Project and his partners
GVP Data Infrastructure: Geodatabase