Post on 11-Jan-2016
1
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Welcome to the
1st GLOWA-Volta Database Workshop
2
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Agenda
• Aims of the workshop
• Deficits relating the datastocks and data management of the GVP
• Datamanagement
• Livecycle of data
• Conclusions for the GVP
• Need for integration of the data users to database developement
• Role of disciplines to data management
• Steps forward to an optimized data management
3
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
• Initiation of a dialogue with the GVP-members about their requirements to an efficient data management
this dialogue is a process in which the following items should be discussed
• data
• use and access
• database structure
• metadatabase
• webpresence
• database team and division of work
Aims of the workshop
4
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
These points should be discussed within the working groups as far as possible. In this workshop we are focussing the items
- data (data flow)
- data use and access
- set up of a „database team“ and division of work
Technical implementation, structure and type of the databases, including
ways of access should be developed in a team by members of the
departments as well as computer scientists and project leaders!
Aims of the workshop
5
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Deficits relating the datastocks and data management of the GVP currently
6
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
The current situation
Data server
• data stock is not completed
• data searching by criterias is not possible
• arrangement of data is unclear
• relation to the project is unclear
• there are no rules for data uploading (location, topic etc.)
7
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Data mediums
• what is it‘s content?
• to which project/thesis does it belong to?
The current situation
?
8
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Metadatabase• data stock representation is not completed
The current situation
9
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Metadatabase• if you are looking for data, you have to ask your colleague in and outside of
ZEF!
• maybe the contact person is not available
The current situation
10
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Metadatabase• blind links
The current situation
11
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Datasets• inconsistency
The current situation
12
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Datasets• lack of data description
• which method background?
• are the values correct?
?
The current situation
13
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Data management
• For the avoidance of such problems there is the necessity of datamanagement
• Normally the processes of data management should be implemented within a project, when it starts!
• Definition (by the „Data Management Association“):
„Data Resource Management is the development and execution of architectures, policies, practices and procedures that properly manage the full data lifecycle needs of an enterprise“
14
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Procurement- Own investigation- Own processing- Supply from other institution/Project
Structuring(Data modeling)- Categorisation - Sortation- Description
and Storing
Administration- User Access Rights- Security
Use and Processing
- Data processing - Content Management- Quality Assurance- Data preparation (for others)
Distribution- Access- Deliver
Disposal- Update or- Erasure
Lifecycle of Data and Aspects of his
Management
15
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
can happen from• own investigations
• other institutions
• other (sub-)projects within the main project
Lifecycle of Data: Procurement of Data
serves • for providing the operating processes with input data
needs • certain data sources and formats
• quality
• application interfaces (import)
16
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Lifecycle of Data: Structuring and Storing
means• sorting of data related to a classification schema
• by themes
• by projects/subprojects
• by formats
• by applications
• by spatial research area
• .....
17
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
or/ and
• by a conceptual data model • it obtains the data entities and their relationships within a scope of a system
• the entities have properties (attributes)
• it is independend of the storing in a database and other technical requierements
• it can be designed in different forms (relational, network, hierarchical)
• the target system for data storing can be a relational database as well as a file system
Lifecycle of Data: Structuring and Storing
18
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
needs• consensus among data producers and users within an organization about
• conceptual data model
• data needed and not needed
• rules about data updating and archival storage
• standards for metadata-content
• control of compliance to structure criterias
Lifecycle of Data: Structuring and Storing
serves for• easy search, find and use of data
19
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
needs• storage places for the databases (central/distributed)
• physical data model
• derived from the conceptual/logical data model
• takes into account the facilities and constraints of a given database management system
• database management system with
• interfaces for applications
• query and search services
• backup and security functions
Lifecycle of Data: Structuring and Storing
means• the physical storing of data
20
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Lifecycle of data: Administration
means• on technical base
• install and maintenance of database system (database + database management system)
• user access constraints (rights)
• back up and archiving tasks
• security
• performance
• on content base
• Integrity - verifying or helping to verify
• control of data deliver
• control of data input
• metadata
21
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Lifecycle of data: Administration
needs• cooperation between data producers/users and administrators for
• maintenance and upgrading the database(-schema)
• definition of the authorization concept for database access (read only, read/write only, database schema modification etc.)
22
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
means• use of data for analysis
• processing of data inside and outside of models
• production of new or modified (output-)data
• control of data accuracy
• preparation of data for other processes/projects
Lifecycle of data: Use and Processing
23
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Lifecycle of data: Distribution
• outside an institution/project
• by direct access to a database
• Web-Services
• publishing the metadata
• data extract service from a database
• data downloads
• Map Services (geodata)
means
• delivery of data
• inside an organization/project
• by storing in a database (access by transfer counterpart)
• transfer by a portable media
• by publishing the metadata
24
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Lifecycle of data: Distribution
• outside an organisation/project
• for providing work processes with adjusted data
• for providing data for public information about the projects
serves• inside an organization/project
• for providing work processes with adjusted data
needs
• knowledge about the requierements of demand concerning
• further use of data
• formats
• clients
• ...
25
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Lifecycle of data: Disposal
means• updating the data
• selection and deleting or archiving of data
• being out of date
• being in disuse
serves• against data overflow into the databases
• for maintenance the quality of data
needs• cooperation between the data producers/users and the database administrator
26
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Conclusions for the GVP
• Conditions
• GVP is divided in a range of projects and subprojects
• e.g. in Phase II „Land Use“ with subprojects L1, L2 etc.
• e.g. in Phase III „Analysis of Long-Term Environemental Change“ with the subprojects E1, E2 etc.
• with their own processings, models, input and output data (- formats) data flows and -storages
• with specific integrations and dependencies among each other and within „use case“ frameworks
• Projects and their models are provided also with data from different scientific disciplines like Hydrology, Pedology, Social Economy, Ecology etc.
27
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Conclusions for the GVP
* GVP Phase III Proposal, S. 8
• in Phase III main objective ist the „Integration of Phase I and II research results, knowledge, data and tools“*
• in Phase III the DSS will be realized as the GVP‘s primary output
• Conditions
The several subprojects are connected by data flow (transfer)
The data flow should be adjusted to the GVP and DSS requierements. This means there must be a
transparent management, which is centralized and standardized
28
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Need for integration of the data users during development and setup of a GVP-data management
• Each researcher (or on a higher level: project) is a kind of data manager in his own work space. He has
• is own (local) database
• his own input and output data and data procurement requierements
• his own usage and processings
• his own distributing of data (to other users/projects)
• and therefore his own (short) lifecycle of data
• and is integrated in the data flow between the projects and also their life cycle of data
29
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Project 1
Project 4
Project 3
Project 1Project 3
Project 4
Project 2Data flow
Central
Database Data flow
30
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
• have to decide, together with other project members and the database developers, which data should be stored centrally to share them, and which can be stored locally or at other places
• have to decide which structure of data storing is most convenient for an optimized using
• have to give information about their data (create metadata)
project members....
Role of disciplines in developing concepts of a data management
and
• they are responsible for the data management in their own work area - before they will be interdisciplinary coordinated by the database
administrator
31
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
• have the responsibility to consult the project members about the requirements of data management
• have to organize the data flow concerning the (technical) way of data storing and access. The activities must be adjusted to the operating processes/projects and their interfaces
• have to develop the data management standards together with the project-members
Role of disciplines in developing concepts of a data management
developers of a database ....
32
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Steps forward to an optimized data management
Step 1: analyze the data stock (data dictionary)
Step 2: analyze the data flows
Step 3: develope the logical data model for data storing
(within this workshop)
My request to you
33
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Basic for working groups
Data flow modell combined with data dictionary
= Terminator: data producers (data source) or users (data hollow) outside the system (external Partners, public)
Notation :
= Process: transfer of input data into output data e.g. by algorithms
a = Data flow: direction for dataset „a“ dictionary
= Data storage unit as data pool (not local). Building time differs from using time. „A“ dictionary
a= Data flow: relay in two directions (processes)
A
34
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Basic for working groups
Context-Diagram
GLOWA-Volta
External Partner
External Partner
Decision Makers
Public
35
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Basic for working groups
Diagram 1: GLOWA-Volta
Water Supply and Distribution
External Partner
Water Demand and Management
Analysis of Long-Term Env.
Change
External Partner
DSS
36
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Diagram 2: Analysis of Long-Term Environmental Change
Basic for working groups
GVP LUDAS (E3)
Automated Classification of
Remotely Sensed Imagery (E1)
Cellular automata (E2)
Vendor of remote sensing data
Land-use Change Predictions and LU
PolicyS 1
37
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Basic for working groups
GVP-LUDAS
a, b, c
Evaluation of Elicitation Results
(House-hold Survey)
working group: natural scientists
ElicitationGhana
Diagram 3: GVP-LUDAS
E 1
d
working group: social economists
Ae
f
E 4
g
38
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
In the afternoon I would like to discuss the requirements of a data management system from your point of view.
Thank you!!
To Do
• Please try to draw a general overview about data flows and stocks
• And relate data management options to the certain data flows or storages
Take it all as a form of brainstorming!!
39
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
40
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Central Database
How to organize (sort) the data into the database ???
Project 1: - theme 1
- format 1- format 2
- theme 2 ....
Formats- SPSS
- project 1- project 2
- remote sensing....
Region 1- Project
- subproject - theme
- format
Project 1:- theme 1
- format 1- format 2
- theme 2 ....
41
1. GLOWA-Volta Database Workshop
September 5, 2006, ZEF Uni Bonn
Antonio Rogmann (ZEFc)
Basic for working groups
Data flow modell combined with data dictionary
Notation II:
= Dataflow: „a“ is originated from „b“ and „c“b
ca
a= Dataflow: relay in two directions (processes)
a b
c= Dataflow: division from dataset „a“ into datasets „b“ and „c“
aDataflow: updating of data to a storage=