Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION...

47
1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

Transcript of Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION...

Page 1: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

1

Lecture 5:

GIS Data Management

GE 118: INTRODUCTION TO GIS

Engr. Meriam M. Santillan

Caraga State University

Page 2: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

2

File Structures

(File-based datasets)

Simple list

Ordered sequential files

Indexed files

Page 3: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

3

Simple List

Simplest file structure

Unordered/unstructured

Arrangement is by whichever comes first

Page 4: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

4

Ordered Sequential Files

Simple lists that are arranged according to

some order (ex. Alphabetical order)

Page 5: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

5

Indexed Files

An index to the directory is needed for more

efficient searches involving finding entries

given certain criteria

Can be developed as direct files or inverted

files

Page 6: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

6

Direct Indexed Files

Records are used to provide access to other

pertinent information

Page 7: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

7

Indirect Indexed Files

Index is based on possible search criteria,

not on the entities themselves

Attributes are the primary search criteria and

the entities rely on them for selection

Page 8: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

8

Database

An integrated set of data on a particular

subject

Collection of interrelated data stored

together with controlled redundancy to

serve one or more applications in an

optimal fashion

Requires more elaborate structure

called a database structure or

database management system

Page 9: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

9

Significance of Database

Most GIS activities consist of storing entity and

attribute data so that we can retrieve any

combination of these objects.

Each graphical feature must be stored explicitly with

its attributes so that their combined search becomes

faster.

Page 10: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

10

Advantages of Database over

File-based datasets

Collecting data at a single location reduces

redundancy and duplication

Lower maintenance cost due to better organization

and decreased data duplication

Multiple applications can use the same data and can

evolve separately over time

Page 11: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

11

Advantages of Database over

File-based datasets

User knowledge can be transferred between applications more easily because database remains constant

Facilitated data sharing, with a corporate view provided to data managers and users

Security and standards for data and data access can be established and enforced

Page 12: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

12

Database Management System

A software application designed to organize the efficient and effective storage and access to data

A suite of software programs designed to store, retrieve and manipulate data within a database

Page 13: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

13

Types of Database Structure

1. Hierarchical Data Structures

2. Network Systems

3. Relational Database Structures

Page 14: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

14

Hierarchical Data Structure

‘one-to-many’ or ‘parent-child’ relationship

Implies that each element has a direct relationship

to a number of symbolic children

Each child is capable of having the same direct

relationship with his/her own offspring, and so on.

Page 15: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

15

Hierarchical Data Structure

Page 16: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

16

Hierarchical Data Structure

Advantages:

Simple and straightforward data access since parent

and children are directly linked

Easy to search since structure is well defined

Relatively easy to expand by adding new branches

and formulating new decision rules

Page 17: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

17

Hierarchical Data Structure

Disadvantages:

Confined to queries along one branch only

Difficult restructuring to allow other possible search

criteria

Creates large index files

Redundant entries for searching

Page 18: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

18

Network Systems

‘many-to-many’ relationship

Each individual data is linked directly to

anywhere in the database using pointers,

without the parent-child relationship.

Page 19: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

19

Network Systems

Page 20: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

20

Network Systems

Advantages:

Less rigid compared to hierarchical structure

Can handle many-to-many relationships

Allows much greater flexibility

Reduced redundancy of data

Page 21: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

21

Network Systems

Disadvantages:

In very complex GIS, the number of pointers can become large, thus requiring a lot of storage space

Linkages between data must still be explicitly defined using pointers

Numerous possible linkages can become extremely tangled, resulting to confusion and incorrect linkages

Not recommended for novice users

Page 22: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

22

Relational Database

Management Systems

(RDBMS)

Data are stored as ordered records or rows of attribute values called tuples

Tuples are grouped with corresponding data rows in a form called relations

Each column represents data for a single attribute for the entire dataset

Page 23: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

23

Relational Database

Management Systems

(RDBMS)

Primary key – a column which is used to define

the search strategy or criterion

Foreign key – column in the second table to

which the primary key is linked

Page 24: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

24

Relational Database

Management Systems

(RDBMS)

Normal forms – set of rules to indicate the

forms that the tables should take

1. First Normal Form

2. Second Normal Form

3. Third Normal Form

Page 25: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

25

First Normal Form

Table must contain columns and

rows

Because the columns are to be

used as search keys, there should

only be a single value in each row

location

Page 26: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

26

Second Normal Form

Requires that every column that is

not a primary key be totally

dependent on the primary key

Simplifies the tables

Reduces redundancy by imposing the

restriction that each column be only

searchable using the primary key

Page 27: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

27

Third Normal Form

States that columns that are not primary keys must “depend” on the primary key, whereas the primary key does not depend on the nonprimary key Primary key must be used to find other

columns

But the other columns are not needed to search for values in the primary key column

Idea is to reduce redundancy

Page 28: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

28

Relational Database

Management Systems

(RDBMS) Advantages:

Allow us to collect data in reasonably simple tables, keeping organization also simple

Capable of doing relational joins, as long as there is at least one column common to the tables to be joined

Allows greatest flexibility, both in design and querying

Page 29: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

29

Data Storage in a DBMS

Object classes/layers are stored in database tables

Each layer is stored as a single database table in a database management system

Rows contain objects, while columns contain attributes/properties of the objects

Page 30: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

30

Data Storage in a DBMS

Geographic database tables have a geometry column (or shape column), which non-geographic tables don’t have

Each layer is stored as a single database table in a database management system

Rows contain objects while columns contain attributes/properties of the objects

Page 31: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

31

Basic Database

Functions/Operations

Join

Tables are joined together using common row/column

values or keys

After joining two or more tables, a new table is created

which contains all the values of the joined tables

Database tables can be joined together to create new

relations, or views of the database.

Page 32: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

32

Basic Database

Functions/Operations

Link

Tables are linked using common row/column values or

keys

Unlike in joining, linking tables does not result to a new

table. The original tables are retained but accessing one

enables the user to also access a table linked to it

Page 33: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

33

Database Design

Involves three stages: conceptual, logical,

and physical

Involves six practical steps (see Figure)

Page 34: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

34

Stages of Database Design

Conceptual Model

User View

Object

and

Relationships

Geographic

Representation

Logical Model

Geographic

Database

Types

Geographic

Database

Structure

Physical Model

Database

Schema

Page 35: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

35

Conceptual Model

Steps involved are:

1. Model the user’s view

2. Define objects and their relationships

3. Select geographic representation

Page 36: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

36

Model the User’s View

Identifying organizational functions, determining data requirements of these functions, organizing data into groups for data management

May be presented using a report with tables

Page 37: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

37

Define Objects and Their

Relationships

Specification of object types/classes and

functions, and their relationships

May be presented using diagrams

Page 38: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

38

Select Geographic

Representation

Choosing between the types of discrete objects (point, line, or polygon) or field to represent the data

Selection has a critical impact on the database use

Although it is possible to switch between representations later on, it would be computationally expensive and would lead to information loss

Page 39: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

39

Logical Model

Steps involved are:

1. Match to geographic database types

2. Organize geographic database structure

Page 40: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

40

Match to Geographic Database

Types

Matching of object types to be studied to

specific data types supported by the GIS

Page 41: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

41

Organize Geographic Database

Structure

Defining topological associations, specifying

rules and relationships, and assigning

coordinate systems

Page 42: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

42

Physical Model

Step involved is:

1. Define database schema

definition of the actual physical database

schema that will hold the database data values

usually created using the DBMS software’s data

definition language (ex. SQL)

Page 43: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

43

Database

Organization/Structuring

Necessary for efficient query, analysis, and

mapping

Page 44: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

44

Structuring Techniques

1. Topologic Creation

2. Indexing

Page 45: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

45

Topologic Creation

Can be created for vector data using either batch or interactive techniques

Batch Topology – for CAD, survey, simple feature and other unstructured vector data

– an iterative process

Interactive Topology – performed dynamically at the time objects are added to the database

Page 46: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

46

Indexing

Can help speed up certain types of queries

Three main indexing methods in GIS are grid indexes, quadtrees, and R-trees.

Database index – a special representation of information about objects that improves searching

Page 47: Lecture 5: GIS Data Management - WordPress.com1 Lecture 5: GIS Data Management GE 118: INTRODUCTION TO GIS Engr. Meriam M. Santillan Caraga State University

47

Thank you!