Module 3 Introduction to GIS - Charles Sturt University · What do we know about GIS so far It...

49
Module 3 Introduction to GIS Lecture 7 – GIS data

Transcript of Module 3 Introduction to GIS - Charles Sturt University · What do we know about GIS so far It...

Module 3Introduction to GISLecture 7 – GIS data

What do we know about GIS so far

It allows to work with geospatial data to solve spatial problems (to know where it is, what happens somewhere and what changes – integration and spatial analysis)

It can be used for many different applications provided geospatial data exist, presented with the correct coordinate system

It needs at least 5 components to work properly (Hardware, Software, Data, Procedures, People) – not only software!

Many GIS software available (open-source source code is freely available to view, edit, and redistribute)

All GIS software provide basic GIS tools: projection transformation, spatial analysis; surface creation; data management; selection and extraction operations.

With any GIS software we can visualize the data and create maps

What can we do with GIS

Quick and easy access to large volumes of data

Search and select data and information

Link or merge one dataset with another (integration of spatial and non-spatial data)

Update data quickly and cheaply

Spatial analysis (with existing data) and modelling (creating new data and information)

Generate outputs (maps and tables)

GIS data model

A simplified representation of the real-world

GIS data model

GIS data model is a simplified view of the real world.

A data model defines how spatial features are represented and stored in GIS.

We can visualize this data model as a set of layers that represent each real world feature (data model organizes and stores geospatial data).

Each layer overlays perfectly on top of each other so that every location is precisely matched.

http://images.flatworldknowledge.com/campbell/campbell-fig01_012.jpg

GIS data model – for each application, different thematic layers

http://www.awi.de/fileadmin/user_upload/Research/Research_Divisions/Geosciences/Marine_Geochemistry/Marine_GIS/Infolayer1.jpg

http://www.lakegeorgeassociation.org/assets/images/gis_layers2.jpg

http://kenex.com.au/Images/predictive/layertheme.jpg

Different thematic layers displayed in GIS software

Is the geographic world a jig-saw puzzle of polygons or a club-sandwich of data layers? (Couclelis, 1992)

Entities (objects, discrete data) are described by their attributes and its position can be mapped using a coordinate system.

Defining and recognizing the entity (a house? a river? a forest?) is the first step, listing its attributes, defining its boundaries and its location is the second.

Entity

The real-world geographic space can be perceived as being occupied by entities or as a field representing a continuous variation of an attribute.

GIS data model (applied to entities/objects)

Geospatial data acquisition

LocationType of representationClassification (what it is?)

Database (Attributes)

ID X Y Name

Real-world

Simplification

Model

Map

Data Model

… for the mindless mechanical eye everything in the world is just another array of pixels… (Couclelis, 1992)

Field

Fields are described by the smooth and continuous variation of attributes over a space with continuous coordinates.

Fields are created by modelling continuous data.

GIS data models

Two GIS data models:

VECTOR

more appropriate for mapping discrete geographic entities

RASTER

most appropriate for modelling continuous geographic phenomena

https://csde.washington.edu/services/gis/workshops/Images/IntroGIS_Layers.png

GIS data models choice

There are many possible views for the same reality depending on the abilities and preferences of the GIS user.

Switzerland example (Burrough & McDonell, 1998):

Should Switzerland be recognized as a land of individual mountain entities (vector) or as a land in which the attribute “elevation” demonstrates extreme variation (raster)?

http://campusarch.msu.edu/wp-content/uploads/2011/10/raster-and-vector-model1.jpg

GIS data models choice

Opting for an entity approach to mountain peaks will provide an excellent basis for a system that records who climbed the mountain and when but it will not provide information for computing slopes for its sides.

Choosing a continuous representation allows the calculation of slopes but does not gives names for the peaks.

http://upload.wikimedia.org/wikipedia/commons/6/60/Matterhorn_from_Domh%C3%BCtte_-_2.jpg

http://www.graphatlas.com/switzerland_map_town_terrain_elevation_land_in_meter.gif

GIS data models choice

• The decision will be based on the aim to be achieved and the database needed.

• The choice of the conceptual model determines how information can later be derived.

• It depends on the scientific or technical discipline of the user

http://geog214-7.wikispaces.com/file/view/vector_raster.jpg/169141933/476x567/vector_raster.jpg

Heywood, Cornelius & Carver, 2011

GIS database

Attribute table

GIS data model storage – database / attribute table

http://pubs.sciepub.com/jgg/2/1/1/image/fig2.png

GIS data model = spatial data (geometry, location) + non-spatial data (attribute)

• A GIS database compiles attribute data and stores it in tables, organized by rows and columns.

• Each row represents a spatial feature (normally with an ID), each column describes a characteristic. A row is also called a record and a column is also called a field.

• Attribute tables are different for vector and raster.

Database / attribute table

Attributes are inserted and displayed as tables.

The type of attribute data provided for a spatial feature can determine the utility of datasets in GIS analysis - the scale of measurement (nominal, ordinal, interval and ratio) used to record attribute data is important

Nominal and ordinal data are introduce as character strings and are used as categorical data in GIS operations ( soil types or levels of soil erosion).

Interval and ratio data will be introduced as integers or float (if decimal digits are included)

Database / attribute table

Attribute tables are also used for a number of GIS operations.

Attribute tables are often joined or related to spatial data layers, and the attribute values they contain can be used to find, query, and symbolize features (vector) or raster cells.

It is possible to have attribute tables that are not linked to spatial features – GIS software a database management system to relate the non-spatial attributes with the spatial data.

Queries - GIS analysis based on the database/attribute table

Queries are questions that you ask of the data (related to the data model and geospatial data used)

Queries can be spatial or aspatial, e.g. “Where have major traffic accidents occurred in the last five years?” (spatial), “What percentage of traffic accidents involved alcohol?” (aspatial).

“Basic” queries – using the “Identify” tool in GIS.

“Advanced” queries – using “Query Builder” in GIS to build queries to select only features that satisfy some criteria (using basic maths operators or Boolean operators).

Spatial analysis using GIS database

Queries

Basic Query – Identify tool

The result of clicking on a feature using the Identify tool:

The fields are the attributes of the record in this particular layer

Advanced Query – Query Builder tool

Advanced Query – Query Builder tool

The basic operators in a query are =, <>, <, >, ≤, ≥, like.

• Equal to (=), not equal to (<>)( CNTRY_NAME = ‘”Australia’”)

• Less than (<), less than or equal to (≤)( POP2005 < 1000000)

• Greater than (>), greater than or equal to (≥)( SQKM > 100000)

• LikeBegins with or contains, used in conjunction with * wildcard

( CNTRY_NAME like “Al*” - Selects Algeria, Albania

Queries examples

Show all polygons where soil type is silty loam

Show all census districts where population is less than 1000

Find all locations where elevation is less than or equal to 100m and annual rainfall is between 1200 and 1800mm

Advanced Query – Query Builder tool

The Boolean operators are and, or, not - used to create compound queries.

• AND – all criteria must be satisfied( POP2005 ≤ 10000 AND SQKM ≥ 100000)

• OR – at least one criteria must be satisfied( STATUS = ‘National capital’ OR STATUS = ‘Provincial capital’)

• NOT – negation(NAME = “Mexico” AND NOT STATUS = “City” includes results contains: New

Mexico; the nation of Mexico but does not return Mexico City)

Advanced Query – Query Builder tool

Combine operators to make complex queries:(CNTRY_NAME = ‘Australia’ OR CNTRY_NAME = ‘New Zealand’) AND CITY_NAME like ‘A*’)

The syntax of the queries is important (how the query is written). In different GIS software the syntax will probably be slightly different.

The results of the query can be visualized in the layer view. Results of query:

Cities in NSW with population >= 20,000 shown in green

GIS Vector

Geometry, location, attribute table

Vector data model

The vector data model represents space as a series of discrete entity-defined point, line and polygons which are geographically referenced.

Besides geometry and location, an attribute table is also associated with each vector feature.

Vector data is scale-dependent and the user must take this into consideration when choosing the appropriate entity (point, line or polygon) to represent a feature.

For example, a city on a 1:1 000 000 map may appear as a point but the same city may appear as a polygon on a 1:25 000 map. But if it was not defined as a polygon initially the scale might change but the representation would always be a point (maybe , in this case, a huge point on a map).

Vector entity: Point

Points are stored as a pair of x-y coordinates. An attribute table containing point information only needs three columns – namely x, y and a description.

A point can be used to represent features such as a power pole, a tree, a sample location or a town.

GPS locations can be imported to GIS as points.

Vector entity: Line

A line has at least two points joined by a line. A line has length and a direction from start node to end node. The shape of a line may be a smooth curve or a connection of straight-line segments with vertices. A polyline is a feature made of lines.

Lines can be used to represent features such as a river, a road, a fence line or a contour.

Most GIS software have an editing toolbox that allows the user to create line features. This is useful when digitizing maps to create lines.

Vector entity: Polygon

A polygon is a two-dimensional feature and has the properties of area such as size and perimeter, in addition to location.

A polygon can be used to represent features such as a lake, a building, a soil class, a local government area or a state boundary.

Polygon features can also be created using the editing toolbox.

Vector entities: Point, Line and Polygon

http://www.google.com.au/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRw&url=http%3A%2F%2Fwww.tankonyvtar.hu%2Fhu%2Ftartalom%2Ftamop425%2F0032_terinformatika%2Fch01s02.html&ei=vthFVZ_lBoTCmwXSo4DQAw&bvm=bv.92291466,d.dGY&psig=AFQjCNGQLS0zKnGOTWDaHG9Y97qlCsySxA&ust=1430727153186804

Vector data requires Topology

Topology is the arrangement that defines how point, line, and polygon features share coincident geometry. For example, adjacent polygons representing states share their common boundaries.

Topology ensures data quality and integrity (for example, enables detection of lines that do not meet correctly).

Topology can enhance GIS analysis (because it defines correctly the position and direction of features).

Topology rules: Counties must not overlap; Countour lines must not intersect; Label points must be inside polygons.

Applications using vector

Use vector data maps as topographic maps (to locate feature).

Use vector data for spatial analysis (buffer, proximity, overlay…) to generate new data.

Vector data is the base for network analysis (plan routes, calculate drive-times, locate facilities, least-cost path from a destination point to the nearest least-cost source).

http://www.esri.com/news/arcwatch/1206/graphics/feature2-lg.jpg

Composite vector data: TIN

Composite features are built on points, lines and polygons such as the triangulated irregular network (TIN) which approximates the terrain (surface) with a set of non-overlapping triangles based on an irregular network of elevation points (from GPS, LIDAR, surveys, DEM)

http://map.sdsu.edu/geog104/images/unit-4/TIN.jpg

• Because nodes can be placed irregularly over a surface, TINs can have a higher resolution in areas where a surface is highly variable or where more detail is desired and a lower resolution in areas that are less variable.

• TIN’s are used to create Digital Terrain Models for GIS terrain mapping and analysis.

GIS Raster

Grid, cells, resolution

Raster data model

Raster data is made up of regular cells coloured according to some value. A raster is a two-dimensional grid where the basic unit is a cell. The resolution of a raster is determined by the grid cell size.

Each cell represents a continuous feature (such as elevation and precipitation) and corresponds to the value of the spatial phenomenon at the cell location.

Raster

• Cells must be aligned to the coordinate system• Cell start locations can not be offset• Grid is not normally at an angle• Cell is also called a pixel

The resolution of the raster is dependent on the size of the cell. The finer the cell resolution and the greater the number of cells that represent small areas, the more accurate the representation.

Raster values

Depending on the coding of each cell value, a raster can be either an integer or a floating-point raster(if presented with decimal digits).

Integer cell values usually represent categorical data. A common example is the land cover raster that codes land use as 1 – forest, 2 – agricultural, 3 – water.

Floating point cell data represents continuous numeric data.

Cells can also have a NoData value to represent the absence of data.

Raster attribute table

The attribute table displays the cell values (corresponding to the value of a continuous feature at the cell location) and their frequencies (counts).

Applications using Raster

Raster as basemaps

Remote sensing (satellite)

images, aerial photographs,

scanned maps can be used as

a background display for

other feature layers.

For example, we can create a

the boundary of a lake as a

polygon using its displayed

image in a basemap.

Applications using Raster

Raster as surface maps

Rasters is used as regularly

spaced representation of

surfaces.

Elevation values measured

from the earth's surface are

the most common application

of surface maps, but other

values, such as rainfall,

temperature, concentration,

and population density, can

also define surfaces that can

be spatially analyzed. http://www.terramapper.com/image/data/ned2.gif

A DEM (Digital Elevation Model) is very used as raster data in GIS digital terrain mapping and analysis.

It represents a regular array of elevation points converted into surfaces (each elevation point is the center of a cell).

Applications using Raster

Raster as thematic maps

Rasters representing thematic

data can be derived from

analyzing other data. A

common analysis application is

classifying a satellite image by

land-cover categories.

Thematic maps can also result

from geoprocessing to create a

raster dataset that maps

suitability for a specific activity. Satellite image Land use raster map

Vector vs Raster

“Raster is faster, but vector is corrector” (Berry, 1995)

Vector

Can represent point, line and polygon features much more accurately

Require less disk storage space

Complex data structure (topology)

High quality outputs

Enables network design

Complex database

Raster

Spatial inaccuracies due to the limits imposed by the raster dataset cell dimensions

Depending on the resolution, large size files requiring more disk storage space

Simple data structure

Easy to manipulate for analysis

Great diversity of raster data sources

Vector – Raster conversion

Conversion between models

Depending on the GIS operation to perform it might be necessary to convert vector into raster and vice-versa.

Rasterization (converts vector to raster)

Vectorization (converts raster to vector).

Giscommons.org

Rasterization

A grid is placed over the vector and if the cell contains the underlying vector then it is coded as such .

With polygons: difficult to decide whether to classify the boundary cells as part of the polygon or not.

Vectorization

Raster to vector requires thinning (to provide width to the line), extraction (determining where individual lines begin and end) and topological reconstruction (to connect lines and find errors).

Converting from raster to vector is will almost always result in a coarser vector with square edges, depending on the resolution of the raster.

Chang, 2014

More about GIS data acquisition next week

SCI103 notes:Start planning Assessment 6