Post on 28-Jun-2020
Intro to Spatial Data Analysis
GIS 5222
Jake K. Carr
Week 4
Intro to Spatial Data Analysis Jake K. Carr
Spatial Data Analysis
Spatial Data Analysis (SDA) is the process of identifyinggeographic patterns in data and analyzing how the relationshipsbetween features are affected by their relative locations.
What makes spatial data special is that each observation isgeographically referenced to a particular area on the map.
By exploiting the information contained in that geographicreference we will be able to draw stronger conclusions than if wesimply ignored it.
Intro to Spatial Data Analysis Jake K. Carr
Spatial Data Types
There are two types of spatial data:
• Continuous• Data from a surface of values - like current temperature.
• Discrete• Data associated with individual geographic objects - like
population by county in OH.
Intro to Spatial Data Analysis Jake K. Carr
Continuous Phenomena
Continuous phenomena such as precipitation or temperature canbe found or measured anywhere.
These phenomena have no gaps.
You can measure a value at any location.
Intro to Spatial Data Analysis Jake K. Carr
Temperature Readings
Intro to Spatial Data Analysis Jake K. Carr
Discrete Features
For discrete features the ‘location’ of each feature is mutuallyexclusive of the ‘location’ of other features.
If a feature is located here, it cannot also be located elsewhere.
• Ex: Counties in OH.
The variables that are associated with discrete features is oftenwhat we are most interested in:
• Ex: Population counts for each county in OH.
Intro to Spatial Data Analysis Jake K. Carr
County Population
Here, the (geographic) featuresare counties in OH.
The attribute (variable) ofinterest is Population.
Intro to Spatial Data Analysis Jake K. Carr
Spatial Data Representation
There are two main ways to represent spatial data:
• Raster• A grid of cells in which the value of an attribute is assigned to
the grid cell corresponding to the ‘same’ location.
• Vector
• Locational shapes constructed of individual points (vectors) inwhich an attribute value is assigned to the point(s)corresponding to that location.
Raster is faster, but Vector is ‘corrector’ !
Intro to Spatial Data Analysis Jake K. Carr
Spatial Data Representation
Vector Raster
Intro to Spatial Data Analysis Jake K. Carr
Raster Example
Intro to Spatial Data Analysis Jake K. Carr
Vector Example
Intro to Spatial Data Analysis Jake K. Carr
Geometry Types
For this course, we will (almost) always work with vector data.
The vector format is useful for ‘accurate’ representation of spatialdata features from the standard geometry types:
• Points: a pair of double-precision coordinates in the order(X,Y).
• Lines: an ordered set of points (vertices), connected insequence.
• Polygons: one or more rings. A ring is a connected sequenceof three or more points (vertices) that form a closed,non-self-intersecting loop.
Intro to Spatial Data Analysis Jake K. Carr
All Three Geometry Types
Intro to Spatial Data Analysis Jake K. Carr
Areal Support/Block
Areal data1involves aggregated quantities for each areal unit withinsome relevant spatial partition of a given region (such as thecounties within a state).
On p. 1 of the text (p. 24 in .pdf) the author mentions theconcept of an ‘areal support or block.’ Support is just a statisticsword for unit of observation.
The areal support of a set of county population values is thecounty.
Areal data is always represented by POLYGON geometries _ arealand polygon are interchangeable.
1The text specifically focuses on analysis of areal data.Intro to Spatial Data Analysis Jake K. Carr
Geometry Types
Geometry types build from the ‘ground up’:
Points are the basic building block of all vector geometries (moreon that in a minute).
Lines (polylines) are then built up as a series of connected points.
Polygons are closed polylines - the first and last point in the seriesis the same.
Intro to Spatial Data Analysis Jake K. Carr
Vector Geometry
Why is vector data called vector data?
Intro to Spatial Data Analysis Jake K. Carr
Vector Geometry
Intro to Spatial Data Analysis Jake K. Carr
Where to Find Geometries
Intro to Spatial Data Analysis Jake K. Carr
Fun with vector geometries!
See:
• polygonVertices.py
• pointVertices.py
• pointVertices shapefile.py
Intro to Spatial Data Analysis Jake K. Carr
Shapefiles
Every shapefile data set includes a minimum of three files.
The first of these files digitally stores the geometry of the featuresas sets of vector coordinates (.shp).
A second required file holds an index that matches the spatialfeatures to their associated attribute data (.shx).
The third required file stores the attribute data in dBASE format(.dbf).
Intro to Spatial Data Analysis Jake K. Carr
Shapefiles
At a minimum, we need the following three files to use counties ina map document:
• counties.shp: the main shape file containing vectorcoordinate data
• counties.shx: the index file
• counties.dbf: the dBASE table
Intro to Spatial Data Analysis Jake K. Carr
Shapefiles
There are a few additional files associated with shapefileconstruction, but these are optional.
One of the most important optional files is the projection file(.prj). This file includes the coordinate system definition.
– Do you remember how to determine the coordinate systemfor a given shapefile with ArcPy?
Another optional file stores the metadata for the file (.xml).Metadata is additional descriptive information about the shapefile -like when it was produced, and what time period the attribute datawas collected, etc.
Intro to Spatial Data Analysis Jake K. Carr
Spatial Analysis
In some applications the purpose of analysis is to describe thespatial arrangment of geographic features.
In others, the focus may be on describing the spatial variation inattribute values associated with those geographic features.
These descriptions might involve identifying interesting aspects ofthe data, such as detecting clusters or concentrations of high (orlow) values.
The next step might be to try to understand why certain areas ofthe map have a concentration of high (or low) values.
Intro to Spatial Data Analysis Jake K. Carr
Explaining Variation
There are two aspects to variation in a spatial data set.
The first is the basic variation in the data values (disregarding theinformation provided by the locational index).
The second is spatial variation, or the variation in the data valuesacross the map.
Describing these two aspects of variation involves differentstrategies.
Intro to Spatial Data Analysis Jake K. Carr
Variation in Data Values
Simply plot a histogram of the data:
See matplotlib’s pyplot submodule example:
variation.py
Intro to Spatial Data Analysis Jake K. Carr
Spatial Variation in Data Values
Spatial variation in AREA:
Intro to Spatial Data Analysis Jake K. Carr
Explaining Variation
To explain variation we will try to find a model that will accountfor the variation in some attribute.
It is possible that this model could also provide a good explanationof the spatial variation in our data.
It is also possible that a model that apparently does well indescribing attribute variation leaves important aspects of its spatialvariation unexplained.
– For example all the cases that are very poorly fitted by themodel might be in one part of the map.
Intro to Spatial Data Analysis Jake K. Carr
Elements of Spatial Analysis
• Cartographic Modelling: Each data set is represented as amap and map-based operations (i.e. Buffer Analysis) generatenew maps.
• Mathematical Modelling: Model outcomes are dependenton the form of the spatial interaction between objects in themodel. This occurs either through the spatial relationships orthe geographical positioning of objects within the model.
• Statistical Modelling: Techniques for the proper analysis ofspatial data which make use of the spatial referencing in thedata.
Intro to Spatial Data Analysis Jake K. Carr
Spatial Data Matrix
Spatial data consists of a set of (k) attributes
Z = {Z1,Z2, ...,Zk}
measured at (or associated with) a set of (n) spatial locations
S = {S(1), S(2), ...S(n)}
In other words, there are n locations with k variables measured ateach location.
Intro to Spatial Data Analysis Jake K. Carr
Spatial Data Matrix
Variables are represented by capital letters, like Z and S .Observations from those variables are indicated by lower caseletters, such as z and s.
The Spatial Data Matrix consisting of k attributes for each of ngeographic features has the standard form:
z1(1) z2(1) . . . zk(1) s(1)z1(2) z2(2) . . . zk(2) s(2)
......
......
...z1(n) z2(n) . . . zk(n) s(n)
Intro to Spatial Data Analysis Jake K. Carr
Spatial Data Matrix: ArcMap Edition
In ArcMap, the order of these variables changes to the form:
z1(1) s(1) z2(1) . . . zk(1)z1(2) s(2) z2(2) . . . zk(2)
......
......
...z1(n) s(n) z2(n) . . . zk(n)
Z1 is typically called FID, and S() is the Shape* variable.
Intro to Spatial Data Analysis Jake K. Carr
We’ve already seen this!
Intro to Spatial Data Analysis Jake K. Carr