1 Peter Fox GIS for Science ERTH 4750 (98271) Week 4, Tuesday, February 14, 2012 Geocoding, Simple...

Post on 18-Jan-2018

216 views 0 download

description

Reading review for last week Lab on Friday? (Max can close his ears)… Video Tutorials MapInfo User Guide Chapter 3 (Basics, esp. Working with Layers in the Layer Control, p. 57) - layering MapInfo User Guide Chapter 7 (Drawing and Editing Objects, esp. Editing, p. 170) - digitizing MapInfo User Guide Chapter 10 (Buffering and Working with Objects, p. 268) - buffering MapInfo User Guide Chapter 12 (Registering Raster Images, p.324) - registering 3

Transcript of 1 Peter Fox GIS for Science ERTH 4750 (98271) Week 4, Tuesday, February 14, 2012 Geocoding, Simple...

1

Peter FoxGIS for Science

ERTH 4750 (98271)Week 4, Tuesday, February 14, 2012

Geocoding, Simple Interpolation, Sampling

Contents• Reading review• Assignment 1 status?

• Geocoding• Interpolation• Sampling

• Lab on Friday• Next week

2

Reading review for last week• Lab on Friday? (Max can close his ears)…• Video Tutorials• MapInfo User Guide Chapter 3 (Basics, esp.

Working with Layers in the Layer Control, p. 57) - layering

• MapInfo User Guide Chapter 7 (Drawing and Editing Objects, esp. Editing, p. 170) - digitizing

• MapInfo User Guide Chapter 10 (Buffering and Working with Objects, p. 268) - buffering

• MapInfo User Guide Chapter 12 (Registering Raster Images, p.324) - registering

3

Geocoding• “Geocoding is the process of finding associated

geographic coordinates (often expressed as latitude and longitude) from other geographic data, such as street addresses, or zip codes (postal codes). With geographic coordinates the features can be mapped and entered into a GIS, or the coordinates can be embedded into media such as digital photographs via geotagging.”

• “Reverse geocoding is the opposite: finding an associated textual location such as a street address, from geographic coordinates.”

• “A geocoder is a piece of software or a (web) service that helps in this process” (could be YOU)

4

Geocoding – the world has changed

• http://geocoder.us/• http://code.google.com/apis/maps/documenta

tion/geocoding/• http://www.ffiec.gov/geocode/

• Currently 40% of U.S. addresses are geocoded! – How about the rest of the world?

5

Geocoding 001• A regular geocode

may show us the address of the office, or the street entrance to the complex.

• The rooftop geocode shows us the exact location.

• Delivering a pizza to Unit 701 would be much easier with rooftop geocoding.

6

Geocoding…• Often you will have data for which the

position is known only by its street or address.

• Such data can be ‘spatially enabled’ by geocoding with the addresses or street names. This requires a street database. (!)

• Often the street database contains geographic coordinates for the centroids of the street segments. If you enter just the street name, MapInfo will geocode to the centroid of the street segment. 7

Example

8

If your street database has only street names and their centroids, you will use "First Street" to describe your first point, at 4029 First Street, when geocoding. If First Street extends from Second Ave. to Fourth Ave. your position will be given as the ‘street centroid’. If First St. extends farther east, its centroid could be even farther from your actual observation point. If First St extends only between 2nd Ave. and 3rd Ave, your point will be assigned to the location of the ‘segment centroid’ between 2nd and 3rd Avenues.

Thus…• By this method your second observation, at 4040

First St., would be assigned the same geographic coordinates as the first.

• Clearly this method is intended for widely spaced points to be geocoded.

• Your street database may contain coordinates and address numbers for the intersections. – For example, at the intersection of 1st St. and 2nd Ave.

(I1) the address numbers for 1st St start at 4000 and increase eastward to 4100 at the intersection of 1st St. and 3rd Ave (I2).

– When you geocode the address 4029 First St, MapInfo will interpolate between the two intersections. 9

• Let’s say the geographic coordinates of the intersection I1 are (x1, y1) and the geographic coordinates of the intersection I2 are (x2, y2).

• To do this interpolation to get the coordinates (x, y) of 4029 1st St:

• x = x1 + (4029-4000)/(4100-4000) * (x2 – x1)

• y = y1 + (4029-4000)/(4100-4000) * (y2 – y1) 10

Simply put• In simple terms, because the street number

4029 is 29% of the way from I1 to I2, the geographic coordinates will be assigned 29% of the distance between them.

• Similarly, the second observation point, at 4040 1st St, will be assigned 40% of the way from I1 to I2.

• Importantly, the second point will not coincide with the first.

11

Uh-huh, it’s that easy…• What about 2300 Downing St, Denver? Is

that North Downing or South Downing?

• Street, road, way, circle, court, avenue…

• Centroids for large ‘regions’

• GPS – authenticated?

• Tell me more…12

Interpolation• Interpolation is the process of estimating the

unknown value of a function based on the known values at neighboring points.

• Because in general we cannot sample every point on the ground, we sometimes interpolate between our observed points to predict, or make an estimate of, the value at some point we did not or could not sample.

• In contrast to extrapolation…13

Random samples in a region

• For example, let’s imagine that our job is to measure radon concentrations in people’s basements for an entire county with the aim of finding ‘hot-spots’ where levels are dangerously high.

• It would be impossible to sample every home but we want to be able to warn folks whose homes are possibly ‘hot’ even though we did not sample there.

• Hence, we want to be able to make accurate predictions at unsampled points based only on our sampled points. 14

How to?• Numerous ways to go about making these

predictions. • We will start with the inverse-distance

weighting (IDW) method, used by MapInfo Professional.

• In IDW, the value at an unknown point is the weighted average of its neighbors, where the weighting is the inverse of distance raised to some power.

15

The IDW..• In general, by weighted averages the value of

an unknown point zj at (xj, yj) is given by:– zj = [SUM i=1,n (wi zi )] / [ SUM i=1,n wi ]

– n is the number of sampled points and w is the weight assigned to each sampled point.

• You can see that if each w = 1, this equation represents the common definition of the average; i.e., the sum of the values divided by the number of values.

16

Choices…• In IDW, the value of w for a particular

sampled point is determined by how far that sampled point is from the point to be estimated.

• The weight decreases as distance d increases such that w = d - k where k is some chosen number (in Mapinfo k is between 1 and 10).

17

Sampling stations

18

Weighted Interpolation

19

Weights: distance and k…

• The exponent determines how smooth the map will be. • The larger the exponent, the more important each sample

becomes for the estimated points around it. 20

Other ways…• Another method is to use a Gaussian function

for the distance weighting if you suspect some correlation distance between the data

• The Gaussian function is:– wi = exp [ -0.5 * (di / do)2 ] / [ do sqrt (2 * pi)]

– where do is the correlation distance, di is the distance to point i, and pi = 3.1415926...,

21

22

The correlation distance determines how smooth the map will be. The larger the distance, the broader and flatter the weighting function.

Uncertainties…• In the case in which the measurements at the

sampled points have uncertainties with a standard deviation of s, the weight function also includes it: – zj = [SUM i=1,n ( zi wi si -2 )] / [ SUM i=1,n ( wi si

-

2 ) ]

• NB. the superscript -2 indicates that the weighting is according to 1/s2, or one over the variance.

23

Leads us to sampling• Interpolation is the process of estimating, or

predicting, values of some spatial property at points between the points at which the property has been measured (sampled).

24

Sampling methods• Random – samples are randomly distributed

in the region of interest• Regular – sampling on a grid• Transect – samples are along lines crossing

the region• Cluster – samples are clustered• Contour – samples are made along contours

(often used when digitizing from a map)

25

Choices, choices…• The sampled points can be distributed in a

regular or irregular grid. • Often interpolation is used to produce a

regular grid out of an unevenly sampled distribution.

• A regular grid (raster model) is necessary to examine certain properties of the data, such as slope and curvature, or to produce a more understandable presentation in the form of surface plots.

26

Factors for sampling…• Sampling is often dictated by economics or

by logistics but also must reliably account for the spatial variations in the quantity of interest.

• Widely-spaced samples may miss the short wavelength variations and can lead to aliasing the signal.

27

Sampling theorem• In essence, the theorem shows that a band

limited analog signal that has been sampled can be perfectly reconstructed from an infinite sequence of samples if the sampling rate exceeds 2B samples per second, where B is the highest frequency of the original signal.

• If a signal contains a component at exactly B hertz, then samples spaced at exactly 1/(2B) seconds do not completely determine the signal… (wikipedia)

28

Aliased Sampling• In this plot the sampled points (triangles) indicate a

decreasing trend, whereas the actual trend is ~ sinusoidal

29

30

Tips• Sampling on a regular grid can lead to

aliasing if the grid spacing is too wide. • If you want to sample on a grid (regular

sampling), first estimate the wavelength of the spatial variations by making some close samples.

• Random sampling allows you to estimate the wavelengths of the variations in your sampling but can leave large un-sampled regions.

31

Summary• Three more topics for GIS (for Science)

– Geocoding– Interpolation– Sampling

• For learning purposes remember:– Demonstrate proficiency in using geospatial applications and tools

(commercial and open-source).– Present verbally relational analysis and interpretation of a variety of

spatial data on maps.– Demonstrate skill in applying database concepts to build and manipulate a

spatial database, SQL, spatial queries, and integration of graphic and tabular data.

– Demonstrate intermediate knowledge of geospatial analysis methods and their applications. 32

Reading for this week• Chapter 13: Putting your Data on a Map

(Geocoding, pp. 353-364)• Chapter 9: … Thematic maps and Grid

surface maps (Interpolation, pp. 264-266)• Sampling theorem (self directed)

33

Friday Feb. 17th• Lab session – with a walk through of

examples first• 12pm-~1:40pm (attendance will be <obvious>

;-) )

• Hands on (as well as layering, etc.)– Geocoding– Interpolation– Sampling

• Assignment due 5pm 34

Next classes• Introduction to geostatistics.• Interpolation techniques continued

(trend surfaces, Thiesses polygons, splines)

• Lab on Friday (24th)

35