Geographic Information System - York University notes/lecture9.pdf · 2010-03-14 · 1 GEOG4340 Q....
Transcript of Geographic Information System - York University notes/lecture9.pdf · 2010-03-14 · 1 GEOG4340 Q....
1
GEOG4340 GEOG4340 Q. ChengQ. Cheng
Geographic Information SystemGeographic Information System
Lecture NineLecture NineData IntegrationData Integration
Spatial Decision Support Spatial Decision Support System (SDSS)System (SDSS)
Mapping areas for Mapping areas for drilling in mining drilling in mining
industryindustry
Multivariate Logistic Regression
2
3D Simulation and Decision Making 3D Simulation and Decision Making Various Types of Data Various Types of Data Integration Integration
Combining multiCombining multi--layers of layers of geoinformationgeoinformation for decision making for decision making
MultiMulti--sourcessourcesMultiMulti--scalesscalesMultiMulti--formatsformatsMultiMulti--owners owners MultiMulti--temporal captures temporal captures
Spatial Decision Support System Spatial Decision Support System Multiple Map ModelingMultiple Map Modeling
Decision Theory is concerned with Decision Theory is concerned with the logic by which one arrives at a the logic by which one arrives at a choice between alternatives. choice between alternatives.
Alternative ActionsAlternative ActionsAlternative hypothesesAlternative hypothesesAlternative objectsAlternative objects
so onso on
Potential ApplicationsPotential Applications
Site Selection Site Selection Suitability Assessment Suitability Assessment Favorability AssessmentFavorability AssessmentProbability Assessment Probability Assessment
Spatial Decision Support System (SDSS)Spatial Decision Support System (SDSS)GIS Data Integration for PredictionGIS Data Integration for Prediction
Remote Sensing
Geological
Geochemical
Geophysical
.
. Potential
Evidential Layers (X)
Modeling (F)
Output Data (S)
Processing
GIS Data Sources
Data PreprocessingInterpretingInformation Extraction
Integration
DBMS
DBMS
DBMS
DBMS
Geographical
Suitability Map for Planning School Suitability Map for Planning School
3
A General Spatial Modeling Processes A General Spatial Modeling Processes
•Stating the problem
•Breaking the problem down
•Exploring input datasets
•Determining analysis processes
•Verifying the model’s result
•Implementing the result and reporting
4
Model of Processes for finding Distance from rec. facilities
BufferRecreational Site
Distance toRec. Site
Rec. SiteBuffer
Reclassify
Model of Processes for finding Distance beyond existing schools
BufferSchools
Distance toSchool
SchoolBuffer
Reclassify
Model of Processes for finding Relative flat area
SlopeElevation
Slope Classes
Slope Map
Reclassify
Model of Processes for finding Suitable landuse type
LanduseMap
LanduseClassesReclassify
Model Constraints Model Constraints Normalization:
1. Convert maps into comparable unit
⎜⎜⎝
⎛=enotsuitablabsentno
suitablepresentyesxi ,,,0
,,,1 10...,,2,1=ix
2. Assigning weights for each map as %
101...21
≤≤=+++
i
n
wwww
Model of Processes for combining Diverse maps
Landuse
Suitability Map
Calculator
Slope
Dist. Rec. Site
Dist. School
5
Combine Maps Combine Maps
Grid calculator with equation
S = 0.50 rec_site +0.25 dist_school +0.125rec_landuse + 0.125 rec_slope
Map S has values between 1 -10
Model Validation
Are the criteria reasonable?
Is the model valid?
Does the result meet the requirement?
Are there errors related to the result?
Are all data used necessary?
General Data Integration ModelGeneral Data Integration Modelfor SDSSfor SDSS
S = F(x1, x2,…, xn)S S –– Index map showing Index map showing
SuitabilitySuitabilityProbabilityProbability
xxii -- maps or evidencesmaps or evidenceswwii -- weights weights
Simple Linear ModelSimple Linear Model
nnxwxwxwS +++= ...2211
S S –– Index map showing Index map showing SuitabilitySuitabilityProbabilityProbability
xxii -- maps or evidencesmaps or evidenceswwii -- weights weights
Model Constraints Model Constraints Normalization: Normalization:
1. Convert maps into comparable unit1. Convert maps into comparable unit
⎩⎨⎧=
noyes
xi ,0,1
2. Weights showing relative importance 2. Weights showing relative importance of mapsof maps
101...21
≤≤=+++
i
n
wwww
⎪⎩
⎪⎨
⎧=
nounknown
yesxi
,0,5.0
,110...,,2,1=ix
6
Methods for Calculating Weights for Methods for Calculating Weights for Data IntegrationData Integration
Data Driven Methods:Data Driven Methods:Weights of evidenceWeights of evidenceLogistic regressionLogistic regression
Artificial Neural networkArtificial Neural network
Knowledge driven Methods:Knowledge driven Methods:Fuzzy logicFuzzy logic
Hybrid Methods:Hybrid Methods:Fuzzy weights of evidenceFuzzy weights of evidence
Model Types Model Types
1. Probabilistic1. Probabilistic
2. Deterministic2. Deterministic
S S –– random variable showing random variable showing probability 0 probability 0 ≤≤ S S ≤≤ 1 with uncertainty1 with uncertainty
S S –– Score 0 Score 0 ≤≤ S S ≤≤ 11
Relationships Between Different Models
Simple Overlay Model (Union, Intersect, Identity)
Linear Model (adding weights)
Logistic Model (Weights of Evidence, Logistic Regression)
Fuzzy Logic model (various operators)
Spatial Data Modeler Extension: Arc-SDM
Weights of Evidence
Logistic Regression
Fuzzy Logic
Neural Network
4
33112244PointsPoints
45453535131377AreaArea
0.060.06BBnotAnotA440.020.02nnot Bot BnotAnotA330.150.15notBnotBA A 220.570.57BBA A 11points/areapoints/areaPolyBPolyBPolyAPolyAIDID
A B
A
not A
not B
B
not A not B
A not B
not A B
A B
not A not B
A not B
not A B
0.57
0.15
0.02
0.06
Prior probability total number of point / total area10/100 = 0.10 (10%)
Posterior probability: number of point /pattern area
Prior probability: total number of point / total area10/100 = 0.10 (10%)
Posterior probability: number of point /pattern area (density of point/area) - P(D|AB)
Percentage of points: # points on pattern/total # of points P(AB|D)
Concept of Prior probability and Posterior probability
5
not A not B C
A not B C
not A B C
Three patterns: trees, lake and road buffer
A B C A notB notC
AB
notC
notAnotB notC
notA
B n
otC
5.0)|( =DABP
0.4(0.42)
110.40.40.60.6
0.30.30.10.1(0.12)(0.12)
0.20.2(0.18)(0.18)
nnot Bot B
0.70.70.30.3(0.28)(0.28)
0.40.4(0.42)(0.42)
BB
nnot Aot AAA
0.2(0.18)
0.1(0.12)
0.3 (0.28)
not A not B
A not B
not A B
A B
Percentage of points
110.40.40.60.6
0.30.30.10.1(0.12)(0.12)
0.20.2(0.18)(0.18)
nnot Bot B
0.70.70.30.3(0.28)(0.28)
0.40.4(0.42)(0.42)
BB
nnot Aot AAA
Percentage of points
P(AB|D) P(notA B|D)
P(A notB|D) P(notA notB|D)
P(B|D)
P(notB|D)
P(notA|D)P(A|D)
marginal probability
Joint probabilityPercentage of points of independent events
P(AB|D) = P(A|D) P(B|D)
marginal probabilityJoint probability
Percentage of points on AB = % points on A * % points on B
6
5.0)|( =DABP
0.07(0.10)
110.480.480.520.52
0.80.80.350.35(0.38)(0.38)
0.450.45(0.42)(0.42)
nnot Bot B
0.20.20.130.13(0.10)(0.10)
0.070.07(0.10)(0.10)
BB
nnot Aot AAA
0.45(0.42)
0.35(0.38)
0.13 (0.10)
not A not B
A not B
not A B
A B
Percentage of Areas
110.480.480.520.52
0.800.800.350.35(0.38)(0.38)
0.450.45(0.42)(0.42)
nnot Bot B
0.200.200.130.13(0.10)(0.10)
0.070.07(0.10)(0.10)
BB
nnot Aot AAA
Percentage of areas
P(AB) P(notA B)
P(A notB) P(notA notB)
P(B)
P(notB)
P(notA)P(A)
marginal probability
Joint probability
Percentage of areas of independent events
P(AB) = P(A) P(B)
marginal probabilityJoint probability
% Area of AB = % Area of A * % Area of B
Bayes’s Rule:
Probability map
P(D|A) = P(D)P(A|D)/P(A)
P(D|notA) = P(D)P(notA|D)/P(notD)
P(D|AB) = P(D) P(A|D)/P(A) P(B|D)/P(B)
P(D|AnotB) = P(D)P(A|D)/P(A) P(notB|D)/P(notB)
P(D|not AB) = P(D)P(notA|D)/P(notA) P(B|D)/P(B)P(D|not A notB) = P(D)P(notA|D)/P(notA)
P(not B|D)/P(not B)
7
Bayes’s Rule:
Log (Probability)
log[P(D|A)] = log[P(D)]+log[P(A|D)/P(A)]
= Log[P(D)] + WA+
Log[P(D|notA)] = log[P(D)] + log[P(notA|D)/P(notD)]
= Log[P(D)] + WA-
Where WA+ = log[P(A|D)/P(A)]
WA- = log[P(notA|D)/P(notA)]
Log[P(D|AB)] = log[P(D)] +WA++ WB
+
Log[P(D|A notB)] = log[P(D)] +WA++ WB
-
Log[P(D|notAB)] = log[P(D)] +WA-+ WB
+
Log[P(D|notA not B)] = log[P(D)] +WA-+ WB
-
If A, B, C are conditionally independent then
WA+ = log[P(A|D)/P(A)] = Log[ ]
= Log[ ]
P(A|D)
P(A)
% points on A
% Area of A
WA+ > 0 positive correlation between A and points
WA+ = 0 no correlation between A and points
WA+ < 0 negative correlation between A and points
Spatial Association Index
Contrast
C = WA+ - WA
-
(1) -∞ < C < ∞
(2) C = 0 A and D are independent
(3) C > 0 positive correlation between D and A
(4) C < 0 negative correlation between D and A
7
Logistic Model for SDSSLogistic Model for SDSS
......}|{ 0 +++= BA WWWABDLogit
...)|( ABDP
)(DP
)|(),|( BDPADP
Prior Probability
Posterior Probability
#(D) =20Area(T) = 7780P(D) = 0.0026Area(A) =3065#(D|A) = 15P(D|A) = 0.0049
Area(B) =4175#(D|B) = 19P(D|B) = 0.0045
8
Area(AB) =1624.27#(D|AB) = 13P(D|AB) = 0.008
9
Au, W, As,
Au- Sn- W- As Multiple Elements
Spatial Data Modeler Extension: Arc-SDM
Weights of Evidence
Logistic Regression
Fuzzy Logic
Neural Network
10
Logistic Model for SDSSLogistic Model for SDSS
......}|{ 0 +++= BA WWWABDLogit
...)|( ABDP
Posterior Probability Prediction of Potential Flowing Wells in the ORM
Flowing Wells and SpringsFlowing Wells and Springs Flowing Wells vs. Distance from Flowing Wells vs. Distance from ORMORM
-8
-4
0
4
8
12
0 5000 10000 15000
Spatial Correlation
Distance
Flowing Wells vs. Distance from Flowing Wells vs. Distance from ORMORM
Spatial Correlation
Distance
Flowing Wells vs. Distance Flowing Wells vs. Distance From High Slope ZoneFrom High Slope Zone
-8
-4
0
4
8
0 2000 4000 6000 8000
Spatial Correlation
Distance
11
Flowing Wells vs. Thickness of Flowing Wells vs. Thickness of DriftDrift
Flowing vs. Distance from Thick Flowing vs. Distance from Thick DriftDrift
-4
0
4
8
0 5000 10000 15000 20000
Spatial Correlation
Distance
Flowing vs. Distance from Thick Flowing vs. Distance from Thick DriftDrift
Potential Locations of Flowing WellsPotential Locations of Flowing Wells by by SDSS SDSS ––
Weights of EvidenceWeights of Evidence (Cheng, 2001)(Cheng, 2001)Theme Area
% Points% Contrast t-value
LR Coeff.
LRStd
Buffer zone (1~1km) around steep slope 63 80 0.89 6.54 0.88 0.14Buffer zone (1~5km) around the ORM 40.3 55.3 0.62 5.68 0.28 0.12Buffer zone around steep slope of lower sand / gravel top 10.1 52.7 0.62 5.68 1.77 0.12
Ratio of sand/gravel unit cumulative thickness in well depth (6~25%) 43.5 65.1 0.90 7.88 0.49 0.12
Buffer zone (1~2.5km) of thick drift area 16.2 29.3 0.79 6.56 0.45 0.13Elevation of the upper confined aqu ifers at 356~375 (m a. s. l.) 16.8 38.6 1.18 10.44 0.47 0.13
Elevation of the lower confined aquifers at 311~347 (m a. s. l.) 40.5 58.2 0.73 6.60 0.28 0.13
Steep slope of confined aquifer surface 22.9 54.7 1.45 13.19 0.97 0.12Buffer zone ( 0~2km) around the small ponds 61.1 74.9 0.67 5.31 0.43 0.13Intercept constant -11.62 0.52
Results obtained by Weights of Evidence Results obtained by Weights of Evidence and Logistic Regression Methodsand Logistic Regression Methods
12
Multivariate Logistic Regression