Evaluation of Five GIS based Interpolation Techniques for Estimating the Radon Concentration for...
-
Upload
alexa-towse -
Category
Documents
-
view
216 -
download
1
Transcript of Evaluation of Five GIS based Interpolation Techniques for Estimating the Radon Concentration for...
Evaluation of Five GIS based Interpolation Techniques for Estimating the Radon
Concentration for Unmeasured Zip Codes in the State of Ohio
By
Suman Maroju
Department of Civil Engineering
The University of Toledo
Advisor: Ashok Kumar PhD
IntroductionIntroduction Radon is a naturally occurring radioactive gas produced by the Radon is a naturally occurring radioactive gas produced by the
breakdown of Uranium in soil, rock and water. breakdown of Uranium in soil, rock and water.
Radon is the second most common cause of lung cancer after Radon is the second most common cause of lung cancer after
cigarette smoking, accounting for 15,000 to 22,000 cancer deaths cigarette smoking, accounting for 15,000 to 22,000 cancer deaths
per year in the US alone according to the National Cancer Institute per year in the US alone according to the National Cancer Institute
(USA) (USA)
Radon gas is believed to cause about 14% of lung cancer deaths
(1000+ deaths) in Ohio annually.
45% of homes in Ohio exceed the USEPA action level.
62.5% of schools in Ohio have at least one room in excess of the
USEPA action level
Data CollectionData Collection
Data collected from various county health Data collected from various county health
departments, commercial testing services and departments, commercial testing services and
university researchers.university researchers.
Original database – Kumar et al. (1990)Original database – Kumar et al. (1990)
1996 and 1997 – 82,000 1996 and 1997 – 82,000
New data being constantly addedNew data being constantly added
Total of 130,826 observations used in this studyTotal of 130,826 observations used in this study
ObjectivesObjectives
To evaluate the best interpolation technique To evaluate the best interpolation technique for the radon data set.for the radon data set.
To perform this interpolation technique on the To perform this interpolation technique on the whole radon data set, obtain prediction map whole radon data set, obtain prediction map and estimate concentrations for unmeasured and estimate concentrations for unmeasured zip codes. zip codes.
To present the impact of the results obtained To present the impact of the results obtained from this study. from this study.
ArcGIS Geostatistical AnalystArcGIS Geostatistical Analyst
Geostatistical Analyst provides a wide variety of Geostatistical Analyst provides a wide variety of
tools for spatial data exploration, identification of tools for spatial data exploration, identification of
data anomalies, evaluation of error in prediction data anomalies, evaluation of error in prediction
surface models, statistical estimation and optimal surface models, statistical estimation and optimal
surface creation.surface creation.
Exploratory Spatial Data Analysis Exploratory Spatial Data Analysis (ESDA) Tool(ESDA) Tool
The ESDA tools are designed to explore the The ESDA tools are designed to explore the
distribution of data, look for global trends in the distribution of data, look for global trends in the
data, examining spatial autocorrelation and data, examining spatial autocorrelation and
understand the correlation between multiple data understand the correlation between multiple data
sets.sets.
Tools include Histogram, Normal QQ Plot, Trend Tools include Histogram, Normal QQ Plot, Trend
Analysis, Semivariogram/Covariance Cloud.Analysis, Semivariogram/Covariance Cloud.
Histogram Histogram
The Histogram tool in The Histogram tool in
ESDA provides a ESDA provides a
univariate (one-univariate (one-
variable) description of variable) description of
the data. the data.
The plots shows the The plots shows the
frequency distribution frequency distribution
for the radon data set.for the radon data set.
Normal QQ PlotNormal QQ Plot
The QQ Plot is to
compare the distribution
of the data to a standard
normal distribution.
Trend AnalysisTrend Analysis
The Trend The Trend
Analysis tool Analysis tool
can help identify can help identify
global trends in global trends in
the input data the input data
set.set.
North-South Trend line
East-West trend line
North-South axisEast-West axis
Semivariogram/Covariance CloudSemivariogram/Covariance CloudSemivariogram Semivariogram points representing points representing pairs of locationspairs of locations
ApproachApproach The geometric mean of radon concentration values is The geometric mean of radon concentration values is
inputted for each zip code and zero values are assigned inputted for each zip code and zero values are assigned to the zip codes that are not measured. to the zip codes that are not measured.
The polygon features of Ohio zip codes shape file is The polygon features of Ohio zip codes shape file is converted into point features to input as point data converted into point features to input as point data source in the interpolation techniques. source in the interpolation techniques.
The point featured shape file is then divided into two The point featured shape file is then divided into two shape files; one having 1066 zip codes with radon shape files; one having 1066 zip codes with radon concentration data and the other contains 796 zip codes concentration data and the other contains 796 zip codes
with no measured radon concentration data.with no measured radon concentration data.
ApproachApproach
The first step is to evaluate the best interpolation The first step is to evaluate the best interpolation
technique. technique.
The point featured shape file is divided into 80% The point featured shape file is divided into 80%
training data points and 20% test data points. training data points and 20% test data points.
Sensitivity analysis for division of data setSensitivity analysis for division of data set
Then the different interpolation techniques are Then the different interpolation techniques are
executed using the training data points which executed using the training data points which
creates a layer of spatial variation and the creates a layer of spatial variation and the
predictions are evaluated for test data points.predictions are evaluated for test data points.
ApproachApproach
Second part Second part
– Best interpolation technique is chosen based on values Best interpolation technique is chosen based on values
of of statistical parametersstatistical parameters..
– Modeling is done for the whole radon data set, which Modeling is done for the whole radon data set, which
creates a surface of spatial variation and the predictions creates a surface of spatial variation and the predictions
for unmeasured zip codes (where no data is collected) for unmeasured zip codes (where no data is collected)
is evaluated from the surface created.is evaluated from the surface created.
Interpolation methodsInterpolation methods
Five Interpolation TechniquesFive Interpolation Techniques
Ordinary KrigingOrdinary Kriging
Inverse Distance Weighting (IDW)Inverse Distance Weighting (IDW)
Radial Basis Function (RBF)Radial Basis Function (RBF)
Local Polynomial Interpolation Local Polynomial Interpolation
Global Polynomial InterpolationGlobal Polynomial Interpolation
Ordinary Kriging Ordinary Kriging
Kriging is divided into two distinct tasks:Kriging is divided into two distinct tasks: Quantifying the spatial structure of the data Quantifying the spatial structure of the data
(known as variography) and producing a (known as variography) and producing a prediction i.e., fitting a spatial dependence prediction i.e., fitting a spatial dependence model to the data.model to the data.
Make a prediction for the unknown value of a Make a prediction for the unknown value of a specific location. Achieved by using the fitted specific location. Achieved by using the fitted model from the variography (spatial data model from the variography (spatial data configuration) and values of the measured configuration) and values of the measured sample points around the prediction location. sample points around the prediction location.
Ordinary KrigingOrdinary KrigingThe equation used in Ordinary Kriging is:The equation used in Ordinary Kriging is:
Z*Z* ( (uu) is the Ordinary Kriging estimate at spatial ) is the Ordinary Kriging estimate at spatial location location uu, ,
n (n (uu) is the number of the data used at the ) is the number of the data used at the known locations given a neighborhoodknown locations given a neighborhood
Z (Z (uuαα ) are the n measured data at locations ) are the n measured data at locations uuαα located close to located close to uu
m= mean of distributionm= mean of distribution
)(
1
)(1un
u
Z*(u) =
Z(u)(
)(
1
uun
m
Ordinary KrigingOrdinary Kriging
λλαα ( (u)=u)= weights for location weights for location uuαα computed from computed from the spatial covariance matrix based on the the spatial covariance matrix based on the spatial continuity (variogram) model, which is spatial continuity (variogram) model, which is given by: given by:
n is the number of data pairs separated by distance h z(ui) and z(ui+h) are the data values at locations
separated by distance h
2
1
))()((2
1huzuz
n i
n
ii
γ (h) =
Ordinary KrigingOrdinary Kriging
Ordinary KrigingOrdinary Kriging There are three primary
parameters that describe the autocorrelation of radon concentrations. These are range, nugget and sill.
– The range is where the best-fit line starts to level off, (46.55). Within the range, all data are correlated.
– The maximum semivariogram value is
sill parameter (0.2869)
– Nugget is data variation due to measurement errors (0.20487).
Range
Sill
Nugget
Spherical model
Ordinary KrigingOrdinary Kriging
Ordinary KrigingOrdinary Kriging
Inverse Distance Weighting (IDW)Inverse Distance Weighting (IDW)
IDW interpolation assumes that things close to one IDW interpolation assumes that things close to one another are more alike than those farther apart. another are more alike than those farther apart.
To predict a value for any unmeasured location, IDW will To predict a value for any unmeasured location, IDW will use the measured values surrounding the prediction use the measured values surrounding the prediction location. location.
Measured values closest to the prediction location will Measured values closest to the prediction location will have more influence on the predicted value than those have more influence on the predicted value than those farther away. farther away.
IDW assumes that each measured point has a local IDW assumes that each measured point has a local influence that diminishes with distance. influence that diminishes with distance.
Inverse Distance WeightingInverse Distance Weighting
A simple IDW weighting function, as defined by A simple IDW weighting function, as defined by Shepard, is :Shepard, is :
Where w(d) is the weighting factor applied to a known value Where w(d) is the weighting factor applied to a known value
d is the distance between known and unknown values d is the distance between known and unknown values
p is the power parameter (most common value is 2).p is the power parameter (most common value is 2).
A general form of interpolating a value using IDW is:A general form of interpolating a value using IDW is:
Inverse Distance WeightingInverse Distance Weighting
Inverse Distance WeightingInverse Distance Weighting
Radial Basis Function (RBF)Radial Basis Function (RBF)
RBF is an exact interpolation technique in the RBF is an exact interpolation technique in the sense that, the surface created must go through sense that, the surface created must go through each measured sample value.each measured sample value.
It is similar to IDW, except that it predicts values It is similar to IDW, except that it predicts values above the maximum and below the minimum above the maximum and below the minimum measured values.measured values.
Radial Basis Function (RBF)Radial Basis Function (RBF)
Radial Basis Function (RBF)Radial Basis Function (RBF)
Global Polynomial InterpolationGlobal Polynomial Interpolation
Global Global polynomial polynomial interpolation interpolation technique fits a technique fits a plane through plane through the measured the measured data points. A data points. A plane is typically plane is typically a polynomial. a polynomial.
Global Polynomial InterpolationGlobal Polynomial Interpolation
Local polynomial InterpolationLocal polynomial Interpolation
While Global While Global Polynomial Polynomial interpolation fits interpolation fits a polynomial to a polynomial to the entire the entire surface, Local surface, Local Polynomial Polynomial interpolation fits interpolation fits many many polynomials, polynomials, each within each within specified specified overlapping overlapping neighborhoods. neighborhoods.
Local polynomial InterpolationLocal polynomial Interpolation
Evaluation CriteriaEvaluation Criteria
Several statistical indicators (Root Mean Square Error Several statistical indicators (Root Mean Square Error
(RMSE), Mean Error (ME), Mean Absolute Error (MAE) (RMSE), Mean Error (ME), Mean Absolute Error (MAE)
and Mean Square Error (MSE)) are computed on observed and Mean Square Error (MSE)) are computed on observed
and predicted radon concentrations.and predicted radon concentrations.
Confidence limits on the statistics for NormalizedConfidence limits on the statistics for Normalized Mean Mean
Square Error (NMSE), Fractional Bias (FB),Square Error (NMSE), Fractional Bias (FB), and Coefficient and Coefficient
of Correlation (r) are calculated using Bootstrap application of Correlation (r) are calculated using Bootstrap application
to identify the most suitable interpolation technique.to identify the most suitable interpolation technique.
ResultsResultsMeasured Vs Predicted Radon Conc. Values for the test Measured Vs Predicted Radon Conc. Values for the test
datasetsdatasetsOrdinary Kriging Estimates for Test Dataset
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
10.00
0.00 2.00 4.00 6.00 8.00 10.00
Predicted Values
Mea
sure
d V
alue
s
Ordinary Kriging estimatesfor Test Dataset
Linear (Ordinary Krigingestimates for Test Dataset)
IDW Estimates for Test Dataset
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
10.00
0.00 2.00 4.00 6.00 8.00 10.00
Predicted Values
Mea
sure
d V
alue
s
IDW Estimates for TestDataset
Linear (IDW Estimates forTest Dataset)
RBF Estimates for Test Dataset
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
10.00
0.00 2.00 4.00 6.00 8.00 10.00
Predicted Values
Mea
sure
d Va
lues RBF Estimates for Test
Dataset
Linear (RBF Estimates forTest Dataset)
ResultsResults Measured Vs Predicted Radon Conc. Values for test Measured Vs Predicted Radon Conc. Values for test
datasetsdatasets
LPI Estimated for Test Dataset
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
10.00
0.00 2.00 4.00 6.00 8.00 10.00
Predicted Values
Mea
sure
d Va
lues LPI Estimated for Test
Dataset
Linear (LPI Estimated forTest Dataset)
GPI Estimates for the Test Dataset
0.001.002.003.004.005.00
6.007.008.009.00
10.00
0.00 2.00 4.00 6.00 8.00 10.00
Predicted Values
Mea
sure
d V
alue
s
GPI Estimates for the TestDataset
Linear (GPI Estimates forthe Test Dataset)
ResultsResults
ME, MAE, MSE and RMSE values of different ME, MAE, MSE and RMSE values of different interpolation techniques for geometric mean of interpolation techniques for geometric mean of
radon concentration test predictionsradon concentration test predictions
Ordinary Kriging
IDW RBFGlobal
Polynomial Interpolation
Local Polynomial
Interpolation
ME 0.09 0.17 0.19 0.1 0.14
MAE 1.33 1.45 1.44 1.46 1.4
MSE 4.99 5.77 5.57 5.15 5.21
RMSE Value
2.23 2.4 2.36 2.27 2.28
ResultsResults
NMSE, FB and Corr. Values from Bootstrap MethodNMSE, FB and Corr. Values from Bootstrap Method
Ordinary Kriging
IDW RBFGlobal
Polynomial Interpolation
Local Polynomial
Interpolation
NMSE 0.41 0.46 0.44 0.42 0.42
FB -0.026 -0.047 -0.055 -0.027 -0.041
Corr. (r) 0.5 0.42 0.45 0.48 0.47
ResultsResultsSummary of Robust and Seductive 95% Summary of Robust and Seductive 95%
Confidence Limits Analyses on Each TechniqueConfidence Limits Analyses on Each Technique
Ordinary Kriging
IDW RBFGlobal
PolynomialLocal
Polynomial
NMSE X X X X X
FB
Corr. (r) X X X X X
Note:X indicates significantly different from zero.Blank indicates not significantly different from zero.
ResultsResults
Summary of Robust and Seductive 95% Confidence Limits Analyses Summary of Robust and Seductive 95% Confidence Limits Analyses among Each Techniqueamong Each Technique
Interpolation Technique
Among Techniques
NMSE FB Corr.(r)
Yes No Yes No Yes No
Ordinary Kriging- IDW
Ordinary Kriging –RBF X
Ordinary Kriging - GPI
Ordinary Kriging - LPI
IDW- RBF
IDW- GPI
IDW- LPI
RBF- GPI
RBF- LPI
GPI – LPI
Note:Yes- Indicates significantly different from zero.No- Indicates not significantly different from zero
Comparison of the behavior of the prediction maps with the soil Comparison of the behavior of the prediction maps with the soil uranium concentrations mapuranium concentrations map
Comparison of the behavior of the prediction maps with the soil Comparison of the behavior of the prediction maps with the soil uranium concentrations mapuranium concentrations map
ResultsResults
ResultsResults
Predicted Geometric Mean of Radon Predicted Geometric Mean of Radon Concentrations Using Ordinary Kriging technique Concentrations Using Ordinary Kriging technique
for Lucas Countyfor Lucas County
ZIP CODE COUNTY PREDICTED GM
43402 LUCAS 1.88
43445 LUCAS 2.96
43449 LUCAS 2.89
43460 LUCAS 2.35
43522 LUCAS 1.80
43551 LUCAS 2.28
43558 LUCAS 1.92
ConclusionConclusion
Prediction maps were created using the training data set for all five Prediction maps were created using the training data set for all five
interpolation techniques and projected values were estimated for the interpolation techniques and projected values were estimated for the
test data set.test data set.
Statistical parameters (error values) were evaluated and the Statistical parameters (error values) were evaluated and the
prediction maps generated from these techniques were compared to prediction maps generated from these techniques were compared to
the soil uranium concentration map.the soil uranium concentration map.
It was inferred that any of the four (Ordinary Kriging, IDW, RBF and It was inferred that any of the four (Ordinary Kriging, IDW, RBF and
Local Polynomial) interpolation techniques can be used for predicting Local Polynomial) interpolation techniques can be used for predicting
the radon concentrations for unmeasured zip codes.the radon concentrations for unmeasured zip codes.
Ordinary Kriging technique was chosen and the geometric means of Ordinary Kriging technique was chosen and the geometric means of
radon concentrations were evaluated for unmeasured zip codes.radon concentrations were evaluated for unmeasured zip codes.
ConclusionConclusion
From the data sets available prior to study, number of zip codes From the data sets available prior to study, number of zip codes having geometric mean of radon concentration over 4.0 pCi/l is having geometric mean of radon concentration over 4.0 pCi/l is 390. 390.
After using the Ordinary Kriging interpolation technique to calculate After using the Ordinary Kriging interpolation technique to calculate the predictions for unmeasured zip codes, number of zip codes the predictions for unmeasured zip codes, number of zip codes having radon concentration over 4.0 pCi/l is 688.having radon concentration over 4.0 pCi/l is 688.
The predicted radon concentrations for unmeasured zip codes were The predicted radon concentrations for unmeasured zip codes were found to be below 8 pCi/l.found to be below 8 pCi/l.
Therefore, for the cases where the geometric mean of radon Therefore, for the cases where the geometric mean of radon concentration exceeds 8 pCi/l and 20 pCi/l, the number of zip codes concentration exceeds 8 pCi/l and 20 pCi/l, the number of zip codes from existing data is equal to that obtained by interpolation from existing data is equal to that obtained by interpolation technique for unmeasured zip codes (85 and 9 for the respective technique for unmeasured zip codes (85 and 9 for the respective cases).cases).
Thank youThank you
Sensitivity Analysis for division of Sensitivity Analysis for division of data set data set
Interpolation Technique
80-20 (%) 70-30 (%) 60-40 (%)
RMSE RMSE RMSE
Ordinary Kriging 2.23 3.33 2.86
IDW 2.4 3.31 2.29
RBF 2.36 3.31 2.93
Global Polynomial 2.27 3.57 3.06
Local Polynomial 2.28 3.3 2.91