Robust GW summary statistics & robust GW regression are used to investigate spatial variation and...

1
Robust GW summary statistics & robust GW regression are used to investigate spatial variation and relationships in a freshwater acidification critical load (FACL) data set covering Great Britain. Results show that data relationships can vary across space, but where this perception can sometimes depend on only a few influential outlying observations. An important advance is the development of a new robust GW regression model to cater for such phenomena. GW principal components analysis is used to investigate the changing local structure in multivariate spatial data sets. Using Dublin voter turnout data, the GW PCA methodology is advanced to incorporate: (i) automatic bandwidth selection, (ii) tests for its application and (iii) visualisation techniques for its output. Furthermore, a robust GW PCA is developed to detect multivariate spatial outliers. This extension is demonstrated using in a soils geochemical data set for Northern Europe (Fig. 1). Geographically weighted (GW) models: advances in investigating spatial heterogeneity Paul Harris, Martin Charlton & Chris Brunsdon* Studies for spatial exploration, visualisation & outlier detection Inference related problems in GW regression are investigated using GW PCA and other techniques; primarily to address strong criticisms of GW regression in the literature regarding local collinearity (Figs. 3 & 4). A guide to fitting and interpreting GW models in this respect is given. Simulated and Dublin voter turnout data sets are used in this study. A Bayesian spatially varying coefficient model is developed to also address inference related problems in GW regression. The Bayesian model is an entirely different approach to the nonparametric GW regression model and benefits from a better handle on uncertainty. The model is performance tested both for exploration/inference and for spatial prediction/inference (using London house price data). GW regression as a spatial predictor is assessed in three complementary studies. First, its performance is compared to: (i) standard regression; (ii) standard kriging (from geostatistics) and (iii) hybrids (kriging combined with GW regression), using simulated data sets. Results show promise with the hybrids but that standard kriging should be preferred. In a second study, GW regression with a heteroskedastic error variance is linked and compared with a corresponding kriging model. This new GW predictor is able to provide relatively accurate prediction confidence intervals. In a third study, indicator kriging is combined with GW regression to form a novel hybrid that also provides promisingly accurate prediction confidence intervals. The latter studies both use the FACL data. Kriging with GW variograms is a novel geostatistical-nonparametric hybrid. This non-stationary variogram technique generalises moving window kriging where classic variogram estimators are replaced with information-rich, GW variogram estimators. Results indicate (using 4 pollution data sets) much promise in the new model (Fig. 2). Related studies visualise outputs from such non-stationary variogram predictors; these studies include the development of a robust model. Optimal sample (or network) re-design. Initial work has been conducted using GW summary statistics and a location-allocation algorithm. Current work uses the outputs from the GW predictors above with simulated annealing to achieve similar optimisation objectives. All studies use an air pollution data set. Research presented in this poster was funded by a Strategic Research Cluster Grant (07/SRC/I1168) by Science Foundation Ireland under the National Development Plan. The authors gratefully acknowledge this support. Studies for spatial prediction, its uncertainty & sample re-design Studies for model inference & statistical properties Fig. 1: Outlier detection with robust GWPCA The GWmodel R package The fundamental science of the above studies will be transferred to studies in applied science on the StratAG project via an open source R package of statistical computing code. The R package will include existing GW models and our newly developed GW models (including advancements from other members of the spatial heterogeneity team, not presented here). This R package can be mirrored with a set of GW modelling tools for ESRI’s ArcGIS. * Visiting professor – University of Liverpool, UK Model Ran k Model Rank Std kriging 4 Std MWK 2 Std nonlinear kriging 3 MWK with GW variograms 1 2: Classic local variograms (top), GW variograms (middle), global variogram tom left) and results Fig. 4: The use of GWPCA to map matrix conditional numbers ( > 30 suggests a significant local collinearity effect). Dublin voter turnout covariate data. Fig.3. Local variance inflation factors for two covariates of the Dublin voter turnout data set (VIFs > 10 for a bandwidth of 150 nearest neighbours are highlighted).

Transcript of Robust GW summary statistics & robust GW regression are used to investigate spatial variation and...

Page 1: Robust GW summary statistics & robust GW regression are used to investigate spatial variation and relationships in a freshwater acidification critical.

Robust GW summary statistics & robust GW regression are used to investigate spatial variation and relationships in a freshwater acidification critical load (FACL) data set covering Great Britain. Results show that data relationships can vary across space, but where this perception can sometimes depend on only a few influential outlying observations. An important advance is the development of a new robust GW regression model to cater for such phenomena.

GW principal components analysis is used to investigate the changing local structure in multivariate spatial data sets. Using Dublin voter turnout data, the GW PCA methodology is advanced to incorporate: (i) automatic bandwidth selection, (ii) tests for its application and (iii) visualisation techniques for its output. Furthermore, a robust GW PCA is developed to detect multivariate spatial outliers. This extension is demonstrated using in a soils geochemical data set for Northern Europe (Fig. 1).

Geographically weighted (GW) models: advances in investigating spatial heterogeneityPaul Harris, Martin Charlton & Chris Brunsdon*

Studies for spatial exploration, visualisation & outlier detection

Inference related problems in GW regression are investigated using GW PCA and other techniques; primarily to address strong criticisms of GW regression in the literature regarding local collinearity (Figs. 3 & 4). A guide to fitting and interpreting GW models in this respect is given. Simulated and Dublin voter turnout data sets are used in this study.

A Bayesian spatially varying coefficient model is developed to also address inference related problems in GW regression. The Bayesian model is an entirely different approach to the nonparametric GW regression model and benefits from a better handle on uncertainty. The model is performance tested both for exploration/inference and for spatial prediction/inference (using London house price data).

GW regression as a spatial predictor is assessed in three complementary studies. First, its performance is compared to: (i) standard regression; (ii) standard kriging (from geostatistics) and (iii) hybrids (kriging combined with GW regression), using simulated data sets. Results show promise with the hybrids but that standard kriging should be preferred. In a second study, GW regression with a heteroskedastic error variance is linked and compared with a corresponding kriging model. This new GW predictor is able to provide relatively accurate prediction confidence intervals. In a third study, indicator kriging is combined with GW regression to form a novel hybrid that also provides promisingly accurate prediction confidence intervals. The latter studies both use the FACL data.

Kriging with GW variograms is a novel geostatistical-nonparametric hybrid. This non-stationary variogram technique generalises moving window kriging where classic variogram estimators are replaced with information-rich, GW variogram estimators. Results indicate (using 4 pollution data sets) much promise in the new model (Fig. 2). Related studies visualise outputs from such non-stationary variogram predictors; these studies include the development of a robust model.

Optimal sample (or network) re-design. Initial work has been conducted using GW summary statistics and a location-allocation algorithm. Current work uses the outputs from the GW predictors above with simulated annealing to achieve similar optimisation objectives. All studies use an air pollution data set.

Research presented in this poster was funded by a Strategic Research Cluster Grant (07/SRC/I1168) by Science Foundation Ireland under the National Development Plan. The authors gratefully acknowledge this support.

Studies for spatial prediction, its uncertainty & sample re-design

Studies for model inference & statistical properties

Fig. 1: Outlier detection with robust GWPCA

The GWmodel R packageThe fundamental science of the above studies will be transferred to studies in applied science on the StratAG project via an open source R package of statistical computing code. The R package will include existing GW models and our newly developed GW models (including advancements from other members of the spatial heterogeneity team, not presented here). This R package can be mirrored with a set of GW modelling tools for ESRI’s ArcGIS.

* Visiting professor – University of Liverpool, UK

Model Rank Model Rank

Std kriging 4 Std MWK 2

Std nonlinear kriging 3 MWK with GW variograms

1

Fig. 2: Classic local variograms (top), GW variograms (middle), global variogram(bottom left) and results

Fig. 4: The use of GWPCA to map matrix conditional numbers ( > 30 suggests a significant local collinearity effect). Dublin voter turnout covariate data.

Fig.3. Local variance inflation factors for two covariates of the Dublin voter turnout data set (VIFs > 10 for a bandwidth of 150 nearest neighbours are highlighted).