outlier kerry.pdf

28
Computers & Geosciences 33 (2007) 1233–1260 Determining the effect of asymmetric data on the variogram. II. Outliers R. Kerry a, , M.A. Oliver b a Department of Geography, Brigham Young University, Provo, Utah, USA b Department of Soil Science, University of Reading, Reading, England Received 21 May 2005; accepted 26 July 2006 Abstract Asymmetry in a distribution can arise from a long tail of values in the underlying process or from outliers that belong to another population that contaminate the primary process. The first paper of this series examined the effects of the former on the variogram and this paper examines the effects of asymmetry arising from outliers. Simulated annealing was used to create normally distributed random fields of different size that are realizations of known processes described by variograms with different nugget:sill ratios. These primary data sets were then contaminated with randomly located and spatially aggregated outliers from a secondary process to produce different degrees of asymmetry. Experimental variograms were computed from these data by Matheron’s estimator and by three robust estimators. The effects of standard data transformations on the coefficient of skewness and on the variogram were also investigated. Cross-validation was used to assess the performance of models fitted to experimental variograms computed from a range of data contaminated by outliers for kriging. The results showed that where skewness was caused by outliers the variograms retained their general shape, but showed an increase in the nugget and sill variances and nugget:sill ratios. This effect was only slightly more for the smallest data set than for the two larger data sets and there was little difference between the results for the latter. Overall, the effect of size of data set was small for all analyses. The nugget:sill ratio showed a consistent decrease after transformation to both square roots and logarithms; the decrease was generally larger for the latter, however. Aggregated outliers had different effects on the variogram shape from those that were randomly located, and this also depended on whether they were aggregated near to the edge or the centre of the field. The results of cross-validation showed that the robust estimators and the removal of outliers were the most effective ways of dealing with outliers for variogram estimation and kriging. r 2007 Elsevier Ltd. All rights reserved. Keywords: Geostatistics; Normality; Outliers; Simulation; Skewness; Variogram; Robust estimators; Data transformation 1. Introduction Departures from normality can arise from a long tail of larger or smaller values in the underlying process; in our first paper in this issue (Kerry and Oliver, 2007), we examined the effect of this on the variogram. Here, we focus on the presence of a relatively small number (e) of extreme values from another population (C ) that contaminate a primary Gaussian, N P (0,1), process (P)—these values may be considered as outliers. There has been much ARTICLE IN PRESS www.elsevier.com/locate/cageo 0098-3004/$ - see front matter r 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.cageo.2007.05.009 Corresponding author. E-mail address: [email protected] (R. Kerry).

description

outleir

Transcript of outlier kerry.pdf

  • Computers & Geosciences 33 (2

    mII. Outliers

    roots and logarithms; the decrease was generally larger for the latter, however. Aggregated outliers had different effects on

    tail of larger or smaller values in the underlying relatively small number (e) of extreme values fromanother population (C ) that contaminate a primary

    ARTICLE IN PRESSGaussian, NP (0,1), process (P)these values maybe considered as outliers. There has been much

    0098-3004/$ - see front matter r 2007 Elsevier Ltd. All rights reserved.

    doi:10.1016/j.cageo.2007.05.009

    Corresponding author.E-mail address: [email protected] (R. Kerry).the variogram shape from those that were randomly located, and this also depended on whether they were aggregated near

    to the edge or the centre of the eld. The results of cross-validation showed that the robust estimators and the removal of

    outliers were the most effective ways of dealing with outliers for variogram estimation and kriging.

    r 2007 Elsevier Ltd. All rights reserved.

    Keywords: Geostatistics; Normality; Outliers; Simulation; Skewness; Variogram; Robust estimators; Data transformation

    1. Introduction

    Departures from normality can arise from a long

    process; in our rst paper in this issue (Kerry andOliver, 2007), we examined the effect of this on thevariogram. Here, we focus on the presence of aR. Kerry , M.A. OliveraDepartment of Geography, Brigham Young University, Provo, Utah, USA

    bDepartment of Soil Science, University of Reading, Reading, England

    Received 21 May 2005; accepted 26 July 2006

    Abstract

    Asymmetry in a distribution can arise from a long tail of values in the underlying process or from outliers that belong to

    another population that contaminate the primary process. The rst paper of this series examined the effects of the former

    on the variogram and this paper examines the effects of asymmetry arising from outliers. Simulated annealing was used to

    create normally distributed random elds of different size that are realizations of known processes described by variograms

    with different nugget:sill ratios. These primary data sets were then contaminated with randomly located and spatially

    aggregated outliers from a secondary process to produce different degrees of asymmetry. Experimental variograms were

    computed from these data by Matherons estimator and by three robust estimators. The effects of standard data

    transformations on the coefcient of skewness and on the variogram were also investigated. Cross-validation was used to

    assess the performance of models tted to experimental variograms computed from a range of data contaminated by

    outliers for kriging.

    The results showed that where skewness was caused by outliers the variograms retained their general shape, but showed

    an increase in the nugget and sill variances and nugget:sill ratios. This effect was only slightly more for the smallest data set

    than for the two larger data sets and there was little difference between the results for the latter. Overall, the effect of size of

    data set was small for all analyses. The nugget:sill ratio showed a consistent decrease after transformation to both squarea, bDetermining the effect of asym007) 12331260

    etric data on the variogram.

    www.elsevier.com/locate/cageo

  • ARTICLE IN PRESSR. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 123312601234discussion about what values in a distributionconstitute outliers and how to deal with them (seeBarnett and Lewis (1994) for a thorough discussionon this subject). In summary, outliers are eitherdistributional in nature and are usually obvious asparticularly large or small values in a histogram orbox and whisker plot (Tukey, 1977), or they can bespatial, whereby the values are particularly differentfrom other values in their spatial vicinity. The latterneed not be distributional outliers and may beidentied to an extent from a pixel map of values orby other more elaborate methods such as thoseproposed by Gnanadesikan and Kettenring (1972)and Haslett et al. (1991), for example. For thepurpose of this investigation, we assume that theoutliers are distributional, are not erroneous valuesand have been identied by a thorough precedingexploratory data analysis.Environmental data frequently have asymmetric

    distributions caused by a small number of marginalvalues or outliers from a secondary process. In thespatial context, the secondary process may give riseto randomly located outliers, such as localizedenrichments of ores, additions to the soil by animalfaecal deposits (McBratney and Webster, 1986),accidental localized spills of fertilizers and so on.Outliers can also be spatially aggregated, forexample, on industrial sites where the secondaryprocess has led to localized contamination of thesurface materials at isolated points or in elds whereanimal faecal deposits are in one part of the eld.These can all be regarded as quasi-point processessuperimposed on a primary continuous process.Although Matherons (1965) variogram estimator

    is asymptotically unbiased for any intrinsic randomfunction (Cressie, 1993), it is sensitive to departuresfrom normality or from a symmetric distribution(Webster and Oliver, 2001) because it is based onsquared differences. It is particularly sensitive tooutlying values of Z and even a single outlier candistort the experimental variogram because it mightbe involved in several paired comparisons overmany or all lag intervals. Several robust estimatorsof the variogram have been devised to solve theproblem of asymmetry resulting from outliers, suchas those of Armstrong and Delner (1980), Cressieand Hawkins (1980), Dowd (1984) and Genton(1998a). Lark (2000) investigated three of theserobust variogram estimators with simulated andreal soil data contaminated by outliers. He showedthat robust estimators were generally useful for data

    contaminated with outliers, but not for data wherethe asymmetry has a more general underlying cause.This was to be expected because an underlyingassumption of these robust estimators is that thedata have a contaminated normal distribution.Genton (1998b) further showed that the shape ofvarious robust estimators changed in response tothe presence of different proportions of outliers.He used the term breakdown point to refer tothe number of outliers necessary to make anestimator explode (tend towards innity) or implode(tend to 0). Lark (2000) concluded from hisinvestigation that robust estimators are not asubstitute for a thorough exploratory data analysiswith appropriate editing and transformation of thedata prior to variography.Although we consider robust variogram estima-

    tors here, we are aware that analysts do not alwayshave access to appropriate software to computethem. Therefore, based on this and on Larks (2000)comment mentioned above, we also consider howone should proceed with exploratory data analysis,editing and transformation of data prior to geosta-tistical analysis where data are contaminated byoutliers. The procedure generally agreed on ingeostatistical texts is summarized in Fig. 1 of Kerryand Oliver (2007); however, this is based oninformed intuition rather than rigorous investiga-tion. If the skewness is outside the bounds of 71,the histogram and/or box and whisker plot shouldbe investigated. If the asymmetry is caused byoutliers, this is often more evident in a schematicbox and whisker plot than histogram and theextreme values should be investigated further. Ifthey are clearly the result of errors in the assemblyof data or laboratory analysis, they should beremoved permanently from the data. If they are truevalues they are likely to be of interest, particularly inpollution studies. Outliers can be treated as aseparate statistical population and removed forcomputing the variogram because it is often thevariogram of the underlying process that is ofinterest (Cressie, 1993). The removal of outliers isless problematic in spatial than classical statisticsbecause this action does not affect the randomnessof the sample (Barnett and Lewis, 1994). Transfor-mation of the data can also be considered, butGoovaerts (1997) has indicated that this is not idealif the aim is prediction. In general, those whoultimately use the predictions, such as land man-agers, environmental scientists and so on, wantvalues on the original scale of measurement, which

    involves a back transformation. For square root

  • range (a) of 75m and nugget variances (c0) of 0, 0.25,0.5 and 0.75. At a proportion, e, of the sites, values ofthe primary Gaussian process were added to atrandom locations by those from a secondary randomprocess NC (mC,,sC). Two rates of contamination wereused: e 0.02 and 0.05; the latter was the main focusof attention because the smaller rate gave only twocontaminated sites for data on the 20-m grid. Thecontaminants were drawn from random, normallydistributed populations with different means,NC (1,1), NC (1.25,1), NC (1.5,1) ,y,NC (10,1), andadded to the original values of the primary processto give skewness coefcients of 0.5, 1.0, 1.5, 2.0 and3.0. Table 1 gives the coefcient of skewness, meansof the secondary process and the number of randomlylocated sites contaminated for the 0.05 rate ofcontamination for each set of data simulated usinga variogram with no nugget variance. The mean ofthe secondary process used to produce a desiredcoefcient of skewness varied slightly between thedata sets because of differences in the original valuesof the primary process and the locations selected atrandom for contamination. The overall distributionfunction of the data is given by

    Zx f1 N m ;s ;j N m ;s g: (1)

    ARTICLE IN PRESSR. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 12331260 1235With greater insight into the effects of outliers onthe variogram, the standard best practice describedabove will be appraised and suggestions made as tohow it might be improved.

    2. Methods

    2.1. Simulation of two-dimensional data

    contaminated by outliers

    Twelve random elds with a standard normaldistribution, NP (0,1), were simulated using thesimulated annealing procedure of Deutsch andJournel (1992) for a 200m 200m hypotheticaleld at the nodes of 5-, 10- and 20-m grids, to give1600, 400 and 100 data, respectively. These data arerealizations of the primary process NP (mP, sP, jP),

    whtors solve the effects of asymmetry caused byoutliers on the variogram and the accuracy ofprediction?outliers for different degrees of asymmetryinuence the variogram differently?To what extent do the removal of outliers, datatransformations and robust variogram estima-

    functions with different nugget:sill ratios, i.e.different degrees of spatial continuity?Do randomly located and spatially aggregatedand logarithmic transformations, the back-trans-form tends to exaggerate any error associated withprediction through squaring and exponentiation,respectively. This can affect extreme values themost, which are of most interest in pollution studies.Therefore, one should question the appropriatenessof any data transformation, where asymmetry iscaused by outliers.Kerry and Oliver (2007) showed that the effect of

    underlying asymmetry on the variogram was less forlarge sets of data than for small ones, as illustratedin Figs. 2 and 8 of that paper. The effects of samplesize, the degree of continuity in the variation and thelocation of outliers, however, have not beeninvestigated thoroughly where asymmetry has beencaused by outliers. This paper explores the follow-ing in this context:

    Do similar degrees of asymmetry in the distribu-tion caused by outliers affect the variogramequally for data sets of different size?

    For a given sample size how do different degreesof asymmetry affect data generated by variogramere j is a vector of spatial parameters; in this case,a spherical function with a sill variance (c0+c) of 1, a

    Table 1

    Coefcient of skewness, means of the secondary process and

    number of sites for a rate of contamination of 0.05 for each set of

    data produced by simulated annealing data using a variogram

    with a nugget:sill ratio of 0

    Data (m) Coefcient of

    skewness

    Mean of

    secondary

    process

    Number of

    sites in

    secondary

    process

    5 0.5 2.00 80

    5 1.0 3.50 80

    5 1.5 4.50 80

    5 2.0 5.50 80

    5 3.0 9.00 80

    10 0.5 2.50 20

    10 1.0 4.00 20

    10 1.5 5.00 20

    10 2.0 6.00 20

    10 3.0 10.00 20

    20 0.5 3.25 5

    20 1.0 4.25 5

    20 1.5 5.25 5

    20 2.0 6.50 5

    20 3.0 10.00 5P P P C C C

  • ARTICLE IN PRESSR. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 123312601236In addition, the primary process was contaminatedat a rate of 0.05 in such a way that the outliers wereaggregated either near the edge or centre of the eld.A random location for an outlier was selected eithernear to the edge or centre of the eld; the remainingoutliers (e1) were then placed at the appropriatenumber of surrounding sites. The contaminants at thespatially aggregated locations were drawn from thesame populations as described above and added tothe original values of the primary process to giveskewness coefcients of 0.5, 1.0, 1.5, 2.0 and 3.0.Although the outliers are spatially aggregated, theyare spatially independent in this case. This might notalways be the case; for example, a large chemical spillin a restricted area could result in spatially dependentsample data. We have not considered this scenariohere as it did not accord with our underlying model,Eq. (1) above.

    2.2. Approaches to reduce asymmetry

    Values that have been identied as distributionaloutliers are often removed from the data to achievea normal or near-normal distribution. However, asmentioned above, there is some reluctance in doingthis when the outliers are associated with contami-nated sites and are the values of most interest orconcern. Therefore, we also transformed the data tosquare roots and common logarithms (log10) toassess the extent to which they reduced asymmetryand the effects on the variograms computed fromthe range of data sets described above aftertransformation. A consistent procedure wasadopted before transformation: a constant of 4was added to each value in the data to make allvalues just positive as described in Kerry and Oliver(2007).

    2.3. Matherons variogram estimator and robust

    variogram estimators

    Omni-directional experimental variograms werecomputed using Matherons (1965) estimator as inKerry and Oliver (2007); it is given by

    g^Mh 1

    2mhXmh

    i1fzxi zxi hg2, (2)

    where g^Mh is the semi variance at a given lagdistance h, z(xi) and z(xi+h) are the observed valuesof Z at xi and xi+h, and m(h) is the number of

    paired comparisons at lag h.Experimental variograms were computed on thesimulated normally distributed data and that con-taminated with outliers with initial lag intervalsbased on the grid spacings of the data of 5-, 10- and20-m. They were then modelled by weighted least-squares approximation using GenStat (Payne,2006). In addition, variograms were computed ondata transformed to square roots and commonlogarithms (log10), and with outliers removed usingEq. (2) and with robust estimators. Cressie (1993)uses the term robust to describe inference proce-dures that are stable when model assumptionsdepart from those of a central model, for exampleby a small amount of contamination by anindependent Gaussian process. Robust variogramestimators are a possible solution to the problem ofoutliers because the goal is to estimate thevariogram of the non-contaminated part of thedata (1e)P. Consequently, they are less sensitive tooutliers than is Eq. (2).Lark (2000) gives a succinct description of robust

    variogram estimators and their properties, and wedo not repeat this here. We have used the same threerobust estimators as used by Lark (2000), namelythose of Cressie and Hawkins (1980), Dowd (1984)and Genton (1998a). We summarize these estima-tors below following Lark (2000).Cressie and Hawkins (1980) estimator estimates

    the variogram at lag h for a primary process with anormal distribution of differences, Z(x)Z(x+h),and damps the effect of outliers from a secondaryprocess. For a given lag, it is an estimation of thelocation (rst-order moment) of the squared differ-ences. Cressie and Hawkins (1980) estimator isbased on taking the fourth roots of the squareddifferences, and it is given by

    2g^CHh 1=mhPmhi1 jzxi zxi hj1=2n o4

    0:457 0:494=mh 0:045=m2h .

    (3)

    The denominator in Eq. (3) is a correction based onthe assumption that the underlying process to beestimated has normally distributed differences overall lags. Genton (1998a) says that Cressie andHawkins estimator is not really a solution to theproblem because a single outlier can still have anadverse effect.Cressie (1993) suggests that variogram estimation

    can also be regarded as a problem of identifying thescale at various lags, i.e. the second-order moment

    of the differences, Z(x)Z(x+h), and that this

  • value at xi and z^xi the estimated value there.The MSDR is

    ARTICLE IN PRESSR. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 12331260 1237approach might be the most suitable for datacontaminated by outliers. The estimators of Dowd(1984) and of Genton (1998a) are both scale estima-tors. They estimate the variogram for a dominantintrinsic process, for which the differences Z(x)Z(x+h) are normal, in the presence of outliers froma secondary process. Dowds (1984) estimator isgiven as

    2g^Dh 2:198fmedianjyihjg2, (4)where yi(h) z(xi)z(xi+h), i 1,2 ,y,m(h). Theterm within the braces is the median absolute pairdifference (MAPD) for lag h, which is a scaleestimator only for variables where the expectationof the differences is 0. In addition, the pairdifference must be distributed symmetrically so thatthe expectation of the median pair difference is 0.The constant, 2.198, in Eq. (3) is a correction forconsistency that scales the MAPD to the standarddeviation of a normally distributed population.Gentons (1998a) estimator is based on the scale

    estimator, QN, of Rousseeuw and Croux (1992,1993). The Quantity QN is given by

    QN 2:219fjXi Xjj; iojg H2 , (5)

    where the constant 2.219 is a correction forconsistency with the standard deviation of thenormal distribution, and H is the integral part of(N/2)+1. Gentons (1998a) estimator uses Eq. (5) asan estimator of scale applied to the differences ateach lag; it is given by

    2g^Gh 2:219fjyih yjhj; iojg H2

    h i2, (6)

    where yi(h) is the same as for Eq. (3), and now H isthe integral part of {m(h)/2}+1.

    2.4. Cross-validation

    Cross-validation was done as described in Kerryand Oliver (2007) for a selection of the tted modelsand associated data sets. The method used involvedremoving each datum in turn and then kriging at thepoint with the relevant model parameters andneighbouring data points. The diagnostic statisticsderived from cross-validation for this investigationwere the mean error (ME), mean squared error(MSE), mean squared deviation ratio (MSDR) andmedian squared deviation ratio (MeSDR). Theratios are derived from the squared errors and canbe used to distinguish between variogram models

    for the range of data examined. Lark (2000),MSDR 1N

    XNi1

    fzxi z^xig2s^2xi

    ;

    where s^2xi is the kriging variance at the point. Thecloser the MSDR is to 1, the better the model is forkriging.The MeSDR was determined by dividing the

    squared errors by the kriging variances for eachdata point, and then ordering the values; the middlevalue was taken as the MeSDR. When the correctmodel is used for kriging, the MeSDR should beclose to 0.455, which is the median of the standardw2 distribution with one degree of freedom.

    3. Results and discussion

    Fig. 1 shows the schematic box and whisker plotsfor the 10-m data for the range of skewnesscoefcients examined arising from randomly locatedoutliers. The graphs show the box, which containsthe middle 50% of the distribution, and thehorizontal line is the median. The circles beyondthe whiskers are large values at the margins of thedistribution and the crosses are the outliers, whichare beyond three times the inter-quartile range. Forthe normal distribution and a skewness coefcientof 0.5 (Fig. 1a and b, respectively) there are nocrosses, indicating that the marginal values are notextreme. As the asymmetry increases, the number oflarge values increases at the extremes on the positivehowever, recommended the MeSDR to determinethe best model for kriging with skewed data becausethe mean is affected by asymmetry, which meansthat the MSDR is not robust if the data arecontaminated by outliers.The ME is given by

    ME 1N

    XNi1

    fzxi z^xig

    and the MSE is given by

    MSE 1N

    XNi1

    fzxi z^xig2;

    where N is the number of data values, z(xi) the trueside of the distribution.

  • ARTICLE IN PRESS

    3

    2

    1

    0

    1

    2

    3

    4

    2

    0

    2

    6

    4

    2

    0

    2

    6

    8

    4

    2

    0

    2

    6

    8

    4

    2

    0

    2

    6

    8

    4

    2

    0

    2

    Fig. 1. Schematic box and whisker plots for data on the 10-m grid and 0 nugget variance for (a) a normal distribution, and skewness

    coefcients of: (b) 0.5, (c) 1.0, (d) 1.5, (e) 2.0 and (f) 3.0; the circles represent large values in the margin of the distribution and the crosses

    are outliers.

    R. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 123312601238

  • ARTICLE IN PRESS

    0

    Variance

    0

    0

    1

    2

    3

    4

    5

    6

    7

    Variance

    3.5

    3.0

    2.5

    2.0

    1.5

    1.0

    0.5

    0.0

    10080604020

    Lag Distance (m)

    0 10080604020

    Lag Distance (m)

    0 10080604020

    Lag Distance (m)

    Variance

    3.5

    3.0

    2.5

    2.0

    1.5

    1.0

    0.5

    0.0

    Lag Distance (m)

    10080604020

    0

    Lag Distance (m)

    10080604020

    0

    Lag Distance (m)

    10080604020

    3.5

    3.0

    2.5

    2.0

    1.5

    1.0

    0.5

    0.0

    Variance

    3.5

    3.0

    2.5

    2.0

    1.5

    1.0

    0.5

    0.0

    Variance

    3.5

    3.0

    2.5

    2.0

    1.5

    1.0

    0.5

    0.0

    Variance

    Fig. 2. Experimental variograms computed from data on 5-m (~), 10-m (&) and 20-m (n) grids simulated by a variogram function with anugget:sill ratio of 0 () for (a) a normal distribution, and skewness coefcients of: (b) 0.5, (c) 1.0, (d) 1.5, (e) 2.0 and (f) 3.0 caused by

    randomly located outliers.

    R. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 12331260 1239

  • 3.1. The effect of randomly located outliers on

    variograms computed from simulated data of

    different sample sizes

    Fig. 2 shows the experimental variograms com-puted from the three sizes of data set simulated witha nugget:sill ratio of 0 for the range of skewnesscoefcients (0, 0.5, 1.0, 1.5, 2.0 and 3.0) resultingfrom randomly located outliers with a rate ofcontamination of 0.05. The exhaustive variogramsof the normally distributed data are very similarto those used to simulate the data (the solid line inFig. 2); therefore, the latter were used to assess theeffects of asymmetry in the data on the variogram asin Kerry and Oliver (2007). Tables 2 and 3 give theparameters of the models tted to the range of data.The general pattern for all sizes of data set is that asskewness increases the sill and nugget variances

    of outliers increased to 2030% of the data, whichwe can interpret as resulting in greater asymmetry.For all coefcients of skewness 40, the 5-m data

    (1600 sites) have variograms with the smallest nuggetand sill variances (Fig. 2 and Table 2). There is littledifference between the variograms for the 10-m (400sites) and 20-m (100 sites) data, except for theskewness coefcient of 2.0 (Fig. 2e). Overall, the effectof the size of data set is less than that observed, whereasymmetry is caused by a long tail in the distribution(see Kerry and Oliver, 2007). As the asymmetrycaused by randomly located outliers increases, thevariograms for all sizes of data set are affectedsimilarly (Fig. 2). Changes in the variogram caused byskewness coefcients 41 suggest a need to mitigatethe effects of the outliers before computing it.Fig. 3c and d shows the experimental variograms

    for the 5- and 10-m data, with skewness coefcients

    ARTICLE IN PRESS

    ed fro

    ated

    R. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 123312601240increase quite dramatically; this is particularly sowhen the skewness coefcient reaches 3.0 andthe vertical scale of the graph (Fig. 2f ) is larger.Cressie and Hawkins (1980), Genton (1998a)and Lark (2000) all noted that the effect of outlierson the variogram was to increase the nugget andsill variance. Fig. 3a and b summarizes this effectof increasing asymmetry in the data for the 5- and20-m data sets for a rate of contamination byoutliers of 0.05. Tables 2 and 3 also indicate anincrease in nugget:sill ratio as asymmetry in thedistribution increases. For skewness coefcientsX2, the variogram tends towards pure nugget.Genton (1998a) also noted that Matherons estima-tor tended towards pure nugget when the proportion

    Table 2

    Parameters of models tted to experimental variograms comput

    nugget:sill ratio of 0 and with asymmetry caused by randomly loc

    Coefcient of

    skewness

    Grid interval of data (m)

    (rate of contamination by

    outliers of 0.05)

    Model type

    0 5 Spherical

    0.5 5 Spherical

    1.0 5 Spherical

    1.5 5 Spherical

    2.0 5 Circular

    3.0 5 Circular

    0 20 Spherical

    0.5 20 Spherical

    1.0 20 Spherical

    1.5 20 Spherical

    2.0 20 Spherical

    3.0 20 Circularfrom 0 to 2.0 caused by contamination with outliersat a rate of 0.02. We excluded the 20-m data fromthis comparison because there were just two out-liers. The nugget and sill variances (Fig. 3c and dand Table 4) are smaller than those for the largerrate of contamination for both sizes of data set(Tables 24). These graphs (Fig. 3c and d) furtherconrm that the effect of data-set size is less thanthat of either the rate of contamination or degree ofasymmetry. The difference in nugget:sill ratiobetween the 0.02 and 0.05 rates of contaminationis least for a coefcient of skewness of 0.5 and thisdifference increases gradually from 0.033 to 0.230for the 5-m data and from 0.047 to 0.325 for the10-m data as the skewness increases.

    m data on 5- and 20-m grids generated by a variogram with a

    outliers at a rate of contamination of 0.05

    c0 c a (m) c0+c c0:c0+c

    0 1 75.0 1.00 0

    0.340 0.948 70.2 1.288 0.264

    0.847 0.901 71.4 1.748 0.484

    1.320 0.866 73.1 2.186 0.604

    1.931 0.796 68.3 2.728 0.708

    5.248 0.704 80.3 5.952 0.882

    0 1 75.0 1.00 0

    0.364 1.030 57.0 1.394 0.261

    0.520 1.220 50.5 1.740 0.299

    0.981 1.289 54.3 2.270 0.432

    1.374 1.622 48.6 2.996 0.459

    3.674 2.221 40.8 5.895 0.623

  • ARTICLE IN PRESS

    ed fro

    ent n

    .962

    .950

    .944

    .905

    .926

    .750

    .764

    .763

    .765

    .769

    .777

    R. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 12331260 1241Table 3

    Parameters of models tted to experimental variograms comput

    located outliers at a rate of contamination of 0.05 and with differ

    Coefcient of

    skewness

    Nugget:sill

    ratio of

    generating

    function

    Model type c0 c

    0 0 Spherical 0 1

    0.5 0 Spherical 0.240 0

    1.0 0 Spherical 0.636 0

    1.5 0 Spherical 1.017 0

    2.0 0 Circular 1.522 0

    3.0 0 Circular 3.890 0

    0 0.25 Spherical 0.250 0

    0.5 0.25 Spherical 0.503 0

    1.0 0.25 Spherical 0.903 0

    1.5 0.25 Spherical 1.286 0

    2.0 0.25 Spherical 1.761 0

    3.0 0.25 Circular 4.175 03.2. Effect of spatial continuity on variograms

    computed from data simulated on a 10-m grid with

    randomly located outliers

    Experimental variograms were computed andmodelled for the three sizes of data set from datasimulated with different nugget:sill ratios (0, 0.25,0.5 and 0.75) and for the range of asymmetryconsidered. The results showed that the effects ofincreasing asymmetry in the distribution anddiscontinuity in the spatial variation are similarfor the three sizes of data set; therefore, the resultsare given for the 10-m data only (Table 3). Theresults are summarized in Fig. 4 for data on the10-m grid. For the data simulated by functions withnugget:sill ratios of 0, 0.25 and 0.5, the variogramshape remains fairly constant as the asymmetryincreases. However, for data simulated with anugget:sill ratio of 0.75, the nugget variance

    0 0.50 Spherical 0.500 0.500

    0.5 0.50 Spherical 0.754 0.550

    1.0 0.50 Circular 1.168 0.546

    1.5 0.50 Circular 1.549 0.559

    2.0 0.50 Circular 2.022 0.575

    3.0 0.50 Circular 4.398 0.625

    0 0.75 Spherical 0.750 0.250

    0.5 0.75 Pentaspherical 1.006 0.264

    1.0 0.75 Pentaspherical 1.390 0.289

    1.5 0.75 Pentaspherical 1.762 0.308

    2.0 0.75 Pentaspherical 2.227 0.329

    3.0 0.75 Circular 4.599 0.399

    aMSE is the mean squared error and MeSDR is the median squaredm data on the 10-m grid with asymmetry caused by randomly

    ugget:sill ratios, and cross-validation results

    a c0+c c0:c0+c Cross-validation

    results

    MSEa MeSDRa

    75.0 1.00 0 0.1677 0.496

    74.2 1.202 0.200 0.5812 0.219

    76.2 1.586 0.401 1.077 0.129

    77.9 1.961 0.519 1.530 0.113

    71.9 2.427 0.627 2.083 0.087

    78.7 4.817 0.808 5.326 0.065

    75.0 1.00 0.250 0.4260 0.447

    77.0 1.267 0.397 0.7829 0.291

    78.0 0.542 1.080 0.262

    79.0 2.051 0.627 1.480 0.203

    80.3 2.531 0.696 2.271 0.155

    77.7 4.952 0.843 5.009 0.099increases as skewness increases (Fig. 4 and Table 3)and for skewness coefcients X1.5 the variogramsare almost pure nugget, Fig. 4d and Table 3. For agiven coefcient of skewness, the sill variancechanges little as spatial continuity in the datadecreases. Fig. 5 summarizes the effects of asym-metry for the different degrees of spatial continuityon the variogram for data on the 10-m grid.The results described above suggest that the

    degree of asymmetry caused by outliers has agreater effect on the shape of the variogram whenthe nugget:sill ratio of the original generatingvariogram of the primary process is 40.5.

    3.3. Effects of spatially aggregated outliers on the

    variogram

    Fig. 6 shows the experimental variograms com-puted from data simulated with a nugget:sill ratio of

    75.0 1.00 0.500 0.6591 0.434

    83.9 1.304 0.579 1.054 0.316

    75.2 1.714 0.681 1.573 0.210

    75.7 2.108 0.735 2.047 0.163

    76.5 2.597 0.779 2.622 0.143

    79.8 5.023 0.875 5.946 0.101

    75.0 1.00 0.750 0.8825 0.430

    93.9 1.270 0.792 1.196 0.366

    94.8 1.679 0.828 1.692 0.293

    96.9 2.070 0.851 2.156 0.240

    99.5 2.556 0.871 2.717 0.213

    81.9 4.998 0.920 5.987 0.116

    deviation ratio.

  • ARTICLE IN PRESSR. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 1233126012421.5

    2.0

    2.5

    3.0

    3.5

    Variance0 for all sizes of data set and all coefcients ofskewness, and with outliers (contamination rate of0.05) aggregated either near to the edge or the centreof the eld. The variograms are quite different inform from those computed from data with ran-domly located outliers, Fig. 3. Table 5 gives theparameters of the tted functions for the spatiallyaggregated outliers; it shows that they have small or0 nugget effects for all sizes of data set and skewnesscoefcients, whereas for the randomly located out-liers the nugget variance increases with increasing

    0.0

    0.5

    1.0

    0

    0.0

    0.5

    1.0

    1.5

    2.0

    2.5

    3.0

    3.5

    Variance

    Lag Distance (m)

    10080604020

    0

    Lag Distance (m)

    10080604020

    skew =

    skew =

    skew = 1.5

    normal

    Fig. 3. Experimental variograms (symbols) computed from 5-m (a, c), 1

    coefcients of skewness caused by contamination at rates of 0.05 (a, b1.5

    2.0

    2.5

    3.0

    3.5

    Varianceasymmetry in the distribution (Tables 2 and 3).There is also a marked difference between thevariograms computed from data with outliersaggregated near the edge and centre of the eld(Fig. 6a, c and e and Fig. 6b, d and f, respectively);Table 5 shows that, for the latter, the sill variancesare considerably larger and that the range decreasesas skewness increases. The large difference betweenthe effect of outliers near the edge and near thecentre of the eld can be explained by the fact thatoutliers near the centre are involved in many more

    0.0

    0.5

    1.0

    0

    0

    Lag Distance (m)

    10080604020

    0.0

    0.5

    1.0

    1.5

    2.0

    2.5

    3.0

    3.5V

    ariance

    Lag Distance (m)

    14012010080604020

    generating

    model

    skew = 1.0

    2.0

    0.5

    0-m (d) and 20-m (b) data with a normal distribution and different

    ) and 0.02 (c, d) with randomly located outliers.

  • ARTICLE IN PRESS

    from

    a rat

    e

    rical

    R. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 12331260 1243paired comparisons than are those at the edge of theeld, in particular at the longer lags. The greatercontinuity in the variogram near to the origin forthe spatially aggregated outliers probably relates tostrong continuity over the majority of the eld,whereas the randomly located outliers increasediscontinuity in the variation at several isolatedplaces in the eld. The effect of size of data set isagain small compared with the other effects.

    3.4. Effect of data transformation (square root and

    log10) on the variogram

    Table 4

    Parameters of models tted to experimental variograms computed

    0 nugget variance and with skewness caused by contamination at

    Coefcient of

    skewness

    Grid interval of data (m)

    (rate of contamination by

    outliers of 0.02)

    Model typ

    0 5 Spherical

    0.5 5 Spherical

    1.0 5 Spherical

    1.5 5 Spherical

    2.0 5 Pentasphe

    3.0 5 Circular

    0 5 Spherical

    0.5 10 Circular

    1.0 10 Circular

    1.5 10 Circular

    2.0 10 Circular

    3.0 10 CircularData simulated by variogram functions with anugget:sill ratio of 0 were used to determine theeffects of square root and log10 transformations,and the removal of outliers on the variogram for allcoefcients of skewness examined. This was donefor all sizes of data set for a 0.05 rate ofcontamination and for data on the 5- and 10-mgrids for the 0.02 rate. Table 6 gives the coefcientsof skewness before and after transformation for allsizes of data set and 0.05 rate of contamination. AsTable 6 shows, the results for both rates ofcontamination followed a similar form, only thosefor the 0.05 rate are focused upon because thisincludes all sizes of data set. For original skewnesscoefcients p1 the square root transformation wasmore effective in reducing the coefcient of skew-ness, and for original skewness coefcients 41 thelogarithmic one was generally the more effective.Fig. 7 shows the experimental variograms for the

    three sizes of data set after transformation to squareroots and log10 for all levels of asymmetryexamined. The shapes of the variograms remainsimilar for all sizes of data set. For data with askewness coefcient of 3.0, the variograms aftertransformation to log10 are far less different fromthose computed from less skewed data than werethose computed from the raw data, Figs. 7f and 2f,respectively. The experimental variograms in Fig. 7indicate that the larger the size of data set, thesmaller are the nugget and sill variances for bothtransformations, but this difference is small.Tables 7 and 8 give the model parameters of

    variograms computed for the 0.05 rate of contam-

    data on 5- and 10-m grids generated by a variogram function with

    e of 0.02 with randomly located outliers

    c0 c a c0+c c0:c0+c

    0 1 75.0 1.00 0

    0.200 0.998 73.8 1.198 0.167

    0.387 0.997 74.4 1.383 0.279

    0.582 0.994 75.0 1.576 0.369

    0.789 1.026 89.9 1.815 0.435

    1.387 1.014 92.0 2.401 0.578

    0 1 75.0 1.00 0

    0.275 0.991 65.8 1.266 0.217

    0.475 1.029 66.5 1.504 0.316

    0.679 1.067 67.3 1.746 0.389

    0.924 1.110 68.2 2.034 0.454

    1.532 1.217 70.1 2.748 0.557ination by outliers for all sizes of data set andskewness coefcients after transformation to squareroots and logarithms, respectively. For data on the5- and 10-m grids, the nugget:sill ratios show aconsistent decrease after transformation to bothsquare roots and log10 (Tables 7 and 8, respectively)compared with the results for the raw data (Tables 2and 3), and the decrease is greater after the log10transformation (Table 8). The same is true for the20-m data for skewness coefcients 41 (Tables 2, 7and 8), but the decrease is less than that for theother data sets.Table 6 gives the coefcients of skewness after

    transforming the 10-m data with outliers aggregatedat both the edge and centre of the eld to squareroots and log10; there is little difference betweenthese results and those for randomly locatedoutliers. This is largely to be expected as thetransformations do not take into account the spatialpositions of the outliers. Fig. 8a and c shows the

  • ARTICLE IN PRESSR. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 123312601244iance

    3.5

    3.0

    2.5

    2.0experimental variograms computed from the squareroot-transformed values with outliers at the edgeand centre of the eld, respectively, and Fig. 8b andd those computed from log10-transformed values atthe edge and centre of the eld, respectively. Thedifferences in sill variance for data with outliers atthe centre and edge of the eld remain, but there islittle difference between the effects of the twotransformations. The sill variances for the square

    0

    Var

    1.5

    1.0

    0.5

    0.0

    Variance

    3.5

    3.0

    2.5

    2.0

    1.5

    1.0

    0.5

    0.0

    Lag Distance (m)

    10080604020

    0

    Lag Distance (m)

    10080604020

    normal ske

    skew = 1.5 ske

    Fig. 4. Experimental variograms (symbols) computed from 10-m data

    caused by randomly located outliers and the variogram functions (soli

    (b) 0.25, (c) 0.5 and (d) 0.75.iance

    3.5

    3.0

    2.5

    2.0root- and log10-transformed data with randomlylocated outliers and those aggregated at the edge ofthe eld are similar (Figs. 7 and 8a, b, respectively),whereas the sill variances of variograms computedfrom data with outliers grouped near the centre arefar larger after transformation (Fig. 8c and d).Outliers were removed from the data and the

    variogram computed for the primary process(1e)P with 0 nugget variance for each size of data

    Variance

    3.5

    3.0

    2.5

    2.0

    1.5

    1.0

    0.5

    0.0

    Var

    1.5

    1.0

    0.5

    0.0

    0

    Lag Distance (m)

    10080604020

    0

    Lag Distance (m)

    10080604020

    w = 0.5 skew = 1.0

    w = 2.0 generating

    model

    with a normal distribution and different coefcients of skewness

    d lines) used to generate the data with nugget:sill ratios of: (a) 0,

  • ARTICLE IN PRESS

    0

    Va

    ria

    nce

    3.5

    3.0

    2.5

    2.0

    1.5

    1.0

    0.5

    0.0

    Va

    ria

    nce

    3.5

    3.0

    2.5

    2.0

    1.5

    1.0

    0.5

    0.0

    Va

    ria

    nce

    3.5

    3.0

    2.5

    2.0

    1.5

    1.0

    0.5

    0.0

    Va

    ria

    nce

    3.5

    3.0

    2.5

    2.0

    1.5

    1.0

    0.5

    0.0

    Va

    ria

    nce

    3.5

    3.0

    2.5

    2.0

    1.5

    1.0

    0.5

    0.0

    Lag Distance (m)

    10080604020

    0

    Lag Distance (m)

    10080604020

    0

    Lag Distance (m)

    10080604020

    0

    Lag Distance (m)

    10080604020

    0

    Lag Distance (m)

    10080604020

    generating model

    0 nugget

    generating model

    0.25 nugget

    generating model

    0.50 nugget

    generating model

    0.75 nugget

    Fig. 5. Comparison of experimental variograms computed from 10-m data generated by functions with different nugget:sill ratios for (a) a

    normal distribution, and skewness coefcients of: (b) 0.5, (c) 1.0, (d) 1.5 and (e) 2.0 caused by randomly located outliers.

    R. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 12331260 1245

  • ARTICLE IN PRESSR. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 1233126012465set; this was done once only as the outliers were atthe same locations for each coefcient of skewness.Table 9 shows that there is a small increase in the

    0

    1

    2

    3

    4

    0 20 40 60 80 100

    Lag Distance (m)

    Variance

    0

    1

    2

    3

    4

    5

    Variance

    0

    1

    2

    3

    4

    5

    Variance

    0 20 40 60 80 100 120 140

    Lag Distance (m)

    0 20 40 60 80 100

    Lag Distance (m)

    normal skew = 0

    skew = 1.5 skew = 2

    Fig. 6. Experimental variograms (symbols) computed from data on 5-m

    and different coefcients of skewness caused by contamination with outl

    of the eld.5nugget:sill ratio compared with the generatingmodel, which can be explained by the gaps in thedata that cause some apparent loss of continuity in

    0

    1

    2

    3

    4

    5

    Variance

    0

    1

    2

    3

    4

    Variance

    0

    1

    2

    3

    4

    5

    Variance

    0 20 40 60 80 100

    Lag Distance (m)

    0 20 40 60 80 100

    Lag Distance (m)

    0 20 40 60 80 100 120 140

    Lag Distance (m)

    .5 skew = 1.0

    .0 generating

    model

    (a, b), 10-m (c, d) and 20-m (e, f) grids with a normal distribution

    iers spatially aggregated in the corner (a, c, e) or the centre (b, d, f )

  • ARTICLE IN PRESS

    from

    regate

    e

    R. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 12331260 1247Table 5

    Parameters of models tted to experimental variograms computed

    with a nugget:sill ratio of 0 and skewness caused by spatially agg

    Coefcient of

    skewness

    Grid interval

    of data (m)

    Location of

    spatially

    aggregated

    outliers

    Model typ

    0 5 Spherical

    0.5 5 Edge Circularthe variation at the rst lag interval. The results forthe 0.02 rate of contamination are similar to thosedescribed above for the larger contamination rate,Table 9. Variograms computed for the 10-m dataafter the removal of outliers aggregated at the edgeof the eld are surprisingly similar to thosecomputed after the removal of randomly locatedoutliers (Table 9), whereas after the removal ofthose at the centre of the eld the nugget:sill ratio issmaller than for the latter variograms.

    1.0 5 Edge Circular

    1.5 5 Edge Circular

    2.0 5 Edge Circular

    3.0 5 Edge Circular

    0 5 Spherical

    0.5 5 Centre Circular

    1.0 5 Centre Circular

    1.5 5 Centre Circular

    2.0 5 Centre Circular

    3.0 5 Centre Circular

    0 10 Spherical

    0.5 10 Edge Circular

    1.0 10 Edge Circular

    1.5 10 Edge Circular

    2.0 10 Edge Spherical

    3.0 10 Edge Pentaspherical

    0 10 Spherical

    0.5 10 Centre Circular

    1.0 10 Centre Circular

    1.5 10 Centre Circular

    2.0 10 Centre Circular

    3.0 10 Centre Spherical

    0 20 Spherical

    0.5 20 Edge Circular

    1.0 20 Edge Spherical

    1.5 20 Edge Pentaspherical

    2.0 20 Edge Pentaspherical

    3.0 20 Edge Pentaspherical

    0 20 Spherical

    0.5 20 Centre Circular

    1.0 20 Centre Circular

    1.5 20 Centre Circular

    2.0 20 Centre Spherical

    3.0 20 Centre Sphericaldata on 5-, 10- and 20-m grids generated by a variogram function

    d outliers (rate of contamination of 0.05)

    c0 C a c0+c c0:c0+c

    0 1 75.0 1.00 0

    0.066 1.008 56.3 1.075 0.0623.5. Robust variogram estimators

    The results for the robust variograms are given indetail only for data on the 10-m grid with nugget:sillratios of 0 and 0.5 and for a rate of contaminationby outliers of 0.05, as those for the other sizes ofdata set followed a similar pattern.Fig. 9 shows the experimental variograms com-

    puted by Matherons estimator and the three robustestimators, g^CHh; g^Dh and g^Gh; for data

    0.068 1.245 53.6 1.313 0.052

    0.071 1.480 52.7 1.551 0.046

    0.075 1.776 52.3 1.852 0.041

    0.116 3.284 53.0 3.400 0.034

    0 1 75.0 1.00 0

    0.036 1.620 78.1 1.656 0.022

    0.004 2.219 79.8 2.223 0.002

    0.000 2.972 80.5 2.972 0.000

    0.000 3.889 80.6 3.889 0.000

    0.000 9.314 79.9 9.314 0.000

    0 1 75.0 1.00 0

    0.054 1.059 68.5 1.113 0.048

    0.068 1.298 70.9 1.366 0.050

    0.101 1.513 72.2 1.614 0.063

    0.104 1.836 82.8 1.940 0.054

    0.451 3.487 108.9 3.938 0.115

    0 1 75.0 1.00 0

    0.022 1.632 67.9 1.654 0.013

    0.000 2.169 67.2 2.169 0.000

    0.000 2.866 66.4 2.866 0.000

    0.000 4.117 65.4 4.117 0.000

    0.000 8.444 71.5 8.444 0.000

    0 1 75.0 1.00 0

    0.028 1.211 79.5 1.239 0.023

    0.032 1.552 102.8 1.584 0.020

    0.048 1.948 139.8 1.996 0.024

    0.115 2.401 154.7 2.516 0.046

    0.542 5.370 222.0 5.912 0.092

    0 1 75.0 1.00 0

    0.000 1.616 43.8 1.616 0.000

    0.200 2.000 48.0 2.200 0.091

    0.018 2.719 49.4 2.737 0.007

    0.000 3.601 55.5 3.601 0.000

    0.000 8.750 63.4 8.750 0.000

  • ARTICLE IN PRESSR. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 123312601248Table 6

    Coefcients of skewness for data on 5-, 10- and 20-m grids

    generated by a variogram function with a nugget:sill ratio of 0

    and contaminated with randomly located and spatially aggre-

    gated outliers (rate of 0.05) after transformation to square roots

    and log10

    Coefcient of

    skewness of

    original data

    Coefcient of

    skewness of square

    root of data (+4)a

    Coefcient of

    skewness of log10data (+4)a

    5-m grid random outliers

    0.5 0.108 0.764simulated with a nugget:sill ratio of 0, with anormal distribution and all coefcients of skewness.Tables 3 and 10 give the parameters of the modelstted to Matherons and the robust variograms,respectively. For the normal distribution, g^Mh andg^CHh are closest to the generating model (Fig. 9aand Table 10). As the skewness starts to increase,Matherons estimator shows an increasing depar-ture from the original variogram and so does that ofCressie and Hawkins (1980), but to a lesser extent(Fig. 9bf and Table 10). For all degrees of

    1.0 0.353 0.4331.5 0.711 0.1762.0 1.059 0.077

    3.0 2.013 0.819

    10-m grid random outliers

    0.5 0.101 0.9341.0 0.378 0.5751.5 0.734 0.3092.0 1.077 0.0513.0 2.111 0.787

    10m grid, aggregated outliers-edge

    0.5 0.160 0.9671.0 0.315 0.6141.5 0.677 0.3452.0 1.028 0.0813.0 2.095 0.779

    10m grid, aggregated outliers-centre

    0.5 0.053 0.8821.0 0.271 0.6391.5 0.629 0.3742.0 1.145 0.013

    3.0 2.053 0.743

    20-m grid random outliers

    0.5 0.185 1.3281.0 0.203 1.0161.5 0.604 0.7032.0 1.067 0.3393.0 2.026 0.469

    a(+4) constant added to standard normal data so that smallest

    value was just positive for transformation.asymmetry, g^Dh departs less from the generatingvariogram than g^Gh, but there is little differencebetween these two robust variograms (Fig. 9bf andTable 10). For Dowds (1984) estimator, the sillvariance is closer to 1 than for Gentons (1998a),but there is little change in either as the asymmetryincreases (Table 10). For Cressie and Hawkinsestimator, the nugget and sill variances increasewith increasing asymmetry (Table 10).Fig. 10 shows the experimental variograms

    computed by Matherons estimator and the threerobust estimators, g^CHh; g^Dh and g^Gh; fordata simulated with a nugget:sill ratio of 0.5 with anormal distribution and all coefcients of skewness.Tables 3 and 11 give the parameters of the modelstted to Matherons and the robust variograms,respectively. Discontinuity in the variation has anadverse effect on the shape of Matherons estimatoras asymmetry in the distribution increases. This isnot so for the robust estimators; the forms of thesevariograms remain close to the original function.The robust variograms show the same pattern inrelation to each other as in Fig. 9 for the data with anugget:sill ratio of 0. Table 11 shows that asasymmetry increases the sill variance also increasesin the robust variograms, and this effect is greaterfor Cressie and Hawkinss and Gentons estimatorsthan for Dowds. It is also greater overall for thevariograms computed from data with nugget:sillratio of 0.5 (Table 11) than for those computedfrom data with a nugget:sill ratio of 0 (Table 10).

    3.6. Cross-validation results

    The MEs for all cross-validation analyses wereclose to 0 showing that the estimators are unbiased.The MSDRs were close to 1 for most analyses; thissupports Larks (2000) observation that the MSDRis poor for comparing the effects of asymmetry inthe data on the variogram. As the MEs and MSDRsdo not provide any insight into the effect of outliersin the data on the variogram and on the accuracy ofprediction, we do not include them here.Table 12 gives the results of cross-validation

    (MSEs and MeSDRs) for data simulated by thespherical function with 0 nugget for the threegrid spacings and the range of asymmetry.Cross-validation was also done for both rates ofcontamination; the results for the 0.05 rate of conta-mination only are given because the differencesbetween the two rates are small. To summarize, the

    MSEs for the 0.02 rate of contamination are smaller

  • ARTICLE IN PRESSR. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 12331260 1249 0.25and the departure of the MeSDR from 0.455 isslightly less than are those for the larger rate ofcontamination for the three grid sizes. Cross-

    0 20 40 60 80

    0

    Variance

    Variance

    0.20

    0.15

    0.10

    0.05

    0.00

    Lag Distance (m)

    100

    0 20 40 60 80

    Lag Distance (m)

    100

    0.25

    Variance

    0.20

    0.15

    0.10

    0.05

    0.00

    0.25

    0.20

    0.15

    0.10

    0.05

    0.00

    Lag Distance (m)

    14012010080604020

    square root of skew = 0.5

    square root of skew = 1.0

    square root of skew = 1.5

    square root of skew = 2.0

    square root of skew = 3.0

    Fig. 7. Experimental variograms computed from data on 5-m (a, b), 10

    and log10 (b, d, f) with skewness caused by randomly located outliers.0.035

    0.030validation was also done for the 10-m data withdifferent nugget:sill ratios. The results are given inTable 3 and follow a similar pattern to those for the

    0 20 40 60 80

    Lag Distance (m)

    100

    0 20 40 60 80

    Lag Distance (m)

    100

    Variance

    0.025

    0.020

    0.015

    0.010

    0.005

    0.000

    Variance

    0.035

    0.030

    0.025

    0.020

    0.015

    0.010

    0.005

    0.000

    Variance

    0.035

    0.030

    0.025

    0.020

    0.015

    0.010

    0.005

    0.000

    0

    Lag Distance (m)

    14012010080604020

    log10 of skew = 0.5

    log10 of skew = 1.0

    log10 of skew = 1.5

    log10 of skew = 2.0

    log10 of skew = 3.0

    -m (c, d) and 20-m (e, f) grids transformed to square roots (a, c, e)

  • ARTICLE IN PRESS

    ted o

    d wi

    33

    99

    40

    00

    R. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 123312601250Table 7

    Parameters of models tted to experimental variograms compu

    generated by a variogram function with a nugget:sill ratio of 0 an

    Original skewness

    of data

    Grid interval of

    data (m)

    Model type c0

    0.5 5 Spherical 0.01

    1.0 5 Spherical 0.02

    1.5 5 Spherical 0.04

    2.0 5 Spherical 0.06data with 0 nugget. As the nugget variance in-creases, the MSEs increase more for coefcients ofskewness 41.5, but overall there is little difference.For a given coefcient of skewness, the MeSDRsdepart less from 0.455 as the nugget varianceincreases, but again this is small. The effects onthe MeSDR with increasing skewness caused by out-liers appear to be less when the original data weresimulated with a nugget variance (Table 3). This mightbe because the larger the nugget variance, the more is

    3.0 5 Circular 0.1284

    0.5 10 Circular 0.0185

    1.0 10 Circular 0.0391

    1.5 10 Circular 0.0562

    2.0 10 Circular 0.0753

    3.0 10 Circular 0.1663

    0.5 20 Spherical 0.0225

    1.0 20 Spherical 0.0327

    1.5 20 Spherical 0.0441

    2.0 20 Spherical 0.0586

    3.0 20 Spherical 0.1254

    Table 8

    Parameters of models tted to experimental variograms computed on

    generated by a variogram function with a nugget:sill ratio of 0 and wi

    Original

    skewness of data

    Grid interval of

    data (m)

    Model type c0

    0.5 5 Pentaspherical 0.00242

    1.0 5 Pentaspherical 0.00466

    1.5 5 Pentaspherical 0.00633

    2.0 5 Pentaspherical 0.00806

    3.0 5 Pentaspherical 0.01408

    0.5 10 Circular 0.00308

    1.0 10 Circular 0.00575

    1.5 10 Circular 0.00770

    2.0 10 Circular 0.00970

    3.0 10 Circular 0.01763

    0.5 20 Circular 0.00526

    1.0 20 Circular 0.00677

    1.5 20 Spherical 0.00767

    2.0 20 Spherical 0.00959

    3.0 20 Spherical 0.01483n data transformed to square roots on 5-, 10- and 20-m grids

    th skewness caused by randomly located outliers

    c a c0+c c0:c0+c

    0.0591 76.1 0.072 0.183

    0.0578 77.9 0.088 0.341

    0.0571 79.3 0.101 0.435

    0.0566 80.8 0.117 0.515the weight given to the more distant points in thekriging neighbourhood. This results in greater smooth-ing of the predictions and of the kriging errors, inparticular when outliers are present in the data.Table 12 shows that the MSEs increase markedly

    as the coefcient of skewness increases for all sizesof data set. It is notable from Table 12 that theeffect of data-set size is small; the MSEs for the20-m data are larger than for the two larger sets ofdata, but not especially so. The MeSDRs are

    0.0538 77.0 0.182 0.705

    0.0695 65.7 0.088 0.210

    0.0581 66.7 0.097 0.402

    0.0565 67.6 0.113 0.498

    0.0551 68.7 0.130 0.577

    0.0506 74.0 0.217 0.767

    0.0616 62.9 0.084 0.268

    0.0654 59.2 0.098 0.333

    0.0703 55.8 0.114 0.385

    0.0788 51.7 0.137 0.426

    0.0938 53.1 0.219 0.572

    data transformed to logarithms (log10) on 5-, 10- and 20-m grids

    th skewness caused by randomly located outliers

    c a c0+c c0:c0+c

    0.01186 94.1 0.014 0.170

    0.01155 96.4 0.016 0.288

    0.01141 98.0 0.018 0.357

    0.01130 99.6 0.019 0.416

    0.01107 105.3 0.025 0.560

    0.01301 68.0 0.016 0.191

    0.01260 68.7 0.018 0.313

    0.01237 69.3 0.020 0.384

    0.01218 69.9 0.022 0.443

    0.01160 72.4 0.029 0.603

    0.01181 63.1 0.017 0.308

    0.01202 61.2 0.019 0.360

    0.01294 65.4 0.021 0.372

    0.01336 63.3 0.023 0.418

    0.01463 58.8 0.029 0.503

  • ARTICLE IN PRESSR. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 12331260 1251riance

    0.35

    0.30

    0.25

    0.20considerably less than 0.455 for all coefcients ofskewness and for all sizes of data set (Table 12); thismeans that the function is over-estimating thekriging variance. The departure from 0.455 isconsiderable even for a skewness of 0.5. TheMeSDRs, however, do not show a consistentpattern in relation to the size of data set.Table 12 gives the cross-validation results for all

    sizes of data set and skewness after transformation

    0

    Va 0.15

    0.10

    0.05

    0.00

    Variance

    0.35

    0.30

    0.25

    0.20

    0.15

    0.10

    0.05

    0.00

    10080604020

    Lag Distance (m)

    0 10080604020

    Lag Distance (m)

    square root of skew = 0.5

    square root of skew = 1.0

    square root of skew = 1.5

    square root of skew = 2.0

    square root of skew = 3.0

    Fig. 8. Experimental variograms computed from data on 10-m grid tr

    caused by outliers spatially aggregated in the corner (a, b) and centre (ariance

    0.045

    0.040

    0.035

    0.030

    0.025to square roots and logarithms. The pattern of theMeSDRs with increasing asymmetry for the 10-mdata after transformation was similar for all nuggetvariances, so the results are not presented here.The MSEs for the square root- and log10-trans-formed data increase as the original skewness in thedata increases and also as the size of data setdecreases. The effect of size of data set is small,however. Table 12 shows that the MeSDRs for all

    V

    0 10080604020

    Lag Distance (m)

    0 10080604020

    Lag Distance (m)

    0.020

    0.015

    0.010

    0.005

    0.000

    Variance

    0.045

    0.040

    0.035

    0.030

    0.025

    0.020

    0.015

    0.010

    0.005

    0.000

    log10 of skew = 0.5

    log10 of skew = 1.0

    log10 of skew = 1.5

    log10 of skew = 2.0

    log10 of skew = 3.0

    ansformed to square roots (a, c) and log10 (b, d) with skewness

    c, d) of the eld.

  • ARTICLE IN PRESS

    from

    aggr

    ode

    ircul

    ircul

    entas

    ircul

    ircul

    ircul

    ircul

    R. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 123312601252sizes of data set and for both transformationsdepart considerably from 0.455. There is littledifference between the MeSDRs for a givencoefcient of skewness for both transformations;sometimes the MeSDR departs less from 0.455 forthe square root transformation and at others itdeparts less for the log10 one. The MeSDRs for thetransformed data depart only slightly less from0.455 than do those for the raw data, Table 12. Inall cases, the greater the original asymmetry in thedata, the greater the degree of departure from 0.455.In general, these results suggest that data transfor-mations do not improve markedly the performanceof the model for kriging, although they reduce thecoefcient of skewness.For all sizes of data set, the MSEs and MeSDRs

    for the variograms and data where the outliers hadbeen removed were the smallest and showed theleast departure from 0.455, respectively (Table 12).

    Table 9

    Parameters of models tted to experimental variograms computed

    with a nugget:sill ratio of 0 where randomly located and spatially

    Location of outliers Grid interval

    of data (m)

    Rate of

    contamination

    M

    Random 5 0.05 C

    Random 10 0.05 C

    Random 20 0.05 P

    Random 5 0.02 C

    Random 10 0.02 C

    Aggregatededge 10 0.05 C

    Aggregatedcentre 10 0.05 CIf the outliers are essentially nuisance data, then thisapproach would be the most sensible. However, ifthe outliers are important values, such as largevalues of a pollutant, they should be returned to thedata for kriging. The cross-validation results withthe outliers restored and using the variogramcomputed after outliers had been removed are givenin parentheses in Table 12 for the 10-m data. TheMSEs for all original coefcients of skewness of the10-m data are larger than are those computed withthe original variograms of these data. The MeSDRsalso depart more from 0.455 than do those for theoriginal data; they are larger, which means that thekriging variances have been under-estimated. Theseresults show that the variogram model for data withoutliers performs better in kriging than that with theoutliers removed.Table 13 gives the cross-validation results foroutliers aggregated either near the edge or the centreof the eld for all sizes of data set. This is the onlyanalysis for which there is an obvious effect of thesize of data set; for a given coefcient of skewnessthe MSEs increase as asymmetry increases and asthe size of the data set decreases. However, theMSEs are considerably smaller for the aggregatedoutliers than for the randomly located ones(Table 12). The MSEs for the centrally groupedoutliers are larger than are those for outliers at theedge, but the difference is small for all sizes of data.For randomly located outliers, the kriging errors arelarge in their vicinity because of the lack of anyrelation with surrounding values. In addition, thelarger errors associated with the outliers are widelydistributed over the eld of data. For the aggregatedoutliers, the errors in the uncontaminated part ofthe eld are small and they will also be smaller in the

    data on 5-, 10- and 20-m grids generated by a variogram function

    egated outliers have been removed

    l type c0 c a c0+c c0:c0+c

    ar 0.028 0.946 65.0 0.974 0.029

    ar 0.021 0.963 64.0 0.984 0.021

    pherical 0.060 0.913 94.5 0.973 0.062

    ar 0.022 0.958 65.2 0.980 0.022

    ar 0.023 0.961 65.3 0.984 0.023

    ar 0.042 0.908 68.6 0.950 0.044

    ar 0.010 0.940 60.9 0.950 0.011contaminated area than for the randomly locatedoutliers. This is because adjacent values within thekriging neighbourhood in the area of aggregatedoutliers will be more similar to each other thanthose where the outliers are randomly located.The MeSDRs for aggregated outliers depart less

    overall from 0.455 than do those of the randomlylocated ones (Tables 13 and 12, respectively). TheMeSDRs for outliers aggregated near the edge ofthe eld show less departure from 0.455 in generalthan do those for outliers aggregated near the centrefor the two smaller sets of data; the opposite is thecase for the 5-m data. The centrally located outliersare likely to result in larger MSEs than those locatedat the edge of the eld because they will be involvedin many more predictions as the kriging neighbour-hood moves over the eld.

  • ARTICLE IN PRESS

    0

    0

    1

    2

    3

    4

    5

    6

    3.0

    2.5

    2.0

    1.5

    Variance

    1.0

    0.5

    0.0

    3.0

    2.5

    2.0

    1.5

    Variance

    1.0

    0.5

    0.0

    3.0

    2.5

    2.0

    1.5

    Variance

    1.0

    0.5

    0.0

    20 40 8060 100

    Lag distance (m)

    0 20 40 8060 100

    Lag distance (m)

    0 20 40 8060 100

    Lag distance (m)

    0 20 40 8060 100

    Lag distance (m)

    0 20 40 8060 100

    Lag distance (m)

    0 20 40 8060 100

    Lag distance (m)

    Variance

    3.0

    2.5

    2.0

    1.5

    1.0

    0.5

    0.0

    Variance

    3.0

    2.5

    2.0

    1.5

    1.0

    0.5

    0.0

    Variance

    Fig. 9. Experimental variograms computed from data simulated by a variogram function with a nugget:sill ratio of 0 on a 10-m grid by

    Matherons (m) estimator and Cressie and Hawkinss (), Dowds ( ) and Gentons (&) robust estimators for skewness coefcients of:(a) 0, (b) 0.5, (c) 1.0, (d) 1.5, (e) 2.0 and (f) 3.0 caused by randomly located outliers.

    R. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 12331260 1253

  • ARTICLE IN PRESS

    mpu

    e of 0

    uared

    R. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 123312601254Table 10

    Parameters of models tted to robust experimental variograms co

    with a nugget:sill ratio of contaminated by randomly outliers (rat

    Robust estimator Model type c0 c

    Skewness

    Cressie and Hawkins

    0.5 Pentaspherical 0 1.166

    1.0 Pentaspherical 0.1291 1.259

    1.5 Pentaspherical 0.1658 1.372

    2.0 Pentaspherical 0.2018 1.477

    3.0 Pentaspherical 0.3552 1.846

    Dowd

    0.5 Pentaspherical 0.02142 1.155

    1.0 Circular 0.00244 1.162

    1.5 Circular 0 1.200

    2.0 Spherical 0 1.214

    3.0 Spherical 0 1.215

    Genton

    0.5 Spherical 0.02945 1.025

    1.0 Spherical 0.00691 1.200

    1.5 Spherical 0 1.280

    2.0 Spherical 0 1.318

    3.0 Spherical 0 1.336

    aMSE is the mean squared error and MeSDR is the median sqFor the transformed data, results for the 10-mdata only are given. For both transformations theMSEs increase little as skewness in the original dataincreases (Table 13). This suggests that transforma-tion has been successful for the aggregated outliers,regardless of whether they are near the edge or thecentre of the eld. This is supported by the MeSDRswhich are closer to 0.455 than for the original dataand for the data with randomly located outliers(Table 12). The results also suggest that thetransformation to square roots is more successfulfor outliers aggregated near the edge of the eld andthat to log10 for outliers near the centre; however,the difference between them is small. The MeSDRsfor data with the outliers removed are the closest to0.455 for both scenarios, but those for outliers nearthe edge are the closer. The results after the removalof outliers appear to suggest that this is preferableto data transformation before computing thevariogram. However, outliers were not removedfor the transformation and this might be importanton contaminated sites so that information is notlost. When the outliers are returned to the data forkriging, however, the MeSDRs depart more from0.455 (Table 13, see results in parentheses forted from data on a 10-m grid generated by a variogram function

    .05) and cross-validation results

    a c0+c c0:c0+c Robust models

    MSEa MeSDRa

    71.75 1.166 0 0.654 0.497

    77.18 1.388 0.0930 1.205 0.387

    77.66 1.538 0.1078 1.736 0.352

    78.02 1.679 0.1202 2.382 0.334

    78.99 2.201 0.1614 6.085 0.301

    84.70 1.176 0.0182 0.641 0.515

    54.89 1.164 0.0021 1.286 0.753

    56.16 1.200 0 1.880 0.885

    65.08 1.214 0 2.608 0.866

    65.14 1.215 0 6.833 1.127

    56.65 1.055 0.0279 0.639 0.469

    56.99 1.207 0.0057 1.286 0.631

    58.89 1.280 0 1.886 0.735

    60.15 1.318 0 2.611 0.740

    60.82 1.336 0 6.839 0.956

    deviation ratio.coefcient of skewness of 0.5) than do those forthe square root-transformed data for outliersaggregated near the edge and log10-transformeddata aggregated near the centre.Where the original data contain randomly located

    outliers, the MeSDRs for the transformed data tendto be more similar for different coefcients ofskewness compared with those where there wasunderlying asymmetry. The latter showed a morepronounced decrease in MeSDR as skewnessincreased (Kerry and Oliver, 2007). The overallpattern of MeSDR values suggests that for all gridsizes the log10 transformation is the more effective.Tables 10 and 11 give the cross-validation results

    for the robust variogram estimators for the 10-mdata only simulated with 0 and 0.5 nugget variance,respectively. The MSEs increase in all cases withincreasing asymmetry, but to a slightly smallerextent for the Cressie and Hawkins estimator thanfor the Dowd and Genton ones. The MSEs are alsosmaller for the data simulated with a nugget:sillratio of 0.5, which supports the previous observa-tions for the 10-m data with Matherons estimator.As above, the smoothing effect on kriged predic-tions of a nugget effect in the variogram reduces the

  • ARTICLE IN PRESS

    0

    0

    1

    2

    3

    4

    5

    6

    7

    3.0V

    ariance

    2.5

    2.0

    1.5

    1.0

    0.5

    0.0

    3.0

    Variance

    2.5

    2.0

    1.5

    1.0

    0.5

    0.0

    3.0

    Variance

    Variance

    2.5

    2.0

    1.5

    1.0

    0.5

    0.0

    3.0

    Variance

    2.5

    2.0

    1.5

    1.0

    0.5

    0.0

    3.0V

    ariance

    2.5

    2.0

    1.5

    1.0

    0.5

    0.0

    Lag distance (m)

    10080604020

    0

    Lag distance (m)

    10080604020

    0

    Lag distance (m)

    10080604020

    0

    Lag distance (m)

    10080604020

    0

    Lag distance (m)

    10080604020

    0

    Lag distance (m)

    10080604020

    Fig. 10. Experimental variograms computed from data simulated by a variogram function with a nugget:sill ratio of 0.5 on a 10-m grid by

    Matherons (m) estimator and Cressie and Hawkinss (), Dowds ( ) and Gentons (&) robust estimators for skewness coefcients of:(a) 0, (b) 0.5, (c) 1.0, (d) 1.5, (e) 2.0 and (f) 3.0 caused by randomly located outliers.

    R. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 12331260 1255

  • ARTICLE IN PRESS

    Table 11

    Parameters of models tted to robust experimental variograms computed from data on a 10-m grid generated by a variogram function

    with a nugget:sill ratio of 0.5 contaminated by randomly outliers (rate of 0.05) and cross-validation results

    Robust estimator Model type c0 c1 A c0+c c0:c0+c Robust models

    Skewness MSEa MeSDRa

    Cressie and Hawkins

    0.5 Spherical 0.7718 0.5025 68.98 1.274 0.6057 1.054 0.386

    1.0 Spherical 0.9425 0.5762 71.56 1.519 0.6206 1.577 0.353

    1.5 Spherical 1.046 0.6262 72.24 1.672 0.6255 2.056 0.316

    2.0 Spherical 1.146 0.6714 72.81 1.817 0.6306 2.639 0.305

    3.0 Circular 1.570 0.7901 69.03 2.360 0.6653 5.988 0.328

    Dowd

    0.5 Spherical 0.6609 0.5645 71.58 1.225 0.5395 1.055 0.433

    1.0 Spherical 0.6933 0.6503 73.14 1.343 0.5162 1.587 0.458

    1.5 Spherical 0.6988 0.6651 73.28 1.364 0.5123 2.075 0.470

    2.0 Spherical 0.6974 0.6697 73.18 1.367 0.5102 2.669 0.503

    3.0 Spherical 0.6974 0.6697 73.18 1.367 0.5102 6.104 0.667

    Genton

    0.5 Spherical 0.7436 0.5207 68.37 1.264 0.5883 1.054 0.394

    1.0 Spherical 0.8273 0.6419 70.78 1.469 0.5632 1.582 0.394

    1.5 Spherical 0.8265 0.7692 71.54 1.596 0.5179 2.075 0.398

    2.0 Spherical 0.8266 0.7707 71.75 1.597 0.5176 2.668 0.423

    3.0 Spherical 0.8214 0.7893 71.49 1.611 0.5099 6.111 0.561

    aMSE is the mean squared error and MeSDR is the median squared deviation ratio.

    Table 12

    Mean squared errors and median squared deviation ratios, MSE and MeSDR, respectively, from cross-validation using variogram model

    parameters from original data contaminated by randomly located outliers (rate 0.05) on 5-, 10- and 20-m grids (nugget:sill ratio of 0), from

    data transformed to square roots and logarithms (log10), and from data with outliers removed

    Coefcient

    of skewness

    of original

    data

    Original data Transformed data

    Square roota Log10a Outliers removed (outliers returned

    for kriging)

    MSEa MeSDRa MSEa MeSDRa MSEa MeSDRa MSEa MeSDRa

    5-m grid

    0 0.1322 0.703

    0.5 0.3944 0.214 0.02215 0.226 0.004089 0.206 0.1254 (0.4285) 0.516 (0.773)

    1.0 0.8138 0.128 0.03979 0.146 0.006451 0.144

    1.5 1.217 0.100 0.05479 0.116 0.008219 0.123

    2.0 1.720 0.082 0.07182 0.097 0.01005 0.112

    3.0 4.258 0.058 0.1434 0.069 0.01646 0.076

    10-m grid

    0 0.1677 0.496

    0.5 0.5812 0.219 0.03137 0.234 0.005387 0.245 0.1678 (0.6361) 0.490 (0.660)

    1.0 1.077 0.129 0.05197 0.153 0.008115 0.166 (1.256) (0.869)

    1.5 1.530 0.113 0.06863 0.130 0.01007 0.149 (1.829) (1.017)

    2.0 2.083 0.087 0.08719 0.103 0.01205 0.126 (2.531) (1.006)

    3.0 5.326 0.065 0.1758 0.074 0.01987 0.091 (6.626) (1.424)

    R. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 123312601256

  • ARTICLE IN PRESS

    Table 12 (continued )

    Coefcient

    of skewness

    of original

    data

    Original data Transformed data

    Square roota Log10a Outliers removed (outliers returned

    for kriging)

    MSEa MeSDRa MSEa MeSDRa MSEa MeSDRa MSEa MeSDRa

    20-m grid

    0 0.3691 0.394

    0.5 0.9816 0.199 0.05992 0.211 0.00922 0.189 0.3306 (1.007) 0.303 (0.586)

    1.0 1.373 0.188 0.07594 0.184 0.01131 0.164

    1.5 1.854 0.185 0.09371 0.185 0.1332 0.155

    2.0 2.605 0.148 0.1183 0.175 0.01595 0.165

    3.0 5.528 0.082 0.1989 0.126 0.02293 0.137

    MSE is the mean squared error.

    MeSDR is the median squared deviation ratio.

    Values in brackets are for kriging using the variogram with outliers removed but with the outliers returned to the data for kriging.aA constant of 4 was added to values in each original data set so that all values were just positive for the logarithmic transformation.

    Table 13

    Mean squared errors and median squared deviation ratios, MSE and MeSDR, respectively, from cross-validation using model parameters

    of original data on grids of 5-, 10- and 20-m (nugget:sill ratio of 0), and on the 10-m grid for data transformed to square roots and

    logarithms (log10) and with outliers removed where outliers are spatially aggregated

    Data with

    grouped

    outliers

    Skewness

    coefcient of

    original data

    Original data Transformed data

    Square roota Log10a Outliers removed (outliers

    returned for kriging)

    MSEa MeSDRa MSEa MeSDRa MSEa MeSDRa MSEa MeSDRa

    5-mcorner 0 0.1322 0.703

    5-mcorner 0.5 0.1898 0.357

    5-mcorner 1.0 0.2083 0.310

    5-mcorner 1.5 0.2254 0.268

    5-mcorner 2.0 0.2464 0.234

    5-mcorner 3.0 0.3524 0.136

    5-mcentre 0 0.1322 0.703

    5-mcentre 0.5 0.1982 0.404

    5-mcentre 1.0 0.2174 0.437

    5-mcentre 1.5 0.2397 0.348

    5-mcentre 2.0 0.2686 0.268

    5-mcentre 3.0 0.4544 0.114

    10-mcorner 0 0.1677 0.496

    10-mcorner 0.5 0.2809 0.426 0.01659 0.429 0.002922 0.362 0.1684 (0.2816) 0.458 (0.515)

    10-mcorner 1.0 0.3363 0.376 0.01825 0.394 0.003002 0.335

    10-mcorner 1.5 0.3983 0.299 0.02037 0.363 0.003203 0.312

    10-mcorner 2.0 0.4808 0.262 0.02315 0.298 0.003462 0.267

    10-mcorner 3.0 1.041 0.111 0.03870 0.241 0.004750 0.203

    10-mcentre 0 0.1677 0.496

    10-mcentre 0.5 0.3154 0.357 0.01365 0.361 0.003052 0.404 0.1616 (0.3161) 0.475 (0.578)

    10-mcentre 1.0 0.3661 0.318 0.01433 0.305 0.003153 0.356

    R. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 12331260 1257

  • ARTICLE IN PRESS

    ed d

    ota

    eSD

    .255

    .199

    .118

    R. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 123312601258Table 13 (continued )

    Data with

    grouped

    outliers

    Skewness

    coefcient of

    original data

    Original data Transform

    Square ro

    MSEa MeSDRa MSEa M

    10-mcentre 1.5 0.4286 0.240 0.01552 0

    10-mcentre 2.0 0.5502 0.172 0.01799 0

    10-mcentre 3.0 0.9632 0.084 0.02590 0

    20-mcorner 0 0.3691 0.394

    20-mcorner 0.5 0.4138 0.386

    20-mcorner 1.0 0.4547 0.368

    20-mcorner 1.5 0.5130 0.305

    20-mcorner 2.0 0.5957 0.215

    20-mcorner 3.0 1.156 0.112

    20-mcentre 0 0.3691 0.394

    20-mcentre 0.5 0.7320 0.229

    20-mcentre 1.0 0.8937 0.163

    20-mcentre 1.5 1.107 0.196

    20-mcentre 2.0 1.279 0.101

    20-mcentre 3.0 2.592 0.059

    MSE is the mean squared error.localized errors caused by the outliers. TheMeSDRs are closer to 0.455 for Cressie andHawkins estimator for data simulated with anugget:sill ratio of 0 (Table 10), followed by thosefor Gentons estimator. For data simulated with anugget:sill ratio of 0.5, the MeSDRs are closest to0.455 for Dowds estimator for skewness coefcientso2 and for Gentons for skewness coefcients of 2.0and 3.0. The MeSDRs are sometimes b0.455 forthe robust variograms, which suggest that thefunction is under-estimating the kriging variancepossibly because of the effect of non-normality onthe robust estimator. The MeSDRs for Cressie andHawkins estimator for data with no nuggetvariance and those for Dowds and Gentonsestimators for data with a nugget variance aresimilar to those for data with the outliers removed.However, it is important to note that when theoutliers are returned for kriging and the variogramcomputed on data with the outliers removed is used,the MSEs are still slightly smaller than are those forDowds and Gentons estimators, but overall theMeSDRs depart more from 0.455 than do those ofthe robust estimators.

    MeSDR is the median squared deviation ratio.

    Values in brackets are for kriging using the variogram with outliers reaA constant of 4 was added to values in each original data set so thaata

    Log10a Outliers removed (outliers

    returned for kriging)

    Ra MSEa MeSDRa MSEa MeSDRa

    0.003327 0.323

    0.003655 0.273

    0.004521 0.1964. Conclusions

    Where asymmetry arises from contamination of aprimary process by a secondary process at arelatively small number of randomly located sites(outliers), the effect of size of data set on both theform of the variogram and the results of cross-validation is small compared with that observedwhere asymmetry arises from a long tail in thedistribution (Kerry and Oliver, 2007). The rate ofcontamination by outliers, however, has more effecton the variogram and results of cross-validationthan does the size of data set. A more modestcoefcient of skewness of 3.0 caused by outliersdistorts the variogram considerably more than thatobserved for underlying asymmetry with largercoefcients of skewness.For a given coefcient of skewness caused by

    randomly located outliers, an increase in nugget:sillratio of the generating variogram up to andincluding 0.5 has little effect on the shape of thevariogram for different coefcients of skewness, butfor a nugget:sill ratio of 0.75 the variograms arealmost pure nugget for the larger coefcients of

    moved but with the outliers returned to the data for kriging.

    t all values were just positive for the logarithmic transformation.

  • Gnanadesikan, R., Kettenring, J.R., 1972. Robust estimates,

    residuals, and outlier detection with multiresponse data.

    Kerry, R., Oliver, M.A., 2007. Determining the effect of

    ARTICLE IN PRESSR. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 12331260 1259skewness. However, cross-validation showed thatthe larger the nugget:sill ratios in the generatingvariogram, the smaller the departure of the MeSDRfrom 0.455 as skewness increases.The effect of aggregated outliers compared with

    randomly located ones on the nugget variance ismarked; the former have very little effect on it.There is also a difference in the effects on thevariogram if outliers are aggregated near to the edgeof the eld or the centre. The MSEs indicate that ifoutliers are aggregated, their effect on the accuracyof prediction is less than if they are randomlylocated. This is further conrmed by the smallerdeparture of the MeSDR from 0.455 for aggregatedoutliers compared with randomly located ones. Foraggregated outliers, there is less need to amelioratetheir effect until the skewness coefcient X2.Transformation to square roots and logarithms

    reduces the coefcient of skewness and improves theshape of the variogram for the larger initialskewness coefcients. However, the MeSDRs de-part considerably from 0.455 for all coefcients ofskewness and this is only marginally less so than forthe raw data. In contrast, the MeSDRs for therobust variograms show less departure from 0.455overall than do those for the transformed data. Fora generating variogram with a nugget:sill ratio ofCressie and Hawkins (1980) estimator results inMeSDRS that were closest to 0.455, whereas with anugget:sill ratio of 0.5 Dowds estimator performedthe best in this context. These results indicate thatone should compute several robust estimators, if thepresence of outliers is suspected, as they performdifferently under different circumstances. The re-moval of outliers, however, resulted in MeSDRsthat were generally closer to 0.455, but, as notedabove, this does not hold when the outliers arereturned to the data for kriging.The MSEs for randomly located outliers suggest

    that the standard approach that many geostatisti-cians have adopted of mitigating the effects ofasymmetry only when the skewness coefcientexceeds the bounds 71 might need revising whenasymmetry arises from randomly located outliers.There is a large increase in MSE (Table 12) betweenskewness coefcients of 0.5 and 1.0, which suggeststhat for skewness coefcients 40.75 there is a needto reduce the asymmetry. The current best practiceapproach of removing outliers appears to be themost appropriate method, when they are randomlylocated and will not be returned to the data for

    kriging. However, when the outliers form a crucialasymmetric data on the variogram. I. Underlying asymmetry.

    Computers and Geosciences, doi:10.1016/j.cageo.2007.05.008.

    Lark, R.M., 2000. A comparison of some robust estimators of the

    variogram for use in soil survey. European Journal of Soil

    Science 51 (1), 137157.

    Matheron, G., 1965. Les variables regionalisees et leur estima-

    tion: une application de la theorie de fonctions aleatoires aux

    sciences de la nature. Masson et Cie, Paris, 306pp.

    McBratney, A.B., Webster, R., 1986. Choosing functionsBiometrics 28 (1), 81124.

    Goovaerts, P., 1997. Geostatistics for Natural Resources

    Evaluation. Oxford University Press, New York, 483pp.

    Haslett, J., Bradley, R., Craig, P.S., Wills, G., Unwin, A.R., 1991.

    Dynamic graphics for exploring spatial data, with application

    to locating global and local anomalies. American Statistician

    45 (3), 234242.part of the investigation, as on contaminated sites,or when they are difcult to identify in theexploratory data analysis, we recommend thatpractitioners compute several robust variogramestimators in preference to the removal of outliersand use the one that gives the best cross-validationresults.

    Acknowledgements

    We thank Professors R. Webster and A.B.McBratney and an unknown referee for theirguidance in revising this paper.

    References

    Armstrong, M., Delner, P., 1980. Towards a more robust

    variogram: a case study on coal. Unpublished Note N-671,

    centre de Geostatistique et de Morphologie Mathematique,

    Fontainebleau, 49pp.

    Barnett, V., Lewis, T., 1994. Outliers in Statistical Data, third ed.

    Wiley, Chichester, 604pp.

    Cressie, N.A.C., 1993. Statistics for Spatial Data. Wiley, New

    York, 900pp.

    Cressie, N., Hawkins, D., 1980. Robust estimation of the

    variogram. Mathematical Geology 12 (2), 115125.

    Deutsch, C.V., Journel, A.G., 1992. GSLIB: Geostatistical

    Software Library and Users Guide. Oxford University Press,

    New York, 369pp.

    Dowd, P.A., 1984. The variogram and kriging: robust and

    resistant estimators. In: Verly, G., David, M., Journel, A.G.,

    Marechal, A. (Eds.), Geostatistics for Natural Resources

    Characterization. Reidel, Dordrecht, pp. 221236.

    Genton, M.G., 1998a. Highly robust variogram estimation.

    Mathematical Geology 30 (2), 213221.

    Genton, M.G., 1998b. Spatial breakdown point of robust

    estimators. Mathematical Geology 30 (7), 853871.for semi-variograms of soil properties and tting them

  • to sampling estimates. Journal of Soil Science 37 (4),

    617639.

    Payne, R.W. (Ed.), 2006. The Guide to GenStat Release 9Part

    2: Statistics. VSN International, Hemel Hempstead.

    Rousseeuw, P.J., Croux, C., 1992. Explicit scale estimators with

    high breakdown point. In: Dodge, Y. (Ed.), L1 Statistical

    Analyses and Related Methods. North-Holland, Amsterdam,

    pp. 7792.

    Rousseeuw, P.J., Croux, C., 1993. Alternatives to the median

    absolute deviation. Journal of the American Statistical

    Association 88 (424), 12731283.

    Tukey, J.W., 1977. Exploratory Data Analysis. Addison-Wesley,

    Reading, MA, 688pp.

    Webster, R., Oliver, M.A., 2001. Geostatistics for Environmental

    Scientists. Wiley, Chichester, 271pp.

    ARTICLE IN PRESSR. Kerry, M.A. Oliver / Computers & Geosciences 33 (2007) 123312601260

    Determining the effect of asymmetric data on the variogram. II. OutliersIntroductionMethodsSimulation of two-dimensional data contaminated by outliersApproaches to reduce asymmetryMatherons variogram estimator and robust variogram estimatorsCross-validation

    Results and discussionThe effect of randomly located outliers on variograms computed from simulated data of different sample sizesEffect of spatial continuity on variograms computed from data simulated on a 10-m grid with randomly located outliersEffects of spatially aggregated outliers on the variogramEffect of data transformation (square root and log10) on the variogramRobust variogram estimatorsCross-validation results

    ConclusionsAcknowledgementsReferences