Estimation of measurement uncertainty from field sampling: implications...

15
ELSEVIER The Science of the Total Environment 198 (1997) 243-257 Estimation of measurement uncertainty from field sampling: implications for the classification of contaminated land. Michael H. Ramsey*, Ariadni Argyraki Environmental Geolw Research, Department of GeoloQ, Imperial College, London SW7 2BP, London, UK Received 17 November 1996; accepted 8 February 1997 Abstract Methods have been devised for estimating measurement uncertaintiesdue to field sampling. Thesemethods have revealed very large discrepancies between the estimated levels of lead in one area of contaminated land. The uncertainty estimatefor the estimated lead concentration at one sampling site, when sampled by one sampler (an individual collecting the samples) usingone protocol, wasvery large (SO%, for 95% confidence). The uncertainty on the estimatedmean concentration was also large (55%) for nine different organisations using the samesampling protocol. Theseorganisations were participating in the sampling trials by sending samplers to the sampling targets. This uncertainty wasdue entirely to the sampling, rather than the analytical measurement as all the analyses were carried out in random sequence within one batch. When sampling organisations usedwhatever sampling protocol that they considered appropriate, and analysed their own samples, the uncertainty on the mean concentration was 45.5%.The similarities in theseestimates suggest that heterogeneity is a limiting factor in measurement uncertainty at this site. The existing criteria for the classification of contaminatedland generally depend on a deterministic comparison between measured concentration of a contaminate and a threshold level. With such large values of measurement uncertainty in the estimated concentration of the contaminant due to field sampling, it is likely that land will be misclassified. This has legal, financial and possible health implications from both the unnecessary remediationof ‘uncontaminated’ land and regarding land as uncontaminatedthat is in fact contaminated.A new probabilistic classificationscheme for contaminated land is proposedthat allows for the overall measurement uncertainty as well as the estimated concentration of the contaminant. A term ‘possiblycontaminated’is given to locations where the measured concentration is below a threshold, but the uncertainty interval exceeds the threshold. The acceptable probability of misclassification can be selected to reflect the risks at that particular site. 0 1997 Elsevier Science B.V. Keyworak Measurement uncertainty; Sampling; Contaminated land * Corresponding author. 004%9697/97/$17.00 0 1997 Elsevier Science B.V. All rights reserved. PII SOO48-9697(97) 05456-g

Transcript of Estimation of measurement uncertainty from field sampling: implications...

ELSEVIER The Science of the Total Environment 198 (1997) 243-257

Estimation of measurement uncertainty from field sampling: implications for the classification of

contaminated land.

Michael H. Ramsey*, Ariadni Argyraki Environmental Geolw Research, Department of GeoloQ, Imperial College, London SW7 2BP, London, UK

Received 17 November 1996; accepted 8 February 1997

Abstract

Methods have been devised for estimating measurement uncertainties due to field sampling. These methods have revealed very large discrepancies between the estimated levels of lead in one area of contaminated land. The uncertainty estimate for the estimated lead concentration at one sampling site, when sampled by one sampler (an individual collecting the samples) using one protocol, was very large (SO%, for 95% confidence). The uncertainty on the estimated mean concentration was also large (55%) for nine different organisations using the same sampling protocol. These organisations were participating in the sampling trials by sending samplers to the sampling targets. This uncertainty was due entirely to the sampling, rather than the analytical measurement as all the analyses were carried out in random sequence within one batch. When sampling organisations used whatever sampling protocol that they considered appropriate, and analysed their own samples, the uncertainty on the mean concentration was 45.5%. The similarities in these estimates suggest that heterogeneity is a limiting factor in measurement uncertainty at this site. The existing criteria for the classification of contaminated land generally depend on a deterministic comparison between measured concentration of a contaminate and a threshold level. With such large values of measurement uncertainty in the estimated concentration of the contaminant due to field sampling, it is likely that land will be misclassified. This has legal, financial and possible health implications from both the unnecessary remediation of ‘uncontaminated’ land and regarding land as uncontaminated that is in fact contaminated. A new probabilistic classification scheme for contaminated land is proposed that allows for the overall measurement uncertainty as well as the estimated concentration of the contaminant. A term ‘possibly contaminated’ is given to locations where the measured concentration is below a threshold, but the uncertainty interval exceeds the threshold. The acceptable probability of misclassification can be selected to reflect the risks at that particular site. 0 1997 Elsevier Science B.V.

Keyworak Measurement uncertainty; Sampling; Contaminated land

* Corresponding author.

004%9697/97/$17.00 0 1997 Elsevier Science B.V. All rights reserved. PII SOO48-9697(97) 05456-g

244 MR. Ramsey, A. Argyraki /The Science of the Total Environment 198 (1997) 243-257

1. Introduction

Uncertainty of measurements in the interpreta- tion of contaminated land can have profound effects on the realistic assessment of the extent of the contamination. Although the importance of the uncertainty is becoming apparent, the meth- ods for the estimation of such uncertainty are lacking. For analytical measurements in isolation, the realistic estimation of uncertainty has already become an important issue [l-3]. In contami- nated land investigations it has recently been shown that it is field sampling, rather than the chemical analysis, that can contribute the largest source of measurement error and will therefore limit the measurement uncertainty [4]. Methods have been developed for quantifying the errors arising from the sampling of one site, by either a single or multiple samplers. For a single sampler, the methods have been applied using a single sampling protocol [5] and several different sam- pling protocols [6]. For the case of multiple sam- plers, different methods have been devised de- pending on whether all samplers were applying the same protocol [4] or different protocols [7]. These methods were applied to estimating heavy metals concentrations on contaminated land, but the methodologies are equally applicable in prin- ciple, to the measurement of any contaminant in any media.

result of a measurement, that character&s the dispersion of the values that could reasonably be attributed to the measurand’. The standard un- certainty, u, can be considered equivalent to one standard deviation which is often used to describe a normally distributed error. The extended uncer- tainty U is equal to the product of the standard uncertainty and a coverage factor ‘k’, which typi- cally has a value of 2 or 3. This factor is analo- gous to the use of multiples of standard deviation for quantifying precision. In the formal definition, systematic errors are not included in estimates of uncertainty, but only residual random errors left after the correction of the systematic errors. However, the distinction between systematic and random errors in this application becomes blurred. The systematic error of one sampler be- comes a random error when assessed as part of a multi-sampler comparison. For example, if one sampler systematically overestimates the concen- tration of a contaminant for some reason, then when assessed in a multi-sampler comparison, this will appear as a component of random varia- tion between the samplers. In the approach used in this paper, therefore, systematic sampling er- rors do become incorporated in estimates of un- certainty.

1.2. Estimating uncertainty in chemical analysis

This paper considers how estimates of mea- surement uncertainty can be derived, using exam- ples from published estimates of sampling and analytical errors. This approach to the classifica- tion of contaminated land is based on a probabilistic, rather than a traditional determinis- tic one. The former can improve the reliability of the classification. For example, when a single estimate of concentration at a location is just below a regulatory limit, but the measurement uncertainty is high, then there is a high probabil- ity that the true concentration at that location does exceed the limit.

1. I. Terminology of uncertainty

The methods developed recently for estimating uncertainty in chemical analysis are of two types. They both need to be evaluated as options for applying to field sampling. In the ‘bottom up’ approach the random error from each individual component of a method is quantified separately as a standard deviation (s). The overall uncer- tainty is then estimated by summing the individ- ual errors by their variances (s’) [Z]. The alterna- tive ‘top down’ approach uses inter-laboratory trials to estimate the total uncertainty of a mea- surement. In this method, many selected labora- tories (n > 8) analyze the same sample, by the same analytical method [3]. The scatter of the measurements reported by all of the laboratories is then used as an overall estimate of uncertainty.

The formal definition of uncertainty has been The limitation of the ‘bottom up’ approach is given by the International Standards Organisa- that it requires that all of the sources of uncer- tion (ISO) [l] as ‘a parameter associated with the tainty need to be identified, It is relatively easy to

MR. &zm.wy, A. Argyraki /The Science of the Total Environment 198 (1997) 243-257 245

consider the obvious sources of error which are explicit parts of the method (e.g. weighing, volu- metric additions). However, the most important source of uncertainty may not be explicit in the method (e.g. laboratory temperature), and it is easily overlooked therefore, especially by inexpe- rienced practitioners. Furthermore, it can be a long and expensive procedure to quantify all the component errors, if the method is to be applied rigorously. Once the main component of the un- certainty has been identified however, then the variance of this component often dominates to such an extent that future monitoring can con- centrate of this one source of variance.

The benefits of the ‘top down’ approach can be appreciated from the discrepancies that are often evident between laboratories in inter-organisa- tional trials. These differences are often larger than can be accounted for by the individual esti- mates of uncertainty within each laboratory. This is because the ‘bottom up’ approach used by individual laboratories tends to give over-optimis- tic estimates of the uncertainty. The limitation of the ‘top down’ approach is that it depends of the selection of the laboratories that contribute. If the laboratories all use a similar source of cali- bration, they may all be equally biased and there- fore give an under estimate of the uncertainty. Alternatively, one laboratory may have gross er- rors, atypical of the application of the method as a whole, and this will cause an overestimate of the uncertainty.

1.3. Estimating uncertainty in field sampling

For environmental materials the primary prob- lem with estimating measurement uncertainty by the methods described is that it ignores the un- certainty arising from field sampling. It is often quoted that an analysis can never be of better quality than the sample upon which it is made. What has been lacking, however, is the means of estimating the size of the uncertainty that is in- troduced by field sampling. It is possible to adapt the methods devised originally for estimating ana- lytical uncertainty to estimate sampling uncer- tainty.

A more holistic approach is to consider field sampling and chemical analysis as just two parts

of the same ‘measurement’ process, and to quan- tify their combined contribution to the uncer- tainty [Sl. Such total ‘measurement’ uncertainty has therefore four potential components. These are the sampling and analytical contributions to random error (i.e. sampling and analytical preci- sion) and any uncorrected systematic errors (i.e. sampling and analytical bias).

Taking the ‘bottom up’ approach of estimating the total measurement uncertainty we can review the methods available for estimating these four components. Analytical precision can be mea- sured by analysing replicates [9] or in combina- tion with sampling precision using a balanced design of sampling and analytical duplicates [lO,ll]. An example of such a balanced design consists of measurements made on both of the duplicate samples taken at a certain proportion of the sampling locations. Analytical bias is usually estimated by analysing certified reference materi- als [12]. There are no methods in general use for estimating sampling bias. Sampling bias has al- ready been determined, however, between the concentrations estimates made by applying dif- ferent sampling protocols [6]. However estimating bias compared with a reference value (i.e. as defined by IS0 1993) may require the introduc- tion of reference sampling targets, analogous to reference materials for the estimation of analyti- cal bias [13]. This method should be equally appli- cable for estimating bias in protocols with ran- dom or systematic sampling designs.

Taking the ‘top down’ approach, it should be possible to use measurements from inter-organi- sational sampling trials, such as sampling profi- ciency tests [7] and collaborative trials [4] to esti- mate uncertainty, as will be discussed below.

2. Methods for estimating the uncertainty of field sampling

There are four cases in which measurement uncertainty, including that from sampling, could be estimated:

1. For single sampler/single protocol 2. For single sampler/multiple protocols 3. For multiple sampler/single protocol 4. For multiple sampler/multiple protocols

246 MR. Ramsey, A. Argyraki / The Science of the Total Environment I98 (1997) 243-257

These will be discussed in turn, using their application at one particular site of heavy metal contamination. The sampling target used for all four methods was a fallow agricultural field of approximately 1.8 ha (180 x 170 m) near Wirksworth in Derbyshire, UK. The major sam- pling area is situated on a scarp slope which drops away to the south and southwest. The eastern part of the field is flat and lies along a ridge at an altitude of 250 m. The local geology is predomi- nantly Millstone Grit with some associated silt- stones and clays. Soils are classified as generally being loamy sands with little organic matter de- veloped on Millstone Grit 1141. The heavy metal contamination is due to the use of the field for Pb smelting in medieval times (1300-1550 AD). The topographic location of the area is typical of a medieval bole smelter site, fuelled by wood. Such smelting sites were often located on high scarps. Boles relied on the prevailing wind to create the necessary air blast [14]. In this particular field, however, no evidence of the previous smelting activities is visible. Sampling designs were not intensified in any particular area of the site.

2.1. Method 1: Single sampler/single protocol

This method for estimating sampling and ana- lytical precision involves taking duplicate samples at some proportion of the sampling locations, typically 10% [ll]. Duplicates are the most cost- effective type of replicates, allowing the estima- tion of precision over the greatest number of locations. The duplicate samples are not taken at exactly the same position as the original, but displaced by a distance that represents the uncer- tainty of locating the sampling point, by the par- ticular surveying technology used. This distance will produce differences between sample dupli- cates caused by small scale local variation, but this variation represents that which may arise randomly in locating the sample location by this surveying technique. In applying this method to the Derbyshire site the duplicate samples were taken 2 m from the original location in randomly selected directions. Full experimental details have been given elsewhere 161, the objective here is to interpret the findings of this survey in terms of

the uncertainty estimation. One sampling proto- col that was investigated was the regular grid, with a 20-m spacing. In one application of this the samples were from a single auger core (O-15 cm) at each grid point and in a second the samples were composite of five auger cores taken within 1 m2 at each grid point (Fig. 1). Duplicate samples were taken at 20% of the locations in each grid, and both samples were analysed twice in a balanced design.

The three components of the variation were separated using robust analysis of variance [15]. Robust statistics rely on the accommodation of outlying values rather than their rejection, by downweight extreme observations and compen- sate for that downweighting. Two of the compo- nents represent uncertainty, these are the sam- pling and the analytical variance <.s&~ and s.&,). The third component is the between-location

(4

Fig. 1. Sampling designs employed to assess both the extent of Pb and Cu contamination and the measurement uncertainty at a site in Derbyshire, UK. For the regular grid (a) single auger samples were taken, but for grid (b) each location was repre- sented by a five-fold composite sample taken in 1 m’. For both grids a number of location had duplicated samples taken at a distance of 2 m from the original, in random directions. These duplicates were used to quantify the within-location variability of metal concentration caused by locational uncertainty (i.e. surveying error). Reproduced from [6] with the kind consent of the Royal Society of Chemistry.

M.R. Ramsey, A. Argymki /The Science of the Total Environment 198 (1997) 243-257 247

variance due to real variation of the contaminant (e.g. lead) across the site, termed the geochemical variance ( s ’ seochem). The sampling uncertainty, or within-location variance, will be partially due to small scale geochemical variation within the loca- tion, but represents the uncertainty in all of the samples that could be taken from that ‘location’ as specified, in this case within a 2-m radius. All three components of variance can be summed to give the total variance of the survey, expressed by:

2 Gml = &xhem + knp + dial

The measurement uncertainty (u) can be esti- mated using this ‘bottom up’ approach, from the ‘holistic’ combination of the sampling and analyti- cal variance described above, giving the measure- ment variance (s,,,,) as:

For the Derbyshire site with a regular grid of single samples (Table 11, the extended uncer- tainty (1871 pg/g> is 49.8% W%) for a single measurement made at the estimated mean con- centration of this site (7516 pg/g). However, by using composite samples the uncertainty drops to only 30.9% of the mean. This reduction in vari- ance is statistically significant at the 95% confi- dence level, using the F-test. This finding quanti- fies the theoretical prediction that taking compos- ite samples should reduce measurement uncer- tainty. The question of whether either of these levels of uncertainty is acceptable and therefore ‘fit for purpose’, will be discussed below.

U=S me as = d( klp + SL)

To express the extended uncertainty W), with a coverage factor (k) of two (for 95% confidence) this gives:

u = ku = 2Smeas

Analytical bias can be added into this estimate of uncertainty. In this study analytical bias was estimated independently by the analysis of six certified reference materials with a wide range of concentration of lead. Regression techniques were use to estimate that the analytical bias (B,) was approximately -3%, over the range of lead con- centrations found in the sample.

The uncertainty interval of any concentration c (e.g. the estimated mean for the site> becomes asymmetric. The upper limit can therefore be calculated from

As uncertainty relative to the mean concentra- tion si, becomes:

c + u = c(1 + U%/lOO)(l - B,/lOO)

and the lower limit from: U% = 200smeas/F (1)

where X is the estimated mean concentration c - u = c(1 - U%/lOO)(l - BJlOO)

of the analyte in the site. The interpretation of this relative uncertainty assumes that it does not change as a function of concentration. This has been demonstrated for analytical precision for concentration values well above the detection limit [9].

For the mean of the single sample grid Cc = 7516 pg/g, U% = 49.8%, B, = -3.O%), the up- per limit of uncertainty is therefore givenby:

c + U = 7516(1 + 49.8/100)(1 + 3/100) = 11597 /.Lg/g

Table 1 Estimation of measurement uncertainty from the application of single sampler/single protocol on the Derbyshire site

Sampling design Sites duplicated

Mean Pb robust ( a/d

%leas robust ( b%/p>

u = %neas U% = 2OOs,,,,/~

Reguluiar grid single sample 7 7.516 8185 1871.0 3742 Regulular grid composite samples 9 6093 5600 940.5 1881

Uncertainty reduced from 49.8% to 30.9% by use of composite samples, as predicted theoretically.

49.8% 30%

248 M.R. Ramsey, A. Argyrnki / The Science of the Total Environment 198 (1997) 243-257

The lower limit of uncertainty is given by:

c - u = 7516(1 - 49.8/100)(1 + 3/100)

=3886 /.qij/g

These equations can be refined if either the bias or the precision varies as a function of con- centration. The main problem with this method is that it does not allow for bias originating from the sampling process. This may be caused by the variability between different sampling protocols designed to fulfil the same objective (i.e. estima- tion of the mean lead concentration of the site) or from different samplers applying the same protocol. The bias between different sampling protocols can be assessed by Method 2, and the bias between different samplers with the same protocol by Method 3.

2.2. Method 2: Single samplers /multiple protocol

It is possible in principle to estimate the bias between different sampling protocols, applied by the same sampler on one sampling target. This procedure was applied to the Derbyshire site us- ing four protocols: (1) a regular grid with single samples; (2) a regular grid with composite sam- ples; (3) a herringbone design; and (4) a stratified random design [6]. The bias between the mean values of lead concentration estimated by each protocol was found not to be statistically signifi- cant for the number of samples taken. This was primarily because of the large uncertainties (i.e. standard error) on each of the mean values, which probably originated from the heterogeneity of the site. The estimates of the uncertainty for the concentration values at each location are similar for all four sampling protocols. Combining mea- surements from the duplicates from all four pro- tocols gives a more reliable estimate of the ex- tended uncertainty value as 60.4%. This is not significantly different from the value calculated for a single protocol by Method 1. The uncer- tainty in the estimated mean lead concentration for the whole site, calculated across the four separate protocols, is 16.1%. This shows a much

smaller uncertainty for the estimated site means than for concentration estimates at individual lo- cations, as would be expected.

2.3. Method 3: Multiple samplers /single protocol

This method of estimating uncertainty is effec- tively ‘top down’ and is based on the application of a Collaborative Trial in Sampling (CTS). The concept of the collaborative trial is already well accepted as a method for evaluating the perfor- mance of a single analytical method. One of the original objectives of the CTS was to validate a particular sampling protocol [4]. It is also possible however, to use the results of a CTS to estimate the uncertainty of measurements resulting from its application at a particular site.

This method was applied to a sub-area of the same site in Derbyshire (0.9 ha, 60 X 150 m>. Nine participants visited the site independently to apply the same sampling protocol in the shape of a ‘W’, twice over (Fig. 2). The objective of the protocol was to estimate the mean concentration of lead and copper for the site.

Each sampler left no evidence of where they had sampled, and did not communicate with other samplers, to maintain independence between the samplers. All the chemical analyses were per- formed for both lead and copper, in one ran- domised batch within one laboratory, to eliminate the effect of analytical bias between participants.

The three components of the uncertainty (Ta- ble 2) were separated using robust analysis of variance and the extended uncertainty (as U%) calculated from Eq. 1, using the corresponding standard deviation.

Although essentially a ‘top down’ approach, the experimental design did allow the separate esti- mation of the uncertainty due to the chemical analysis within the one laboratory. For lead this value is 5% and is clearly very small when com- pared with the uncertainty from the two apphca- tions of the sampling protocol by each participant (55.4%). Interestingly there is no increase in this uncertainty estimate when multiple samplers use this same protocol. This implies that the spatial heterogeneity of the site was a dominant factor in

M.R Ramsey, A. Argyraki /The Science of the Total Environment 198 (1997) 243-257 249

--

m Road

e Road

m* Garden

Fig. 2. Four possible interpretations of the sampling design used to assess mean metal concentration and measurement uncertainty by Method 3 (the collaborative trial in sampling involving nine participants). Reproduced from [4] with permis- sion.

controlling the uncertainty for lead, rather than differences between samplers.

This uncertainty for the estimated mean for a single sampler by the ‘w’ protocol (55.4%) is much larger than for the means estimated by different but more elaborate protocols (16.1%). This may be due partially but not entirely to the smaller number of increments (i.e. subsamples) used for the ‘W’ design (n = 17) against the other protocols (n = 39 or 40). This shows the limita- tion of the ‘w’ protocol and the findings of the

Table 2 Robust estimates of uncertainty for mean concentration of Pb and Cu at the Derbyshire site, made from a collaborative trial in sampling with nine participants

Pb ( wz/d f-3 ( i%/d Grand mean 6575 30

Gleas u lJ% smeas u U%

Analytical 165 330 5.0% 1.1 2.2 7.4 Single sampler 1822 3644 55.4% 1.2 2.4 8.0 Multiple sampler 1822 3644 55.4% 2.4 4.8 16.0

CTS suggest that it is not ‘fit-for-purpose’ given a requirement for the estimated mean to be within 20% of the assigned value for the site [4].

For copper, which is not at elevated concentra- tions in this site, the analytical component of the variance is a high proportion of the single sam- pler uncertainty. Moreover the uncertainty for multiple samplers is higher than that for single samplers, showing some contribution of different sampling bias of the various samplers.

Although this CTS methodology was applied here to estimate the uncertainty on mean concen- trations for a whole site, it could equally well be applied to estimate the uncertainty of the concen- tration at particular locations within a site.

2.4. Method 4: Multiple samplers /multiple protocol

In a Proficiency Test in Sampling (PTS) each sampler uses a different sampling protocol, which is selected by the participant to fulfil the same objective. The uncertainty estimated from such an inter-organisational trial, reflects the widest defi- nition of uncertainty at one site. For the applica- tion of the PTS at the same site in Derbyshire, the objective addressed by all nine sampling pro- tocols (Fig. 3) was the estimation of the mean metal concentration at the site [7]. The analytical measurements for this trial were each undertaken by the participating organisation, rather than cen- trally as for the CTS. This design was employed so that each organisation had responsibility for all aspects of their measurements. A reference mate-

2.50 MR Ramsey, A. Argyraki / The Science of the Total Environment 198 (1997) 243-257 . . . . 1 . . . . . Organisatlon 1

.

. . . . .

. . . .

.*.**

rI

. . . . .

a..**

. . . . .

. . . . .

1

Organization 4 . . , . . . . l

. * . l r- . . . .

a.. .

. . . .

Ckganlsation 7

Organlsation 2

s l .

. l . .

.-• .* l *

L!

l *. ‘*.

. .

. . .

. . . . . . . . L . . . . . . Organtaatbn 6

Organisation 3

a . . . . . . . .

. . . . .

. . .

. . .

. . .

. .

. .

. .

. .

Organisation 6

Organisatbn S

Fig. 3. Nine sampling designs used to assess mean metal concentration and measurement uncertainty by Method 4 (the sampling proficiency test). Reproduced from [7] with permis- sion.

rial (HRM31) was specially prepared with mate- rial from this same site, and analysed by each participant to separate the contribution of analyt- ical bias to their measurements.

The resulting scatter in the estimates of the mean concentration of lead gave a robust esti- mate of overall uncertainty (45.5%) similar to that found in the CTS (55.4%).The contribution

of the analytical bias of each laboratory to this overall uncertainty can be estimated by the un- certainty in the measurements on the special ref- erence material (HRM31) which was 23.5%. When this source of uncertainty is subtracted, the estimate of the uncertainty due to sampling is reduced to 39.0%. This is a lower level of uncer- tainty than estimated from the CD. This may be due in part to the more appropriate designs cho- sen by participants in the PTS with larger number of increments (mean = 22) than that used in the CTS (n = 17)

For copper, which is known to be more homo- geneously distributed across the site from previ- ous work, the two estimates of uncertainty are not significantly different, being 13.9% from the PTS and 16.0% from the CTS.

3. Implication of uncertainty for classification

3.1. Deterministic classification of contaminated land

The current method used to classify contami- nated land is to compare element concentrations measured in the soil against a threshold value of some kind. Various generic threshold values, with a variety of different legal connotations, have been applied in many counties. In the UK, for example, the values used as threshold trigger con- centrations have been set by the Interdeparmen- tal Committee on the Redevelopment of Con- taminated Land (ICRCL) [16]. Even with the current move towards definitions based on site- specific risk assessment, the final result is usually a threshold value over which soil at a particular location can be classified as contaminated to some degree.

The problem with the current method of apply- ing a threshold value, however derived, is that it ignores the existence of measurement uncer- tainty. There is therefore a risk of misclassifica- tion because of this uncertainty. There is a defi- nite probability that a single soil sample from a particular location within an area of land will give a measurement of a contaminant concentration below the threshold value, even when repeated

M.R Ramsey A. Aqyraki /The Science of the Total Environment 198 (1997) 243-257 251

samples taken from that location would show that the soil has an average concentration that is over the threshold. Conversely, there is also a possibil- ity that one sample may give a measurement over the threshold, where repeated measurements show that the average concentration at a particu- lar location is under the threshold. What is re- quired is a probabilistic classification of contami- nated land that takes into account the uncertainty of a concentration measurement as well as its value.

3.2. Probabilistic classification of contaminated land

The proposed method specifies four categories for the level of contamination based on the over- lap of the uncertainty with a single threshold level (Fig. 4). It is derived from the general approach to uncertainty taken by Thompson [3]. The probability of the location being contaminated over the threshold is given in Table 3 (column 2). For the category ‘Uncontaminated’ all of the range of the uncertainty is lower than the thresh- old value (TX The probability of this site being contaminated over the threshold is therefore < 0.025 (i.e. 2.5%). In the ‘Possibly contaminated’ category, although the measured concentration value is less than the threshold, the upper side of the range of the uncertainty does cover the threshold value. In this case the probability of this site being contaminated over the threshold is therefore between 0.025 and 0.50 (i.e. 2.5-50%). For the category ‘Probably contaminated’ the measured concentration exceeds the threshold

value, but there is still a probability that the true mean value is less than the threshold (0.025-0.50). For this category the probability of this site being contaminated over the threshold is therefore between 0.50 and 0.975 (50-97.5%). The last cat- egory of ‘Contaminated’ the probability of the site being contaminated over the threshold is > 0.975, although on the lower boundary of this class there is still a very small chance (2.5%) that the true concentration value at the site is under the threshold value. The probabilities quoted here are given for an extended uncertainty with a coverage factor of two. It is equally valid to select a different coverage factor that reflects a differ- ent probability of misclassification that is con- sidered more appropriate for a particular site. The probabilities are also given assuming single estimates of concentration being made at each sampling location. If IZ multiple measurements have been made at a location, then the uncer- tainty value for that site should be divided by Jn, before the classification is made.

The classification of each location on a site depends not only on the concentration value measured on a sample from that location, but also on its uncertainty. An example of the appli- cation of this method will be given below.

There is however, a modification to this calcu- lation which can make the classification even more rapid, although it can lead to misunder- standing in interpretation, unless the logic of the matter is well understood as will be explained. If the uncertainty values for the particular measure- ments made are calculated for the concentration

Table 3 Probabilistic classification of soils with measured concentration (C) and uncertainty (U), based on the probability that the contaminant concentration is greater than a particular threshold value (Tf

Classification Concentration range Concentration range Probability of C > T, with uncertainty on C with uncertainty on T i.e. C being over threshold

Uncontaminated C+U<T C<T-U < 0.025 Possibly contaminated C<T<C+U T-U<C<T 0.025-0.500 Probably contaminated C-lJ<T<C T<C<T+U 0.500-0.975 Contaminated T<C-U C>T+U > 0.975

The effect of uncertainty on the concentration estimate is evaluated (column 2), but a rapid intermediate calculation can consider the uncertainty as being on the threshold value (column 3). Uncertainty values are stated for 95% confidence interval, but could be recalculated for whatever confidence interval is considered appropriate.

252 MR. Ramsey, A. Argyraki / The Science of the Total Environment 198 (1997) 243-257

equal to the threshold value, then the classilica- tion of the soil samples can be made directly, without calculating uncertainties for each mea- surement. The equivalence of this second method of calculation to the first can be seen by compar- ing Fig. 4b with Fig. 4a. The concentration ranges for this method of calculation are shown in column 3 of Table 3, and are arithmetically equal to those for the original method shown in columu 2.

The potential problem with this ‘short cut’

method of calculation is that it appears to use a modified ‘threshold value’ (e.g. T - U> and not the regulatory threshold value (T). These are not really modified thresholds, but calculation inter- mediates to identify measurements that fail into the category of ‘uncontaminated (C + U < T). Furthermore this method assumes that only sin- gle samples are taken at each location. For sites with duplicate samples the ‘modified threshold value’ would be T - (U/,/Z).

(a) Concentration (C)

;T Threshold (T)

c-u

Uncontaminated Possibly Probably Contaminated - Probabilistic Contaminated Contaminated Classification

Uncontaminated Uncontaminated Contaminated Contaminated - Deterministic

Classification

(b) Concentration (C)

0 --- T+U

Threshold (T)

0 ---T-U

Uncontaminated Possibly Probably Contaminated

Contaminated Contaminated Fig. 4. Probabilistic classification of contaminated soils, allowing for uncertainty on either the measured concentration values (a), or around the threshold value (b) for a rapid method of classification. Both methods give the same classification result although the former is slower in calculation but more rigorous. Classification by traditional deterministic method is provided for comparison.

M.R Ramsey, A. Aqyraki / The Science of the Total Environment 198 (1997) 243-257 2.53

4. Application of probabilistic classification to limit of the uncertainty for each single concentra- contaminated land tion measurement is given by:

In the methods for estimating uncertainty dis- cussed above (2, 3 and 4) the uncertainty (U) is estimated at the mean concentration and con- verted into relative units (U%) by dividing by the mean (see Eq. 1). The application of this value to all the measurements of soil concentration as- sumes that this relative uncertainty remains con- stant with varying concentration. There is evi- dence to show that relative precision of analytical methods approaches a constant asymptotic value when measurements are made at concentration levels more than two orders of magnitude above the detection limit of the analytical technique 193. For most applications on contaminated land this is a reasonable assumption, given the high levels of most threshold values quoted for contami- nants. If this is not the case then a slightly more elaborate calculation is needed that uses a math- ematical model for the change of uncertainty with concentration.

When considering the calculation of the uncer- tainty on concentration estimates (0, the upper

(a) Pb in soil @g/g)

c + u = ccl+ U%/lOO)

The lower limit is similarly given by:

c - u = C(1 - U%/lOO)

When the ‘short cut’ method is being used, with the uncertainty expressed on the threshold value, then the upper and lower limits are given by:

T + U = T(1 + U%/lOO) (2)

T - U = T(1 - U%/lOO) (3)

If U% has been estimated only for a single sampler (methods 1 or 2) and a separate estimate of measurement bias (B) is available, which has not been used to correct the measurements, then the limits are given by:

T + U = T(1 + U%/lOO)(l -B/100)

T - U = T(1 - U%/lOO)(l - B/100)

m

0 30 so so 120 150 180 210 270 240

010 cl El 0 0 0 II 0 + 0

? sojo / l a 0 0 0 0 + cl 0 1

N sooooooooooo

l~oooooo~+++ I

socl00+0.+0+0

1:jo I 0 0 0 0 0 0 0 0 +

Is0 i+ 0 0 + 0 0 0 0 0 0

210 1 + + oooonooci

240 + 0 0 0 0 17 0 0 0 0

270 j + oon+n+ooo

b!i Mk)

* 2coMWO

a IOOO-2wo

q 503.low

0 300 -500

El 200-300

0 ICO-MO

c 50 -100

254 M.R. Ramsey, A. Argyraki / The Science of the Total Enuironment 198 (1997) 243-257

W Xouaslow Heath dte- Pb la soil

m

0 30 60 90 120 150 180 210 240 ! I 1 I

0 -0 0

3010 +

1 60 0

i

0

90 0 0

120 0 0

i

,d lo

1

0

la/O 0

I 210 4 0 0

0

0

0

0

0

0

0

0

0

0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

0

0

Fig. 5. Geochemical maps of the lead contamination in an area of Hounslow Health, using a threshold value of 500 pg/g and either the (a) deterministic or (b) probabilistic classification. The deterministic approach shows a few isolated location of contamination over the threshold, whereas the probabilistic map (b) shows that almost the entire area is possibly contaminated over this same threshold, because of the very high measurement.

4.1. &ample of probabilistic classification

An area, 270 m*, of Hounslow Heath in West The probabilistic assessment, using Method 1, London, UK, was sampled to a depth of 10 cm required that 10 duplicate samples were taken, using a regular 30-m grid, by the same methods and the resulting estimates of uncertainty are described for the Derbyshire site. Lead concen- shown in Table 4. The very high level of uncer- tration in the soil in this site are due to its use in tainty (83.6%) reflects the extreme small-scale the past as a landfill site for domestic and possibly spatial variation of lead contamination on the some industrial wastes. If this land were to be site. The probabilistic classification boundaries considered hypothetically for residential develop- can be calculated using Eqs. 2 and 3. The concen- ment, the lead concentrations could be classified tration equal to the upper limit of the category by the usual deterministic method using the ‘Uncontaminated’ (C = T - U, equivalent to C + ICRCL trigger threshold for domestic use of 500 U = T) is 82 pg/g and the lower limit of ‘Con- pg/g (Fig. 5a). It should be noted however that taminated’ (C = T + U, equivalent to C - U = T) as a designated ‘green belt’ area, this land would is 918 pg/g. These concentration limit values are not be available for housing development. A sim- calculated for 95% confidence, but changes to the plistic interpretation of such a map would be that coverage factor in the calculation of U% from the area appears broadly uncontaminated, with a Eq. 1 could be used to specify a different level for few (i.e. 8/100) small ‘hot spots’ that could appar- the acceptable level of misclassification.

ently be avoided for the purposes of development by locating houses in the areas with Pb concen- trations < 500 pg/g.

M.R. Ramsey, A. Atgyraki / The Science of the Total Environment 198 (1997) 243-257

Table 4 Estimates of uncertainty made by Method 1 for a smvey of part ofHounslow Heath

255

Sampling design

Regular grid

Mean Pb StOtd robust robust ( s/d (CL/d

291.3 218

%ea, robust ( P/d

124.3

u = %cas U% = 2oos,,,,/x

248.6 83.6%

When the map of contamination is redrawn using this probabilistic classification (Fig. 5b), then a very different result is evident. Virtually all of this area is at least ‘Possibly contaminated’ and only nine sporadic sites are classified as ‘Uncon- taminated’. Given the 95% confidence interval used for this classification, we would expect two or three sites (i.e. 2.5%) to be erroneously classi- fied as uncontaminated. In effect there are no groups of sites that could be reliably be classified as uncontaminated in the survey area.

4.2. Implications of probabilistic classification

The use of the probabilistic classification can have large financial implications for the treat- ment of contaminated land. For this example on Hounslow Heath, the specification of 95% confi- dence for the classification of the land as ‘con- taminated’ may mean that the cost of remediation may be far higher than if the traditional de- terministic classification were used. However, the use of the deterministic method would lead to a high probability of some contaminated areas be- ing misclassified as uncontaminated and therefore not being treated. The traditional approach could therefore have even greater financial implications than the probabilistic, in the longer term with respect to legal liability.

The advantages of the probabilistic approach are:

1. A reduced risk of misclassification of land (e.g. cost savings on either unnecessary reme- diation, or overlooking the need for remedia- tion>

2. Variable probabilities of misclassification can be incorporated, to reflect the particular cir- cumstances of the site (e.g. intended land use).

Cost benefit analysis can be employed on the design of the sampling survey. For example in assessing the effectiveness of taking compos- ite samples for reducing uncertainty. Uncertainty estimates can also be incor- porated in environmental risk assessment to give uncertainties in estimated risks. Facilitates identification of causes of mea- surement uncertainty and indicates where im- provements need to be made (e.g. in sampling rather than in the analytical method)

Disadvantages of the probabilistic classifica- tion:

1. More land will become classified as ‘possibly contaminated’ than is considered ‘con- taminated’ under the deterministic method. Such areas can be re-sampled by a method with lower uncertainty, to clarify its true sta- tus.

2. Increased cost of initial survey, typically + 10% to + 20% for duplicate samples (i.e. for Method 1). The measurement of uncer- tainty has a cost, but the information gained will be more useful than unqualified measure- ments of concentration.

3. Increased complexity in explaining the survey findings. However, even the general public is becoming more familiar with measurements expressed in a probabilistic way, in some weather forecasts for example.

4. The estimates of uncertainty made for sam- pling methods are site specific, and are not generally applicable, as would be the case for uncertainty from analytical methods. Given that the classification of contaminated land is moving towards a more site-specific approach this may not be a serious limitation.

256 M.R Ramsey, A. ArgVraki /The Science of the Total Environment 198 (1997) 243-257

In the longer term there will be a need for quality assurance systems to ensure that organisa- tions engaged in field sampling provide unbiased estimates of measurement uncertainty, as well as for the concentration of contaminants. Sampling proficiency tests could be used to fuhil this role by the evaluation of both concentration and un- certainty estimates.

This paper has addressed uncertainty in con- centration estimates at particular sampling loca- tions. It has not discussed the uncertainty of concentration estimates between sampling points, which require geostatistical techniques, such as kriging [17] or disjunctive kriging [18], a technique that estimates the risk that the real value at a site exceeds a prescribed threshold. However, the es- timates of uncertainty at sampling points made by the methods described here could in principle, be fed into geostatistical methods to improve esti- mates of uncertainty between the sampling points.

5. Acceptable levels of uncertainty: fitness for purpose

Once the uncertainty of measurements has been estimated for a particular survey, one ques- tion that arises is whether that level of uncer- tainty is acceptable. This question has barely been addressed for any application except the account- ing of nuclear materials [191. In analytical quality control for example, a target for analytical preci- sion in trace determinations (e.g. 10%) is com- mon practise. Such targets are rarely based, how- ever, on a rigorous evaluation of the precision required for the interpretation of the analytical results. When the requirement is for the spatial interpretation of concentration measurements, it has been suggested [ll] that the limit for mea- surement variance, and therefore uncertainty, is given by:

When this proportion is exceeded, then the identification of ‘hot spots’ on contaminated land tends to become progressively less reliable. Such a level is related to the variability of the environ-

ment under study (e.g. the degree of contamina- tion) rather than using a level fixed in relative terms (e.g. 10% RSD). No matter how large the measurement uncertainty becomes, however, it is always possible to use the estimated uncertainty values. These can be used to test whether one location is contaminated more than another loca- tion (or more than a threshold value), using tradi- tional statistical techniques. In that sense there- fore, there is no universal limit to the acceptable level of uncertainty. In retrospect, however, it may prove that the uncertainty of one survey prevents the quantification of a particular objec- tive, such as the delineation of a contamination ‘hot spot’. In this case a second survey would be required which would had the value of uncer- tainty reduced to an appropriate level. This ac- ceptable level of uncertainty needs to be calcu- lated from the results of the first survey.

6. Conclusions

1. Methods have been devised for estimating measurement uncertainty due to both the sampling and chemical analysis of contami- nated land. The simplest method uses dupli- cated samples taken at a small proportion of locations by one sampler, whereas the most sophisticated method uses inter-organisa- tional sampling trials with different sampling protocols.

2. Values of uncertainty found from preliminary studies of lead at one location in Derbyshire, are large, ranging from 50% to 72%. This uncertainty can be reduced to 31% for a single sampler by using composite (i.e. bulked) samples. Small-scale heterogeneity appears to be the dominant cause of the uncertainty at this site. This also illustrates how the ap- proach can be used to quantify improvements to sampling protocols and to evaluate differ- ences between them.

3. The uncertainty of the estimated mean con- centrations across this site vary from 16.1% for single samplers using a design with around

M.R. Ramsey, A. Apyraki / The Science of the Total Environment 198 (1997) 243-257 251

40 sampling increments taken at locations across the site, to 55.4% for multiple sam- plers using a design with only 17 sampling increments. Probabilistic classification of contaminated land can be made using the estimates of measurement uncertainty. This approach de- fines four classes of contamination including ‘possibly contaminated’ in which the mea- sured concentration at a location is below a statutory threshold, but the uncertainty inter- val exceeds the threshold. The probabilistic classification has several ad- vantages over traditional deterministic classi- fication of contaminated land including the setting of a specified probability of misclassi- fication which can be related to the financial risks involved. The general principles of these methods should be applicable to the sampling of any environmental media (e.g. sediment, water, atmospheres).

Acknowledgements

The authors thank the following people for participating in the inter-organisational sampling trials: Paul Nathanail of Nottingham Trent Uni- versity, Judith Nathanail of Delta Environmental; Malcolm J. Brown of the British Geological Sur- vey; Matt Hill of the Department of Environmen- tal Sciences, University of Bradford; Joy Rae of the Postgraduate Institute for Sedimentology, University of Reading; Peter Abrahams of the Department of Geography, University College of Wales; Janet Cotter-Howells of the Department of Plant and Soil Science, University of Ab- erdeen, and Aradhana Mehra of the School of Environmental and Applied Sciences, University of Derby. We would also like to thank Richard Wood and his staff at the London Borough of Hounslow for their support.

References

[I] ISO, Guide to the expression of uncertainty in measure- ment, ISO, Geneva, 1993.

l21

[31

[41

El

l61

t71

Bl

191

t101

Ml

l121

[I31

t141

l151

1161

I171

[191

Eurochem, Quantifying uncertainty in analytical mea- surement, Eurochem Secretariat, PO Box 46, Tedding- ton, Middlesex, UK, TWll ONH, 1995, pp. 87. Analytical Methods Committee, Uncertainty of mea- surements: implications of its use in analytical science. Analyst, 120 (1995) 2303-2308. M.H. Ramsey, A. Argyraki and M. Thompson, On the collaborative trial in sampling analyst. 120 (1995bl 2309-2317. M.H. Ramsey, Sampling and analytical quality control (SAX) for improved error estimation in the measure- ment of heavy metals in the environment, using robust analysis of variance. Appl. Geochem., 2 (1993) 149-153. M.H. Ramsey, A. Argyraki and M. Thompson, Bstima- tion of sampling bias between different sampling proto- cols on contaminated land. Analyst, 120 (1995al 1353-1356. A. Argyraki, M.H. Ramsey and M. Thompson, Profi- ciency testing in sampling: pilot study on contaminated land. Analyst, 120 (199.5) 2799-2804. M. Thompson, Uncertainty in an uncertain world. Ana- lyst, 120 (1995) 117N-118N. M. Thompson, and R.J. Howarth, Duplicate analysis in geochemical practice. Analyst, 101 (1976) 690-698. R.G. Garrett, The determination of sampling and ana- lytical errors in exploration geochemistry. Econ. Geol., 64 (1969) 568-569. M.H Ramsey, M. Thompson and M. Hale, Objective evaluation of precision requirements for geochemical analysis using robust analysis of variance. J. Geoch. Explo., 44 (19921 23-36. M.H. Ramsey, M. Thompson and E.K. Banerjee, A realistic assessment of analytical data quality from in- ductively coupled plasma atomic emission spectrometry. Analyt. Proc., 24 (1987) 260-265. M. Thompson and M.H. Ramsey, Quality concepts and practices applied to sampling - an exploratory study. Analyst, 120 (1995) 261-270. J. Maskall and I. Thornton, (1993) Metal contamination of soils at historical lead smelting sites. Land Contam. Reclam., 1 (1993) 92-100. Analytical Methods Committee, Robust statistics - how not to reject outliers - Part 2. Inter-laboratory trials. Analyst, 114 (1989) 169991705. UK Department of the Environment, Interdepartmental committee on the redevelopment of contaminated land. Guidance on the Assessment and Redevelopment of Contaminated Land, DOE, London, 1987. E.H. Isaaks and R.M. Srivastava, An Introduction to Applied Geostatistics, Oxford University Press, Oxford, 1989, pp. 489. J. Rivoirand, Introduction to Disjunctive Kriging and Non-linear Geostatistics, Gxford University Press, Qx- ford, 1994. De Bievre, Target values for uncertainty components in fissile elements and isotope assay. ESARDA Bull., 6 (1983) l-5.