Science of the Total EnvironmentApplied geochemistry and environmental sciences invariably deal with...

15
The concept of compositional data analysis in practice Total major element concentrations in agricultural and grazing land soils of Europe Clemens Reimann a, , Peter Filzmoser b , Karl Fabian a , Karel Hron c , Manfred Birke d , Alecos Demetriades e , Enrico Dinelli f , Anna Ladenberger g and The GEMAS Project Team 1 a Geological Survey of Norway, PO Box 6315 Sluppen, N-7491 Trondheim, Norway b Institute for Statistics and Probability Theory, Vienna University of Technology, Wiedner Hauptstrasse 810, A-1040 Wien, Austria c Department of Mathematical Analysis and Applications of Mathematics, Palacký University, Faculty of Science, Listopadu 12, CZ-77146 Olomouc, Czech Republic d Federal Institute for Geosciences and Natural Resources (BGR), Branch ofce Berlin, Wilhelmstr. 2530, D-13593 Berlin, Germany e Institute of Geology and Mineral Exploration, Entrance C, Olympic Village, Acharnae, Athens, GR-13677, Greece f University of Bologna, Department of Earth Science, Piazza di Porta San Donato 1, I-40126 Bologna, Italy g Geological Survey of Sweden (SGU), Box 670, S-751 28 Uppsala, Sweden abstract article info Article history: Received 20 December 2011 Received in revised form 14 February 2012 Accepted 14 February 2012 Available online 12 April 2012 Keywords: Agricultural soil XRF Major elements Europe Geochemistry Compositional data Applied geochemistry and environmental sciences invariably deal with compositional data. Classically, the original or log-transformed absolute element concentrations are studied. However, compositional data do not vary indepen- dently, and a concentration based approach to data analysis can lead to faulty conclusions. For this reason a better statistical approach was introduced in the 1980s, exclusively based on relative information. Because the difference between the two methods should be most pronounced in large-scale, and therefore highly variable, datasets, here a new dataset of agricultural soils, covering all of Europe (5.6 million km 2 ) at an average sampling density of 1 site/2500 km 2 , is used to demonstrate and compare both approaches. Absolute element concentrations are certainly of interest in a variety of applications and can be provided in tabulations or concentration maps. Maps for the opened data (ratios to other elements) provide more specic additional information. For compositional data XY plots for raw or log-transformed data should only be used with care in an exploratory data analysis (EDA) sense, to detect unusual data behaviour, candidate subgroups of samples, or to compare pre-dened groups of samples. Correlation analysis and the Euclidean distance are not mathematically meaningful concepts for this data type. Element relationships have to be investigated via a stability measure of the (log-)ratios of elements. Logratios are also the key ingredient for an appropriate multivariate analysis of compositional data. © 2012 Elsevier B.V. All rights reserved. 1. Introduction Geochemistry aims to quantitatively determine the chemical composition of the Earth and its parts and to discover the factors that control the distribution of individual elements (Goldschmidt, 1937, 1954). Geochemical studies need to be carried out from the atomic to the continental and nally global (some may argue cosmic) scale (for discussions of scale see: Darnley et al., 1995; Reimann et al., 2009, 2010) to meet these aims. Geochemical data are usually reported as concentrations in units of mg/kg or weight percent (wt.%) and are thus a classical example of compositional (closed) data (CoDa Aitchison, 1986). If all chemical elements in a sample are analysed, the analytical results sum up to a constant (1,000,000 mg/kg or 100 wt.%). Thus no single variable is free to vary separately from the rest of the total composition. Even if not all chemical elements are analysed, the total element concentra- tions still depend on each other. The relevant information for each single variable in a geochemical dataset thus lies in the ratios between all variables and not in the measured element concentrations as such. An interpretation and statistical evaluation of the observed concentra- tion values is only meaningful if the relationship to the values of the remaining variables is taken into account (Aitchison, 1986; Filzmoser et al., 2010). It could hence be argued that a multi-element geochemical Science of the Total Environment 426 (2012) 196210 Corresponding author. Tel.: + 47 73 904 307. E-mail address: [email protected] (C. Reimann). 1 S. Albanese, M. Andersson, A. Arnoldussen, R. Baritz, M.J. Batista, A. Bellan, D. Cicchella, B. De Vivo, W. De Vos, M. Duris, A. Dusza-Dobek, O.A. Eggen, M. Eklund, V. Ernstsen, T.E. Finne, D. Flight, S. Forrester, M. Fuchs, U. Fugedi, A. Gilucis, M. Gosar, V. Gregorauskiene, A. Gulan, J. Halamić, E. Haslinger, P. Hayoz, G. Hobiger, R. Hoffmann, J. Hoogewerff, H. Hrvatovic, S. Husnjak, L. Janik, C.C. Johnson, G. Jordan, J. Kirby, J. Kivisilla, V. Klos, F. Krone, P. Kwecko, L. Kuti, A. Lima, J. Locutura, P. Lucivjansky, D. Mackovych, B.I. Malyuk, R. Maquil, M.J. McLaughlin, R.G. Meuli, N. Miosic, G. Mol, P. Négrel, P. O'Connor, K. Oorts, R. T. Ottesen, A. Pasieczna, V. Petersell, S. Peiderer, M. Poňavič, C. Prazeres, U. Rauch, . Salpeteur, A. Schedl, A. Scheib, I. Schoeters, P. Sefcik, E. Sellersjö, F. Skopljak, I. Slaninka, A. Šorša, R. Srvkota, T. Stalov, T. Tarvainen, V. Trendavilov, P. Valera, V. Verougstraete, D. Vidojević, A.M. Zissimos, Z. Zomeni. 0048-9697/$ see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.scitotenv.2012.02.032 Contents lists available at SciVerse ScienceDirect Science of the Total Environment journal homepage: www.elsevier.com/locate/scitotenv

Transcript of Science of the Total EnvironmentApplied geochemistry and environmental sciences invariably deal with...

Page 1: Science of the Total EnvironmentApplied geochemistry and environmental sciences invariably deal with compositional data. Classically, the original ... statistical approach was introduced

Science of the Total Environment 426 (2012) 196–210

Contents lists available at SciVerse ScienceDirect

Science of the Total Environment

j ourna l homepage: www.e lsev ie r .com/ locate /sc i totenv

The concept of compositional data analysis in practice — Total major elementconcentrations in agricultural and grazing land soils of Europe

Clemens Reimann a,⁎, Peter Filzmoser b, Karl Fabian a, Karel Hron c, Manfred Birke d, Alecos Demetriades e,Enrico Dinelli f, Anna Ladenberger g

and The GEMAS Project Team 1

a Geological Survey of Norway, PO Box 6315 Sluppen, N-7491 Trondheim, Norwayb Institute for Statistics and Probability Theory, Vienna University of Technology, Wiedner Hauptstrasse 8–10, A-1040 Wien, Austriac Department of Mathematical Analysis and Applications of Mathematics, Palacký University, Faculty of Science, Listopadu 12, CZ-77146 Olomouc, Czech Republicd Federal Institute for Geosciences and Natural Resources (BGR), Branch office Berlin, Wilhelmstr. 25–30, D-13593 Berlin, Germanye Institute of Geology and Mineral Exploration, Entrance C, Olympic Village, Acharnae, Athens, GR-13677, Greecef University of Bologna, Department of Earth Science, Piazza di Porta San Donato 1, I-40126 Bologna, Italyg Geological Survey of Sweden (SGU), Box 670, S-751 28 Uppsala, Sweden

⁎ Corresponding author. Tel.: +47 73 904 307.E-mail address: [email protected] (C. Reim

1 S. Albanese, M. Andersson, A. Arnoldussen, R. BarCicchella, B. De Vivo, W. De Vos, M. Duris, A. Dusza-DoErnstsen, T.E. Finne, D. Flight, S. Forrester, M. Fuchs, U.Gregorauskiene, A. Gulan, J. Halamić, E. Haslinger, P. HayHoogewerff, H. Hrvatovic, S. Husnjak, L. Janik, C.C. JohnsoV. Klos, F. Krone, P. Kwecko, L. Kuti, A. Lima, J. LocuturaB.I.Malyuk, R.Maquil, M.J.McLaughlin, R.G.Meuli, N.MiosK. Oorts, R. T. Ottesen, A. Pasieczna, V. Petersell, S. PfleidRauch, . Salpeteur, A. Schedl, A. Scheib, I. Schoeters, P. SSlaninka, A. Šorša, R. Srvkota, T. Stafilov, T. TarvainenVerougstraete, D. Vidojević, A.M. Zissimos, Z. Zomeni.

0048-9697/$ – see front matter © 2012 Elsevier B.V. Alldoi:10.1016/j.scitotenv.2012.02.032

a b s t r a c t

a r t i c l e i n f o

Article history:Received 20 December 2011Received in revised form 14 February 2012Accepted 14 February 2012Available online 12 April 2012

Keywords:Agricultural soilXRFMajor elementsEuropeGeochemistryCompositional data

Applied geochemistry and environmental sciences invariably deal with compositional data. Classically, the originalor log-transformed absolute element concentrations are studied. However, compositional data do not vary indepen-dently, and a concentration based approach to data analysis can lead to faulty conclusions. For this reason a betterstatistical approach was introduced in the 1980s, exclusively based on relative information. Because the differencebetween the two methods should be most pronounced in large-scale, and therefore highly variable, datasets, herea new dataset of agricultural soils, covering all of Europe (5.6 million km2) at an average sampling density of1 site/2500 km2, is used to demonstrate and compare both approaches. Absolute element concentrations arecertainly of interest in a variety of applications and can be provided in tabulations or concentration maps. Mapsfor the opened data (ratios to other elements) provide more specific additional information. For compositionaldata XY plots for raw or log-transformed data should only be used with care in an exploratory data analysis(EDA) sense, to detect unusual data behaviour, candidate subgroups of samples, or to compare pre-defined groupsof samples. Correlation analysis and the Euclideandistance are notmathematicallymeaningful concepts for this datatype. Element relationships have to be investigated via a stability measure of the (log-)ratios of elements. Logratiosare also the key ingredient for an appropriate multivariate analysis of compositional data.

© 2012 Elsevier B.V. All rights reserved.

1. Introduction

Geochemistry aims to quantitatively determine the chemicalcomposition of the Earth and its parts and to discover the factors thatcontrol the distribution of individual elements (Goldschmidt, 1937,

ann).itz, M.J. Batista, A. Bel‐lan, D.bek, O.A. Eggen, M. Eklund, V.Fugedi, A. Gilucis, M. Gosar, V.oz, G. Hobiger, R. Hoffmann, J.n, G. Jordan, J. Kirby, J. Kivisilla,, P. Lucivjansky, D. Mackovych,ic, G.Mol, P. Négrel, P. O'Connor,erer, M. Poňavič, C. Prazeres, U.efcik, E. Sellersjö, F. Skopljak, I., V. Trendavilov, P. Valera, V.

rights reserved.

1954). Geochemical studies need to be carried out from the atomic tothe continental and finally global (some may argue cosmic) scale (fordiscussions of scale see: Darnley et al., 1995; Reimann et al., 2009,2010) to meet these aims.

Geochemical data are usually reported as concentrations in units ofmg/kg or weight percent (wt.%) and are thus a classical example ofcompositional (closed) data (CoDa — Aitchison, 1986). If all chemicalelements in a sample are analysed, the analytical results sum up to aconstant (1,000,000 mg/kg or 100 wt.%). Thus no single variable isfree to vary separately from the rest of the total composition. Even ifnot all chemical elements are analysed, the total element concentra-tions still depend on each other. The relevant information for eachsingle variable in a geochemical dataset thus lies in the ratios betweenall variables and not in the measured element concentrations as such.An interpretation and statistical evaluation of the observed concentra-tion values is only meaningful if the relationship to the values of theremaining variables is taken into account (Aitchison, 1986; Filzmoseret al., 2010). It could hence be argued that amulti-element geochemical

Page 2: Science of the Total EnvironmentApplied geochemistry and environmental sciences invariably deal with compositional data. Classically, the original ... statistical approach was introduced

197C. Reimann et al. / Science of the Total Environment 426 (2012) 196–210

dataset should only be analysed in multivariate space, without evenconsidering the univariate case. However, a careful univariate dataanalysis has always been the starting point of statistical analyses of re-gional geochemical datasets (Reimann et al., 2008). This is a reasonableapproach because it helps to better understand the behaviour of thedata before more sophisticated multivariate techniques are applied.For example, the very aim of a regional geochemical mapping projectis to study and predict the distribution (concentration) of a chemicalelement in two-dimensional space. Such maps have been successfullyused to aid geological mapping, for mineral exploration, for document-ing contamination, and for detecting amultitude of additional processesthat determine the distribution of chemical elements at the Earth'ssurface. It will be hard to convince a regional geochemist that all thesemaps are “wrong” and that dimensionless ratio maps (which ratio?)are the only correct maps. Thus, while many solutions to the closureproblem exist for multivariate data analysis (e.g., Aitchison andGreenacre, 2002; Buccianti and Pawlowsky-Glahn, 2005; Bucciantiet al., 2006; Egozcue and Pawlowsky-Glahn, 2011; Filzmoser andHron, 2008; Filzmoser et al., 2009b; Hron et al., 2010; Otero et al.,2005; Pawlowsky-Glahn and Buccianti, 2002, 2011; Tolosana-Delgadoand van den Boogaart, 2011; von Eynatten et al., 2003), a sensible ap-proach to univariate and bivariate data analysis of compositional data,satisfying the statistician as well as the geochemist, is still under devel-opment (Filzmoser et al., 2009a, 2010).

The administration of the new European REACH (Registration, Eval-uation and Authorisation of Chemical — EC, 2006) regulation, whichcame into force on the 1st of June, 2007, requires knowledge about“soil quality” at the European scale. The GEMAS (Geochemical mappingof agricultural and grazing land soils) project, a cooperation projectbetween EuroGeoSurveys and Eurometaux, aims at providing suchdata for Europe. Samples of agricultural soil and of soil under perma-nent grass cover were taken during 2008 at an average density of1 site/2500 km2, covering the member states of the European Union(exception Malta and Romania) and several neighbouring countries(e.g., Norway, Serbia, Ukraine). In total, over 4000 sampleswere collect-ed, prepared and analysed (see also Reimann et al., 2012a). The totalconcentrations of the major elements (Al2O3, CaO, Fe2O3, K2O, MgO,MnO, Na2O, P2O5, SiO2, and TiO2, plus Loss on Ignition (LOI)) in thesoil samples, reported for the GEMAS project analysed by X-ray fluores-cence spectrometry (XRF— LOI gravimetric) are a “classical” example ofa “closed” dataset. This dataset is used here:

(1) To report the concentration of major elements in Europeanagricultural soils.

(2) To study the regional distribution of the major elements in orderto better understand the processes governing the distribution ofchemical elements in European agricultural soils and theirrelative importance at the continental scale.

(3) To investigate effects of data closure and to understand whichevaluation procedures may be applied to such data, and whichshould be avoided.

(4) To compare alternative data analysis techniques to the classicalway of treating geochemical data.

(5) To further develop recommendations for the uni-, bi- and multi-variate investigation of compositional datasets.

1.1. The survey area

Maps covering topography and land use for Europe can be found inalmost any atlas. A number of maps covering different themes at aboutthe scale of the GEMAS project (topography, geology, tectonics, faultand fracture zones, distribution of different rock types, distribution ofthe main sedimentary basins, precipitation and population density)are collected in Reimann and Birke (2010). Fig. 1 shows a simplifiedgeological map including the main geological structures discussed inthis paper. For Europe an excellent source of land use information is

the CORINE land cover map of Europe (GLC2000 database, 2003). Adetailed geological map of Europe is provided by Asch (2003), and con-cise descriptions of the geology of Europe can be found in Ziegler(1990), Blundell et al. (1992) and McCann (2008). The soil atlas ofEurope provides a wealth of information on European soils, but alsocontains maps of average precipitation, temperature, land use, popula-tion density, extent of the last glaciation, and soil texture (Jones et al.,2005).

2. Material and methods

2.1. Project background and sampling

GEMAS is a cooperation project between the Geochemistry ExpertGroup of EuroGeoSurveys (EGS) and Eurometaux. The GEMAS projectaims to produce consistent soil geochemistry data at the continentalscale in accordance to REACH (EC, 2006) requirements. REACH specifiesthat industry must prove that it can produce and handle its substancessafely. Risks due to the exposure to a substance during production anduse at the local, regional and European scale all need to be assessed. In-dustries handlingmetals needed harmonised data on the natural distri-bution of chemical elements, and of soil properties governing metalavailability in soils at the continental scale. REACH requires that risk as-sessment is performed according to land use. The GEMAS project fo-cused on agricultural soils from arable and grazing land, both linkedto the human food chain. According to REACH the sample depth shouldbe 0–20 cm for agricultural soils (arable land, Ap-horizon) and 0–10 cmfor grazing land soils (land under permanent grass cover) and theb2 mm grain size is the fraction to be analysed. With the exception ofthe sample density, the sampling requirements were thus rigidly fixedby external requirements.

With regard to sample density it was decided to follow the example ofan earlier project, covering Northern Europe (the Baltic Soil Survey:Reimann et al., 2003) and to sample one site per 2500 km2 (50×50 kmgrid). The grid cells were centrally provided, but the sample teams werefree to decide where in a grid cell the two samples of agricultural andgrazing land soil were taken. Sample materials and especially the bagsused for storing the samples were centrally provided to all field teams.

Samples were taken as composites from 5 sites spread over a ca.100 m2 area in a large agricultural field (Ap-sample) and on landunder permanent grass cover (Gr sample). The average weight of asample was 3.5 kg. It was attempted to find sample sites for the Apand Gr samples in as close proximity as possible. The average distancebetween the two sites is 500 m, but, depending on land use, single sam-ple pairs, where the sites are more than 50 km apart do occur. All sitesand the soil profile at any one site were documented in a series of pho-tographs. Field procedures are detailed in a field handbook which isfreely available on the internet (EGS, 2008). For quality controlpurposes, a field duplicate was taken at every 20th sample site withan offset distance of ca. 10–20 m from the original sample site.

2.2. Sample preparation

All sampleswere prepared in a central laboratory (Geological Surveyof the Slovak Republic). The samples were air dried and sieved to pass a2 mm nylon screen. All samples were then randomised and analyticalduplicates and project standards were introduced at a rate of 1 in 20.All samples were then split into ten aliquots using a Jones Riffle splitter.Four splits of ~200 g each are stored for future reference, and 6 splits of50–100 g each were sent to the different contract laboratories for theimmediate analytical work. For analysis by XRF the samples needed tobe milled prior to further sample preparation. One of the small samplesplits was milled to less than 63 μm in an agate disc mill at BGR'slaboratory in Germany. Loss on Ignition (LOI) was then determined onall samples via slowly heating to 1030 °C, keeping them at this temper-ature for 15 min in a muffle furnace, letting them cool to room

Page 3: Science of the Total EnvironmentApplied geochemistry and environmental sciences invariably deal with compositional data. Classically, the original ... statistical approach was introduced

Fig. 1. Simplified geological map of the survey area (modified from Reimann et al., 2012b). AM: Armorican Massif, BF: Black Forest, BM: Bohemian Massif, Co: Cornwall, H: Harz,IV: Iberian Variscides, ISC: Irish-Scottish Caledonides, MC: Massif Central, RS: Rhenish Slate Mountains, SC: Scandinavian Caledonides, V: Vosges. TESZ: Trans European Suture Zone.

198 C. Reimann et al. / Science of the Total Environment 426 (2012) 196–210

temperature in a dessicator and reporting the weight loss. For soilsamples with a LOI b25% 1 g of sample was mixed with 5 g lithiummetaborate and 25 mg lithium bromide in Pt95-Au5 crucibles andfused for 20 min at 1200 °C in an automatic fluxer (HAG 12–1500).For soil samples with LOI >25% a mixture of 1 g of sample with 2.5 glithium metaborate and 2.5 g lithium tetraborate was used for fluxing.

2.3. Analysis

Total concentrations of the 10major elements reported here (Al2O3,CaO, Fe2O3, K2O, MgO, MnO, Na2O, P2O5, SiO2 and TiO2) were deter-mined by wavelength dispersive X-ray fluorescence spectrometry(WD-XRFS) using PAN2400 and AXIOS WD-XRFs with Cr- andRh-anode X-ray tubes, respectively. To correct for matrix effects andspectral interferences calibration curves were constructed using 130certified reference materials.

2.4. Quality control

Quality control (QC) was based on (a) a field duplicate taken at arate of 1 in 20 samples, (b) an analytical replicate produced from eachfield duplicate and (c) the frequent (1 in 20) insertion of a project

standard. Results of QC are documented in a reportwhich is freely avail-able on the internet (Reimann et al., 2011b). Precision for all elements/parameters reported here is better than 3%.

2.5. Data analysis

Data analysis andmap plottingwere carried out in R, an open sourcesoftware, which can be freely downloaded from the CRAN server athttp://cran.r-project.org. The R scripts used for producing the graphicsin this article can be found in the supplementary material. All resultsfrom the GEMAS project will be published in form of a book in 2013.All project data will accompany that book in the form of excel files onan attached CD-ROM.

3. Results and discussion

3.1. Tabulation of statistical distribution measures

Table 1 summarises the analytical results. In an attempt to providevalid measures for compositional data the table was built aroundpercentiles (quantiles of the distribution). Although percentiles of thedistribution are not influenced by a log-transformation, they can change

Page 4: Science of the Total EnvironmentApplied geochemistry and environmental sciences invariably deal with compositional data. Classically, the original ... statistical approach was introduced

Table 1Summary statistics, major elements, agricultural soil (Ap-samples, 0–20 cm, b2 mm fraction, N=2108) and grazing land soil (Gr samples, 0–10 cm, b2 mm, N=2024). All analyticalresults in weight percent (wt.%). Analysis of oxides by WD-XRF, LOI: gravimetric. Mat.: material, DL: detection limit; Min: minimum; Q: quantiles (Q50=median); Max: maximum;MAD: median absolute deviation,”.log”: for the log-transformed values, “.ilr”: for the ilr tansformed results; powers: orders of magnitude variation.

Oxide Mat. DL Min Q2 Q5 Q10 Q25 Q50 Q75 Q90 Q95 Q98 Max MAD.log MAD.ilr Powers

Al2O3 Ap 0.05 0.37 1.98 2.93 4.1 6.87 10.4 12.8 14.5 15.8 17.2 27.2 0.17 0.277 1.9Gr 0.05 0.29 1.47 2.42 3.57 6.44 9.8 12.6 14.6 15.8 17.2 26.7 0.191 0.312 2

CaO Ap 0.005 0.012 0.108 0.183 0.273 0.524 1.19 2.63 10.4 19.5 28.3 52.9 0.522 0.85 3.6Gr 0.005 0.007 0.104 0.168 0.265 0.489 1.07 2.51 9.76 18.8 27.4 50.8 0.525 0.856 3.9

Fe2O3 Ap 0.01 0.12 0.501 0.77 1.19 2.22 3.51 4.99 6.13 6.92 8.45 22.1 0.252 0.41 2.3Gr 0.01 0.13 0.455 0.73 1.14 2.07 3.43 4.96 6.19 7.06 8.36 16.4 0.27 0.44 2.1

K2O Ap 0.005 0.029 0.463 0.684 0.913 1.37 1.91 2.41 2.94 3.37 4.09 9.54 0.177 0.289 2.5Gr 0.005 0.029 0.345 0.541 0.787 1.25 1.78 2.32 2.89 3.4 4.13 6.03 0.195 0.317 2.3

MgO Ap 0.01 0.02 0.08 0.13 0.23 0.5 0.91 1.56 2.46 3.3 4.86 21 0.36 0.587 3Gr 0.01 0.02 0.07 0.12 0.2 0.45 0.86 1.46 2.32 3.1 4.71 24.1 0.371 0.604 3.1

MnO Ap 0.001 0.005 0.015 0.022 0.03 0.049 0.078 0.112 0.161 0.204 0.254 2.25 0.261 0.425 2.7Gr 0.001 0.004 0.011 0.017 0.026 0.045 0.074 0.114 0.158 0.202 0.257 0.708 0.289 0.471 2.2

Na2O Ap 0.01 0.01 0.07 0.12 0.207 0.44 0.81 1.69 2.51 2.82 3.21 4.58 0.423 0.689 2.7Gr 0.01 b0.01 0.06 0.11 0.17 0.38 0.71 1.36 2.17 2.6 3.04 3.94 0.414 0.674 2.9

P2O5 Ap 0.001 0.014 0.064 0.08 0.099 0.132 0.18 0.253 0.342 0.429 0.566 1.01 0.209 0.34 1.9Gr 0.001 0.02 0.051 0.07 0.0883 0.122 0.178 0.257 0.351 0.431 0.567 2.01 0.24 0.391 2

SiO2 Ap 0.1 2.46 22.4 35.6 46.5 58.9 67.2 76.8 85.9 89.5 92.2 96.5 0.0854 0.139 1.6Gr 0.1 1.51 17 30.9 42.5 56.1 64.8 74.6 84 88.5 91.5 97.8 0.0926 0.151 1.8

TiO2 Ap 0.001 0.015 0.127 0.185 0.261 0.407 0.604 0.766 0.898 0.994 1.18 4.1 0.184 0.299 2.4Gr 0.001 0.019 0.0995 0.149 0.227 0.39 0.595 0.757 0.895 1.01 1.18 3.96 0.194 0.317 2.3

Parameter MAT. DL Min Q2 Q5 Q10 Q25 Q50 Q75 Q90 Q95 Q98 Max MAD.log MAD.ilr PowersLOI Ap 0.01 1.46 2.59 3.11 3.91 5.87 8.57 12.8 20.5 27.4 36 95.3 0.251 0.409 1.8

Gr 0.01 1.03 2.73 3.98 5 7.74 11.6 16.6 25.9 33.2 49 93.9 0.247 0.402 2

199C. Reimann et al. / Science of the Total Environment 426 (2012) 196–210

under a logratio-transformation. Thus, already the validity of Table 1could be questioned, because it reports percentiles of compositionaldata. To improve this situation, it would be desirable to calculate thepercentiles of the logratio-transformed data, and to afterwards trans-form the results back to the original data scale for ease of interpretation.Unfortunately, such a back-transformation would only be unique if theconcentrations for all samples would sum up to 1, which is usually notthe case. An alternative is to work with all possible pairwise logratios,and derive their percentiles. However, having to consider all pairsmakes the analysis quite complex when many compositional parts areinvolved.

The information collected in Table 1 is needed if absolute valuesrather than relative ratios are of interest. Note that Table 1 does not pro-vide values for mean and standard deviation, which both are based onEuclidean distances. Compositional data, however, do not belong tothe classical Euclidean space, but need to be considered in their ownEuclidean geometry on the simplex (see Aitchison, 1986; Filzmoseret al., 2009a, 2010; Egozcue and Pawlowsky-Glahn, 2011), even forunivariate data analysis. Therefore, all classical statistical tests forcomparison of themean (median) of the two datasetswill deliver faultyresults because they are based on Euclidean distances. In terms oflogratio- (the logarithm of a ratio) transformation three differentapproaches to open the data are possible:

(a) an additive logratio (alr)-transformation (Aitchison, 1986),sacrificing one variable, e.g., TiO2 and presenting all other resultsas a logratio to TiO2 (but why to TiO2, different results must beexpected when another variable is sacrificed);

(b) a centred logratio (clr)-transformation (Aitchison, 1986) where,in order to construct the logratios, each variable is divided by thegeometric mean of all elements measured, followed by alog-transformation;

(c) an isometric logratio (ilr)-transformation (Egozcue et al., 2003)which has preferable geometrical properties for multivariatedata analysis butwhere the direct relation to the elements is lost.

In the following, the clr-transformation is applied to perform thestatistical analysis of compositional data. However, there exists a severedisadvantage of the clr-transformation. The resulting clr-variables havea certain information overlap because the geometric mean is used as a

common divisor. A scatter plot of a pair of clr-variables could thus beinterpreted in a misleading way. For this reason, in such plots only thesingle clr-variables will be interpreted later on, but not the relation be-tween them. It has been demonstrated that the single clr-variables areproportional to ilr-variables using a special class of ilr transformations(see Filzmoser et al., 2012). Each of these ilr-variables (and thus eachclr-variable) contains all the relative information of the correspondingelement to the remaining elements, and is, therefore, fully informativeabout the compositional information of the underlying element. Theilr-(clr-) variables are a statistically correct representation of a compo-sitional dataset as long as each of these variables is consideredseparately.

Table 2 shows the CoDa analysis equivalent to Table 1, the statisticalparameters for the clr-transformed variables. Each clr-variable inTable 2 contains all relative information of the studied element to theremaining elements. Since each clr-variable should be considered sepa-rately, any use of correlation analysis with clr-variables would not bemeaningful. These clr-variables are ratios and consequently dimension-less numbers without any obvious meaning to the geochemist. Thereexist no values to compare in the literature, and these ratios do not pro-vide the desired information on how much of an element occurs at agiven location in space. The variables do no longer provide absolutebut relative information. Median values and variances of the singleCoDa variables remain comparable. It is worth noting that, whenaccepting that one is looking at different and dimensionless numbers,Table 2 provides information which is not that different from Table 1.Table 2 shows how dominant an element is in the composition.Elements with high concentrations are also characterised by highclr(element)-values; even the sequence of elements remains thesame. Just like in Table 1, the differences observed between the twodatasets (Ap and Gr) are minimal. The median value of clr(Al2O3) of0.74 explains that on average the concentration of Al2O3 is 100.74=5.5times larger than the geometric mean of all concentrations (for easeof interpretation the log to the base 10 rather than the natural logarithmwas used for the clr-transformation). The median value clr(MnO) of−1.35, in contrast, signifies that MnO has an abundance of only 1/25of the geometric mean. The maximum value observed for clr(MnO) isstill lower than the minimum value for clr(Al2O3). The interquartilerange (IQR) and median absolute deviation (MAD) provide estimates

Page 5: Science of the Total EnvironmentApplied geochemistry and environmental sciences invariably deal with compositional data. Classically, the original ... statistical approach was introduced

Table 2Summary statistics, clr-transformed major elements, agricultural soil (Ap-samples, 0–20 cm, b2 mm fraction, N=2108) and grazing land soil (Gr samples, 0–10 cm, b2 mm,N=2024). All values are ratios, i.e. dimensionless. Mat.: material, Min: minimum; Q: quantiles (Q50=median); Max: maximum; MAD: median absolute deviation, IQR: interquar-tile range.

Oxide Mat. Min Q2 Q5 Q10 Q25 Q50 Q75 Q90 Q95 Q98 Max RANGE MAD IQR

clr(Al2O3) Ap 0.0101 0.46 0.542 0.609 0.687 0.747 0.809 0.88 0.939 1.02 1.31 1.3 0.0898 0.0904Gr −0.0228 0.405 0.508 0.578 0.672 0.742 0.807 0.891 0.952 1.01 1.24 1.27 0.101 0.0999

clr(CaO) Ap −1.7 −0.895 −0.717 −0.595 −0.408 −0.173 0.104 0.678 1.02 1.24 2.19 3.89 0.375 0.38Gr −1.88 −0.929 −0.733 −0.641 −0.428 −0.183 0.108 0.689 1.05 1.28 2.03 3.9 0.387 0.397

clr(Fe2O3) Ap −0.245 −0.0511 0.0256 0.0874 0.189 0.296 0.384 0.474 0.54 0.611 0.931 1.18 0.143 0.144Gr −0.29 −0.073 0.0155 0.0777 0.188 0.294 0.401 0.489 0.555 0.631 1.2 1.49 0.158 0.158

clr(K2O) Ap −1.07 −0.489 −0.3 −0.201 −0.0696 0.0563 0.171 0.27 0.339 0.459 0.826 1.9 0.178 0.178Gr −1.07 −0.548 −0.359 −0.238 −0.0934 0.031 0.154 0.267 0.35 0.463 0.815 1.88 0.184 0.183

clr(MgO) Ap −0.991 −0.839 −0.708 −0.617 −0.443 −0.284 −0.129 0.0162 0.151 0.299 1.11 2.11 0.234 0.233Gr −0.996 −0.833 −0.747 −0.641 −0.463 −0.296 −0.142 0.0126 0.131 0.327 1.24 2.24 0.238 0.238

clr(MnO) Ap −1.99 −1.73 −1.66 −1.59 −1.47 −1.35 −1.22 −1.07 −0.979 −0.873 −0.0619 1.93 0.187 0.187Gr −2.19 −1.8 −1.7 −1.61 −1.48 −1.35 −1.21 −1.07 −0.973 −0.856 −0.378 1.81 0.197 0.199

clr(Na2O) Ap −1.64 −1.21 −1.02 −0.786 −0.481 −0.269 −0.0788 0.0911 0.168 0.232 0.394 2.04 0.294 0.298Gr −1.8 −1.22 −1.07 −0.879 −0.553 −0.316 −0.129 0.0425 0.127 0.197 0.465 2.27 0.31 0.314

clr(P2O5) Ap −1.5 −1.29 −1.24 −1.18 −1.07 −0.955 −0.819 −0.671 −0.6 −0.48 −0.00426 1.5 0.185 0.184Gr −1.55 −1.37 −1.28 −1.2 −1.09 −0.95 −0.805 −0.662 −0.575 −0.404 0.243 1.79 0.206 0.208

clr(SiO2) Ap 0.857 1.11 1.22 1.31 1.42 1.55 1.75 2 2.14 2.28 2.79 1.94 0.238 0.246Gr 0.694 1.11 1.21 1.28 1.41 1.55 1.75 1.98 2.15 2.33 2.9 2.21 0.24 0.253

clr(TiO2) Ap −1.38 −0.824 −0.738 −0.668 −0.575 −0.468 −0.365 −0.287 −0.232 −0.125 0.333 1.72 0.156 0.156Gr −1.24 −0.878 −0.772 −0.693 −0.579 −0.469 −0.368 −0.274 −0.199 −0.111 0.226 1.47 0.157 0.156

Parameter Mat. Min Q2 Q5 Q10 Q25 Q50 Q75 Q90 Q95 Q98 Max RANGE MAD IQRclr(LOI) Ap 0.2 0.366 0.446 0.508 0.597 0.7 0.836 1.03 1.19 1.45 2.49 2.29 0.171 0.177

Gr 0.00242 0.448 0.544 0.606 0.707 0.825 0.982 1.19 1.38 1.69 2.49 2.49 0.204 0.204

200 C. Reimann et al. / Science of the Total Environment 426 (2012) 196–210

of the variance of the clr-transformed variables. The higher the variance,the higher is the influence of this variable on the multivariate dataensemble. Interestingly, due to this effect, clr(CaO) and clr(Na2O)have both more influence than the dominating element clr(SiO2). Thereason is that the relative concentrations of CaO and Na2O vary morethan that of SiO2 (see MAD or IQR in Table 2).

To better understand the effects of clr-transformation, the originalconcentration values are compared to the clr values of the samevariable. If opening the data would be a minor correction only, onewould expect a well-defined one-to-one relationship between concen-tration and clr values. By showing the plots for Al2O3, CaO, P2O5 and LOI(all plots are collected in the Supplementary material), Fig. 2 demon-strates clearly that no such well-defined relation exists. It would bevery difficult to predict the effect of opening the data on any one vari-able. In contrast to a widespread misconception, the non-uniquenessdoes not depend on the absolute concentration. In the four examplesshown, the largest differences must be expected for Al2O3, P2O5 andFe2O3 while for CaO the results probably do not change much.

3.2. Plots of the cumulative data distribution

One of the most powerful tools to study the data distribution is acumulative robability (CP) diagram. For the above examples (Al2O3,CaO, Fe2O3 and P2O5) comparing the CP's for original data and clr-transformedvariables in Fig. 3 results in qualitatively similar distributions.

At the level of data interpretation, the statistical distributions of the Apand Gr datasets are generally surprisingly well comparable for the majorelements. Only LOI is significantly higher for the Gr samples (compareTable 1). This has probably several reasons: (1) under permanent grasscover a thin organic layer develops at the top of the soil profile, (2) the di-lution of organic byminerogenicmaterial is less at the Gr sample depth of0–10 cm than at the 0–20 cm used for Ap soils, and (3) in parts of Europethe very reason that agricultural land is used as grazing land is that it is toowet to plough, which implies that it is rich in organic material.

One purpose of CP plots is to detect breaks in the data structurewhich may indicate subpopulations or certain geochemical processes(Reimann et al., 2008). As already visible in Fig. 2 the CP plots showthat the statistical distributions displayed by concentration versus

clr-transformed variables contain different information. Most promi-nently it is visible that the breaks shift to different positions in the plots.

3.3. Comparing the two sample materials

In a next step, the results from the two samplematerials are comparedin more detail than provided by data tabulation or statistical distribution,where Tables 1 and 2, and Fig. 3 indicate that the statistical behaviour ofAp and Gr are almost identical. This can be done graphically by plottingfor each variable the Ap result against the result of the neighbouring Grsample, and adding a 1:1 line for ease of comparison. Fig. 4 comparesthe datasets sample-by-sample using XY plots: comparisons are doneboth in log scale and in clr scale. This reveals that substantial local differ-ences between neighbouring Ap and Gr samples occur. However, themajority of sample pairs return well comparable results, and this factdominates the overall statistical appearance in Fig. 3.

The statistical similarity of the two datasets has an importantimplication for low density geochemical mapping at the continental(European) scale. It demonstrates that two independent sample sets,collected at the same density throughout Europe, essentially reflectthe same information, even if sampling at the very low sampledensity of one site per 2500 km2 (see also the discussion in Smith andReimann, 2008). It thereby confirms that robust geochemical mapscan be expected by this procedure.

In a way, it is surprising that the two different sample materials (Apand Gr), collected from sites with dissimilar land use at diverse depths(0–20 cm vs. 0–10 cm), still deliver such comparable results. The differ-ences between the sample materials, as outlined above, explain for thescatter in the XY plots (Fig. 4). While the supplement contains allgraphics for both sample materials, the spatially more representativeAp-sample set will be exemplarily presented in the following sections.

3.4. Mapping

One of the main aims of regional geochemistry is to visualise thespatial data structure on a map (Reimann, 2005). Different approachesto geochemical mapping are outlined in Reimann (2005), Reimann et al.(2008). Reliable and informative geochemical maps are needed in

Page 6: Science of the Total EnvironmentApplied geochemistry and environmental sciences invariably deal with compositional data. Classically, the original ... statistical approach was introduced

Fig. 2. Scattergram of measured element concentration versus the clr-transformed values of the same variable.

201C. Reimann et al. / Science of the Total Environment 426 (2012) 196–210

mineral exploration, in environmental studies, and for communicatingrelevant geochemical findings to environmental regulators and policymakers.

The map of SiO2 versus all other elements on Fig. 5 highlights theproblem of working with compositional (closed) data, because it isdominated by a belt of very high SiO2 concentrations in northern centralEurope (N-Germany, Poland, Baltic States). These high SiO2 concentra-tions are related to the occurrence of sandy, coarse grained soils inthese areas. These soils are young (about 8000 years) and developedon the sediments (end moraines) of the last glaciation. They predomi-nantly consist of quartz and feldspar, such that SiO2 concentrations inthe soils reach over 80 wt.% (Fig. 5). Accordingly, in a closed systemsumming up to 100 wt.%, there is little space for all other elements tovary. This is directly reflected on the maps of the other elements inFig. 5 (maps of all elements in both sample materials are found in theSupplementary material): low concentrations necessarily appear inthe belt where high SiO2 prevails. Mapping the other elements doesnot deliver intrinsic information for these elements. According toAitchison (1986), the information value of concentration data lies notin the measured values themselves but rather in the ratios betweenthe variables. Although the geochemist intuitively is interested inthe absolute concentrations of elements at any one location of a sur-vey area, one has to question the intrinsic validity of single-elementmaps. This, of course, questions practically all classical geochemicalmaps.

The statistical problem from the point of view of compositional dataanalysis is: single-element maps predominantly deliver results predict-able from other elements.

On the other hand, the geochemist argues that single-elementmaps convey a direct quantitative prediction of the expected ele-ment concentration at a specific location, and that they can bedirectly interpreted in terms of geology (occurrence of certainlithologies) and soil forming processes (occurrence of certainsoil types). These two aspects are illustrated by the followingexamples (see supplementary material for the maps not shown inFig. 5):

Al2O3: Scandinavia is —with some exceptions covered by soils withmoderate to high Al2O3 concentrations. The Caledonides and a beltof clay-rich soils running from southern Finland into southernSweden are marked by high Al2O3. The soils developed on glacialsediments in central northern Europe show uniformly low Al2O3

concentrations, while variation is high in central/southern Europe.Some granitic intrusions as well as the alkaline volcanic provinces(Italy) are marked by high Al2O3 values.CaO: Even given the fact that one might expect severe disturbancesof the natural geochemical patterns due to liming of agriculturalfields, the map for CaO is certainly one of the most informativemaps in terms of directly depicting geology. A clear break occursbetween Scandinavia (moderate to high concentrations) and therest of Europe, and in this case the break does not follow the glacialsediments but truly bedrock geology. In Scandinavia the Caledo-nides are marked by the highest CaO concentrations in the soils(note the continuation to Scotland and northern Ireland), and the

Page 7: Science of the Total EnvironmentApplied geochemistry and environmental sciences invariably deal with compositional data. Classically, the original ... statistical approach was introduced

Fig. 3. Cumulative probability (CP) diagrams of selected variables for the two sample materials: left hand: original data, right hand side clr-transformed data.

202 C. Reimann et al. / Science of the Total Environment 426 (2012) 196–210

Page 8: Science of the Total EnvironmentApplied geochemistry and environmental sciences invariably deal with compositional data. Classically, the original ... statistical approach was introduced

Ap Al2O3 [wt%]

Gr

Al2

O3

[wt%

]

0.5 2 5 20

0.5

25

10

Ap CaO [wt%]

Gr

CaO

[wt%

]

0.005 0.1 1 5 50

0.00

50.

11

520

Ap P2O5 [wt%]

Gr

P2O

5 [w

t%]

0.02 0.1 0.5 2

0.02

0.1

0.5

2

Ap LOI [wt%]

Gr

LOI [

wt%

]

1 2 5 20 100

12

510

50

−2 0 1 2 3

−2

−1

01

23

Ap clr(Al2O3)

Gr

clr(

Al2

O3)

−4 −2 0 2

−4

−2

01

23

Ap clr(CaO)

Gr

clr(

CaO

)

−4 −3 −2 −1

−4

−3

−2

−1

Ap clr(P2O5)

Gr

clr(

P2O

5)

0 1 2 3 4

01

23

4

Ap clr(LOI)

Gr

clr(

LOI)

Fig. 4. Scattergrams for element concentrations and clr(Element) as determined in the two sample materials (Ap and Gr) from neighbouring sample locations. The line indicates a1:1 relation and allows to better judge relative enrichment/depletion in one of the sample materials.

Fig. 5.Maps of element concentrations in agricultural soils (Ap-horizon, 0–20 cm, b2 mm) for four selected elements based on the original data. Classes for the symbols are based onpercentiles (0, 5, 25, 75, 95, and 100).

203C. Reimann et al. / Science of the Total Environment 426 (2012) 196–210

Page 9: Science of the Total EnvironmentApplied geochemistry and environmental sciences invariably deal with compositional data. Classically, the original ... statistical approach was introduced

204 C. Reimann et al. / Science of the Total Environment 426 (2012) 196–210

Fennoscandian Shield mostly by moderate concentrations. The gla-cial sediments return a very uniform signal of low CaO values. Incentral/southern Europe variability is higher, exceptionally highCaO concentrations mark many of the areas underlain by lime-stones, dolomites and marble, while areas underlain by granites re-turn uniformly low CaO values.Fe2O3: The map in general appears rather noisy, with high and lowconcentrations occurring in close proximity. Clear patterns areshown by low Fe2O3 concentrations in soils on top of glacial sedi-ments in central/northern Europe, and uniformly high Fe2O3 valuesover much of south-eastern Europe.K2O: Again a disturbance of natural distribution patterns by inputfrom fertilisers must be expected. The map for K2O is in fact rathernoisy. However, many areas underlain by granitic rocks are markedby unusually high K2O concentrations (e.g., northern Spain/Portugal,massif Central, Bohemian massif). The alkaline volcanic rocks insouthern-central Italy are also marked by high K2O values. The soilsdeveloped on top of the glacial sediments in northern central Europe(but note the Baltic States!) give uniformly low K2O concentrations.Soils in Sweden, southern Finland and the Baltic States show incontrast rather high K2O concentrations. In southern Sweden andFinland these high concentrations are often due to the occurrenceof clay-rich soils. The effects of agricultural practice on these agricul-tural soils are thus not immediately deducible from the map.

LOI: A high value for Loss on ignition (LOI) can either be related to ahigh content of organic matter or to the occurrence of calcareoussoils. It is thus not surprising that many high values are observedin parts of Scandinavia (organic matter) and in southern Europe(calcareous soil parent material), while soils developed on top ofglacial sediments in northern central Europe are marked byuniformly low LOI values.

MgO: Themap is separated into three large zones: (1) predominantlyhigh values in Scandinavia (with local exceptions e.g., on top of thesediments in Skåne in southernmost Sweden), (2) a belt of lowvalues on top of the glacial sediments in northern/central Europecontinuing towards northern France, Spain and into Portugal and(3) high values over south-eastern Spain southern France, Italy andmost of south-eastern Europe. Local highs can be related to the occur-rence of ophiolites (Greece) or dolomites (Spain, Italy).MnO: Themap ofMnO is rather noisy, one of the few clear features isthe low MnO concentration in soils developed on top of the glacialsediments in northern/central Europe. Low values do also prevailover western France and all of Spain while much of south-easternEurope is marked by high MnO concentrations.

Na2O: The map of Na2O shows a clear and striking pattern: highvalues over all of Scandinavia (Baltic Shied and Caledonides), highvalues also on top of the Caledonides in Scotland and low valuesover much of south-western and central northern Europe, includingthe glacial sediments. Values in themedian range (with local excep-tions, e.g. much of Hellas with high values) mark large parts ofsouth-eastern Europe. The exceptionally high Na2O concentrationsin northern Scandinavia indicate an important crustal boundary.P2O5: This is another elementwhere onemight expect amajor influ-ence from the input of fertilisers to the agricultural soils. The map isin parts certainly noisy, however, the mostly high P2O5 concentra-tions throughout Scandinavia and Scotland are an outstandingfeature and clearly linked to natural phenomena. These high valuesare partly due to the organic rich soils developed in the northernwetand cold climate and due to the occurrence of rocks rich in apatite.Soils developed on top of the glacial sediments in northern central

Europe show uniformly low P2O5 concentrations. Local highs occurthroughout the map in connection with some granitic intrusionsand the volcanic rocks in Italy.

SiO2: The coarse grained glacial sediments in central/northernEurope bring about exceptionally high SiO2 concentrations. Valuesup to>95% SiO2 indicate that these soils are in part almost purequartz sands. Most of the northern European soils, and large partsof southern European soils returned comparatively low SiO2 values(minimum values well under 5% SiO2). The reason for “low SiO2”

in the soil samples is not directly discernible on the map: the wide-spread occurrence of carbonates in the European south and of soilswith high organic carbon content in the north, can result in bothcases in exceptionally low SiO2 concentrations.TiO2: In Scandinavia an east–west gradient is observed in terms ofTiO2 concentrations with the highest values occurring over theCaledonides in Norway. The soils on glacial sediments in northerncentral Europe are marked by uniformly low TiO2. The large areaunderlain by calcareous sediments in eastern Spain also correspondsto a large area with low TiO2 concentrations in soil. High TiO2

concentrations prevail in many of the soils from south-easternEurope.

Maps for the Ap and the Gr samples are — with some local excep-tions — very well comparable (see Supplementary material). Thegeochemist will thus argue that all these maps are interpretable anddeliver important and useful information. From the viewpoint ofcompositional data analysis, the maps of most elements add very littlenew, independent information after studying the spatial distributionsof SiO2, LOI and CaO.

From this angle, it is more effective to plot maps for the clr-transformed variables, i.e. a map of the logratio of the investigatedelement to the geometric mean of all elements measured. As inthe above case of the tabulation of statistical distribution measures,a map of a clr-transformed variable does not express direct infor-mation about the concentration of the investigated element at anyone point within the map. Instead, it represents a relative abun-dance of the element with respect to the geometric mean of all ele-ments measured.

The comparison of such “clr(Element)” maps, as shown in Fig. 6,with the corresponding maps in Fig. 5 demonstrates that for severalelements (e.g., SiO2, CaO, MgO) the mapped patterns (high versuslow) remain almost the same, while for some other elements (e.g.,K2O, P2O5) the patterns on the map change significantly from elementto clr(Element). For example, it suddenly becomes visible that relativeto the geometric mean these two major elements are actually highlyabundant in the glacial sediments of northern central Europe. Theirlow absolute concentration in soil, which is emphasized by Fig. 5, is pri-marily due to the high concentrations of SiO2 in these soils (also knownand discussed in the geosciences as “the quartz dilution effect” see, e.g.,Bern, 2009). Using the geometric mean as normalizer, effectively sup-presses the predominant SiO2 signal, and therefore in these cases thetwo clr(Element) maps deliver completely different information.Arguing from the compositional data side it is possible to state thatthe “low K2O and P2O5” information in the concentration maps is notthe real information, because it is forced by the high SiO2 concentra-tions, and that one thus should not even bother to present single-elementmaps, but rather directlymove on tomultivariate data analysis.

In contrast, the geochemist will argue that the “really” measuredK2O concentrations are low in these areas, which is important infor-mation in itself, e.g., for assessing soil quality or fertility. Moreover,the patterns of some elements, like CaO and MgO, on both maps didnot even change. This information would definitely be lost withoutmapping the concentration of the single elements first. In addition,practically all research in regional geochemistry during the last

Page 10: Science of the Total EnvironmentApplied geochemistry and environmental sciences invariably deal with compositional data. Classically, the original ... statistical approach was introduced

Fig. 6. Maps of the regional distribution of the clr-transformed variables for which concentration maps are presented in Fig. 5. Classes chosen for mapping are based on percentiles(0, 5, 25, 75, 95, and 100).

205C. Reimann et al. / Science of the Total Environment 426 (2012) 196–210

100 years is based on, and discussed in terms of absolute elementconcentrations, and not in terms of clr ratios. For example, all actionlevels for soils are set based on element concentrations. Following aclr-transformation one is studying abstract ratios instead. In theend, it appears that both maps are needed to understand the process-es governing the distribution of the chemical elements in space. Theclassical concentration map is indeed needed for many practicalapplications — the CoDa logratio map will obviously deliver trulynew additional information in some, though not all, cases. It is notpossible to extract this additional information from single-elementdistribution maps.

3.5. Correlation versus stability, scatterplot matrix, XY diagrams

When analysing soil samples for their total element concentration,increasing the dominant element SiO2 must decrease the sum of theother oxides (and actually all elements in the sample) by virtue ofthe smaller space remaining for them, within the 100% total. It wascorrelation analysis and correlation based methods where the closureproblem with compositional data was first detected and discussed(Pearson, 1897; Chayes, 1960), often under the name “spurious corre-lation”. Thinking in terms of correlation the requirement is that thecorrelation of two variables must not be affected by the influence ofother variables. This requirement is never met when dealing withgeochemical data. Even if only two elements are analysed, all other el-ements in the composition influence the measured concentrations of

these two elements. Therefore, any method based on the absoluteconcentrations of the two parameters studied (like correlation) isbound to fail. When dealing with geochemical data, the measuredconcentration of any one element in a sample always depends onthe concentrations of all other variables, and multivariate informa-tion rather than the bivariate information needs to be considered.When studying an XY diagram of absolute concentrations, it mustalways be kept in mind that there exist a multitude of furthervariables which influence the pattern visible in any one bivariateplot.

Though they are a commonly used tool in geochemical data analysisof compositional data, XY plots should never be interpreted in terms ofcorrelations. Fig. 7 (upper right) shows a scatterplot matrix for thelog-transformed data of the Ap dataset. The boxplots of the lower trian-gle correspond to logratios of the variable in the column divided by thevariable in the row. A scatterplot matrix, instead of a single XY diagram,shows the interactions of all the elements in the composition. Thecorrelation between two elements neglects the effect of all the otherelements in the composition on the concentration of these twoelements. For example an increase of SiO2 will automatically lead to adecrease in the concentration of most other elements (see Fig. 7). Inclassical statistics this would be interpreted as a negative correlation,while in fact it is an artefact of the closed data structure and the domi-nance of the SiO2 variation. Thus, the whole concept of correlation inXY plots does not make sense when dealing with compositional data(Aitchison, 1997; Filzmoser et al., 2010).

Page 11: Science of the Total EnvironmentApplied geochemistry and environmental sciences invariably deal with compositional data. Classically, the original ... statistical approach was introduced

Fig. 7. Scatterplot matrix for the log-transformed element concentrations (upper right). The dashed line indicates a constant ratio between the two elements (corresponds to themedianof the logratio), and the parallel solid lines indicate a ratio that is twice/half as high. In the lower left part boxplots of the logratios of the element pairs are shown togetherwith a coefficientof stability, which varies between 0 and 1. High stability is indicated by values >0.9. Which of the two elements in the pair dominates the composition is shown by the location of theboxplot in relation to the dashed line (above or below). The black symbols indicate the samples from the sandy soils (end moraines) in northern central Europe.

206 C. Reimann et al. / Science of the Total Environment 426 (2012) 196–210

A more surprising, and probably even more important, artefact isvisible in Fig. 7. The dominant concentrations of SiO2 clearly outweighthe concentrations of most other elements (see Table 2, Fig. 7). Thisnot only forces a negative relationship between the other elementsand SiO2, but it can also induce artificial positive correlations amongthe less dominant elements, when interpreted in terms of classicalstatistics. In the context of compositional data, only the ratios betweenthe elements contain the relevant information. A simultaneous decreaseor increase of a pair of variables is not of interest, but rather the stabilityof their ratio. Fig. 7 (upper right) contains this additional information:The dashed line indicates the constant ratio (corresponds to themedianof the logratio), and the parallel solid lines indicate a ratio that is twice/half as high. The closer the data points follow this band, themore stableis the ratio.

In the lower left part of Fig. 7, boxplots of the logratios of the elementpairs are shown together with a coefficient of stability, which variesbetween 0 and 1 (Filzmoser et al., 2010). This coefficient is computedas exp(−var(z)), with z=1/sqrt(2)*log(xi /xj). Here, xi and xj are theparts i and j of the composition. Thus the coefficient is a measure ofthe variance of the logratio of two compositional parts. If the variance

is low, the logratio is stable, and the resulting coefficient is close to 1.High instability leads to high variance and thus to a coefficient close to0. As an example, for Al2O3/TiO2 (lower left) a high (>0.9) coefficientof stability of 0.96 is shown. This indicates that this logratio is stable,i.e. all samples have a comparable logratio. This is also visible in thecorresponding plot in the upper right corner of the diagram, wherethe points follow the dashed line of themedian ratio. The boxplots indi-cate which of the two variables is more dominant in the composition.The median of a boxplot above the stipulated line (0) indicates thatthe first element in the ratio is more dominant (in the example Al2O3

is much more dominant than TiO2).Fig. 8 shows a plot of one selected pair of log-transformed variables

from Fig. 7 (SiO2 and K2O). In terms of correlation, there exists no obvi-ous and strong relationship between SiO2 and K2O (Pearson correlationcoefficient 0.34). In terms of CoDa, only the ratio between the two vari-ables is of interest. Themedian logratio SiO2/K2O of 33.6 corresponds tothe dashed line shown in the XY plot. The band indicated by the solidlines contains all logratios between 16.8 and 67.2 — in total 84% of thedata, resulting in a coefficient of stability of 0.91 (shown on top of theboxplots in Fig. 7 — see Filzmoser et al., 2010). Although the Pearson

Page 12: Science of the Total EnvironmentApplied geochemistry and environmental sciences invariably deal with compositional data. Classically, the original ... statistical approach was introduced

Fig. 8. Scatterplots for SiO2 versus K2O and MgO versus Na2O (see Fig. 7) — in the first case the logratio is highly stable (coefficient of stability: 0.91) in the right diagram it is unstable(coefficient of stability: 0.65). Note that the two variables are plotted using a log scale.

207C. Reimann et al. / Science of the Total Environment 426 (2012) 196–210

correlation is rather low, the stability is very high. For comparison thepair MgO versus Na2O delivers a comparable Pearson correlation of0.37 while the band indicated in the plot (Fig. 8) contains only 55% ofthe data and consequently the coefficient of stability is only 0.65(Fig. 7). The results for SiO2 and K2O indicate that, although not visiblewhen studying the plot in terms of a correlation, there exists in fact aclose relation between these two variables, while the relationshipbetween MgO and Na2O is and remains weak. XY plots are usuallyshown to study the relation between the two variables plotted. Correla-tion, however, is definitely the wrong concept for quantifying relation-ships between two variables from compositional data.

XY plots can still be used with care in the sense of truly exploratorydata analysis (EDA - Tukey, 1977) to detect unusual behaviour betweentwo variables, or groups of samples with a different behaviour that maybe indicative of different processes, as long as it is kept in mind thatthere exist many further variables that influence the visible diagram.The left diagram in Fig. 8 suggests a splitting of the samples in differentgroups. There exists at least one group of samples where both, SiO2 andK2O, have unusually low concentrations. In fact, behind this trend (witha constant ratio!) are even two groups of soil samples hidden: (i) soilsdeveloped on the calcareous sediments, dominated by CaO (seeTable 1, some soils can contain over 50% CaO) in combination with ahigh LOI (CO3), that are prominent in southern Europe, and (ii) soilswith an unusually high amount of organic material (indicated by highLOI but low CaO), typical for parts of northern Europe. In both cases,all other elements must decrease in view of CoDa. In contrast, the sam-ples in Fig. 8 with very high SiO2 concentrations and decreasing K2Oconcentrations are the sandy soils of the end moraines of the last glaci-ation in northern central Europe,where SiO2 concentrations of over 90%leave no room for other elements. It is possible to deduce that thereexist samples with very low SiO2 and K2O and a decrease of SiO2 isaccompanied by a decrease in K2O. However, because the decrease isforced by other elements (CaO and LOI) it would be incorrect to suggestthat there exists a direct relation between SiO2 and K2O. A correctevaluation of the relation between the elements is only possible whenlooking at the ratios — in case of the chosen example samples withlow concentrations have about the same ratio as the majority ofsamples with higher concentrations and the relation is in fact strong.

3.6. Exploring new patterns in the clr(Element) maps

Several of the clr(Element) maps (Fig. 6) showed distribution pat-terns for the ratios that were comparable to the element distributionmaps (Fig. 5), while some maps displayed very different patterns. Thespatial distribution displayed on the maps of P2O5 (Fig. 5) versusclr(P2O5) (Fig. 6) is one example showing a large difference. The con-centration map showed low P2O5 concentrations on the band of sandy

soils dominating northern central Europe (the end moraines of thelast glaciation), characterised by exceptionally high SiO2 concentrations(leaving no space for other elements to vary). Without intricate knowl-edge and understanding of the whole dataset it is close to impossible todetermine the reason for the different spatial distribution observed onthe clr(P2O5) map. These sandy soils from northern central Europe arehighlighted in Fig. 7, the scatterplot matrix, in black. The problem isthat the clr(P2O5) map is influenced by the ratios to all other elements— what is then the reason for the high ratios (relative dominance ofP2O5 in the composition) in this area? It is possible that the effect iscaused by an increase of P2O5 or a decrease of all or any of the otherelements. Studying the location of these soils in the scatterplot matrixin Fig. 7 it is clear that soils with low to moderate P2O5 concentrationsare affected. The plot can now be used to investigate where these soilsdeviate from a constant ratio, indicated by the grey band in the graphic.The strongest deviation is visible in the plot P2O5 versus MgO, whereMgO is depleted relative to P2O5. Smaller deviations in the same direc-tion are also visible for CaO and Fe2O3. Fig. 9 shows two maps, the re-gional distribution of the logratios P2O5/MgO and P2O5/CaO. Here itbecomes obvious that the effect is caused by a relative enrichment ofP2O5 especially with respect to MgO. The Ca–Mg–P–(Fe) ratios are im-portant for plant fertility and production and these patterns may thusbe the first clear indications of the effects of agricultural practice (useof fertilisers) on the studied soils. Note that simple logratio maps asshown in Fig. 9 are true representations of the data even in the CoDasense.

3.7. Multivariate data analysis

Many of the above discussed problems with data closure can beovercome inmultivariate data analysis,when the data are automaticallyclr- or ilr-transformed. However, note that variables with poor dataquality must be recognised and removed before any such transforma-tion is carried out, otherwise they can dominate the result. A classicalapproach to PCA in geochemistry has been to log-transform all data,and to carry out the PCA with the log-transformed values. Fig. 10shows the robust biplots for the log-transformed data (left), and theresults for the conceptually correct version (right), based onilr-transformed data, and back-transformed to the clr-space for a biplotrepresentation (Filzmoser et al., 2009b). Fig. 10 demonstrates that thelog-transformation is not sufficient, because the first principal compo-nent (PC1) is driven by the dominating (in terms of absolute concentra-tion) variable SiO2. In contrast, for the clr-transformed values, whererelative information is considered, the biplots open up (see Filzmoseret al., 2009b). The biplots for PC1 versus PC2 and PC1 versus PC3 forthe log-transformed data clearly demonstrate the closure problem:with the exception of SiO2 all other elements draw into one direction.

Page 13: Science of the Total EnvironmentApplied geochemistry and environmental sciences invariably deal with compositional data. Classically, the original ... statistical approach was introduced

Fig. 9. Maps of the P2O5/MgO and the P2O5/CaO ratio in the agricultural soils of Europe.

208 C. Reimann et al. / Science of the Total Environment 426 (2012) 196–210

The true data structure and the relations between the variables are onlyvisiblewhen the data are clr-transformed. Then suddenly the results areno longer dominated by SiO2.

The biplots for the opened data in Fig. 10 (right) show a strong rela-tion between SiO2, K2O, and P2O5 (sandy and feldspar rich soils). Aweaker relation exists between Fe2O3, MgO, and MnO, TiO2 (maficminerals, mica rich soils), while CaO takes a separate position (calcisols,e.g., in Spain) which is also visible in Fig. 7, the scatterplot matrix interms of low coefficients of stability in relation to all other elements.LOI is related to the occurrence of organic soils but also relatively highin areas with calcareous soils.

4. Conclusions

Geochemical data are compositional and thereby closed data. Math-ematically they define points in the Aitchison geometry on the simplex,and not in the usual Euclidean space for which all classical statisticalmethods are designed. For this reason, all calculations which explicitlyor implicitly are based on Euclidean distances give misleading results.This already includes simple calculations, like computingmean or stan-dard deviation. Even more, it can also affect simple visual assessments,like detecting groups in the data or suggesting patterns of linear rela-tion, both are intimately linked to the Euclidean distance. All suchgraphics should thus only be used and interpreted with care. For themultivariate case appropriate transformations from the simplex to theusual Euclidean space overcome the problems of data closure.

The problemwith closed data is that they aremultivariate by defini-tion. A uni- or bivariate analysis of such data must always keep thatmultivariate structure inmind. From a practical point of view, however,geochemists are interested in the statistical and spatial distribution ofsingle variables, e.g., elements or compounds. They also wish to usethe correlation between two variables to draw conclusions about geo-chemical processes. Although the naïve application of uni- or bivariateanalysis neglecting the closure problem is probably wrong, experienceshows that in many cases this approach leads to interpretable andmeaningful results, probably because the interest is often driven byhigh absolute concentrations of the measured variables and not by therelative contributions of the elements to the whole composition.Relative contributions are valuable because they will provide a deeperinsight into the multivariate structure of the data.

Opened variables can differ widely from their closed counterpart.Whether there will be a difference or not, does not depend on the con-centration range. It is a misconception that closure is only a problem forelements with very high concentrations like SiO2 and Al2O3. Elementswith very low concentration can and will be seriously affected by therelative information of other variables which are invisible in uni- orbivariate plots.

In case of the GEMAS dataset, simple “classical” univariate elementdistribution maps of the major elements in European continental soilscan be interpreted in terms of the main geological structures. Mostprominent are a band of high SiO2 concentrations marking the endmoraines of the last glaciations in northern central Europe, and highconcentrations of CaO in the areas underlain by limestones, marls anddolomites (mostly in southern Europe); while uniform moderate CaOconcentrations are characteristic for the Fennoscandian Shield. Manyof the large granitic intrusions in Europe are related to high K2O values.Soils developed on greenstone belts show high MgO concentrations.

When mapping the opened, clr-transformed variables, the regionaldistribution does not change substantially for some elements, like, e.g.,CaO and MgO; for other elements, like K2O and P2O5, the patternschange significantly. The sandy soils in northern central Europe showsuddenly high ratios of K2O and P2O5 in relation to the geometricmean of all elements— the low absolute values observed on the concen-tration maps are solely forced by the overall high SiO2-concentrations.The new maps, however, are no longer concentration maps but rathershow the regional distribution of a dimensionless ratio. The twomaps — (1) original element concentration versus (2) clr-transformedelement — deliver qualitatively different information: (1) the concen-tration of the studied elements at any location on the map and (2) theinformation whether the measured value is high or low in relation tothe geometric mean of all elements. It is worth-while to study bothmaps.

Classical correlation analysis should not be carried out on composi-tional data in XY plots. These may only be used with great care in anEDA sense to detect unusual data behaviour or subgroups of samples.The relation between all elements can be evaluated when adding infor-mation about the stability of a ratio in a scatterplot matrix. Used in anexploratory sense with indicated subgroups it is a powerful tool tounderstand complex multivariate relations in two dimensions.

For multivariate data analysis the effects of closure must be over-come by applying a suitable logratio-transformation (ilr-transforma-tion). When carrying out principal component or factor analysis, theeffect of opening the data is immediately visible on a biplot. Onlyopened data provide information about the true relationships betweenthe variables, relationships that are independent of the total concentra-tions of the elements.

Acknowledgements

The GEMAS project is a cooperation project of the EuroGeoSurveysGeochemistry Expert Group with a number of outside organisations(e.g., Alterra in The Netherlands, the Norwegian Forest and LandscapeInstitute, several Ministries of the Environment and University Depart-ments of Geosciences in a number of European countries, CSIRO Land

Page 14: Science of the Total EnvironmentApplied geochemistry and environmental sciences invariably deal with compositional data. Classically, the original ... statistical approach was introduced

−20 −10 0 10 20 30

−20

−10

010

2030

PC1 (61.5%)

PC

2 (1

4.4%

)

++ ++++ +++

++++

+ ++

+

+++ ++

+++

+

+

++

+

++

+

+ ++++

++

++++++

++

+

+ +++ ++

++

++ +++

+++

++++ +

+ +

+ +

+++ ++++

+

++ ++ ++

++

+

+ +++

+ +

+

++++

++

+

+

+++

++++ +

+ +

+

++

+++++

++++

+

+

+

+

+++

++

+

+ +

+

+

+

+++

+++

+

+ ++ +++

++

+++ +++

++

+ ++

+

++

++

++

++

++++ + ++ ++ +

+

+++

++++

++++

+++++

+

+++

+

++

++++

+ ++

+

+++

+++

+

+++ ++

+

+

++

+

+++

+

+

++++ +

++ ++

+

++++ ++++ +++

+++++

+

+

+++

+

+++ + ++ ++

+

+++

+++

++

++

+++

+ ++ ++++++

+

++

++++ + +

++

+

++++

++

++ +++

+++ +

++

++++ ++

++

+

+ ++ +

+

+

++ ++++++

+ ++++

+

+

++

+

++

+ ++

+ ++++

++++

++

+ ++++

+ ++ ++ +

+

++

+

++++ +++

+

+

++++

+

+

+++

+++++

+

++++

+ +++ +++

++

+

+

++

+

+ +

+

++

++++

+++

++ +

+ ++ ++ ++ +

+++++ ++

+++

+

+++

+

+++

+

+

+++++++

++++++

+

++

++++ +

+

++

++++ ++++ ++

++

++ +++ +

+

+

+

+

+

++ +++ +

++

++

++++

+

++

+

++++

+ ++

+

+ ++

+++++ +

++

+

+++ +

+

+ + ++++

++

++++

+

+

+++

+

+

++

+++

+

++ ++

+

+

+

++ +++ ++

+

++

++

+

+

+

+

++++

++ +

+

+++++++ +++

++++

+

+++ +

+

++

+ +++

++ ++ +

+

+++

+++

+

++

++ +

+

+++

+

++ +

++

++ ++

+ ++

++

+

+

++ ++ + ++

++++ +

+

++++

+

+

++ +

+ ++

+

+

+++

+

+

++

++++ +

++

++

++++ ++

+++

+++

+

+

+ +

+

+

++

+++++

++++

+ +++

++

+ ++

+++

+ +++

+

++

+

+

+++

+

++ +++

+++

+++

++++

+++

+ ++

+

++ ++

+

++++

+

+++

++

++++

++ ++

+

++ +++++

+

+

+

+++

++ +

+

+ ++ ++ ++

++

+

+++

+++

+ ++

+

++

+++

+

++ +++

++ +++ ++

+++++ ++

+++

+

+

++ ++

+

++

+

++

+

+

++

+ +++

+++ +

++++ +++ +

++

+

++ +

++

++

+++ ++ +

+

+++ ++

+

++

+++

++

+

+ ++ ++

++

+

++

++

++

+

++

+

++

++

+

++

+

+

+++

+ +++ +++

++++ + +++

+

+ +

+

+++ ++

+

+++

++

+

++ ++++

++++ +++

+

+ ++

+

++ ++

+ ++

++

+

++ +

+ +

+

++

+++++

+

+

+

+

+

++ + ++++

+

+++

+ ++++

+++ +++

++

+++

+

+

+ ++ ++++ ++ +++

+

+++ + +++ +++ +

++

+ +++

++ +++++++ +

+

+

+++

++++ +

+

+

+

+

++ ++++

+++++

++ +

+

++ ++

+ ++

++++

++

+

+++++

+ ++

+ ++++

++++

++

+

++

+++++

++

+

+

+

+ ++

+++++

+ +++ ++

+++

+ ++++ +++ +

++

+

++++

+++++

++

+++ ++

+

+

++++ ++

++

++ +++

+++

+

+++ +

+

++

++ +++ +

+

++

+ ++ ++ ++ +++ ++

+

++

+++

+

+ ++++

++

++

+++++

+

+++

+

+++

+

+ ++ ++++++ ++

+ ++

+++

+++

++ ++

++

+

+

+ ++++

+

+

++ +

+

++++ +

+

++++

+

+++++++

+

+

+++

+

++

+++

+

++++

+ +++

+

+

+++

+

+

++ ++++++

+++

+

+

+

++++ + +++

+

+

+

+

+++

+

++

++

+

++

++ ++ + ++

+++ +

+

++++++

++

+

+

+

++++

+

+

+++

+++++ ++

+

++++

+

++

+

++

+++

++

+

+++

++ ++++

++ ++ ++

+

++ ++ ++ +++

+++

+

+

++

+++ ++ ++

+ +++

+

++++

+

+++

+

+

+

+

+

+

+

++ +++

+++

+

+

++

+

+

++

+

+++ +++++

+

++

+

+++ ++

+

++ +

+

+

+

+

+

+

++

+

++

++ +++ + +

+

+

+

+++ +

+ +++

+++

+

+

+

+

+

++

+++

++ ++

+++

++ ++

+

+

++

++ +++

++

+

+

++

+ +++++

+

+

+

++++

++

+ ++

+

+++

+++

++

+

+

+

+++++

+

++

+

+

+

+

+++++ ++

++++ +

+

++ +

++ +++++

++++

+

+

++++

++

+

++

++++++++

++ +++++ ++

+ ++

+

+

++

++ +

+

++++

+

+

+ ++

+

+

+

++++

+++

++

++ +++

++

+

+

+

+

+ +++++

+

++ +++ +++

+

++

++ +++ ++++

+++

++

++

+

++

+++++

++

++++++ + +++

+

+ +

+

+ +++

++++ +++++

++ +

+

++

++

+ +++

+++

++

+

+

+

+

++

+

+++

+

+

+

+

+ ++++

++

+

++++ +

++

+

+

++ +++

++

+

++

+

+

++

++++

+

+

+ ++++

+

+

+

+ ++++ + ++++ +++ +

+++++

+++ +

+

+

+

++ +++ +++++ +

+++ ++ ++

+ ++

+

+

+

+

++

+++

++++

++

+

++++ ++

+

+++++

++

+

+

+

+

+

++ ++++ +

+

+

++

+++ +

+

+++ +++

+

+

+

+

+

+

+++++

++++

++

+++

+

+++

+

+ ++

+

+++++

+

++

−0.4 −0.2 0.0 0.2 0.4 0.6

−0.

4−

0.2

0.0

0.2

0.4

0.6

Al2O3

CaO

Fe2O3

LOI

K2O

MgO

MnO

Na2O

P2O5

SiO2

TiO2

PCA log − Ap

−5 0 5 10

−5

05

10

PC1 (43.6%)

PC

2 (2

2.3%

)

++

++

+

++

++

+

++

+

+

++

+

+

++

+

+

++

+

+

+++

+

+

++

+

+

+

+

++

+

+

++

+

+

++

+

+

+

+

+

+

++

++

+++

+ +

++

+

++

++

+

+

+

+

+

+

+

+ +

+

++

+

+

++

++ +

+

++

++

+ +

++

+

+

+

+

+

+

+

+

+

+++

+

+

++ +

+ +

+

+

+ +

+

++

+

+++

+

+

+

+

+

+

+

+

+

+

+

++

+

+

+

++ ++

+ +

+

++

+ +

+

+

+

+

+

++

++

+

+

+

++

+

+

+

+

+

+

++

+

+

+++

+

+ +

++

+

+

+

+

++

+++

+

+

+++

+

+

+

++

+

++

+

+

+

+

+

+

+

+

+

++

+

++

+

++

+

+

+

+

+

++

+

+

+

+

+

+

++

+

+

++

+

+

+

+

+

+

+

+

++

+

+

+

+++

+

++

+

+++

+

+

+

+++

+

+

+++

++ ++

+

+

+

+

+

++

+

+

+

+

+

+

+

+

++

+++

+

+

+

+

++

+

+ ++

+ +

+

+

+

+++

+

+

+

+

+

+

++

+

++++

+

+

+

+

+

+

+

+

+

+

++

+

+

+

+

+ ++

++

+ +

+

+

+++

+

++

+

+

+

++

+

+

+

+++

++

++

+

+

++ +

+++

++

++

+

+ +

+

+

++

+

+

+

+

+

+

+

+

+

+

++

+

+

+

+

++

++

++

+

+

++

+

+

+

++++

++ +

+

+

+

+

+ +

++

+

+

+

++

+++

+ ++

+

+

++

++++

+ +

+

+

+

++

+

+

+

+

+

+

+

+

+

+

++

+

+

+

+

+

+

+

+

+

+

+

+++

+

+

+

++

++ +

+

+

+

+

+

+

++

+

++

+

++

+

+

+

+++ ++

+

+

+

+

++

+

+

+

++

+

+++

+

+

+

+

+

+

++

+

+

+

+

+

+

+

+

+

+

++

+

++++

+

+

+

+

++

+

+

+

+

+

++++ +

++

+

+ ++

+

++

+

+

+

+

++

+

+

+

+

+ +

+

+

+ +

+

+

+++ +

+

+

+

+

+

+

+

+

+

+

+

+++

++

+

+

++

+

+

+

+

+

+

+

+

++++

+

+

++ +

+

+

+

+ +

+

+

+

+

++

+

+

+

+

+

++

+

+

++

+

+

+

+

+

+

+

++

++

+

+

+

+

++

++

+

+

+

+

+

++ +

+

+

+

+

+

+++

++

+

++

++

+

+

+

++

+

+

+

+

+

+

+

+

+

+

+

+

++

+

+

++

+

++

+

++

++

+

+

+

+

++

+

+ +

+

+

+

+

++

+++

+

++

++

+

+

+

+

+

+

++

+

+

++

+

++ +

+

++

+

+

+

+

+

+

+++

+

+

++

++ ++

+

+++

+ ++

+

++

+

+++

+

+

+ ++

+

+

+

+

+

+

+

+

+

++

+

++

+

+

+

+ ++

+

++

+

+

+

+

+

+

+

+

+

+

++

+

+

++

+

+

+

+

+

+

+

+++

+

+

+

+

+

+

+

++

+

++ +

+

+

+

+

+

+

++

+

+++

+

++

+

+

+

+

+

+

++++

+

++

+

+

+

+

+

+

+

+

+

+

+

++

+

+

+

+

+

+

++

++

+

+

+

+

++

+

+

+

+

+

+

++

+

+

+

+++ ++

+

+

+

+

+

+

+

+

+

+ ++

++

+ +

+

++

+

+

++

+

+

+

+

+

+

++

+

+

+

+

+

+

+

+

+++

++

+ +

++

+

+

+

+

+

+

+

+

+

+

+

+

+

+ +

+

++

+

+

+

+

+

+

+

++

+

+

+

+

+ ++ +

+

+

++

+

+

+

++

++

+

+

+

+

+

+ ++ +

+

+

++++

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

++

+

+

+

++

+

+ + +

++

+

+

+

+

+

+

+

+

+

+

+++

+

+

++

+

+

+

+

++ +

+

+

+

+

+

+ ++

+

++ +

+

+

+

+++

++

+ +

+

+ +

+

+

+

++

+++

+

+

+

+

+

+

+

+

++

+++

+ +

+

+

+

+

++

++

+++

+

+ +

+

++

+

++

+

++

+ +

++

+

++

++

+

++

+

+

+

+

+ +

++

+

+

+

++

+

+

+ +

+

+

+

+

+

+

++

+

+ ++

+

++

+

++++

++

++

+

+

+

++

+

+

++

+

+

++ ++

+

+ + +

+

+

+

++

+ +

+

+

+

+

+ +

+ +

+

++

+

+ ++

+

+

+

+

++

++

+

+

+

+

+

+ +

+

+

+

+

+

++++

+

+

+

++

+

+

+

+++

+

++

+

++

++

++

+

+

++

++

+

+ +

+

+ +

+

+

+

+

+

+

+

+

+

++

++

++

+ +

+

++

+

+

+

+

+

+

+ ++

+

+

+

+

+

++

+

+

+++

+

+

+

+

+

+

+

+

+ ++

+

++

+

+

+

+ +

+

+

+

+ ++

+

++

+++

++

+

+

+

+

++

++

+

+

++

+++

+

+

++

+

+

+

+

+++

++

++

+

+

+

+

+

++

+

+

+

+

+

+

+

+

+

+

++ ++

+

+ +

+

+

+

+

+

+ ++

++

+

+

+

+

++++

+

+

+

+

+

++

+

+

+++

+

+++

+

+

+

+

+

+

++

+

+

+

+++

+

+

++

++

++

+

+

+

+++

+

++

+

++

+

+

+

+

+

+

++

+

++

++

++

+

+

++

+

++

+

+

++

+

+

+++

+

+

+

+

+

+

+

++

+

+

+

++

+

+

+

+

+

+

+

++

+

+

+ +

+

+

+

++

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+ +

++

+

++

+

++

+

+

++

+

+

+

+

+

+

+

+

+

++

+

+

+

+

+

+

+

+

+ ++

+

+

+

+

++

+

+

++

+

+

+

++

+++

++

++

+

+

++

++

+++

+

+

+

+

+

++

+

++

+ ++

+

+

+ ++

++

+

+

+

+

+

+

+ +

+ +

+ +

+

+

+

+

+

+++ ++

+

+

++

+ +

+

+

+

+

+

+

+

+++++

++

+

+

+

+

+

++

+

+

+

+

+

+

+

+++ +

+

++

++

++

+ +

+

+

+

+

++

+

+

+

+

+

+ +

+

+

++

+

++

+

+

++

+

+

+++

+

+

+

++

+

+

+++

+

+

+

+

+

+

+

+ +++

+

+

+

++

+

+

++

+

+

+

++

++++ +

+

+++ +

++

+

+

+

+

+

+

+

++ ++

+

++

+

++

+ +

+

+ ++

+++

+

+

++

+

+

++

+++ ++++

+

++

+

+

+

+

+

+

++

+

+

+

++

+

+

+

+

+

+

+

+

++

+

+

+

+

+

++

+

+

+

++

+

+ +

++

+

+

+

+

+

+

++

+

+++

+

++

+

+

++

++

+ +

+

++

+

+++

+

+

+

+

+

++

+

++

+

+

++ +

+

+

+

+

+

+

+

+

+

+

+ +

+

+

+

+

++

+

++

+

+

+

+

+

+

+

+++

++

+

+

+

+

+

+

++

++

++

+

+++

+

+

+++

++

+

+

+

+

+

+

+

++

+

+

+

+

+

+

+

++

++

+

+

+

+

+

++

+

+

+

++

+

+++

+

++

+

+ +

+

++

++

++

+

+

+

+

+

+

+

+

++

+

++

+

+

+

++

+

++

+

++

−0.5 0.0 0.5 1.0

−0.

50.

00.

51.

0

Al2O3

CaO

Fe2O3

LOI

K2O

MgO

MnO

Na2OP2O5

SiO2

TiO2

PCA clr − Ap

−20 −15 −10 −5 0 5 10 15

−20

−15

−10

−5

05

1015

PC1 (61.5%)

PC

3 (6

.8%

)

++ ++

+

+

+

+++ ++++ +++

+++ ++ ++ ++

++

+

+++

+ ++

+

++ ++

+

+++

+++

+

+

+ +++ + ++

+

+++

+ ++++

++++ +

+ +

+

++ +++

+++

+

++ +

+++

++

+ +

++++

+

+

+++

+

+

++

+

+++

++++++ +

+

+

+

++++++

+++

+

+

++

+++ ++++ +

+

+

+

++

+

+++

++

++ +++

++ +

++

+++

++

++++

+

+

++

+ +

++

++ ++

+ +

++

+ +

+

+++

++++

+++

+ +

+

++

+

+

++

+

+

+++ +

++

+

++

+

+++ ++ +

+

+

+

+

+++

+

++

+

++

+ +

+

+++

+ +

+

+ ++

+

++

++ ++++

+++

+++++

++ ++++ ++++ ++ ++

+

++

++

+++

++

+

+ +++ +

+

++++

+

+

+

++

++++ ++

+

+

+

++++ ++ +

+ ++

++++

++

+

+

+++ +

++

+

+

+++ ++

+

+

+

+

+

++

+

+++

+++

+

+

+

+

+

++

+

+++ ++++++

++

+

+

+

++++ +

+++

++

+

+ +++ ++

+

+

++ +

+

+++ +

+

+

+

+++

++

++++ ++

+

++

+++

+

+ + +

+

+

+

+ ++

+

+ ++++ +

+

+

++

+ +

++ ++ ++

++

++++++ +

+ +

+

+

+

+++

+

++ ++

+

+++

+

+

+

+

+

++

+++

+

++ ++

+

+++ +++

++

++++

++

+ +

+

++

+++ +

+

+

+

++ +

+

+++ +++

+

+ ++ +

++

+++

+++

++ ++

+

++

++

+ +++ +

+

+

+

+++ +

++

+ ++++

+

+

+++

++

+

++

+ +

+

++

++

+

+ + + ++ +

+

+

++ +++ +

++

++

+

++ +

+

+

+ +++ +++

+

++

+++++ +++

++

++

+

++

++

+

++

+ +++ +

+

++ +

+

+++

+

++

+

++ ++ +

+

+++

+ ++ ++ +++

++

+

+++

++

++++

+

+ ++ +

+++

++

+

+++

+

+

++ +

+ ++

+

+

+

+

++

+

++

++++ ++

+++ + +++

++

++

+

+

+++

+

+

++

+

+

+

++

+++

+++++ ++

+++

++

++++

+ ++ ++

++

+

+

++

+

+

++

+++ ++

+

+ ++

++ +

+

+

+++ ++

+

++ ++

+

++++

+++

+

+++

+++

+++

++

+

+ ++

+

++

+

+

+

++

++

++

+

+ ++ ++

++ +

+

+

++

+++++ ++

+

++++

+

+

++

+++

+

+

++

++

+++ +

+

++

+ ++

++

+++

++

+

++

++

+

+

+

+++

++

+ +++

++ +++

++

+

+++

++

+

+++

+++++ +

+

+

+++

+

++

+

+++ +

+++

+

+ ++ ++++

+

++

++

++

+

+++ ++ ++

+

++

+

+

++ ++ +++ +++ ++

+

+

+ +++

+

++

+

+++ ++

+

+++ ++

+

+ + +++++

+++

+

+ ++ +

++

+

++

++

++

+

+

++

++ ++

+

+

++ +

++++

+

+

+++

++ + +

+

++

+

+ +++

++

+++ ++ ++ +

++

+++ ++

++

++++

+ ++ +

++

+

+++ +

+

+

++++

++++

+++ ++ +

++

+

+++

+

+

+

+ ++

++ ++ ++

++

+

+++

+++ +

++++ ++ +

+

++ +

++

++++++

++

+

+++++

+ ++ + +++ +

++

+++

+

+

++ ++++

+++

+

+

+

+ ++ +

+ +

+++++

+++

+ ++

+

++ ++

+++ ++

+

++ + ++

++++ ++

++ ++

++

+

+

++++ +

+

+

+

+ + +++

++

+

+ ++

+

++ +

+++ +

++ +

+

++

++++

++

+

+++ ++

+

++

+

++++

++++

+ +++ +++ + +

+

+++

+

++ +

+

+ ++ ++++++ ++

+ ++

++

++ +

+++

+

+

++

+

+

+ +++++

+

++

+

+

+ +++

+

+

+

+

++

+

+ +++++++

+

+

+ +

+++

+

++

+

+

+

+

++ + +

+

+

+

+++

++++ ++

+

+ ++

+

+

+

++

+

+

++

+ + ++

+ +

+

+

+

++ +

+

++

++

++

+++

+

++

++ + ++

++

+

+++++ ++

+

+

+++

+

++

++

+++++

++ +++ ++ ++

+

++

+

+++

+++

+

+

+++ ++ ++ +++ + +

+

+

+

+

++

++ +

+

++++

++

++

++

+++++ ++ +++

+

+

++++

+

+++

+

++

+

+

+

+

++

+++++

+

+

+

+

+

+ +

++

+

+++ ++

+++

+

++

+

+++++

+

++ +

+

+

+

++

++

++

+

+++

++

++ +

+

+

+

++++

+

++

+

++

++

+

+

+

+

+

+

+++ +

++

+

+

++

++ ++

+

+

++

++ +++ +++

++

+

+ +++++

+

+

+

++++ ++ + ++

+

++

+

+++

+

+

+

+

+

++++

++

++

+

+

+

+

+++++ +

++

+++

+

+

++ + ++ +

+++

+ +++++

++++

++

+

+

+++ +

+

+++++

++

+++++

++

+++

+

+

+ ++

+++

+

++ +++

+ +

+

++

+

++++

++

+

+++

+ +

++++

+

+

+

+

+ ++

++

++

+

+

+

+ + ++

+

+

++ +

+++ + +

+

+ +

++

+++

+

+

+

+

+

+++

++ ++

++++++ + ++

++

+ +

+

+ +++

++

++ ++++++

+ ++

++

++

+ ++

+ ++

+

++

+

+

++

++

+

++ +

++

+++ ++

+

+ ++

+

++++ + +

+ +

+

+

+

+ ++ ++

+

+

+++

+

+

++++ ++ + ++ +

+

+

+

+

+ +++

+

++

++

+++++

+++

+

++ +

+ +

+

+

+

++ ++

+ ++

++ +

++

+

+

++ +++ +

+

+

+

+

+

+ +

+

+++ ++

+ + ++

+++ + ++

+

++

+

++

++

+

+

+

+

+ ++ +++

+ +

+

+

+ ++

+++

+

++ + ++++

++

+

+

+

+++ +++++

+ +

++++

+

+++

+

+

+

+

+

++ ++

+

+

++

−1.0 −0.5 0.0 0.5−

1.0

−0.

50.

00.

5

Al2O3

CaO

Fe2O3

LOI

K2O

MgOMnONa2O

P2O5

SiO2

TiO2

PCA log − Ap

−4 −2 0 2 4 6

−4

−2

02

46

PC1 (43.6%)

PC

3 (1

3%)

+

+

++

+

+

++ +

+ +

++

+

+

+

++

+

+

+

+

++

++

++

++

+

++

+

++

+

++

+

+

+

+

+++

++

++

+

+

+

+

+

+

+ +

+

++

+

+

+

+

++

+

+

+

+

+

+

+

+

++

+++

++

+

+

+

+

+ ++

++

++

+

++

+

+

+

+

+

+

+

++

+

+

++

+

+

+ +

+

+

+

+

+

+

++

++

+

+

++

+

+

+

+

++

+

++

+

+

+

+

+

+

+

+

+

++

++

+

+

+++

++

++

+

++ ++

+

+

+

+

++

+

+

++

++

+ ++

++

+

++

+++

+ +

+

+

+

+

+

+

++ ++

+

+

+

+

+

+

+

++

++

++

+

++

+

++

++

+

++

++

+

+

+

++

+++

+

++

+

+

+

++

+

+

+

+

+

++

+

+

+

+

+

++

+

++

+

+

+ ++

++

+++

+ ++

+

++

+

+

++

+

+

+

+

+

+

++

++

+

++

+

+

+

+

+

+

+ +

+

++

+

+ ++

+

+

++

+++

++

+

++ ++

++

++

+

+

+

+

+

+

+

++

+

+

+++

+

++

+

+ ++

+

+

+

+

+

+ ++

+

++

++

+

+

+

+ +

+ +

+

+

+

+

++

++

+

+++ +

+

+

+ +

+

+

++

+

+++

+

+ +

+

+

+

++

+

+

++ +

+

+

+

+

+

+ +

+

+++

++

+

+

+

+

++

+

+

+

+

+

+

+

+

+

+

+

+

++

+++

+ +

+++

+

+

++

+

+

+

++

+

+

+

+

+

++

+

+

+

+

+

+

+

+

+

+

++

+

+

+

+ +

+ +

+

+ +

++

++

++

+

++

+

+

++

+

+

+

+

+

++ +

+

+

+

+

+

+

++

+

+

+

++

++

+

+

+

+ + +

++

+

+

+ +

+

+

+

+

++

+

+

++

+

+

+

+

+

+

+

+

+

++

+

+

++++

+

+

+

++

+

++

++

+

+

+

+

+

+

+

+

+

+

++

+

+

+

+

+

+

+

+

+

++

+

+

+

+

+

+

++

++

+

++

++

+ ++

+

++

+

++

+

+

+

++ +

++

+ +

++

+

+

++

++

+++

++

+

+ +

+

+

++

+

++

+

++

+

+

+

+

+

+

+

+

++

+

+

+

+

+

++ +

+

+

+

+

++

++

+

+++

++

+

+

+

+

+

+

+

++

+

+

+

+++ ++

++

+

+

+

+

+

+

++

+

++

+

+

+

++

+ ++

+

+

+

+

++

+

+

++ +

+

+

+

++

+

+

++

+

+

+

+

+

+ +

+

++

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

++ +

++

+

+

+

+

+

++ +

++

++++

+ +

+

++

++

+

+++

+

+

+

+

+

++

+

++

++

+

+

++

++

+

+

+

+

+

+ +

++

+

+

+

+

+

+

++

+

++

+

+

+

+

+

+++ +

+

+++

+

+ ++ +

+

+

++

+

+

+

+

+ +

+

++

++

+

++

+

+

+

++

+

+ +

+

+

+

+

+

+

+

+

+

++

+

+

+

++ +

+

+

+

+

++

+++

+

++

+

+

++

+

+

+

+

++

++ +

+

+

+

+

+++

++

+

+

+

+

+

++

+

++

++

+

++

+

++

+ ++

++

+

+

+

++

+

+

+

++

+

+

+

+

+

+

+

+

++

+

+

+

+

+++

+

+

++

+

++

+

+

+

+

+

+

+

+

+

+

+

++

+

++

++

++

++

++

+

+

++

+ +

+

+

++

+

+++

++

+

+

+

+

+

+

+

+

+

++ +

++

+

++

+

+

++

+

++

+

+

+

+

+

++

+

+

++

+

+

+++

+

++

+

+ +

+

+

++

++ ++

+

+

++

++

+

++

++

++

+

+

+

+

+

+

++

+

+

+

+

++

+

+ +

+

+

++

+ +

++ +

++

+

+

+

+

+

+

+

+

+

+

++++

+ +

+

++

++

+

++

++

+

+

+

+

+

+

+

+

+

+

+

++

++

+

+

+

+

+

+++

+

+

+

+++

+

+

+

+

+

+

++

+++

+

+

++

++

+

+

+

+

+

+

+

+

+

+

+

++

+ +

++

+

+

++

+

+

+

+

+

++

+

+

++

++

+

+

+

+

+

+

+

+++

+

+

+

+

+

+

+

+

+

++

+

++

+

+

+

+

+

+

+

+

+

+

+++ +

++

+

+

+

+

+

+

+++

+

+

++ +

+

+

+

+

+

+

++

+

++

+

+ ++

+

+ +

+

++ +

+

+

+ +

+

++

++

+

+

+

++

+

+

+

+

++

++

+

+

+

++

+

+

++

++

+

++

++

+

+

+ + +

+

+++

+

++

+

+

++

+

+

+

+

+

++

+

+ +

+

+

+

+

+

+++ +

++ +

+

++

+

++

+

+

+

++

+

+

++ +

+

+

+

+

++

++

++

+

++

++

+

+

+

++

+

+

+

+

+

+

+

+

++

+

+

+

+

+

+

++ +

++

+

+

+

++

+

+

+

+

+

++

+

+

++

+

+

+

+

+

+

+

+

+

+

+

+

+

+

++

+

+

+

+

++

+

+ +

+

+

+

+

+++ +

++

+

++

+ ++

+

+

+

++

+ +

+++

+

+

+

+

+

+

+

+

+

+

+

+

++

+

+

++

+

+

+

+

+

+

+

+

+

+

+ +

+

+

+

+

+

+

+

+ +

+

+

++

+

+

+

+

+

+

++

+

+

+ ++

+

+

++

+

+

+

+

++ ++

++

+

++

+

+

+

+

+

+

+

+

+

+

++

+++

+

+

+

+

++ ++

+

++

++

++

+

+ +

+ +

+

+

+

++

+

+

+

+

+

+

++

+

+

+

++

+

++

+

+

+

+++

++

++

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

++

+

++

+

+

+

+

+

+

++

+

+

+

+

++

++

+

+++

+

+

+

+

+

++

+

++ + +

+

+

+

+

+

+

+++

++

+

+

+

+++

++

+

+

+

+

++

+

+

+ +

++

+

+

+

++

++

++

+

+

+

+

+

+++

+

++

+

+

+

++

+

++

+

++

+

+

+

+++

+

++

+

+

+

+

+ +

+

++

+

+

++ +

+

+

+

+

+

+

+++ +

++ +

+

+

+ +

+

+

+++

++

+

+

+

++

+

++

+

+

+

+

+++

+

++

+++

+

+

+

+

+

+

+

++

+

+

+

++

++

+

++

+

++

++ +

+

+

+ ++

+

+ ++

+

+

+

+

+

+ +++

+ +

++

+

+

+

++ +

+

+

+

+

+

+

+ +

+

+

++

+ +

+++

+++

+

+ +

+

+

++

+

+

+

+

+

+

++ +

+

+

+

+

+

+

+

++

+

+

++

+

+ +

+

+ ++

+

++

++

+++ +

+ +

+

+

+

+

+

+

+

+

+

+

++

+

++

+

+

+

+

+

+

+

++

+++

+

+

++ +

+

+

+

+

+

++ +

+

+ +

+

+

+

++

++

+

+

+

+

+

+

++

+

+

++

+

+

++

+

+

+

++

+

+

+++

+

++

+

++

+

+

+

+

+++

+

+

+

+

+

+

+

+

+

++

+

+

++

+

+

+

+

+

+ ++

++

+

+

+

+

++

+ +

++

++

+

+

+

++

+ +

++

++

+

++

+

++

+

+

++

+

+

++

+

++

+

+

+++

++

−0.5 0.0 0.5 1.0

−0.

50.

00.

51.

0

Al2O3

CaO

Fe2O3

LOI

K2OMgO

MnO

Na2O

P2O5SiO2TiO2

PCA clr − Ap

−20 −10 0 10 20 30

−20

−10

010

2030

PC2 (14.4%)

PC

3 (6

.8%

)

+++++

++++++++ +++ +

++++++++

+

++ +

+ +++ ++

+

+ +++

+

++++++

+

+

++++++++

+++++ ++

++++

++++

+

+ +++++ +++

++++++ +

+ +++++++

+

+++

+

+

++

+

+++

+ +++++++

++

++++++++++

++ +++ ++ + +++

+

++

++++ ++

+++++++

+++++++

+++ ++++

++

++

++

+ ++++

+++++++

+

+ ++++ + ++++

++

+

+++

+

+++

+

+++++

++++

+

++ +++ +

+

+

+

+

++ +

+

+ +

+

+++ +

+

+ ++++

++++

+

+++++++++++++++ ++ +++++ ++++++++

+

++++

+++

++

+

+++ ++

++++ +

++

+

++

+ +++++

++

+

+++ ++ ++++++

+++++ +

+

++ +++

++

+

+++++

+

+

+

++

+++

+ +++++

++

++

+

++

+

+ ++++ +++++

+

+

+

+

+++++++++

+

+

+++ +++

+

+++ +

+

++++

+

+

+

+++ ++++

++ + ++

+++

++

+

+ ++

+

+

++ ++

+

+++++ ++

+

++

+ ++ +++++++++ +++++

+++

+

+

+++

+++++

+

+ ++

+

+

+

+

+

+++++

+

++++

+

+++ ++ ++++

++ +++

++

+

++

++++

+

+

+

++ +

+

+ +++++

+

+++++

++ ++++++ +++

+

++++++++

+

+

+

+

+++++

++++++

++

++ +++

+

+++ +

+

++

++

+

++++ ++

++

++++++

+ +

++

+

+ +++

+

+++++++

+++

+++ +++++

++++

++

+++

++

++++ ++

+

+++

+

++ ++

+++

+++ ++

+

+++++

++ ++ ++++

+

+ +++

+

+++++++ ++ +++++

++++

+

+

+++++ +

+

+

+

+

++

+

+++

++++ ++++ ++++++

+++

+

+++

+

+

+ +

+

+

+

++++

+ ++++++ +

+ ++

++ ++ ++++++

+++

+

+

+++

+

++++ ++++

+++++++

+

+++++

+

++++

++

++++

+++

++++++++++

+

++++

+

++

+

+

+

++ ++++

+

+++++++ ++

+++++++++ +

+

+ +++

+

+++++++++

+

+++ +++

+

+++ +++ + +++++

+++

+++

++

+ +++

+++++

+ ++++++

+

++ +

++

+

++ +++ ++++

+

++

+++++

+

++ +++

+ +

+

++++ ++ +

+

++

++

++

+

++ ++ ++++

+++

+

+++++++++++++

+

++++

+

+++

+ ++++

+

++ +++

+

++++++ ++++

+++ ++

++

+++++

++++

++

+++++

+

++ +++++

+

+

++ ++++++

+ +

+

+++++++

++++++++

++++ +++

+++++++++

++

++ +

+++

+

++++++ +

+++ +++++++++++

++

+++

++++++

++

+

+++++

+ ++

++++++

+

++++

+++

+++

+ ++

+

+++ ++++ +++++++++++

++

++ ++++

++ +

++

+

+++ +

++

++ ++++++ ++++

++ ++++++ +

+

+++++ ++++++ ++++++

+

+

++++++

++

+++ ++

+++

++++

+++

+ ++++++

+

++++++

++

+

+++++

+

+ ++

++ ++++++

+ ++

+ +++++

+++

+

+

+++

+

+++++++ ++++++++ ++++ ++++

+

++

+

++++++ +

+

+ ++

++++

++

+

+

+++

+

+++++

+++

+

++ +

++ +

+

++

+

+

++++++

++

+

+++

+++ +++

+++++

+

+++

+

+

++++++++

+

+

+

+++

+

++++

+++ ++++++++ +++ +

+

++ +++++

+

+

++++

+ ++

++ ++++++++

+++ ++

+

+++

+ ++++

++

+

+++ +++++++ ++++

+

+

+++ +++

+ +++ + +

++++

+++++++ +++

+

+

++++

+

+++

+

++

+

+

+

+

+++++ ++

++

+

+

+++

++

+++++++++

+++

+

+++++

+

+++

+

++

+ +++

+++

+++

+++++

++

+

++++

+

++

+

+++

++

+

+

+

+

+

+++ ++++

+

++

++++

++

++

+++++ +++

+++

+++++ +

+

+

+

++++ ++++++

++

+

+++ +

+

+

+

+

++++

+ +

+++

+

+

+

++++++ ++++++

+

++++ ++++++ ++ ++

++ ++ ++ +++

+++++++

+++

++++ +++++

++++

+

+ +++++

++++

++++

+

++

+

+ ++++

+

+

+ ++++++

+++

+

+

+

++++

+++

+++

+++ ++

+

++ ++

++++++++++ ++

+

+

+

+

++++

++ + ++++++++++

++

++

+

++++++ +++++++ +

+++

+++

+

++++ ++

++++

+

++

++

+

++ ++

++ ++++

++ ++

+

++ +++++ +

++

+

+++++

+

+

+++

++

++++++ +++++

+

++

++++

+

+++ ++++++

+++++

+ +++

++

+

++++++++ ++++

++

++++++

+

++

+

+

+++

++ ++++++

+++++++

+

++++

+++

++

+

++++++ +++

+

+

++ ++++

++++++++

++

++

+

+++ ++++++++++ +

++++

+

+++

+

+++++

+++

−0.4 −0.2 0.0 0.2 0.4 0.6 0.8

−0.

4−

0.2

0.0

0.2

0.4

0.6

0.8

Al2O3

CaO

Fe2O3

LOI

K2O

MgOMnO

Na2O

P2O5

SiO2

TiO2

PCA log − Ap

−5 0 5 10

−5

05

10

PC2 (22.3%)

PC

3 (1

3%)

+

+

++

+

+

+++

++

++

+

+

+

++

++

++

++

++

+++

+

+

+++

+++

++

+

+

++

++ +

++

++

+

+

+

+

+

+

++

+

++

+

+

+

+

++

+

++

+

+

+

+

+

+ +

+++

++

++

+

+++

+

++

++

+

++

+

+

+

+

+

+

+

++

+

+

+ +

+

+

++

++

+

+

+

+

++

++

+

+

++

+

+

+

+

++

+

+++

+

+

+

+

+

+

+

+

++

++

++

+ ++

+ +

+++

++++

+

+

+

++ +

+

+

++

++ ++ +

++

++

++

++

+ +

+

+

++

+

+

+ ++ ++

+

+

+

+

+

+

++

++++

+

++

++ +

++

+

++

++

+

+

+++

++++

++

+

+

+

++

+

+

+

+

+

++

+

+

+

+

+

++

+

++

+

++++

++

++

+++

+

+

++

+

+

++

+

+

+

++

+

+++

+

++ +

+

+

+

+

++

++

+

+++

+ ++

+

+

++

+++

++

+

+ ++++ +

+++

+

+

++

+

+

++

+

+

++ +

+

+ +

++ +++

+

+

+

+

+++

+

++

++

+

++

++

+ ++

+

+

+

++

++

+

++ + +

++

++

+

+

++

+

+ ++

+

++

+

+

+

++

+

+

+ + +

+

+

+

+

+

++

+

+ ++++

+

+

+

+

++

+

+

+

+

++

++

+

++

+

++

+++

++++ ++

+

++

+

+

+

++

+

+

+

+

+

++

+

+

+

+

+

+

+

+

+

+

++

+

+

+

++

+ +

+

++

++

++

++

+

++

+

+

++

+

+

+

+

+

+++

+

+

+

+

+

+

+ ++

+

+

++ ++

+

+

++++

++

+

+

++

+

+

+

+

++

+

++ +

+

++

+

+

+

+

+

+

+ +

++

++ + +

+

+

+

+++

+ ++ +

+

+

++

+

+

+

+

+

+

++

+

+

+

+

+

+

+

+

+

++

+

+

+

+

+

+

+ ++

++

++

++

+++

+

++

+

+ +

+

+

+

+++

+ +++

++

+

++ +

+ +

+ ++

++

+

++

+

+

+++

+ +

+

++

+

+

+

++

+

+

+

++

+

+

+

+

+

+++ +

++

+

++

++

+

+ ++

++

+

++

+

+

++

++

+

+

+

+++++

++

+

+

+

+

+

+

+ +

+

++

+

++

++

++ +

++

++

++

+

+

+ +++

+

+

++

+

+

++

+

+

+

++

++

+

+ +

+

+

++

+

+

+

++

+

++

+

+

+

+

+

+

+

++

++

++

+

+

++

+

+++

++ + ++++ +

+

++ +

++

+++

+

+

+

+

+

++

+

++

++

+

+

++

++

+

+

++

+

++

+ +

+

+

+

+

+

+

++

+

+ +

+

+

+

++

+ +++

+

++ +

+

++++

+

+

++

+

+

+

+

+++

+ ++ +

+

++

+

+

+

++

+

+++

+

+

+

+

+

+

+

+

++

+

+

+

++ +

+

+

+

++

++

+++

++

+

+

+ +

+

+

+

+

++

+++

+

++

+

+ ++

++

+

+

+

+

+

++

+

++

++

++

++

++

++++++

+

+

++

+

++

+ ++

+

+

+

+

+

+

+

++

+

+

++

+++

+

+

++

+++

++

+

+

+

+

+

++

+

+

++

++

+++

++

++

++

++

++

++

+

+

++

+

+ +++ +

++

++

+

+

+

+

+

+++

++

+

++

+

+

++

+

++

+

+

+

+

+

++

++

++

+

+

+ ++

++ +

+

++

+

++

+

+++ ++

+

+++ +

+++

++

++

+

++

+

++

+ +

+

+

+

+

++

+

++

++

++++

+++

+++

++

+

+

+

+

+

+++

+ ++++

+

++

++

+

++

++

++

+

++

+

+

+

+

+

+

++

+ +

+

+

+

+

+

++++

+

+

+++

+

+

++

+

+

++

+ ++

+

+

++

++

+

+

+

+

+

+

++

+

+

+

+ +

+ ++

++

+

++

+

+

+

+

+

++

+

+

+ +

++

+

+

+

+

+

+

+

++ ++

+

+

+

+

+

++

+

++

+

++

+

+

+

+

++

+

+

+

+

++ ++

++

+

+

+

+

+

++ ++

+

+

+++

+

+

+

+

+

+

+ ++

++

+

++ +

+

++

++++

+

+

+++

++

+ +

+

++

++

+

+

+

+

++

+++

+

+

+ +

+

+++

++

+

++

+ +

+

+

++ ++

+ ++ +

+++

+

++

+

+

+

+

+

++

+

+++

+

+

+

+

+ +++

+++

+

++

+

+ +

++

+

++

++

+++

+

+

+

+

++

+++

+

+

++

++

+

++

++

+

+

+

+

+

+

+

+

++

+

+

+

+

+

+

+ ++++

+

++

+ ++

++

+

+

++

++

++

+

++

+

+

+

+

+

++

+

+

+

+

+ +

+

+

+

++ +

+

+ +

+

+

+

+++ ++

++

++

+++ +

+

+

+

++

++

++++

+

+

+

+

+

+

+

+

++

++

+++

++

+

+

+

+

+

+

+

+

+

++ +

+

+

+

+

+

+

+

+++

+

++

+

+

+

+

+

+

++

+

+

+ ++

+

+

+ +

+

+

+

+

+++

++

++

++

+

+

+

+

+

+

+

+

+

+

+ +

+ ++

+

++

+

++++

+

+ +

++

++

+

++

++

+

+

+

+ +

+

+

+

+

+

+

+ +

+

+

+

++

+

++

+

+

+

+ + ++

+

+ +

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

++

+

+++

+

+

+

+

+

++

+

++

+

++

++

+

+++

+

+

+

+

+

++

+

++++

+

+

+

+

+

+

++ +

++

+

+

+

++ ++

+

+

+

+

+

++

+

+

++

++

+

+

+

++

+ ++ +

+

++

++

+++

+

++

+

+

+

++

+

+ +

+++

+

++

+++

+

++

+

+

+

+

++

+

+ +

++

+ + +

+

+

+

+

+

+

++ +++++

+

+

++

+

+

+++

++

+

+

+

++

+

+ +

++

+

+

+ ++

+

++

+ ++

+

+

+

+

++

+

++

+

+

+

++

+++

++

+

+ +

+++

+

+

++ +

+

++++

+

+

++++

+ +

++

++

+

+

++

+ +

+

+

+

+

+

+

+++

+

++

++

++ +

+ ++

+

++

++

++

+

+

+

+

+

+

+++ +

+

+

++

+

+

+ +

++

++

+

++

+

+ ++

+

++

+++ +++

+ +

+

+

+

+

+

+

+

+

+

+

+ +

++ +

+

+

+

+

+

+

+

++

+++

+

+

+++

+

+

+

+

+

+ ++

+

++

+

+

+

+ ++

+

++

+

+

+

+

++

+

+++

+

+

++

+

+

+

++

+

+

+++

++

+

+

++

+

+

+

+

++ ++

+

+

++

+

+

+

+

++

+

+

++

+

++

+

+

+ ++

++

+

++

++ +

++

++

++

+

+

+

++

++

++

++

++

++

++

+

+

++

++

++

++

++

+++ +

+ +

−0.5 0.0 0.5−

0.5

0.0

0.5

Al2O3

CaO

Fe2O3

LOI

K2OMgO

MnO

Na2O

P2O5SiO2TiO2

PCA clr − Ap

Fig. 10. Biplots showing the results of a PCA for the log-transformed (left) and for the opened (right) data.

209C. Reimann et al. / Science of the Total Environment 426 (2012) 196–210

Page 15: Science of the Total EnvironmentApplied geochemistry and environmental sciences invariably deal with compositional data. Classically, the original ... statistical approach was introduced

210 C. Reimann et al. / Science of the Total Environment 426 (2012) 196–210

andWater in Adelaide, Australia) and Eurometaux. Very special thanksare due to Ilse Schoeters, Rio Tinto, who arranged the Expert Group'scontact with Eurometaux and made this all possible, and to Robert G.Garrett of the Geological Survey of Canada, who established the firstcontact between Ilse and the Geochemistry Expert Group's chairman.The analytical work was co-financed by the following organisations:Eurometaux, Cobalt Development Institute (CDI), European CopperInstitute (ECI), Nickel Institute, Europe, European Precious MetalsFederation (EPMF), International Antimony Association (i2a), Interna-tional Manganese Institute (IMnI), International Molybdenum Associa-tion (IMoA), ITRI Ltd. (on behalf of the REACH Tin Metal Consortium),International Zinc Association (IZA), International Lead Association-Europe (ILA-Europe), European Borates Association (EBA), the (REACH)Vanadium Consortium (VC) and the (REACH) Selenium and TelluriumConsortium. Finally, the Directors of the European Geological Surveysand the additional participating organisations, are thanked for makingsampling of almost all of Europe in a tight time schedule possible. Theparticipation of KH was supported by the Council of the Czech Govern-mentMSM6198959214.Wewould also like to thank the two reviewers,Raimon Tolosana-Delgado and David B. Smith, for thorough and in-sightful reviews that helped to further improve the clarity of the paper.

Appendix A. Supplementary data

Supplementary data to this article can be found online at doi:10.1016/j.scitotenv.2012.02.032.

References

Aitchison J. The statistical analysis of compositional data. London: Chapman & Hall;1986. 416 pp.

Aitchison J. The one-hour course in compositional data analysis or compositional dataanalysis is simple. In: Pawlowsky-Glahn V, editor. Proceedings of IAMG'97 — thethird annual conference of the International Association for Mathematical Geology,volume I, II and addendum, International Center for Numerical Methods in Engi-neering (CIMNE), Barcelona (E), 1100; 1997. p. 3-35.

Aitchison J, Greenacre M. Biplots of compositional data. Appl Stat 2002;51(4):375–92.Asch, K., 2003. The 1:5 Million International Geological Map of Europe and Adjacent

Areas: Development and Implementation of a GIS-enabled Concept. GeologischesJahrbuch, SA 3, BGR, Hannover (ed.), Schweitzerbart (Stuttgart), 172 pp, 45 fig.,46 tab. Available at: http://www.bgr.de/karten/IGME5000/IGME5000.htm.

Bern CR. Soil chemistry in lithologically diverse datasets: the quartz dilution effect.Appl Geochem 2009;24:1429–37.

Blundell DR, Freeman R, Mueller S, editors. A continent revealed — the EuropeanGeotraverse. Cambridge: Cambridge University Press; 1992. 274 pp.

Buccianti A, Pawlowsky-Glahn V. New perspectives on water chemistry and composi-tional data analysis. Math Geol 2005;37:703–27.

Buccianti A, Pawlowsky-Glahn V, Mateu-Figueras G, editors. Compositional data analysis inthe geosciences: from theory to practice. London: Geological Society; 2006. 224 pp.

Chayes F. On correlation between variables of constant sum. J Geophys Res 1960;65(12):4185–93.

Darnley A, Björklund A, Bølviken B, Gustavson N, Koval PV, Plant JA, Steenfelt A, TauchidM, Xuejing X. A global geochemical database for environmental and resource man-agement. Recommendations for international geochemical mappingFinal report ofIGCP-project 259. Paris, France: UNESCO Publishing; 1995. 122pp.

EC, 2006. Regulation (EC) No 1907/2006 of the European Parliament and of the Council of18 December 2006 concerning the Registration, Evaluation, Authorisation and Re-striction of Chemicals (REACH), establishing a European Chemicals Agency, amendingDirective 1999/45/EC and repealing Council Regulation (EEC) No 793/93 and Com-mission Regulation (EC) No 1488/94 as well as Council Directive 76/769/EEC andCommission Directives 91/155/EEC, 93/67/EEC, 93/105/EC and 2000/21/EC. OfficialJournal of the European Communities, 30.12.2006, L396, 1–849.

Egozcue JJ, Pawlowsky-Glahn V. Basic concepts and procedures. In: Pawlowsky-Glahn V,Buccianti A, editors. Compositional data analysis: theory and applications. Chichester:Wiley; 2011. p. 12–28.

Egozcue JJ, Pawlowsky-Glahn V, Mateu-Figueras G, Barceló-Vidal C. Isometric logratiotransformations for compositional data analysis. Math Geol 2003;35:279–300.

EGS. EuroGeoSurveys Geochemistry Working Group. EuroGeoSurveys geochemicalmapping of agricultural and grazing land in Europe (GEMAS)— Field manual. NorgesGeologiske Undersøkelse Report, 2008.038; 2008. available on the internet at:http://www.ngu.no/upload/Publikasjoner/Rapporter/2008/2008_038.pdf. 46 pp.

Filzmoser P, Hron K. Outlier detection for compositional data using robust methods.Math Geosci 2008;40(3):233–48.

Filzmoser P, Hron K, Reimann C. Univariate statistical analysis of environmental (compo-sitional) data — problems and possibilities. Sci Total Environ 2009a;407:6100–8.

Filzmoser P, Hron K, Reimann C. Principal component analysis for compositional datawith outliers. Environmetrics 2009b;20(6):621–32.

Filzmoser P, Hron K, Reimann C. The bivariate statistical analysis of environmental(compositional) data. Sci Total Environ 2010;408:4230–8.

Filzmoser P, Hron K, Reimann C. Interpretation of multivariate outliers for compositionaldata. Comput Geosci 2012;39:77–85.

GLC2000 database. The CORINE Land CoverMap for Europe. European Commission Joint Re-search Centre, Ispra; 2003. Available at:http://edit.csica.es/Soil-Vegetation-LandCover.html.

Goldschmidt VM. The principles of distribution of chemical elements in minerals androcks. J Chem Soc Lond 1937:655–73.

Goldschmidt VM. Geochemistry. Oxford at the Clarendon press; 1954. 730 pp.Hron K, Templ M, Filzmoser P. Imputation of missing values for compositional data

using classical and robust methods. Comput Stat Data Anal 2010;54(12):3095–107.

Jones A, Montanarella L, Jones R. Soil Atlas of Europe. Luxemburg: European Commission;2005. 128 pp.

McCann, T. (ed), 2008. TheGeology of Central Europe. Volume1: Precambrian and Palaeo-zoic; Volume 2: Mesozoic and Cenozoic. Geological Society, London, Vol. 1, 748pp;Vol. 2, 749pp.

Otero N, Tolosana-Delgado R, Soler A, Pawlowsky-Glahn V, Canals A. Relative vs. absolutestatistical analysis of compositions: a comparative study of surface waters of aMediterranean river. Water Res 2005;39:1404–14.

Pearson K. Mathematical contributions to the theory of evolution. On a form of spuriouscorrelation which may arise when indices are used in the measurement of organs.Proceedings of the Royal Society of London LX; 1897. p. 489–502.

Pawlowsky-Glahn V, Buccianti A. Visualization and modelling of subpopulations ofcompositional data: statistical methods illustrated by means of geochemical datafrom fumarolic fluids. Int J Earth Sci Geol Rundsch 2002;91:357–68.

Pawlowsky-Glahn V, Buccianti A. Compositional data analysis: theory and applications.Chichester: Wiley; 2011. 378 pp.

Reimann C. Geochemical mapping — technique or art? Geochem Explor Environ Anal2005;5:359–70.

Reimann C, Birke M, editors. Geochemistry of European bottled water. Stuttgart: Born-traeger Science Publishers; 2010. 268 pp.

Reimann C, Siewers U, Tarvainen T, Bityukova L, Eriksson J, Gilucis A, Gregorauskiene V,Lukashev VK, Matinian NN, Pasieczna A. Agricultural soils in northern Europe: ageochemical atlas. Geologisches Jahrbuch, Sonderhefte, Reihe D, Heft SD 5,Schweizerbart'sche Verlagsbuchhandlung, Stuttgart; 2003. 279 pp.

Reimann C, Filzmoser P, Garrett RG, Dutter R. Statistical data analysis explained. AppliedEnvironmental Statistics with R. Chichester: Wiley; 2008. 343 pp.

Reimann C, Matschullat J, Birke M, Salminen R. Arsenic distribution in the environment:the effects of scale. Appl Geochem 2009;24:1147–67.

Reimann C, Matschullat J, Birke M, Salminen R. Antimony in the environment— lessonsfrom geochemical mapping. Appl Geochem 2010;25:175–98.

Reimann C, de Caritat P, GEMAS Project Team. New soil composition data for Europe andAustralia: demonstrating comparability, identifying continental-scale processes andlearning lessons for global geochemical mapping. Sci Total Environ 2012a;416:239–52.

Reimann C, Demetriades A, Eggen OA, Filzmoser P. The EuroGeoSurveys geochemicalmapping of agricultural and grazing land soils project (GEMAS) – Evaluation of qual-ity control results of total C and S, total organic carbon (TOC), cation exchange capac-ity (CEC), XRF, pH, and particle size distribution (PSD) analysis. Norges GeologiskeUndersøkelse Report, 2011.043; 2011b. available on the internet at:http://www.ngu.no/upload/Publikasjoner/Rapporter/2011/2011_043.pdf. 90 pp.

Reimann C, Flem B, Fabian K, Birke M, Ladenberger A, Négrel P, Demetriades A,Hoogewerff J. GEMAS Project Team. Lead and lead isotopes in agricultural soils ofEurope - The continental perspective. Appl Geochem 2012b;27:532–42.

Smith DB, Reimann C. Low-density geochemical mapping and the robustness of geo-chemical patterns. Geochem Explor Environ Anal GEEA 2008;8:219–27.

Tolosana-Delgado R, van den Boogaart KG. Geostatistics for compositions. In: Pawlowsky-Glahn V, Buccianti A, editors. Compositional data analysis: theory and applications.Chichester: Wiley; 2011. p. 73–86.

Tukey JW. Exploratory data analysis. Reading: Addison Wesley; 1977. 506 pp.von Eynatten H, Barceló-Vidal C, Pawlowsky-Glahn V. Modelling compositional change:

the example of chemical weathering of granitoid rocks. Math Geol 2003;35:231–51.Ziegler PA. Geological Atlas ofWestern and Central Europe. Shell International Petroleum.

Netherlands: Maatschappij B.V.; 1990. 239pp. + Enclosures.