Global Soil MappingA proposal for a participatory multiscale
approach to GSM
Tomislav Hengl
ISRIC � World Soil Information, Wageningen University
GlobalSoilMap.net presentation, 11 Feb 2011
OutlineIntroduction
This talkMy backgroundsGlobalSoilMap.net
Misconceptions about DSM/GSMMapping e�ciencySoil geodata usabilitySoil prediction methods
A proposal for GSMGlobal Soil Mapping is not trivialNested regression modelingThe participatory approach
Malawi show caseInput dataResults
Summary points
GlobalSoilMap.net presentation, 11 Feb 2011
Topics
I My backgrounds;
I Some misconceptions about DSM/GSM;
I A proposal for GSM:I A Global Multiscale Prediction ModelI The crowd-sourcing approach to soil data collection (OpenSoil Pro�les, Soil covariates)
I Global task-oriented Land (Soil) Information System
I Report on the results (Malawi show case).
I Get some feedback.
GlobalSoilMap.net presentation, 11 Feb 2011
Previous projects
I My expertise: spatio-temporal data analysis in FOSS (R),digital soil mapping, geomorphometry, geostatistics. . .
I I have worked with various type of data(climatic/meteorological, species occurrence records,geochemicals. . .);
I Recently published a repository of cca 100 global layers atresolution of 0.05 arcdegrees (5.6 km).
I Author of �A Practical Guide to Geostatistical Mapping�.
I Main organizer of the GEOSTAT summer school for PhDstudents (R+OSGeo).
GlobalSoilMap.net presentation, 11 Feb 2011
My dream is to build an Open multipurpose GLIS
GLOBAL
LAND INFORMATION
SYSTEM
Soil properties (soil information system)
- physical and chemical soil properties, nutrient
capacity, water storage, acidity/salinity…
Live weather channel (meteorological forecasting)
- anticipated temperature (min, max), rainfall, frost
hazard, drought hazard, flood hazard…
Plant monitoring channel (MODIS/ENVISAT)
- current biomass production, biomass anomalies
(pest and diseases), plant health…
Socio-economic data (site-specific)
- administrative units, new laws and regulations,
market activity, closest offices, agro-dealers…
Spatial location (site)
Query site
attributes
Information
incorrect?Update with
ground truth data
Fertilization
Irrigation
Pest treatment
Best crop calendar
Yield estimates
Environmental risks
Suggest the best
land use practice
Model library
GlobalSoilMap.net presentation, 11 Feb 2011
GlobalSoilMap.net
I An international initiative to make soil property maps (7+3) atsix depths at 3 arcsecs (100 m).
I the lightmotive is to �assemble, collate, and rescue as much of
the worlds existing soil data� ;
I Some 30 people directly involved (ISRIC is the main projectcoordinator).
I International compilation of soil data.
I The soil-equivalent of the OneGeology.org, GBIF, GlobCoverand similar projects.
I See full speci�cations athttp://globalsoilmap.org/specifications
GlobalSoilMap.net presentation, 11 Feb 2011
World soils in numbers
I The total productive soil areas: about 104 million squarekm.
I To map the world at 100 m (1:200k), would cost about5 billion EUR (0.5 EUR per ha) using traditional methods.
I We would require some 65M pro�les according to the strictrules of Avery (1987).
I World map at 0.008333333 arcdegrees (ca.1 km) resolution isan image of size 43,200Ö21,600 pixels.
I 27 billion pixels needed to represent the whole world in 100 m(productive soil areas).
GlobalSoilMap.net presentation, 11 Feb 2011
World soils in numbers
I The total productive soil areas: about 104 million squarekm.
I To map the world at 100 m (1:200k), would cost about5 billion EUR (0.5 EUR per ha) using traditional methods.
I We would require some 65M pro�les according to the strictrules of Avery (1987).
I World map at 0.008333333 arcdegrees (ca.1 km) resolution isan image of size 43,200Ö21,600 pixels.
I 27 billion pixels needed to represent the whole world in 100 m(productive soil areas).
GlobalSoilMap.net presentation, 11 Feb 2011
World soils in numbers
I The total productive soil areas: about 104 million squarekm.
I To map the world at 100 m (1:200k), would cost about5 billion EUR (0.5 EUR per ha) using traditional methods.
I We would require some 65M pro�les according to the strictrules of Avery (1987).
I World map at 0.008333333 arcdegrees (ca.1 km) resolution isan image of size 43,200Ö21,600 pixels.
I 27 billion pixels needed to represent the whole world in 100 m(productive soil areas).
GlobalSoilMap.net presentation, 11 Feb 2011
World soils in numbers
I The total productive soil areas: about 104 million squarekm.
I To map the world at 100 m (1:200k), would cost about5 billion EUR (0.5 EUR per ha) using traditional methods.
I We would require some 65M pro�les according to the strictrules of Avery (1987).
I World map at 0.008333333 arcdegrees (ca.1 km) resolution isan image of size 43,200Ö21,600 pixels.
I 27 billion pixels needed to represent the whole world in 100 m(productive soil areas).
GlobalSoilMap.net presentation, 11 Feb 2011
World soils in numbers
I The total productive soil areas: about 104 million squarekm.
I To map the world at 100 m (1:200k), would cost about5 billion EUR (0.5 EUR per ha) using traditional methods.
I We would require some 65M pro�les according to the strictrules of Avery (1987).
I World map at 0.008333333 arcdegrees (ca.1 km) resolution isan image of size 43,200Ö21,600 pixels.
I 27 billion pixels needed to represent the whole world in 100 m(productive soil areas).
GlobalSoilMap.net presentation, 11 Feb 2011
GSM in comparison with other similar projects
SRTM GADM
1990 1995 2000 2005 2010 2015 2020
2.0
2.5
3.0
3.5
4.0
Year
Re
so
lutio
n (
m)
in lo
g-s
ca
le
GPWv3
MOD13C2
MOD12C1
CHLO/SST
GLWD
DMSP-OLSv4
WorldClim
GlobCov2
FRA
5.6 km
HWSDv1EcoRegions
GlobalSoilMap?
OneGeology?
GlobalSoilMap.net presentation, 11 Feb 2011
Misconceptions #1
Mapping e�ciency can be expressed as cost in $ perarea.
To map world soils at 100 m using per unit costs of$2/km2 would cost ca.$300 million1.
1Pedro Sanchez; the NY GlobalSoiMap.net meeting (17th Feb 2009).
GlobalSoilMap.net presentation, 11 Feb 2011
Survey costs and mapping scale
●
●
●
●
●
9.5 10.0 10.5 11.0 11.5 12.0 12.5
−1
01
23
Scale number (log−scale)
Min
imum
sur
vey
cost
s in
EU
R /
ha (
log−
scal
e)
GlobalSoilMap.net presentation, 11 Feb 2011
Mapping accuracy and survey costs
The cost of a soil survey is a function of mapping scale, roughly:
log(X) = b0 + b1 · log(SN) (1)
We can �t a linear model to the empirical table data frome.g.Legros (2006; p.75), and hence we get:
X = exp (19.0825− 1.6232 · log(SN)) (2)
where X is the minimum cost/ha in Euros (based on estimates in2002). To map 1 ha of soil at 1:100,000 scale, for example, oneneeds (at least) 1.5 Euros.
GlobalSoilMap.net presentation, 11 Feb 2011
The GSM calculus
I The total productive soil areas: about 104 million squarekm.
I To map the world soils at 100 m (1:200k), would cost about5 billion EUR (0.5 EUR per ha) using traditional methods.According to Pedro Sanchez, soils could be mapped for$0.20 USD per ha ($300 million USD).
I We would require some 65M pro�les according to the strictrules of Avery (1987).
I World map at 0.008333333 arcdegrees (ca.1 km) resolution isan image of size 43,200Ö21,600 pixels.
I We would need immense storage capacities � one image ofthe world at a 100 m resolution contains 27 billion pixels(productive soil areas only!).
GlobalSoilMap.net presentation, 11 Feb 2011
The GSM calculus
I The total productive soil areas: about 104 million squarekm.
I To map the world soils at 100 m (1:200k), would cost about5 billion EUR (0.5 EUR per ha) using traditional methods.According to Pedro Sanchez, soils could be mapped for$0.20 USD per ha ($300 million USD).
I We would require some 65M pro�les according to the strictrules of Avery (1987).
I World map at 0.008333333 arcdegrees (ca.1 km) resolution isan image of size 43,200Ö21,600 pixels.
I We would need immense storage capacities � one image ofthe world at a 100 m resolution contains 27 billion pixels(productive soil areas only!).
GlobalSoilMap.net presentation, 11 Feb 2011
The GSM calculus
I The total productive soil areas: about 104 million squarekm.
I To map the world soils at 100 m (1:200k), would cost about5 billion EUR (0.5 EUR per ha) using traditional methods.According to Pedro Sanchez, soils could be mapped for$0.20 USD per ha ($300 million USD).
I We would require some 65M pro�les according to the strictrules of Avery (1987).
I World map at 0.008333333 arcdegrees (ca.1 km) resolution isan image of size 43,200Ö21,600 pixels.
I We would need immense storage capacities � one image ofthe world at a 100 m resolution contains 27 billion pixels(productive soil areas only!).
GlobalSoilMap.net presentation, 11 Feb 2011
The GSM calculus
I The total productive soil areas: about 104 million squarekm.
I To map the world soils at 100 m (1:200k), would cost about5 billion EUR (0.5 EUR per ha) using traditional methods.According to Pedro Sanchez, soils could be mapped for$0.20 USD per ha ($300 million USD).
I We would require some 65M pro�les according to the strictrules of Avery (1987).
I World map at 0.008333333 arcdegrees (ca.1 km) resolution isan image of size 43,200Ö21,600 pixels.
I We would need immense storage capacities � one image ofthe world at a 100 m resolution contains 27 billion pixels(productive soil areas only!).
GlobalSoilMap.net presentation, 11 Feb 2011
The GSM calculus
I The total productive soil areas: about 104 million squarekm.
I To map the world soils at 100 m (1:200k), would cost about5 billion EUR (0.5 EUR per ha) using traditional methods.According to Pedro Sanchez, soils could be mapped for$0.20 USD per ha ($300 million USD).
I We would require some 65M pro�les according to the strictrules of Avery (1987).
I World map at 0.008333333 arcdegrees (ca.1 km) resolution isan image of size 43,200Ö21,600 pixels.
I We would need immense storage capacities � one image ofthe world at a 100 m resolution contains 27 billion pixels(productive soil areas only!).
GlobalSoilMap.net presentation, 11 Feb 2011
Mapping e�ciency
The costs-per-area measure is not really informative (it is easy tospend money).We propose instead a measure called mapping e�ciency, de�nedas the amount of money needed to map an area of standard sizeand explain each one percent of variation in the target variable:
θ =X
A · RMSE r[EUR · km−2 ·%−1] (3)
where X is the total costs of a survey, A is the size of area inkm−2, and RMSE r is the amount of variation explained by thespatial prediction model.
GlobalSoilMap.net presentation, 11 Feb 2011
Prediction accuracy and survey costs
GlobalSoilMap.net presentation, 11 Feb 2011
Information production e�ciency
An additional measure of mapping e�ciency is the informationproduction e�ciency, i.e.the amount of money spent to producea given quantity of soil information:
Υ =X
gzip[EUR · B−1] (4)
where gzip is the amount of data (in Bytes) left after compression:
gzip = fc · (fE ·M) · cZ [B] (5)
where fc is the loss-less data compression factor, fE is theextrapolation adjustment factor, cZ is the variable coding size, andM is the total number of pixels.
GlobalSoilMap.net presentation, 11 Feb 2011
Map information content
Variable coding can be set by deriving the (global) e�ectiveprecision of a soil property map:
∆z =RMSE
2; Z = {Z(s),∀s ∈ A} (6)
Following the Nyquist frequency concept from signal processing,there is no justi�cation in saving the predictions with betterprecision than half the average accuracy.
GlobalSoilMap.net presentation, 11 Feb 2011
Map information content
E�ective information content (bytes remaining after compression)in a soil map for a given map extent is basically a function of threefactors:
I Support size (point or block).
I Size of a map in terms of number of pixels, determined, infact, by the e�ective pixel size (which is in fact determinedby sampling intensity).
I E�ective precision (Eq.6) estimated using validation points.
GlobalSoilMap.net presentation, 11 Feb 2011
Conclusions
I Mapping e�ciency (cost / area / percent of varianceexplained) is an objective criteria to compare spatial predictionmethods. $ / area is incomplete (anyone can spend money toproduce maps � the question is how good are the maps?).
I Maps are not what they seem � always assess and visualizethe accuracy of your maps.
I Soil mapping is an iterative process, in each iteration weexplain a bit more of variability.
I We might not ever be able to explain 100% variability in thetarget soil variable.
GlobalSoilMap.net presentation, 11 Feb 2011
Conclusions
I Mapping e�ciency (cost / area / percent of varianceexplained) is an objective criteria to compare spatial predictionmethods. $ / area is incomplete (anyone can spend money toproduce maps � the question is how good are the maps?).
I Maps are not what they seem � always assess and visualizethe accuracy of your maps.
I Soil mapping is an iterative process, in each iteration weexplain a bit more of variability.
I We might not ever be able to explain 100% variability in thetarget soil variable.
GlobalSoilMap.net presentation, 11 Feb 2011
Conclusions
I Mapping e�ciency (cost / area / percent of varianceexplained) is an objective criteria to compare spatial predictionmethods. $ / area is incomplete (anyone can spend money toproduce maps � the question is how good are the maps?).
I Maps are not what they seem � always assess and visualizethe accuracy of your maps.
I Soil mapping is an iterative process, in each iteration weexplain a bit more of variability.
I We might not ever be able to explain 100% variability in thetarget soil variable.
GlobalSoilMap.net presentation, 11 Feb 2011
Conclusions
I Mapping e�ciency (cost / area / percent of varianceexplained) is an objective criteria to compare spatial predictionmethods. $ / area is incomplete (anyone can spend money toproduce maps � the question is how good are the maps?).
I Maps are not what they seem � always assess and visualizethe accuracy of your maps.
I Soil mapping is an iterative process, in each iteration weexplain a bit more of variability.
I We might not ever be able to explain 100% variability in thetarget soil variable.
GlobalSoilMap.net presentation, 11 Feb 2011
Misconceptions #2
Each node will produce soil property maps for theirarea of interest, which can then be stitched together2
These maps will become the most used soilinformation in the World.
2This is not speci�es on GlobalSoilMap.net, but there is a general agreement.
GlobalSoilMap.net presentation, 11 Feb 2011
A hierarchical approach to GSM
I Country nodes� continental nodes (major players)� Globalcoverage.
I Each country node is responsible for producing maps for theirterritory. The nodes have a complete freedom to selectapplicable spatial prediction methods (delivery tempo,data sharing policy etc.).
I As long as the technical speci�cations are satis�ed (10properties, 6 depths, upper lower con�dence limits, 100 m),the maps will be put on GlobalSoilMap.net.
I Inputs and methods to be used for GSM are secondary.
GlobalSoilMap.net presentation, 11 Feb 2011
Lessons from geodata usability
I Geodata usability is a function of: (1) adequacy, (2)consistency, (3) completeness, (4) accuracy of themetadata, (5) data interoperability, (6) accessibility anddata sharing capacity, (7) attribute and thematicaccuracy.
I Each of these aspects can be optimized.
I In reality, we can only increase each of the listed factors up toa certain level, then due to objective reasons, we reach thebest possible performance given the available funds andmethods. Any other improvement would require additionalfunds (or radical improvement of the data/operation models).
GlobalSoilMap.net presentation, 11 Feb 2011
Lessons from geodata usability
I Geodata usability is a function of: (1) adequacy, (2)consistency, (3) completeness, (4) accuracy of themetadata, (5) data interoperability, (6) accessibility anddata sharing capacity, (7) attribute and thematicaccuracy.
I Each of these aspects can be optimized.
I In reality, we can only increase each of the listed factors up toa certain level, then due to objective reasons, we reach thebest possible performance given the available funds andmethods. Any other improvement would require additionalfunds (or radical improvement of the data/operation models).
GlobalSoilMap.net presentation, 11 Feb 2011
Lessons from geodata usability
I Geodata usability is a function of: (1) adequacy, (2)consistency, (3) completeness, (4) accuracy of themetadata, (5) data interoperability, (6) accessibility anddata sharing capacity, (7) attribute and thematicaccuracy.
I Each of these aspects can be optimized.
I In reality, we can only increase each of the listed factors up toa certain level, then due to objective reasons, we reach thebest possible performance given the available funds andmethods. Any other improvement would require additionalfunds (or radical improvement of the data/operation models).
GlobalSoilMap.net presentation, 11 Feb 2011
Soil pro�les from various projects (65k points)
GlobalSoilMap.net presentation, 11 Feb 2011
Conclusions
I A hierarchical (isolation) approach to global soil mapping(stitching of country maps) would probably lead to productsthat are inconsistent, incomplete and irreproducible.
I Considering the current state of legacy data, any GSM willneed to be largely based on extrapolation and downscaling.
I The Global Soil Mapping initiative should be about buildinglive repositories (Open Soil Pro�les, Soil Covariates) and tools(Global Soil Information Facility).
I To map the world soils at 100 m (1:200k), would cost ca.$300million USD. To update such map would cost (again!) $300million USD.
I The future of digital soil mapping lays in task-oriented SoilInformation Systems (idea by Gerard Heuvelink).
GlobalSoilMap.net presentation, 11 Feb 2011
Conclusions
I A hierarchical (isolation) approach to global soil mapping(stitching of country maps) would probably lead to productsthat are inconsistent, incomplete and irreproducible.
I Considering the current state of legacy data, any GSM willneed to be largely based on extrapolation and downscaling.
I The Global Soil Mapping initiative should be about buildinglive repositories (Open Soil Pro�les, Soil Covariates) and tools(Global Soil Information Facility).
I To map the world soils at 100 m (1:200k), would cost ca.$300million USD. To update such map would cost (again!) $300million USD.
I The future of digital soil mapping lays in task-oriented SoilInformation Systems (idea by Gerard Heuvelink).
GlobalSoilMap.net presentation, 11 Feb 2011
Conclusions
I A hierarchical (isolation) approach to global soil mapping(stitching of country maps) would probably lead to productsthat are inconsistent, incomplete and irreproducible.
I Considering the current state of legacy data, any GSM willneed to be largely based on extrapolation and downscaling.
I The Global Soil Mapping initiative should be about buildinglive repositories (Open Soil Pro�les, Soil Covariates) and tools(Global Soil Information Facility).
I To map the world soils at 100 m (1:200k), would cost ca.$300million USD. To update such map would cost (again!) $300million USD.
I The future of digital soil mapping lays in task-oriented SoilInformation Systems (idea by Gerard Heuvelink).
GlobalSoilMap.net presentation, 11 Feb 2011
Conclusions
I A hierarchical (isolation) approach to global soil mapping(stitching of country maps) would probably lead to productsthat are inconsistent, incomplete and irreproducible.
I Considering the current state of legacy data, any GSM willneed to be largely based on extrapolation and downscaling.
I The Global Soil Mapping initiative should be about buildinglive repositories (Open Soil Pro�les, Soil Covariates) and tools(Global Soil Information Facility).
I To map the world soils at 100 m (1:200k), would cost ca.$300million USD. To update such map would cost (again!) $300million USD.
I The future of digital soil mapping lays in task-oriented SoilInformation Systems (idea by Gerard Heuvelink).
GlobalSoilMap.net presentation, 11 Feb 2011
Conclusions
I A hierarchical (isolation) approach to global soil mapping(stitching of country maps) would probably lead to productsthat are inconsistent, incomplete and irreproducible.
I Considering the current state of legacy data, any GSM willneed to be largely based on extrapolation and downscaling.
I The Global Soil Mapping initiative should be about buildinglive repositories (Open Soil Pro�les, Soil Covariates) and tools(Global Soil Information Facility).
I To map the world soils at 100 m (1:200k), would cost ca.$300million USD. To update such map would cost (again!) $300million USD.
I The future of digital soil mapping lays in task-oriented SoilInformation Systems (idea by Gerard Heuvelink).
GlobalSoilMap.net presentation, 11 Feb 2011
Misconceptions #3
There are many possible DSM techniques that areequally suitable for GSM.
Each node should use which ever technique they �ndapplicable.
GlobalSoilMap.net presentation, 11 Feb 2011
GSM techniques
Profile data and polygon maps
Profile data only
Polygon maps only
No soil data available
Data rich areas Data poor areas
extrapolationKnowledge transfer
Hybrid
methods
Purely
geostatistical
methods
Knowledge-
driven
methods
Extrapolation
methods
Figure: Groups of techniques suitable for global soil mapping; afterMinasny and McBratney (2010).
GlobalSoilMap.net presentation, 11 Feb 2011
Conclusions
I Most of the DSM techniques are in fact somehow connected(weighted averaging per polygon is in fact type of regression,SOLIM is type of multiple linear regression), hence, there arenot as many techniques.
I For the consistency and completeness of �nal outputs it isprobably better to build one global model for each soilproperty (or even one multivariate model).
I Selection of covariates and prediction techniques needsto be clearly driven by objective accuracy assessment.
GlobalSoilMap.net presentation, 11 Feb 2011
Conclusions
I Most of the DSM techniques are in fact somehow connected(weighted averaging per polygon is in fact type of regression,SOLIM is type of multiple linear regression), hence, there arenot as many techniques.
I For the consistency and completeness of �nal outputs it isprobably better to build one global model for each soilproperty (or even one multivariate model).
I Selection of covariates and prediction techniques needsto be clearly driven by objective accuracy assessment.
GlobalSoilMap.net presentation, 11 Feb 2011
Conclusions
I Most of the DSM techniques are in fact somehow connected(weighted averaging per polygon is in fact type of regression,SOLIM is type of multiple linear regression), hence, there arenot as many techniques.
I For the consistency and completeness of �nal outputs it isprobably better to build one global model for each soilproperty (or even one multivariate model).
I Selection of covariates and prediction techniques needsto be clearly driven by objective accuracy assessment.
GlobalSoilMap.net presentation, 11 Feb 2011
Other global mapping projects
I SRTM (DEM) � 100 m near-to-global coverage.
I MODIS products � a variety of RS-based products(vegetation indices, LAI, land cover maps etc) at resolutions250 m, 500 m, 1 km and 5.6 km.
I GlobCov � ESA's ENVISAT global consistent land covermap (300 m).
I WorldClim � maps of bioclimatic variables interpolated usingdense point data (1 km).
I . . . there are many more examples (see also: publicly availabledata sets).
All these are based on using uni�ed methodology.
GlobalSoilMap.net presentation, 11 Feb 2011
Di�culties
I There is probably not enough point data in the world to makesoil property maps at so �ne resolution (maps will be largelybased on extrapolation and downscaling).
I The most serious problem of GSM is the discrepancy betweenthe countries considering the amount of (�eld) data.
I Soils are NOT vegetation � it is much more di�cult tomap distribution of soils accurately (RS is helpful, but only upto a certain degree).
I The �nal global soil property maps might be of poor accuracyin >50% of the world.
GlobalSoilMap.net presentation, 11 Feb 2011
Di�culties
I There is probably not enough point data in the world to makesoil property maps at so �ne resolution (maps will be largelybased on extrapolation and downscaling).
I The most serious problem of GSM is the discrepancy betweenthe countries considering the amount of (�eld) data.
I Soils are NOT vegetation � it is much more di�cult tomap distribution of soils accurately (RS is helpful, but only upto a certain degree).
I The �nal global soil property maps might be of poor accuracyin >50% of the world.
GlobalSoilMap.net presentation, 11 Feb 2011
Di�culties
I There is probably not enough point data in the world to makesoil property maps at so �ne resolution (maps will be largelybased on extrapolation and downscaling).
I The most serious problem of GSM is the discrepancy betweenthe countries considering the amount of (�eld) data.
I Soils are NOT vegetation � it is much more di�cult tomap distribution of soils accurately (RS is helpful, but only upto a certain degree).
I The �nal global soil property maps might be of poor accuracyin >50% of the world.
GlobalSoilMap.net presentation, 11 Feb 2011
Di�culties
I There is probably not enough point data in the world to makesoil property maps at so �ne resolution (maps will be largelybased on extrapolation and downscaling).
I The most serious problem of GSM is the discrepancy betweenthe countries considering the amount of (�eld) data.
I Soils are NOT vegetation � it is much more di�cult tomap distribution of soils accurately (RS is helpful, but only upto a certain degree).
I The �nal global soil property maps might be of poor accuracyin >50% of the world.
GlobalSoilMap.net presentation, 11 Feb 2011
Question:
Can we do GSM @ 100 m with such limited data?
GlobalSoilMap.net presentation, 11 Feb 2011
Opportunities
I There is an enormous potential of getting the legacy datatogether (there must be thousands and thousands of soilpro�les unused).
I There is an impressive enthusiasm about this project (manynational soil survey agencies see this as an opportunity to getfunding).
I World (scientists, policy makers, crediting organizations,private sector, . . . farmers) need soil information!
GlobalSoilMap.net presentation, 11 Feb 2011
Opportunities
I There is an enormous potential of getting the legacy datatogether (there must be thousands and thousands of soilpro�les unused).
I There is an impressive enthusiasm about this project (manynational soil survey agencies see this as an opportunity to getfunding).
I World (scientists, policy makers, crediting organizations,private sector, . . . farmers) need soil information!
GlobalSoilMap.net presentation, 11 Feb 2011
Opportunities
I There is an enormous potential of getting the legacy datatogether (there must be thousands and thousands of soilpro�les unused).
I There is an impressive enthusiasm about this project (manynational soil survey agencies see this as an opportunity to getfunding).
I World (scientists, policy makers, crediting organizations,private sector, . . . farmers) need soil information!
GlobalSoilMap.net presentation, 11 Feb 2011
The proposal
We propose that, for the purpose of achieving thehighest geodata usability, the project should promote
use of a single (participatory) global multiscalenested regression-kriging model (5 km, 1 km, 250 m
and 100 m resolution)
and then engage local DSM teams to contribute soilground truth data, polygon maps and predictions
that can be integrated into one information system.
GlobalSoilMap.net presentation, 11 Feb 2011
Global Multiscale Nested RK
Predictions are based on a nested RK model:
z(sB) = m0(sB−k) + e1(sB−k|sB−[k+1]) + . . .+ ek(sB−2|sB−1) + ε(sB) (7)
where z(sB) is the value of the target variable estimated at groundscale (B), B−1, . . . ,B−k are the higher order components,ek(sB−k|sB−(k+1)) is the residual variation from scale sB−(k+1) to ahigher resolution scale sB−k, and ε is spatially auto-correlatedresidual soil variation (dealt with ordinary kriging).
GlobalSoilMap.net presentation, 11 Feb 2011
Some drawbacks
I GM-NRK makes all other DSM e�orts in the Worldredundant(!);
I GM-NRK ignores all other sub-100 m resolution data andmapping e�orts;
I It could also delay delivery of soil property maps because themapping activities would be more di�cult to organizeinternationally;
GlobalSoilMap.net presentation, 11 Feb 2011
The best combined spatial predictor
To avoid these di�culties, we propose using a participatoryapproach to GSM � a combination of GM-NRK and localprediction models. Assuming that at local and global scalesindependent inputs/models are used to generate predictions, thebest combined predictor can be obtained by using:
zBCSP(s0) =zGM−NRK(s0) · 1
RMSEr(GM−NRK) + zLM(s0) · 1RMSEr(LM)
2∑j=1
1RMSEr(Mj)
(8)
where RMSE r is the prediction error estimated usingcross-validation (Eq.3).
GlobalSoilMap.net presentation, 11 Feb 2011
The proposed system
250 m
100 m
1 x 1
degree tiles
(7 properties,
6 depths)
5.6 km
1 km
soil property maps
ISRIC
Data portal
GeoTiff (3 arcsec)
GlobalSoilMap.net
continental nodes
Regional mapping
organization
Multiscale prediction
modelSpatial aggregation
FTP service (clearing house)
automated validation
downscaling
WMS
(visualization: web browser)
KML
(visualization: Google Earth)
GeoTIff
(analysis: GIS)
PostGIS Raster DB
new submission
GlobalSoilMap.net presentation, 11 Feb 2011
GM-NRK in action: Malawi showcase
I 2740 soil observations, from which some 800�1000 containcomplete analytical and descriptive data.
I 1:800k polygon soil map.
I Some 30-40 gridded layers at various resolutions(covariates).
GlobalSoilMap.net presentation, 11 Feb 2011
GM-NRK in action: Malawi showcase
I 2740 soil observations, from which some 800�1000 containcomplete analytical and descriptive data.
I 1:800k polygon soil map.
I Some 30-40 gridded layers at various resolutions(covariates).
GlobalSoilMap.net presentation, 11 Feb 2011
GM-NRK in action: Malawi showcase
I 2740 soil observations, from which some 800�1000 containcomplete analytical and descriptive data.
I 1:800k polygon soil map.
I Some 30-40 gridded layers at various resolutions(covariates).
GlobalSoilMap.net presentation, 11 Feb 2011
Data sets available for Malawi
38000
32667
27333
22000
(a) (b) (c)
48.8
32.7
16.6
0.5
33° 34° 35°
17°
16°
15°
14°
13°
12°
11°
10°
GlobalSoilMap.net presentation, 11 Feb 2011
Gridded maps for Malawi
5.6 km
1 km
250 m
100 m
BiomesClimateParent
material
General land
use
Erosion
deposition
MODIS-based long term Land Surface
Temperature (day/night)
Land
management
Rainfall map of the world
Elevation
Geologic Provinces of Africa
MODIS (MCD12Q1) land cover dynamics
ENVISAT Land Cover map (GlobCov)
MODIS (MCD13Q1) Enhanced Vegetation
Index (EVI) and medium infrared band (MIR)
TWI, TRI, Slope,
Surface roughness,
Insolation
Landsat ETM
thermal band
Soil polygon map (FAO classes)
GlobalSoilMap.net presentation, 11 Feb 2011
Loading the data
# library(GSIF)
# This library is still not available, hence just load the functions:
> source("http://globalsoilmap.org/data/functions.R")
# load the input data:
> source("http://globalsoilmap.org/data/malawi.RData")
> ls()
# mw_soil.utm --- soil polygon map at 1:800k scale;
# malawi.utm --- ca 2000 soil profiles for the whole Malawi;
# malawi.poly.utm --- country borders (lines);
This will load all point, polygon data and and R functions requiredto run this exercise. The input gridded data can be obtained from:
> download.file("http://globalsoilmap.org/data/malawi_grids.zip",
+ destfile=paste(getwd(), "malawi_grids.zip", sep="/"))
# 313 MB
GlobalSoilMap.net presentation, 11 Feb 2011
Regression analysis
12.4 Trend maps 97
3.200
2.533
1.867
1.200
0 100 km
5 km 1 km 250 m
Fig. 12.5 Mean organic carbon (in permille) in soil predicted at 5 km, 1 km and 250 m resolutions. Values in log-scale.
geology for CLYPPT. At 250 m resolution, the models are again more significant: the predictors explain 18.7%of variability for ORCDRC, 21.1% for PHIHO5, and 26.8% for CLYPPT. The best predictors are: MODIS MediumInfrared band and soil type map for ORCDRC, elevation, EVI maps and soil types for PHIHO5, and again elevation,EVI and soil maps for CLYPPT. At finest resolution, we use a smallest subset of predictors (DEM derivatives andLandsat thermal infrared band). Consequently, the R-squares are somewhat lower: 5.5% for ORCDRC; 12.1% forPHIHO5 and 9.3% for CLYPPT. The overall best predictors are elevations, landsat TIR and Topographic WetnessIndex (Table 12.2).
Table 12.2 Summary results of regression analysis for three selected soil variables at various scales (case study Malawi).
Variable name OSP code NBest predictors
and R-square(5 km)
Best predictorsand R-square
(1 km)
Best predictorsand R-square
(250 m)
Best predictorsand R-square
(100 m)
Soil organiccarbon
ORCDRC 785
rainfall,temperature of
warmest month(R2=0.315)
elevation(R2=0.213)
MODIS MIR, soiltypes
(R2=0.187)
elevation, landsatTIR, TRI
(R2=0.055)
pH PHIH5O 793precipitation, LAI,
daily LST(R2=0.464)
TWI(R2=0.213)
MODIS EVI, soiltypes
(R2=0.211)
elevation, TWI,TRI
(R2=0.121)
Clay content CLYPPT 756soil mapping units,
daily LST(R2=0.148)
geological units(R2=0.127)
elevation, MODISEVI
(R2=0.268)
elevation, TWI,devmean
(R2=0.093)
It is clear from the results shown in Fig. 12.5 that at each scale different predictors play different role. Theseresults also confirm that some soil properties, such as clay content, can be better explained using fine-scalepredictors (SRTM DEM derivatives), others such as organic carbon are controlled by global (coarse) predictors
GlobalSoilMap.net presentation, 11 Feb 2011
Organic carbon (values in log-scale)
3.200
2.533
1.867
1.200
0 100 km
5 km 1 km 250 m
GlobalSoilMap.net presentation, 11 Feb 2011
pH visualized in GE (1 degree block)
GlobalSoilMap.net presentation, 11 Feb 2011
Conclusions
I GSM at 100 m is doable (even without 6M pro�les!).
I The multiscale approach allows us to extrapolate inlarge area (even to areas where we have no soil data!).
I Selection of covariates and prediction techniques needsto be clearly driven by objective accuracy assessment.
I The point data is the key to GSM � we need to motivategovernmental agencies and private persons to contribute toOSP.
I We need to start developing and testing tools � if youhave the inputs and the tools to generate outputs, they can bere-generated as many times as you wish.
GlobalSoilMap.net presentation, 11 Feb 2011
Conclusions
I GSM at 100 m is doable (even without 6M pro�les!).
I The multiscale approach allows us to extrapolate inlarge area (even to areas where we have no soil data!).
I Selection of covariates and prediction techniques needsto be clearly driven by objective accuracy assessment.
I The point data is the key to GSM � we need to motivategovernmental agencies and private persons to contribute toOSP.
I We need to start developing and testing tools � if youhave the inputs and the tools to generate outputs, they can bere-generated as many times as you wish.
GlobalSoilMap.net presentation, 11 Feb 2011
Conclusions
I GSM at 100 m is doable (even without 6M pro�les!).
I The multiscale approach allows us to extrapolate inlarge area (even to areas where we have no soil data!).
I Selection of covariates and prediction techniques needsto be clearly driven by objective accuracy assessment.
I The point data is the key to GSM � we need to motivategovernmental agencies and private persons to contribute toOSP.
I We need to start developing and testing tools � if youhave the inputs and the tools to generate outputs, they can bere-generated as many times as you wish.
GlobalSoilMap.net presentation, 11 Feb 2011
Conclusions
I GSM at 100 m is doable (even without 6M pro�les!).
I The multiscale approach allows us to extrapolate inlarge area (even to areas where we have no soil data!).
I Selection of covariates and prediction techniques needsto be clearly driven by objective accuracy assessment.
I The point data is the key to GSM � we need to motivategovernmental agencies and private persons to contribute toOSP.
I We need to start developing and testing tools � if youhave the inputs and the tools to generate outputs, they can bere-generated as many times as you wish.
GlobalSoilMap.net presentation, 11 Feb 2011
Conclusions
I GSM at 100 m is doable (even without 6M pro�les!).
I The multiscale approach allows us to extrapolate inlarge area (even to areas where we have no soil data!).
I Selection of covariates and prediction techniques needsto be clearly driven by objective accuracy assessment.
I The point data is the key to GSM � we need to motivategovernmental agencies and private persons to contribute toOSP.
I We need to start developing and testing tools � if youhave the inputs and the tools to generate outputs, they can bere-generated as many times as you wish.
GlobalSoilMap.net presentation, 11 Feb 2011
GSM products (revisited)
I SoilGrids.org � covariates at 5 km, 1 km (250 m).
I SoilPro�les.org � Open Soil Pro�les (once we reach 1Mpoints we should be able to produce soil property maps withreasonable accuracy).
I R/Python package � automated analysis of point andgridded data.
I GSIF � Global Information Facilities for soil data.
GlobalSoilMap.net presentation, 11 Feb 2011
Next steps
I Re-implement the method using a `clean' data set (USAdata).
I Finalize the blue-paper (technical specs and methods forGSM).
I Package a showcase that anyone can use.
I Set-up web-services (ISRIC servers) and start publishingthe data (launch OSP, worldmaps).
GlobalSoilMap.net presentation, 11 Feb 2011
Next steps
I Re-implement the method using a `clean' data set (USAdata).
I Finalize the blue-paper (technical specs and methods forGSM).
I Package a showcase that anyone can use.
I Set-up web-services (ISRIC servers) and start publishingthe data (launch OSP, worldmaps).
GlobalSoilMap.net presentation, 11 Feb 2011
Next steps
I Re-implement the method using a `clean' data set (USAdata).
I Finalize the blue-paper (technical specs and methods forGSM).
I Package a showcase that anyone can use.
I Set-up web-services (ISRIC servers) and start publishingthe data (launch OSP, worldmaps).
GlobalSoilMap.net presentation, 11 Feb 2011
Next steps
I Re-implement the method using a `clean' data set (USAdata).
I Finalize the blue-paper (technical specs and methods forGSM).
I Package a showcase that anyone can use.
I Set-up web-services (ISRIC servers) and start publishingthe data (launch OSP, worldmaps).
GlobalSoilMap.net presentation, 11 Feb 2011
Top Related