Global soilmap at_afsis_hengl

78

description

A proposal for Global Soil Mapping based on the participatory multiscale nested regression-kriging

Transcript of Global soilmap at_afsis_hengl

Page 1: Global soilmap at_afsis_hengl

Global Soil MappingA proposal for a participatory multiscale

approach to GSM

Tomislav Hengl

ISRIC � World Soil Information, Wageningen University

GlobalSoilMap.net presentation, 11 Feb 2011

Page 2: Global soilmap at_afsis_hengl

OutlineIntroduction

This talkMy backgroundsGlobalSoilMap.net

Misconceptions about DSM/GSMMapping e�ciencySoil geodata usabilitySoil prediction methods

A proposal for GSMGlobal Soil Mapping is not trivialNested regression modelingThe participatory approach

Malawi show caseInput dataResults

Summary points

GlobalSoilMap.net presentation, 11 Feb 2011

Page 3: Global soilmap at_afsis_hengl

Topics

I My backgrounds;

I Some misconceptions about DSM/GSM;

I A proposal for GSM:I A Global Multiscale Prediction ModelI The crowd-sourcing approach to soil data collection (OpenSoil Pro�les, Soil covariates)

I Global task-oriented Land (Soil) Information System

I Report on the results (Malawi show case).

I Get some feedback.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 4: Global soilmap at_afsis_hengl

Previous projects

I My expertise: spatio-temporal data analysis in FOSS (R),digital soil mapping, geomorphometry, geostatistics. . .

I I have worked with various type of data(climatic/meteorological, species occurrence records,geochemicals. . .);

I Recently published a repository of cca 100 global layers atresolution of 0.05 arcdegrees (5.6 km).

I Author of �A Practical Guide to Geostatistical Mapping�.

I Main organizer of the GEOSTAT summer school for PhDstudents (R+OSGeo).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 5: Global soilmap at_afsis_hengl

My dream is to build an Open multipurpose GLIS

GLOBAL

LAND INFORMATION

SYSTEM

Soil properties (soil information system)

- physical and chemical soil properties, nutrient

capacity, water storage, acidity/salinity…

Live weather channel (meteorological forecasting)

- anticipated temperature (min, max), rainfall, frost

hazard, drought hazard, flood hazard…

Plant monitoring channel (MODIS/ENVISAT)

- current biomass production, biomass anomalies

(pest and diseases), plant health…

Socio-economic data (site-specific)

- administrative units, new laws and regulations,

market activity, closest offices, agro-dealers…

Spatial location (site)

Query site

attributes

Information

incorrect?Update with

ground truth data

Fertilization

Irrigation

Pest treatment

Best crop calendar

Yield estimates

Environmental risks

Suggest the best

land use practice

Model library

GlobalSoilMap.net presentation, 11 Feb 2011

Page 6: Global soilmap at_afsis_hengl

GlobalSoilMap.net

I An international initiative to make soil property maps (7+3) atsix depths at 3 arcsecs (100 m).

I the lightmotive is to �assemble, collate, and rescue as much of

the worlds existing soil data� ;

I Some 30 people directly involved (ISRIC is the main projectcoordinator).

I International compilation of soil data.

I The soil-equivalent of the OneGeology.org, GBIF, GlobCoverand similar projects.

I See full speci�cations athttp://globalsoilmap.org/specifications

GlobalSoilMap.net presentation, 11 Feb 2011

Page 7: Global soilmap at_afsis_hengl

World soils in numbers

I The total productive soil areas: about 104 million squarekm.

I To map the world at 100 m (1:200k), would cost about5 billion EUR (0.5 EUR per ha) using traditional methods.

I We would require some 65M pro�les according to the strictrules of Avery (1987).

I World map at 0.008333333 arcdegrees (ca.1 km) resolution isan image of size 43,200Ö21,600 pixels.

I 27 billion pixels needed to represent the whole world in 100 m(productive soil areas).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 8: Global soilmap at_afsis_hengl

World soils in numbers

I The total productive soil areas: about 104 million squarekm.

I To map the world at 100 m (1:200k), would cost about5 billion EUR (0.5 EUR per ha) using traditional methods.

I We would require some 65M pro�les according to the strictrules of Avery (1987).

I World map at 0.008333333 arcdegrees (ca.1 km) resolution isan image of size 43,200Ö21,600 pixels.

I 27 billion pixels needed to represent the whole world in 100 m(productive soil areas).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 9: Global soilmap at_afsis_hengl

World soils in numbers

I The total productive soil areas: about 104 million squarekm.

I To map the world at 100 m (1:200k), would cost about5 billion EUR (0.5 EUR per ha) using traditional methods.

I We would require some 65M pro�les according to the strictrules of Avery (1987).

I World map at 0.008333333 arcdegrees (ca.1 km) resolution isan image of size 43,200Ö21,600 pixels.

I 27 billion pixels needed to represent the whole world in 100 m(productive soil areas).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 10: Global soilmap at_afsis_hengl

World soils in numbers

I The total productive soil areas: about 104 million squarekm.

I To map the world at 100 m (1:200k), would cost about5 billion EUR (0.5 EUR per ha) using traditional methods.

I We would require some 65M pro�les according to the strictrules of Avery (1987).

I World map at 0.008333333 arcdegrees (ca.1 km) resolution isan image of size 43,200Ö21,600 pixels.

I 27 billion pixels needed to represent the whole world in 100 m(productive soil areas).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 11: Global soilmap at_afsis_hengl

World soils in numbers

I The total productive soil areas: about 104 million squarekm.

I To map the world at 100 m (1:200k), would cost about5 billion EUR (0.5 EUR per ha) using traditional methods.

I We would require some 65M pro�les according to the strictrules of Avery (1987).

I World map at 0.008333333 arcdegrees (ca.1 km) resolution isan image of size 43,200Ö21,600 pixels.

I 27 billion pixels needed to represent the whole world in 100 m(productive soil areas).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 12: Global soilmap at_afsis_hengl

GSM in comparison with other similar projects

SRTM GADM

1990 1995 2000 2005 2010 2015 2020

2.0

2.5

3.0

3.5

4.0

Year

Re

so

lutio

n (

m)

in lo

g-s

ca

le

GPWv3

MOD13C2

MOD12C1

CHLO/SST

GLWD

DMSP-OLSv4

WorldClim

GlobCov2

FRA

5.6 km

HWSDv1EcoRegions

GlobalSoilMap?

OneGeology?

GlobalSoilMap.net presentation, 11 Feb 2011

Page 13: Global soilmap at_afsis_hengl

Misconceptions #1

Mapping e�ciency can be expressed as cost in $ perarea.

To map world soils at 100 m using per unit costs of$2/km2 would cost ca.$300 million1.

1Pedro Sanchez; the NY GlobalSoiMap.net meeting (17th Feb 2009).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 14: Global soilmap at_afsis_hengl

Survey costs and mapping scale

9.5 10.0 10.5 11.0 11.5 12.0 12.5

−1

01

23

Scale number (log−scale)

Min

imum

sur

vey

cost

s in

EU

R /

ha (

log−

scal

e)

GlobalSoilMap.net presentation, 11 Feb 2011

Page 15: Global soilmap at_afsis_hengl

Mapping accuracy and survey costs

The cost of a soil survey is a function of mapping scale, roughly:

log(X) = b0 + b1 · log(SN) (1)

We can �t a linear model to the empirical table data frome.g.Legros (2006; p.75), and hence we get:

X = exp (19.0825− 1.6232 · log(SN)) (2)

where X is the minimum cost/ha in Euros (based on estimates in2002). To map 1 ha of soil at 1:100,000 scale, for example, oneneeds (at least) 1.5 Euros.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 16: Global soilmap at_afsis_hengl

The GSM calculus

I The total productive soil areas: about 104 million squarekm.

I To map the world soils at 100 m (1:200k), would cost about5 billion EUR (0.5 EUR per ha) using traditional methods.According to Pedro Sanchez, soils could be mapped for$0.20 USD per ha ($300 million USD).

I We would require some 65M pro�les according to the strictrules of Avery (1987).

I World map at 0.008333333 arcdegrees (ca.1 km) resolution isan image of size 43,200Ö21,600 pixels.

I We would need immense storage capacities � one image ofthe world at a 100 m resolution contains 27 billion pixels(productive soil areas only!).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 17: Global soilmap at_afsis_hengl

The GSM calculus

I The total productive soil areas: about 104 million squarekm.

I To map the world soils at 100 m (1:200k), would cost about5 billion EUR (0.5 EUR per ha) using traditional methods.According to Pedro Sanchez, soils could be mapped for$0.20 USD per ha ($300 million USD).

I We would require some 65M pro�les according to the strictrules of Avery (1987).

I World map at 0.008333333 arcdegrees (ca.1 km) resolution isan image of size 43,200Ö21,600 pixels.

I We would need immense storage capacities � one image ofthe world at a 100 m resolution contains 27 billion pixels(productive soil areas only!).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 18: Global soilmap at_afsis_hengl

The GSM calculus

I The total productive soil areas: about 104 million squarekm.

I To map the world soils at 100 m (1:200k), would cost about5 billion EUR (0.5 EUR per ha) using traditional methods.According to Pedro Sanchez, soils could be mapped for$0.20 USD per ha ($300 million USD).

I We would require some 65M pro�les according to the strictrules of Avery (1987).

I World map at 0.008333333 arcdegrees (ca.1 km) resolution isan image of size 43,200Ö21,600 pixels.

I We would need immense storage capacities � one image ofthe world at a 100 m resolution contains 27 billion pixels(productive soil areas only!).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 19: Global soilmap at_afsis_hengl

The GSM calculus

I The total productive soil areas: about 104 million squarekm.

I To map the world soils at 100 m (1:200k), would cost about5 billion EUR (0.5 EUR per ha) using traditional methods.According to Pedro Sanchez, soils could be mapped for$0.20 USD per ha ($300 million USD).

I We would require some 65M pro�les according to the strictrules of Avery (1987).

I World map at 0.008333333 arcdegrees (ca.1 km) resolution isan image of size 43,200Ö21,600 pixels.

I We would need immense storage capacities � one image ofthe world at a 100 m resolution contains 27 billion pixels(productive soil areas only!).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 20: Global soilmap at_afsis_hengl

The GSM calculus

I The total productive soil areas: about 104 million squarekm.

I To map the world soils at 100 m (1:200k), would cost about5 billion EUR (0.5 EUR per ha) using traditional methods.According to Pedro Sanchez, soils could be mapped for$0.20 USD per ha ($300 million USD).

I We would require some 65M pro�les according to the strictrules of Avery (1987).

I World map at 0.008333333 arcdegrees (ca.1 km) resolution isan image of size 43,200Ö21,600 pixels.

I We would need immense storage capacities � one image ofthe world at a 100 m resolution contains 27 billion pixels(productive soil areas only!).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 21: Global soilmap at_afsis_hengl

Mapping e�ciency

The costs-per-area measure is not really informative (it is easy tospend money).We propose instead a measure called mapping e�ciency, de�nedas the amount of money needed to map an area of standard sizeand explain each one percent of variation in the target variable:

θ =X

A · RMSE r[EUR · km−2 ·%−1] (3)

where X is the total costs of a survey, A is the size of area inkm−2, and RMSE r is the amount of variation explained by thespatial prediction model.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 22: Global soilmap at_afsis_hengl

Prediction accuracy and survey costs

GlobalSoilMap.net presentation, 11 Feb 2011

Page 23: Global soilmap at_afsis_hengl

Information production e�ciency

An additional measure of mapping e�ciency is the informationproduction e�ciency, i.e.the amount of money spent to producea given quantity of soil information:

Υ =X

gzip[EUR · B−1] (4)

where gzip is the amount of data (in Bytes) left after compression:

gzip = fc · (fE ·M) · cZ [B] (5)

where fc is the loss-less data compression factor, fE is theextrapolation adjustment factor, cZ is the variable coding size, andM is the total number of pixels.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 24: Global soilmap at_afsis_hengl

Map information content

Variable coding can be set by deriving the (global) e�ectiveprecision of a soil property map:

∆z =RMSE

2; Z = {Z(s),∀s ∈ A} (6)

Following the Nyquist frequency concept from signal processing,there is no justi�cation in saving the predictions with betterprecision than half the average accuracy.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 25: Global soilmap at_afsis_hengl

Map information content

E�ective information content (bytes remaining after compression)in a soil map for a given map extent is basically a function of threefactors:

I Support size (point or block).

I Size of a map in terms of number of pixels, determined, infact, by the e�ective pixel size (which is in fact determinedby sampling intensity).

I E�ective precision (Eq.6) estimated using validation points.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 26: Global soilmap at_afsis_hengl

Conclusions

I Mapping e�ciency (cost / area / percent of varianceexplained) is an objective criteria to compare spatial predictionmethods. $ / area is incomplete (anyone can spend money toproduce maps � the question is how good are the maps?).

I Maps are not what they seem � always assess and visualizethe accuracy of your maps.

I Soil mapping is an iterative process, in each iteration weexplain a bit more of variability.

I We might not ever be able to explain 100% variability in thetarget soil variable.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 27: Global soilmap at_afsis_hengl

Conclusions

I Mapping e�ciency (cost / area / percent of varianceexplained) is an objective criteria to compare spatial predictionmethods. $ / area is incomplete (anyone can spend money toproduce maps � the question is how good are the maps?).

I Maps are not what they seem � always assess and visualizethe accuracy of your maps.

I Soil mapping is an iterative process, in each iteration weexplain a bit more of variability.

I We might not ever be able to explain 100% variability in thetarget soil variable.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 28: Global soilmap at_afsis_hengl

Conclusions

I Mapping e�ciency (cost / area / percent of varianceexplained) is an objective criteria to compare spatial predictionmethods. $ / area is incomplete (anyone can spend money toproduce maps � the question is how good are the maps?).

I Maps are not what they seem � always assess and visualizethe accuracy of your maps.

I Soil mapping is an iterative process, in each iteration weexplain a bit more of variability.

I We might not ever be able to explain 100% variability in thetarget soil variable.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 29: Global soilmap at_afsis_hengl

Conclusions

I Mapping e�ciency (cost / area / percent of varianceexplained) is an objective criteria to compare spatial predictionmethods. $ / area is incomplete (anyone can spend money toproduce maps � the question is how good are the maps?).

I Maps are not what they seem � always assess and visualizethe accuracy of your maps.

I Soil mapping is an iterative process, in each iteration weexplain a bit more of variability.

I We might not ever be able to explain 100% variability in thetarget soil variable.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 30: Global soilmap at_afsis_hengl

Misconceptions #2

Each node will produce soil property maps for theirarea of interest, which can then be stitched together2

These maps will become the most used soilinformation in the World.

2This is not speci�es on GlobalSoilMap.net, but there is a general agreement.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 31: Global soilmap at_afsis_hengl

A hierarchical approach to GSM

I Country nodes� continental nodes (major players)� Globalcoverage.

I Each country node is responsible for producing maps for theirterritory. The nodes have a complete freedom to selectapplicable spatial prediction methods (delivery tempo,data sharing policy etc.).

I As long as the technical speci�cations are satis�ed (10properties, 6 depths, upper lower con�dence limits, 100 m),the maps will be put on GlobalSoilMap.net.

I Inputs and methods to be used for GSM are secondary.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 32: Global soilmap at_afsis_hengl

Lessons from geodata usability

I Geodata usability is a function of: (1) adequacy, (2)consistency, (3) completeness, (4) accuracy of themetadata, (5) data interoperability, (6) accessibility anddata sharing capacity, (7) attribute and thematicaccuracy.

I Each of these aspects can be optimized.

I In reality, we can only increase each of the listed factors up toa certain level, then due to objective reasons, we reach thebest possible performance given the available funds andmethods. Any other improvement would require additionalfunds (or radical improvement of the data/operation models).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 33: Global soilmap at_afsis_hengl

Lessons from geodata usability

I Geodata usability is a function of: (1) adequacy, (2)consistency, (3) completeness, (4) accuracy of themetadata, (5) data interoperability, (6) accessibility anddata sharing capacity, (7) attribute and thematicaccuracy.

I Each of these aspects can be optimized.

I In reality, we can only increase each of the listed factors up toa certain level, then due to objective reasons, we reach thebest possible performance given the available funds andmethods. Any other improvement would require additionalfunds (or radical improvement of the data/operation models).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 34: Global soilmap at_afsis_hengl

Lessons from geodata usability

I Geodata usability is a function of: (1) adequacy, (2)consistency, (3) completeness, (4) accuracy of themetadata, (5) data interoperability, (6) accessibility anddata sharing capacity, (7) attribute and thematicaccuracy.

I Each of these aspects can be optimized.

I In reality, we can only increase each of the listed factors up toa certain level, then due to objective reasons, we reach thebest possible performance given the available funds andmethods. Any other improvement would require additionalfunds (or radical improvement of the data/operation models).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 35: Global soilmap at_afsis_hengl

Soil pro�les from various projects (65k points)

GlobalSoilMap.net presentation, 11 Feb 2011

Page 36: Global soilmap at_afsis_hengl

Conclusions

I A hierarchical (isolation) approach to global soil mapping(stitching of country maps) would probably lead to productsthat are inconsistent, incomplete and irreproducible.

I Considering the current state of legacy data, any GSM willneed to be largely based on extrapolation and downscaling.

I The Global Soil Mapping initiative should be about buildinglive repositories (Open Soil Pro�les, Soil Covariates) and tools(Global Soil Information Facility).

I To map the world soils at 100 m (1:200k), would cost ca.$300million USD. To update such map would cost (again!) $300million USD.

I The future of digital soil mapping lays in task-oriented SoilInformation Systems (idea by Gerard Heuvelink).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 37: Global soilmap at_afsis_hengl

Conclusions

I A hierarchical (isolation) approach to global soil mapping(stitching of country maps) would probably lead to productsthat are inconsistent, incomplete and irreproducible.

I Considering the current state of legacy data, any GSM willneed to be largely based on extrapolation and downscaling.

I The Global Soil Mapping initiative should be about buildinglive repositories (Open Soil Pro�les, Soil Covariates) and tools(Global Soil Information Facility).

I To map the world soils at 100 m (1:200k), would cost ca.$300million USD. To update such map would cost (again!) $300million USD.

I The future of digital soil mapping lays in task-oriented SoilInformation Systems (idea by Gerard Heuvelink).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 38: Global soilmap at_afsis_hengl

Conclusions

I A hierarchical (isolation) approach to global soil mapping(stitching of country maps) would probably lead to productsthat are inconsistent, incomplete and irreproducible.

I Considering the current state of legacy data, any GSM willneed to be largely based on extrapolation and downscaling.

I The Global Soil Mapping initiative should be about buildinglive repositories (Open Soil Pro�les, Soil Covariates) and tools(Global Soil Information Facility).

I To map the world soils at 100 m (1:200k), would cost ca.$300million USD. To update such map would cost (again!) $300million USD.

I The future of digital soil mapping lays in task-oriented SoilInformation Systems (idea by Gerard Heuvelink).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 39: Global soilmap at_afsis_hengl

Conclusions

I A hierarchical (isolation) approach to global soil mapping(stitching of country maps) would probably lead to productsthat are inconsistent, incomplete and irreproducible.

I Considering the current state of legacy data, any GSM willneed to be largely based on extrapolation and downscaling.

I The Global Soil Mapping initiative should be about buildinglive repositories (Open Soil Pro�les, Soil Covariates) and tools(Global Soil Information Facility).

I To map the world soils at 100 m (1:200k), would cost ca.$300million USD. To update such map would cost (again!) $300million USD.

I The future of digital soil mapping lays in task-oriented SoilInformation Systems (idea by Gerard Heuvelink).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 40: Global soilmap at_afsis_hengl

Conclusions

I A hierarchical (isolation) approach to global soil mapping(stitching of country maps) would probably lead to productsthat are inconsistent, incomplete and irreproducible.

I Considering the current state of legacy data, any GSM willneed to be largely based on extrapolation and downscaling.

I The Global Soil Mapping initiative should be about buildinglive repositories (Open Soil Pro�les, Soil Covariates) and tools(Global Soil Information Facility).

I To map the world soils at 100 m (1:200k), would cost ca.$300million USD. To update such map would cost (again!) $300million USD.

I The future of digital soil mapping lays in task-oriented SoilInformation Systems (idea by Gerard Heuvelink).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 41: Global soilmap at_afsis_hengl

Misconceptions #3

There are many possible DSM techniques that areequally suitable for GSM.

Each node should use which ever technique they �ndapplicable.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 42: Global soilmap at_afsis_hengl

GSM techniques

Profile data and polygon maps

Profile data only

Polygon maps only

No soil data available

Data rich areas Data poor areas

extrapolationKnowledge transfer

Hybrid

methods

Purely

geostatistical

methods

Knowledge-

driven

methods

Extrapolation

methods

Figure: Groups of techniques suitable for global soil mapping; afterMinasny and McBratney (2010).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 43: Global soilmap at_afsis_hengl

Conclusions

I Most of the DSM techniques are in fact somehow connected(weighted averaging per polygon is in fact type of regression,SOLIM is type of multiple linear regression), hence, there arenot as many techniques.

I For the consistency and completeness of �nal outputs it isprobably better to build one global model for each soilproperty (or even one multivariate model).

I Selection of covariates and prediction techniques needsto be clearly driven by objective accuracy assessment.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 44: Global soilmap at_afsis_hengl

Conclusions

I Most of the DSM techniques are in fact somehow connected(weighted averaging per polygon is in fact type of regression,SOLIM is type of multiple linear regression), hence, there arenot as many techniques.

I For the consistency and completeness of �nal outputs it isprobably better to build one global model for each soilproperty (or even one multivariate model).

I Selection of covariates and prediction techniques needsto be clearly driven by objective accuracy assessment.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 45: Global soilmap at_afsis_hengl

Conclusions

I Most of the DSM techniques are in fact somehow connected(weighted averaging per polygon is in fact type of regression,SOLIM is type of multiple linear regression), hence, there arenot as many techniques.

I For the consistency and completeness of �nal outputs it isprobably better to build one global model for each soilproperty (or even one multivariate model).

I Selection of covariates and prediction techniques needsto be clearly driven by objective accuracy assessment.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 46: Global soilmap at_afsis_hengl

Other global mapping projects

I SRTM (DEM) � 100 m near-to-global coverage.

I MODIS products � a variety of RS-based products(vegetation indices, LAI, land cover maps etc) at resolutions250 m, 500 m, 1 km and 5.6 km.

I GlobCov � ESA's ENVISAT global consistent land covermap (300 m).

I WorldClim � maps of bioclimatic variables interpolated usingdense point data (1 km).

I . . . there are many more examples (see also: publicly availabledata sets).

All these are based on using uni�ed methodology.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 47: Global soilmap at_afsis_hengl

Di�culties

I There is probably not enough point data in the world to makesoil property maps at so �ne resolution (maps will be largelybased on extrapolation and downscaling).

I The most serious problem of GSM is the discrepancy betweenthe countries considering the amount of (�eld) data.

I Soils are NOT vegetation � it is much more di�cult tomap distribution of soils accurately (RS is helpful, but only upto a certain degree).

I The �nal global soil property maps might be of poor accuracyin >50% of the world.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 48: Global soilmap at_afsis_hengl

Di�culties

I There is probably not enough point data in the world to makesoil property maps at so �ne resolution (maps will be largelybased on extrapolation and downscaling).

I The most serious problem of GSM is the discrepancy betweenthe countries considering the amount of (�eld) data.

I Soils are NOT vegetation � it is much more di�cult tomap distribution of soils accurately (RS is helpful, but only upto a certain degree).

I The �nal global soil property maps might be of poor accuracyin >50% of the world.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 49: Global soilmap at_afsis_hengl

Di�culties

I There is probably not enough point data in the world to makesoil property maps at so �ne resolution (maps will be largelybased on extrapolation and downscaling).

I The most serious problem of GSM is the discrepancy betweenthe countries considering the amount of (�eld) data.

I Soils are NOT vegetation � it is much more di�cult tomap distribution of soils accurately (RS is helpful, but only upto a certain degree).

I The �nal global soil property maps might be of poor accuracyin >50% of the world.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 50: Global soilmap at_afsis_hengl

Di�culties

I There is probably not enough point data in the world to makesoil property maps at so �ne resolution (maps will be largelybased on extrapolation and downscaling).

I The most serious problem of GSM is the discrepancy betweenthe countries considering the amount of (�eld) data.

I Soils are NOT vegetation � it is much more di�cult tomap distribution of soils accurately (RS is helpful, but only upto a certain degree).

I The �nal global soil property maps might be of poor accuracyin >50% of the world.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 51: Global soilmap at_afsis_hengl

Question:

Can we do GSM @ 100 m with such limited data?

GlobalSoilMap.net presentation, 11 Feb 2011

Page 52: Global soilmap at_afsis_hengl

Opportunities

I There is an enormous potential of getting the legacy datatogether (there must be thousands and thousands of soilpro�les unused).

I There is an impressive enthusiasm about this project (manynational soil survey agencies see this as an opportunity to getfunding).

I World (scientists, policy makers, crediting organizations,private sector, . . . farmers) need soil information!

GlobalSoilMap.net presentation, 11 Feb 2011

Page 53: Global soilmap at_afsis_hengl

Opportunities

I There is an enormous potential of getting the legacy datatogether (there must be thousands and thousands of soilpro�les unused).

I There is an impressive enthusiasm about this project (manynational soil survey agencies see this as an opportunity to getfunding).

I World (scientists, policy makers, crediting organizations,private sector, . . . farmers) need soil information!

GlobalSoilMap.net presentation, 11 Feb 2011

Page 54: Global soilmap at_afsis_hengl

Opportunities

I There is an enormous potential of getting the legacy datatogether (there must be thousands and thousands of soilpro�les unused).

I There is an impressive enthusiasm about this project (manynational soil survey agencies see this as an opportunity to getfunding).

I World (scientists, policy makers, crediting organizations,private sector, . . . farmers) need soil information!

GlobalSoilMap.net presentation, 11 Feb 2011

Page 55: Global soilmap at_afsis_hengl

The proposal

We propose that, for the purpose of achieving thehighest geodata usability, the project should promote

use of a single (participatory) global multiscalenested regression-kriging model (5 km, 1 km, 250 m

and 100 m resolution)

and then engage local DSM teams to contribute soilground truth data, polygon maps and predictions

that can be integrated into one information system.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 56: Global soilmap at_afsis_hengl

Global Multiscale Nested RK

Predictions are based on a nested RK model:

z(sB) = m0(sB−k) + e1(sB−k|sB−[k+1]) + . . .+ ek(sB−2|sB−1) + ε(sB) (7)

where z(sB) is the value of the target variable estimated at groundscale (B), B−1, . . . ,B−k are the higher order components,ek(sB−k|sB−(k+1)) is the residual variation from scale sB−(k+1) to ahigher resolution scale sB−k, and ε is spatially auto-correlatedresidual soil variation (dealt with ordinary kriging).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 57: Global soilmap at_afsis_hengl

Some drawbacks

I GM-NRK makes all other DSM e�orts in the Worldredundant(!);

I GM-NRK ignores all other sub-100 m resolution data andmapping e�orts;

I It could also delay delivery of soil property maps because themapping activities would be more di�cult to organizeinternationally;

GlobalSoilMap.net presentation, 11 Feb 2011

Page 58: Global soilmap at_afsis_hengl

The best combined spatial predictor

To avoid these di�culties, we propose using a participatoryapproach to GSM � a combination of GM-NRK and localprediction models. Assuming that at local and global scalesindependent inputs/models are used to generate predictions, thebest combined predictor can be obtained by using:

zBCSP(s0) =zGM−NRK(s0) · 1

RMSEr(GM−NRK) + zLM(s0) · 1RMSEr(LM)

2∑j=1

1RMSEr(Mj)

(8)

where RMSE r is the prediction error estimated usingcross-validation (Eq.3).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 59: Global soilmap at_afsis_hengl

The proposed system

250 m

100 m

1 x 1

degree tiles

(7 properties,

6 depths)

5.6 km

1 km

soil property maps

ISRIC

Data portal

GeoTiff (3 arcsec)

GlobalSoilMap.net

continental nodes

Regional mapping

organization

Multiscale prediction

modelSpatial aggregation

FTP service (clearing house)

automated validation

downscaling

WMS

(visualization: web browser)

KML

(visualization: Google Earth)

GeoTIff

(analysis: GIS)

PostGIS Raster DB

new submission

GlobalSoilMap.net presentation, 11 Feb 2011

Page 60: Global soilmap at_afsis_hengl

GM-NRK in action: Malawi showcase

I 2740 soil observations, from which some 800�1000 containcomplete analytical and descriptive data.

I 1:800k polygon soil map.

I Some 30-40 gridded layers at various resolutions(covariates).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 61: Global soilmap at_afsis_hengl

GM-NRK in action: Malawi showcase

I 2740 soil observations, from which some 800�1000 containcomplete analytical and descriptive data.

I 1:800k polygon soil map.

I Some 30-40 gridded layers at various resolutions(covariates).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 62: Global soilmap at_afsis_hengl

GM-NRK in action: Malawi showcase

I 2740 soil observations, from which some 800�1000 containcomplete analytical and descriptive data.

I 1:800k polygon soil map.

I Some 30-40 gridded layers at various resolutions(covariates).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 63: Global soilmap at_afsis_hengl

Data sets available for Malawi

38000

32667

27333

22000

(a) (b) (c)

48.8

32.7

16.6

0.5

33° 34° 35°

17°

16°

15°

14°

13°

12°

11°

10°

GlobalSoilMap.net presentation, 11 Feb 2011

Page 64: Global soilmap at_afsis_hengl

Gridded maps for Malawi

5.6 km

1 km

250 m

100 m

BiomesClimateParent

material

General land

use

Erosion

deposition

MODIS-based long term Land Surface

Temperature (day/night)

Land

management

Rainfall map of the world

Elevation

Geologic Provinces of Africa

MODIS (MCD12Q1) land cover dynamics

ENVISAT Land Cover map (GlobCov)

MODIS (MCD13Q1) Enhanced Vegetation

Index (EVI) and medium infrared band (MIR)

TWI, TRI, Slope,

Surface roughness,

Insolation

Landsat ETM

thermal band

Soil polygon map (FAO classes)

GlobalSoilMap.net presentation, 11 Feb 2011

Page 65: Global soilmap at_afsis_hengl

Loading the data

# library(GSIF)

# This library is still not available, hence just load the functions:

> source("http://globalsoilmap.org/data/functions.R")

# load the input data:

> source("http://globalsoilmap.org/data/malawi.RData")

> ls()

# mw_soil.utm --- soil polygon map at 1:800k scale;

# malawi.utm --- ca 2000 soil profiles for the whole Malawi;

# malawi.poly.utm --- country borders (lines);

This will load all point, polygon data and and R functions requiredto run this exercise. The input gridded data can be obtained from:

> download.file("http://globalsoilmap.org/data/malawi_grids.zip",

+ destfile=paste(getwd(), "malawi_grids.zip", sep="/"))

# 313 MB

GlobalSoilMap.net presentation, 11 Feb 2011

Page 66: Global soilmap at_afsis_hengl

Regression analysis

12.4 Trend maps 97

3.200

2.533

1.867

1.200

0 100 km

5 km 1 km 250 m

Fig. 12.5 Mean organic carbon (in permille) in soil predicted at 5 km, 1 km and 250 m resolutions. Values in log-scale.

geology for CLYPPT. At 250 m resolution, the models are again more significant: the predictors explain 18.7%of variability for ORCDRC, 21.1% for PHIHO5, and 26.8% for CLYPPT. The best predictors are: MODIS MediumInfrared band and soil type map for ORCDRC, elevation, EVI maps and soil types for PHIHO5, and again elevation,EVI and soil maps for CLYPPT. At finest resolution, we use a smallest subset of predictors (DEM derivatives andLandsat thermal infrared band). Consequently, the R-squares are somewhat lower: 5.5% for ORCDRC; 12.1% forPHIHO5 and 9.3% for CLYPPT. The overall best predictors are elevations, landsat TIR and Topographic WetnessIndex (Table 12.2).

Table 12.2 Summary results of regression analysis for three selected soil variables at various scales (case study Malawi).

Variable name OSP code NBest predictors

and R-square(5 km)

Best predictorsand R-square

(1 km)

Best predictorsand R-square

(250 m)

Best predictorsand R-square

(100 m)

Soil organiccarbon

ORCDRC 785

rainfall,temperature of

warmest month(R2=0.315)

elevation(R2=0.213)

MODIS MIR, soiltypes

(R2=0.187)

elevation, landsatTIR, TRI

(R2=0.055)

pH PHIH5O 793precipitation, LAI,

daily LST(R2=0.464)

TWI(R2=0.213)

MODIS EVI, soiltypes

(R2=0.211)

elevation, TWI,TRI

(R2=0.121)

Clay content CLYPPT 756soil mapping units,

daily LST(R2=0.148)

geological units(R2=0.127)

elevation, MODISEVI

(R2=0.268)

elevation, TWI,devmean

(R2=0.093)

It is clear from the results shown in Fig. 12.5 that at each scale different predictors play different role. Theseresults also confirm that some soil properties, such as clay content, can be better explained using fine-scalepredictors (SRTM DEM derivatives), others such as organic carbon are controlled by global (coarse) predictors

GlobalSoilMap.net presentation, 11 Feb 2011

Page 67: Global soilmap at_afsis_hengl

Organic carbon (values in log-scale)

3.200

2.533

1.867

1.200

0 100 km

5 km 1 km 250 m

GlobalSoilMap.net presentation, 11 Feb 2011

Page 68: Global soilmap at_afsis_hengl

pH visualized in GE (1 degree block)

GlobalSoilMap.net presentation, 11 Feb 2011

Page 69: Global soilmap at_afsis_hengl

Conclusions

I GSM at 100 m is doable (even without 6M pro�les!).

I The multiscale approach allows us to extrapolate inlarge area (even to areas where we have no soil data!).

I Selection of covariates and prediction techniques needsto be clearly driven by objective accuracy assessment.

I The point data is the key to GSM � we need to motivategovernmental agencies and private persons to contribute toOSP.

I We need to start developing and testing tools � if youhave the inputs and the tools to generate outputs, they can bere-generated as many times as you wish.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 70: Global soilmap at_afsis_hengl

Conclusions

I GSM at 100 m is doable (even without 6M pro�les!).

I The multiscale approach allows us to extrapolate inlarge area (even to areas where we have no soil data!).

I Selection of covariates and prediction techniques needsto be clearly driven by objective accuracy assessment.

I The point data is the key to GSM � we need to motivategovernmental agencies and private persons to contribute toOSP.

I We need to start developing and testing tools � if youhave the inputs and the tools to generate outputs, they can bere-generated as many times as you wish.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 71: Global soilmap at_afsis_hengl

Conclusions

I GSM at 100 m is doable (even without 6M pro�les!).

I The multiscale approach allows us to extrapolate inlarge area (even to areas where we have no soil data!).

I Selection of covariates and prediction techniques needsto be clearly driven by objective accuracy assessment.

I The point data is the key to GSM � we need to motivategovernmental agencies and private persons to contribute toOSP.

I We need to start developing and testing tools � if youhave the inputs and the tools to generate outputs, they can bere-generated as many times as you wish.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 72: Global soilmap at_afsis_hengl

Conclusions

I GSM at 100 m is doable (even without 6M pro�les!).

I The multiscale approach allows us to extrapolate inlarge area (even to areas where we have no soil data!).

I Selection of covariates and prediction techniques needsto be clearly driven by objective accuracy assessment.

I The point data is the key to GSM � we need to motivategovernmental agencies and private persons to contribute toOSP.

I We need to start developing and testing tools � if youhave the inputs and the tools to generate outputs, they can bere-generated as many times as you wish.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 73: Global soilmap at_afsis_hengl

Conclusions

I GSM at 100 m is doable (even without 6M pro�les!).

I The multiscale approach allows us to extrapolate inlarge area (even to areas where we have no soil data!).

I Selection of covariates and prediction techniques needsto be clearly driven by objective accuracy assessment.

I The point data is the key to GSM � we need to motivategovernmental agencies and private persons to contribute toOSP.

I We need to start developing and testing tools � if youhave the inputs and the tools to generate outputs, they can bere-generated as many times as you wish.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 74: Global soilmap at_afsis_hengl

GSM products (revisited)

I SoilGrids.org � covariates at 5 km, 1 km (250 m).

I SoilPro�les.org � Open Soil Pro�les (once we reach 1Mpoints we should be able to produce soil property maps withreasonable accuracy).

I R/Python package � automated analysis of point andgridded data.

I GSIF � Global Information Facilities for soil data.

GlobalSoilMap.net presentation, 11 Feb 2011

Page 75: Global soilmap at_afsis_hengl

Next steps

I Re-implement the method using a `clean' data set (USAdata).

I Finalize the blue-paper (technical specs and methods forGSM).

I Package a showcase that anyone can use.

I Set-up web-services (ISRIC servers) and start publishingthe data (launch OSP, worldmaps).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 76: Global soilmap at_afsis_hengl

Next steps

I Re-implement the method using a `clean' data set (USAdata).

I Finalize the blue-paper (technical specs and methods forGSM).

I Package a showcase that anyone can use.

I Set-up web-services (ISRIC servers) and start publishingthe data (launch OSP, worldmaps).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 77: Global soilmap at_afsis_hengl

Next steps

I Re-implement the method using a `clean' data set (USAdata).

I Finalize the blue-paper (technical specs and methods forGSM).

I Package a showcase that anyone can use.

I Set-up web-services (ISRIC servers) and start publishingthe data (launch OSP, worldmaps).

GlobalSoilMap.net presentation, 11 Feb 2011

Page 78: Global soilmap at_afsis_hengl

Next steps

I Re-implement the method using a `clean' data set (USAdata).

I Finalize the blue-paper (technical specs and methods forGSM).

I Package a showcase that anyone can use.

I Set-up web-services (ISRIC servers) and start publishingthe data (launch OSP, worldmaps).

GlobalSoilMap.net presentation, 11 Feb 2011