Post on 02-Jun-2018
8/10/2019 Dahlke 2009 HP
http://slidepdf.com/reader/full/dahlke-2009-hp 1/13
HYDROLOGICAL PROCESSES Hydrol. Process. 23, 3017– 3029 (2009)Published online 26 August 2009 in Wiley InterScience(www.interscience.wiley.com) DOI: 10.1002/hyp.7413
Test of statistical means for the extrapolation of soil depthpoint information using overlays of spatial
environmental data and bootstrappingtechniques
Helen E. Dahlke,1* Thorsten Behrens,2 Jan Seibert3,4 and Lotta Andersson5
1 Biological and Environmental Engineering, Cornell University, 165 Riley-Robb Hall, I thaca, New York, 14853, USA2 Physical Geography, Institute of Geography, University of Tuebingen, Ruemelinstrasse 19-23, 72070 T¨ ubingen, Germany
3 Department of Geography, University of Zurich, CH-8057 Zurich, Switzerland 4 Department of Physical Geography and Quaternary Geology, Stockholm University, SE-106 91, Stockholm, Sweden
5 Swedish Meteorological and Hydrological Institute, Department of Research and Development, SE-601 76 Norrk¨ oping, Sweden
Abstract:
Hydrological modelling depends highly on the accuracy and uncertainty of model input parameters such as soil properties.Since most of these data are field surveyed, geostatistical techniques such as kriging, classification and regression trees ormore sophisticated soil-landscape models need to be applied to interpolate point information to the area. Most of the existinginterpolation techniques require a random or regular distribution of points within the study area but are not adequate tosatisfactorily interpolate soil catena or transect data. The soil landscape model presented in this study is predicting soilinformation from transect or catena point data using a statistical mean (arithmetic, geometric and harmonic mean) to calculatethe soil information based on class means of merged spatial explanatory variables. A data set of 226 soil depth measurementscovering a range of 0– 6Ð5 m was used to test the model. The point data were sampled along four transects in the Stubbetorpcatchment, SE-Sweden. We overlaid a geomorphology map (8 classes) with digital elevation model-derived topographic indexmaps (2–9 classes) to estimate the range of error the model produces with changing sample size and input maps. The accuracyof the soil depth predictions was estimated with the root mean square error (RMSE) based on a testing and training data set.RMSE ranged generally between 0Ð73 and 0Ð83 m š 0Ð013 m depending on the amount of classes the merged layers had, butwere smallest for a map combination with a low number of classes predicted with the harmonic mean (RMSE D 0Ð46 m).The results show that the prediction accuracy of this method depends on the number of point values in the sample, the value
range of the measured attribute and the initial correlations between point values and explanatory variables, but suggests thatthe model approach is in general scale invariant. Copyright © 2009 John Wiley & Sons, Ltd.
KEY WORDS soil-landscape modelling; hydrological modelling; soil depth; bootstrapping; soil attributes; soil attributeprediction; statistical mean; root mean square error
Received 18 November 2008; Accepted 16 June 2009
INTRODUCTION
Digital high-resolution soil information and new app-
roaches to obtain landscape heterogeneities face still a
growing demand for improvements of existing hydro-
logical models and to capture the space–time variability
of hydrological processes. Soil depth is seen as one of the essential input parameters for distributed hydrologi-
cal and environmental modelling. Soil depth, or the depth
from the ground surface to the surface of the bedrock
or an impermeable layer, is seen as a major control
on soil– water storage and availability in many envi-
ronments (Tromp-van Meerveld and McDonnell, 2006a).
Soil depth significantly affects spatial soil moisture pat-
terns (Burt and Butcher, 1985; Freer et al., 2002; Tromp-
van Meerveld and McDonnell, 2006b) as well as subsur-
face and groundwater flow (Buttle and McDonald, 2002;Freer et al., 2002; Stieglitz et al., 2003). Soil depth or
* Correspondence to: Helen E. Dahlke, Biological and EnvironmentalEngineering, Cornell University, 165 Riley-Robb Hall, Ithaca, New York,14853, USA. E-mail: hed23@cornell.edu
depth to bedrock is thus a standard variable used in
many hydrological models such as soil & water assess-
ment tool (SWAT) (Arnold and Fohrer, 2005), distributed
hydrology soil vegetation model (DHSVM) (Wigmosta
et al., 1994), soil moisture distribution and routing model
(SMDR) (Frankenberger et al., 1999) or TOPMODEL
(Beven et al., 1984). To face the growing demand for
high-resolution spatial soil information, so-called quan-
titative soil-landscape methods are applied to extend
conventional soil survey point observations to the land-
scape scale (Ryan et al., 2000; McBratney et al., 2003).
Approaches applied to predict continuous soil attributes
such as soil depth comprise simple linear regression,
kriging and co-kriging (Odeh et al., 1994, 1995; Ryan
et al., 2000), generalized linear models (McKenzie and
Ryan, 1999), discriminant analysis (Sinowski and Auer-
swald, 1999) and landform evolution models (Saco et al.,
2006).
The development of these models has especially been
facilitated by the achieved advances in geographical
information systems (GIS), digital elevation models
Copyright © 2009 John Wiley & Sons, Ltd.
8/10/2019 Dahlke 2009 HP
http://slidepdf.com/reader/full/dahlke-2009-hp 2/13
3018 H. E. DAHLKE ET AL.
(DEM), terrain analysis, statistical analysis and the
increasing computing capacity during the last decade.
Based on differences in the quality and type of field
measurements of soil properties and the availability of
additional spatial environmental explanatory variables,
the available methods can be categorized into continu-
ous and discrete approaches (Burrough, 1993). Commoncontinuous approaches analyze the spatial continuity of
a specific soil variable based on the variance of their
distribution using geostatistical methods (e.g. kriging)
or they include known environmental information (e.g.
topographic, land use and substrate information) for the
spatial distribution of the soil variable based on a regres-
sion model (Mertens et al., 2002). Discrete approaches
such as Bayesian expert systems model categorical (nom-
inal, ordinal or interval) soil attributes or soil classes
through the integration of soil and landscape information
into a semantic net and/or the definition of logical rules
(Skidmore et al., 1996). Other methods that predict con-ventionally mapped soil-landscape units are fuzzy logic
approaches (Zhu, 2000) and neural networks (Lehmann
et al., 1999; Behrens et al., 2005) that use learning algo-
rithms to train a network that predicts the desired output
units based on mapped soil units.
Despite the great variety and advances that have been
made in the development of continuous and discrete soil-
landscape models, the approaches have limitations in
their applicability to provide input parameters for dis-
tributed hydrological models. Discrete approaches pro-
vide soil information for spatial entities and provide
hence the data structure required in most of the distributed
or hydrological response units (HRU)-based hydrologicalmodels. HRUs describe areas of homogeneous hydro-
logical response based on similar topographical, pedo-
logical and geomorphological characteristics, which are
extracted from an overlay of topographic, soil and land
use data. The concept is based on the assumption that
hydrological processes within a delineated hydrological
response unit show a certain degree of homogeneity and
therefore less variability as compared with surrounding
area units. In comparison to raster-based hydrological
models, it aims to reduce parameterization complexity
and computing time, especially at regional and catchment
scale applications (Flugel, 1995; Leavesley and Stannard,1995). Following the HRU concept, discrete soil model
approaches effectively facilitate the reduction of the spa-
tial variability of hydrological processes in the landscape
and reduce the time and effort to collect necessary soil
attribute data in a study area (Park and van de Giesen,
2004). However, they bear the risk that the hydrologi-
cal model application is bound to the scale of the pre-
existing conventional soil surveys, which exist mostly in
the range of 1 : 50 000 to 1 : 1 000 000 (e.g. 1 : 1 000 000
in Sweden) and are rather inflexible to scaling of the soil
information (Olsson, 1999; Behrens and Scholten, 2007).
Moreover, the development of soil unit-based quantita-
tive soil models reached a degree of complexity in user
expertise and user knowledge, both on the soil survey
and on the model side that challenges their short-term
applicability as simple tools to generate soil input data
for hydrological models and modeller.
Continuous approaches have the advantage that they
are easy applicable, have little demands in computation
software (e.g. implemented in common GIS) and user
expertise. However, most of the geostatistical methods
require a large number of samples or frequent sam-pling for accurate predictions and bear the problem that
even with established model functions, the capabilities to
extrapolate the results outside the study area or catch-
ment remain limited (Kravchenko, 2003). Geostatistical
methods also assume a certain data structure such as a
regular grid or uniform distribution (Odeh et al., 1994,
1995; Lane, 2002; Kravchenko, 2003; Lyon et al., 2006).
Methods such as kriging and inverse distance weighting
(IDW) and regression trees require a regular or random
distribution of the point data that are scattered over the
observation area. However, transect or catena data are
usually not object of interpolation techniques, becausetheir spatial representation for a defined area of inter-
est is limited to the proximate surrounding of the catena
and the incremental distance of the points along the
catena. The application of common interpolation tech-
niques (e.g. kriging and IDW) to catena point data results
in a decrease of the predictive capacity the farther a
point/cell needs to be predicted from the field-measured
points. Typical artefacts such as stripes or facets are pro-
duced in the prediction maps showing the decreasing
ability of the interpolation algorithm to predict in areas,
which lack point observations.
The interpolation of soil information sampled with the
catena approach remains therefore a challenge for geosta-tistical methods and soil-landscape modelling techniques.
Most studies that use catena soil information are, thus,
limited to small-scale applications such as single hill-
slopes and avoid predictions of larger landscape areas.
Most of the interpolation of catena-sampled soil infor-
mation is facilitated through the integration of digital
terrain analysis into the interpolation process (Moore
et al., 1993; Sommer and Schlichting, 1997; Gessler
et al., 2000; Chamran et al., 2002). Statistical correla-
tions among soil properties such as soil moisture, net
primary productivity, soil organic carbon, soil texture
classes and especially soil depths and terrain attributesgenerated from a DEM have been investigated since
the end-1970s and have greatly enhanced the quanti-
tative investigation of hydrological processes in soils
(Beven and Kirkby, 1979; O’Loughlin, 1986; Moore
et al., 1991). These studies contribute to the under-
standing of relations between topography, water move-
ment and ecosystem processes and support quantitative
and dynamic modelling of eco-hydrological processes
through the integration of GIS-based terrain analysis and
field observations (Chamran et al., 2002).
This study presents a soil-modelling technique to
extrapolate soil-depth information from four transects
(soil depth as understood as depth to bedrock) to a
small catchment in Sweden based on different maps of
explanatory variables. Three statistical means (arithmetic,
Copyright © 2009 John Wiley & Sons, Ltd. Hydrol. Process. 23, 3017–3029 (2009)DOI: 10.1002/hyp
8/10/2019 Dahlke 2009 HP
http://slidepdf.com/reader/full/dahlke-2009-hp 3/13
THE EXTRAPOLATION OF TRANSECT SOIL DEPTH POINT INFORMATION 3019
geometric and harmonic) are tested to predict soil depths
based on class means derived from an overlay of the
point observations with each class of a geomorphology
and different terrain maps. Using bootstrapping, the
capability of the statistical means to predict soil depths
and the model uncertainty is estimated for different
spatial disaggregation.
SITE DESCRIPTION
The Stubbetorp catchment (58°440N, 16°210E) is located
about 120 km southwest of Stockholm in the eastern part
of central South Sweden (Figure 1). The hilly catchment
belongs to the upper part of the Kolmarden mountain
ridge, a region dominated by low-weathering gneissic
granites that bounds the northern shore of the deeply
incised bay Braviken of the Baltic Sea (Wikstrom,
1979). The main valley and the two side valleys of Stubbetorp catchment, which covers an area of 0Ð94 km2,
are northwest–southeast orientated following the major
fault line in this region. Altitude in the catchment ranges
from 80 m above sea level (asl) at the gauge to 130 m asl.
The Stubbetorp catchment was completely covered with
water after the last deglaciation period (Persson, 1982).
Both glacial ice movements and the action of ocean
waves, which left the top of the hills with little soil cover,
influenced the present geomorphology and topography.
In large parts of the catchment (46%), the bedrock is
covered with till on which usually rather conductive, very
stony and in fine materials depleted soils are developed.
The eroded gravel and fine sediments have accumulated
in depressions and in the main valley where ombrotrophic
peatlands and swamp forests (in total 10Ð5%) with a
maximum peat depth of 6Ð5 m occur. The catchment
is largely dominated by podzolic forest soils, whereas
lithosols with rocky outcrops are especially occurring in
the southeast part of the catchment. The mean slope of thecatchment is 5Ð9° with a maximum slope of 26° in the area
of the catchment outlet. Most of the catchment is forested
(83%) with Pinus sylvestris and Picea abies of different
age, deciduous tree species are less important and occur
only in the wetland areas. The climate in the catchment is
characterized by a mean annual precipitation of 666 mm
and an annual potential evaporation of 432 mm (period
1985–1994). Mean annual runoff measured for the same
time period was 230 mm (Pettersson, 1995).
MATERIALS
Soil depth measurements
Soil depth measurements (depth to bedrock) were
available for two longer transects (485 m length) crossing
the main valley in the upper part of the catchment
and in two shorter transects in the central (210 m
length) and lower part (120 m length) of the catchment
(Figure 1). These soil depth measurements were obtained
in 1994 using Georadar (Olofsson and Fleetwood, 1994).
The derived data set consists of 226 points with an
incremental distance of 5 m with soil depths varying
between zero and 6Ð5 m (Figure 2).
Figure 1. Study area: Stubbetorp catchment, central-southeast Sweden. Dots indicate locations of soil depth measurements used in this study. Greyareas indicate wetland areas, mapped in July 2005 in the catchment
Copyright © 2009 John Wiley & Sons, Ltd. Hydrol. Process. 23, 3017–3029 (2009)DOI: 10.1002/hyp
8/10/2019 Dahlke 2009 HP
http://slidepdf.com/reader/full/dahlke-2009-hp 5/13
THE EXTRAPOLATION OF TRANSECT SOIL DEPTH POINT INFORMATION 3021
Table II. Terrain parameters calculated for Stubbetorp catchmentand Pearson product– moment correlation coefficients (r ) esti-
mated between soil depths and terrain attributes, respectively
Terrain Attributes r
Vertical distance to channel network (Olaya,
2004)
0Ð58
Elevation above channel (McGuire et al., 2005) 0Ð54Relative profile curvature (Behrens, 2003) 0Ð52Relative hillslope position (Hatfield, 1999) 0Ð42Minimum curvature (Wood, 1996) 0Ð41Waxing/waning slopes (Huber, 1994) 0Ð36Longitudinal curvature (Wood, 1996) 0Ð35Mean curvature (Shary et al., 2002) 0Ð33Mean curvature (Zevenbergen and Thorne, 1987) 0Ð33Mean curvature (Bolstad et al., 1998) 0Ð33Mean curvature ‘high pass filter’ (Behrens,
2003)0Ð33
Mean curvature (Mc Nab, 1989) 0Ð32True surface distance from streams (Behrens,
2003)0Ð31
Relative aspect curvature (Lehmeier and Kothe,1992)
0Ð28
Profile curvature (Shary et al., 2002) 0Ð27Minimum curvature (Shary et al., 2002) 0Ð25Height above channel (Behrens, 2003) 0Ð23Maximum curvature (Shary et al., 2002) 0Ð23Maximum curvature (Wood, 1996) 0Ð16Horizontal curvature (Shary et al., 2002) 0Ð16Plan curvature (Zevenbergen and Thorne, 1987) 0Ð14Difference curvature (Shary et al., 2002) 0Ð12Solar insolation (Shary et al., 2002) 0Ð12Vertical excess curvature (Shary et al., 2002) 0Ð12Plan curvature (Shary et al., 2002) 0Ð09Surface volume above minimum elevation
(Nogami, 1995)0Ð06
Topographic roughness (Behrens, 2003) 0Ð06Surface area (Jenness, 2004) 0Ð03Unsphericity (Shary et al., 2002) 0Ð02Ring-curvature (Shary et al., 2002) 0Ð02Aspect (Moore et al., 1993) 0Ð01Gaussian curvature (Shary et al., 2002) 0Ð02Slope (Horn, 1981) 0Ð04Surface runoff velocity (Moore et al., 1991) 0Ð04Gradient Factor (Shary et al., 2002) 0Ð04Gradient Factor (Behrens, 2003) 0Ð04Total accumulation curvature (Shary et al., 2002) 0Ð04Horizontal excess curvature (Shary et al., 2002) 0Ð07Cross-curvature (Wood, 1996) 0Ð09Rotor curvature (Shary et al., 2002) 0Ð11Reflectance map (Florinsky, 1998) 0Ð12
Topographic index (Beven and Kirkby, 1979) 0Ð13Slope-length-factor (Moore et al., 1991) 0Ð16Relative height curvature (Behrens, 2003) 0Ð17Cross-curvature (Moore et al., 1991) 0Ð24Hemispherical dispersion (Hodgson and Gaile,
1999)0Ð26
Longitudinal curvature (Moore et al., 1991) 0Ð27Steepest downslope (Tarboton, 1997) 0Ð28Profile curvature (Zevenbergen and Thorne,
1987)0Ð35
For the final selection of the terrain attributes as
input data sets for the soil model, both a clustering
of the four single terrain parameters in a number of 2
to 9 classes and parameter combinations of two, three
and all four terrain parameters were tested, resulting in
104 data sets. Parameter combinations were tested in
the sense to artificially generate terrain maps with a
varying number of classes whose spatial disaggregation
could explain best the spatial variability of the measured
soil depths. Since one of the aims of this study is to
test the model’s applicability to predict soil depth for
various spatially disaggregated input data sets, the lack of sufficient environmental data sets as input data in
the model was substituted by terrain maps of variable
number of classes generated through the combination
of different terrain parameters. To extract the terrain
parameters or parameter combination that showed the
highest class dissimilarity, the F-value of a one-way
analysis of variance was calculated for each terrain data
set. The F-value is a measure for how representative the
spatial variance of the fractioned terrain maps for the
distribution of soil depth in the catchment is and whether
the terrain map can be selected as input data set in the
conceptual soil model or not (Table III).
Soil model approach
The soil model approach is aimed to allow generating
spatial maps of soil characteristics (in this study: soil
depth) based on catena point information. The approach
is applicable to generate either user-defined discrete
landscape units like entities used in HRUs or semi-
continuous raster maps. The general approach is based
on class means resulting from an overlay of the soil-
depth measurements with each class of any nominal
data set (e.g. geomorphology and terrain layer). The
approach assumes that each environmental data set usedin the model represents actual differences in the soil
characteristic to be modelled in an area of interest. The
class means are calculated as arithmetic means over
all points located in spatial units with the same class-
id. Assuming that the catena of soil depths points is
crossing several spatial units in each spatial data layer,
the information of each class can be spread over the study
site, if overlaid with other spatial data sets and their
class means. Analogue to the regionalization concept
(Diekkrueger et al., 1999), the overlay of two or more
spatial data sets results, thus, in the disaggregation of
the study site into smaller discrete units whose ‘real’ soil
depth will be approached, the more data sets are used in
the model, the higher the explanatory variables correlate
with the measured soil attribute.
In this study, we tested three statistical means (arith-
metic, geometric and harmonic mean) to predict the soil
depth for Stubbetorp catchment from class means of the
generated terrain maps and the geomorphology map.
Model fitting and validation
The set of 226 soil depth points was split into training
and testing data sets of pre-defined size to evaluate the
spatial soil depths predictions and the model error of the
different soil models. To estimate the model performance,
we applied a bootstrapping technique. Bootstrapping is a
statistical method to estimate standard errors by sampling,
Copyright © 2009 John Wiley & Sons, Ltd. Hydrol. Process. 23, 3017–3029 (2009)DOI: 10.1002/hyp
8/10/2019 Dahlke 2009 HP
http://slidepdf.com/reader/full/dahlke-2009-hp 6/13
3022 H. E. DAHLKE ET AL.
Table III. Variability of F-values as measure of class dissimilarity in mean soil depth tested for all possible terrain parameters andparameter combinations of the cluster analysis
Terrain Parameter Combinations Number of Classes
2 3 4 5 6 7 8 9
1 vd 11Ð27 9Ð86 8Ð18 8Ð10 6Ð84 8Ð11 7Ð30 7Ð51rhp 37Ð28 27Ð54 17Ð14 17Ð66 13Ð10 16Ð46 10Ð72 9Ð61eac 46Ð85 24Ð13 19Ð33 17Ð07 17Ð41 16Ð86 19Ð29 17Ð12rpc 38Ð81 55Ð78 24Ð28 20Ð36 23Ð75 18Ð90 15Ð68 13Ð37
2 eac/rhp 46Ð85 24Ð13 21Ð94 17Ð07 21Ð66 25Ð45 27Ð62 14Ð87eac/rpc 40Ð10 52Ð91 30Ð56 28Ð46 25Ð82 25Ð45 27Ð62 25Ð64
eac/vd 46Ð18 25Ð74 21Ð76 16Ð85 17Ð82 15Ð89 15Ð33 13Ð59rhp/rpc 39Ð53 59Ð21 25Ð98 23Ð64 24Ð81 19Ð45 15Ð14 16Ð30vd/rhp 22Ð18 11Ð41 19Ð61 19Ð92 20Ð72 11Ð14 10Ð16 9Ð45vd/rpc 39Ð53 53Ð20 26Ð80 21Ð81 25Ð13 19Ð46 16Ð23 15Ð04
3 eac/vd/rpc 40Ð10 53Ð15 30Ð56 25Ð99 25Ð89 26Ð29 23Ð18 24Ð02eac/vd/rhp 46Ð18 25Ð74 22Ð51 19Ð99 19Ð99 17Ð73 17Ð78 14Ð99eac/rhp/rpc 39Ð74 52Ð91 30Ð56 25Ð99 25Ð89 25Ð45 25Ð58 21Ð68
4 eac/rhp/rpc/vd 8Ð95 52Ð91 30Ð56 26Ð94 28Ð44 21Ð66 23Ð18 21Ð69
Note: Vd, vertical distance to channel network; eac, elevation above channel; rpc, relative profile curvature; rhp, relative hillslope position. The higherthe F-value, the better is the class separation of the arithmetic class means. The highest F-value reached for each group of classes is highlighted inbold.
Table IV. Number of soil depth points in each class of the raster maps
Number of Classes Raster Maps Used Class id Totalin the Overlay
1 2 3 4 5 6 7 8 9
2 eac 33 193 2263 rhp rpc 24 55 147 2264 eac rpc 101 80 11 34 2265 eac
rpc 71 94 34 7 20 2266 eac vd rhp rpc 5 17 69 45 12 80 2267 eac vd rpc 86 30 15 68 5 7 17 2268 eac rpc 49 47 0 20 67 22 14 7 2269 eac rpc 64 14 11 15 45 53 4 0 20 2268 geomorphology 12 3 40 83 37 17 30 4 226
Note: Vd, vertical distance to channel network; eac, elevation above channel; rpc, relative profile curvature; rhp, relative hillslope position.
where the samples are repeatedly replaced (Efron, 1981).
In this study, we used bootstrapping to estimate the root
mean square error (RMSE) between predicted soil depths
calculated of the training set and the soil depths of the
testing data set, used as expected values. Although the
original data set was split into equally sized training and
testing data sets (113/113 points), we expected the RMSE
to be largely influenced by the sample size of some of
the raster map classes. Some of the terrain maps with
a high number of classes contain a low number of soil
depth points or even no soil depth points (empty classes)
(Table IV). Due to the large data range of measured soil
depths, the sample mean of these classes and the RMSE
are greatly influenced by the values picked during the
bootstrapping.
We calculated the RMSE for different scenarios to
estimate the quality of the predicted soil depth mapsusing bootstrapping and 5000 iterations for each test. In
detail we tested three different scenarios for validation
and calculated the RMSE as follows:
RMSE D
1
nÐ
niD1
x i y i2
where x i is the estimated soil depth calculated of the
arithmetic class means of two classes when combining
two input maps using one of the statistical means and
y i is a soil depth point of the testing data set. The three
measures for validation were the following:
1. The RMSE was calculated between the estimated
soil depth ( x i) of a certain class combination of the
training data set and each of the respectively soil
depth points of the testing set (y i) of exactly the same
class combination, in the following referred to as the
RMSEsingle value.
2. The RMSE was calculated based on the estimated soil
depth ( x i) of a class combination of the training data set
and the class average of the soil depth points of either
Copyright © 2009 John Wiley & Sons, Ltd. Hydrol. Process. 23, 3017–3029 (2009)DOI: 10.1002/hyp
8/10/2019 Dahlke 2009 HP
http://slidepdf.com/reader/full/dahlke-2009-hp 7/13
THE EXTRAPOLATION OF TRANSECT SOIL DEPTH POINT INFORMATION 3023
the geomorphology or the terrain map class (y i) that
the class combination is consisting of, in the following
referred to as the RMSEclass value.
3. The RMSE calculated on class level averaged to
a total RMSE of a given map combination (e.g.
Geomorphology and 2-classes terrain map) to compare
the quality of the different dissolved soil depth maps,for the remainder of this article defined as RMSEtotal.
RESULTS
Terrain layer selection and classification
F-values were calculated for all combinations of ter-
rain attributes. For each class category (number of
classes), the highest F-value was estimated and the ter-
rain map among all raster maps selected that showed the
best class separation. Table III shows F-values obtained
for all single terrain parameters and terrain parameter
combinations. The best F-value reached for each class
category is highlighted in bold. In case that more than
one terrain map reached the best F-value, we chose the
terrain map with the lowest number of combined ter-
rain parameters on the basis of Ockham’s razor (Wolpert,
1990).
Variability of soil depth measurements and class
combinations
Table V summarizes the available number of soil depth
points for each class combination, when the geomorphol-
ogy map is merged with a terrain map assuming all 226
soil depth points in the model. However, with respectto the three validation scenarios stated in section model
fitting and validation, the best validation method of the
estimated soil depths is to compare the estimated soil
depth of an area to soil depth points that are exactly
located in the same area. Since the soil depth points in our
study show a non-uniform distribution over the catchment
(see Figure 1), the estimated soil depth can only directly
be verified for a few class combinations with soil depth
points located in exactly the same area. Table III sum-
marizes the number of maximal available points for each
class combination to estimate the soil depths that can be
directly or indirectly validated with soil depths from thetesting data set. The 8-classes and the 9-classes terrain
maps both have ‘empty’ classes (class 3 of eac rpc8; class
8 of eac rpc9) and contain no representative soil depth
points for the calculation of a class mean (Table IV).
Soil model test using bootstrapping
Results of the total RMSE averaged over 5000 boot-
strapping iterations using the harmonic mean are shown
in Figure 3. Tests of the arithmetic and geometric mean
to predict soil depths for each geomorphology and ter-
rain map class combination were also performed. How-
ever, the results of the total RMSE, RMSEclass value and
RMSEsingle value indicated a poorer performance of the
statistical means as predictors, compared with the har-
monic mean. Both statistical means showed in general
higher RMSE in all validation scenarios and predicted
lower soil depth ranges in the output maps compared
with the original measurements and the predictions made
with the harmonic mean. The 226 point observations of
soil depth ranged from 0 to 6Ð5 m. The use of the arith-
metic mean to calculate class means would have resulted
in non-zero values and would have caused a bias of pre-dicted soil depth in areas (e.g. bare soil areas) where the
majority of soil depth points is zero. Initial test comput-
ing the coefficient of determination between the predicted
soil depth maps and class means of the original soil depth
measurements resulted in lowest coefficients for the maps
predicted with the arithmetic mean (max. R2D 0Ð60)
and highest coefficients for the maps predicted with the
harmonic mean (max. R2D 0Ð73). Consequently, only
assessments based on the harmonic means were selected
for further analyses.
The different map combinations shown in Figure 3
resulted in similar mean RMSE values for the comparedstatistical means with slightly decreasing RMSE values
with increasing number of classes. The means of the cal-
culated total RMSE values decrease from approximately
0Ð82 m (12Ð6% of the total data range) for the 2-classes
terrain map combination to about 0Ð73 m (11Ð1% of total
data range) for the 9-classes terrain map combination.
For convenience, the number of classes in the respec-
tive terrain maps is used in the remaining sections to
distinguish the tested map combinations in further inter-
pretations.
Validation results of the single-RMSE
(RMSEsingle values
), class-RMSE (RMSEclass values
) and a
comparison of estimated and predicted soil depths are
shown for the harmonic mean in Figure 4. For the
majority of the estimated soil depths, the single and
Figure 3. Box-and-whisker plot of total RMSE reached for the harmonicmean and different map combinations. The RMSE are sorted according
to the number of classes of the terrain map used in the overlay withthe geomorphology map. The diagram shows for each map combinationthe median, the upper and lower quartile and the smallest and largest
observed RMSE during the 5000 bootstrapping iterations
Copyright © 2009 John Wiley & Sons, Ltd. Hydrol. Process. 23, 3017–3029 (2009)DOI: 10.1002/hyp
8/10/2019 Dahlke 2009 HP
http://slidepdf.com/reader/full/dahlke-2009-hp 8/13
3024 H. E. DAHLKE ET AL.
Table V. Maximum number and number of exactly located soil depth points available for the prediction of soil depths for eachgeomorphology-terrain map combination based on all 226 points
Maximal Available Number of Points Exactly Located Points
Number of TerrainClasses Maps Geomorphology Geomorphology
Class id 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
12 3 40 83 37 17 30 4
2 eac1 33 45 36 73 116 70 50 63 37 3 7 19 4
2 193 205 196 233 276 230 210 223 197 12 3 37 76 18 17 3
3 rhp rpc1 24 36 27 64 107 61 41 54 28 16 82 55 67 58 95 138 92 72 85 59 12 5 4 2 6 263 147 159 150 187 230 184 164 177 151 3 35 63 27 11 4 4
4 eac rpc
1 101 113 104 141 184 138 118 131 105 4 1 23 4 3 13 16 12 80 92 83 120 163 117 97 110 84 2 13 31 31 33 11 23 14 51 94 48 28 41 15 9 24 34 46 37 74 117 71 51 64 38 8 4 3 1 4 14
5 eac rpc
1 71 83 74 111 154 108 88 101 75 13 28 28 22 94 106 97 134 177 131 111 124 98 4 3 21 37 13 163 34 46 37 74 117 71 51 64 38 8 4 3 1 4 144 7 19 10 47 90 44 24 37 11 5 25 20 32 23 60 103 57 37 50 24 2 1 6 2
6 eac vd rhp rpc
1 5 17 8 45 88 42 22 35 9 2 12 17 29 20 57 100 54 34 47 21 6 1 2 83 69 81 72 109 152 106 86 99 73 6 1 19 1 13 24 45 57 48 85 128 82 62 75 49 7 19 17 25 12 24 15 52 95 49 29 42 16 2 2 6 26 80 92 83 120 163 117 97 110 84 2 12 5 12 2 2
7 eac vd rpc
1 86 98 89 126 169 123 103 116 90 2 15 55 1 2 22 30 42 33 70 113 67 47 60 34 4 18 83 15 27 18 55 98 52 32 45 19 6 1 2 64 68 80 71 108 151 105 85 98 72 6 1 18 8 13 225 5 17 8 45 88 42 22 35 9 2 16 7 19 10 47 90 44 24 37 11 3 3 1
7 17 29 20 57 100 54 34 47 21 14 3
8 eac rpc
1 49 61 52 89 132 86 66 79 53 2 13 23 112 47 59 50 87 130 84 64 77 51 5 1 7 6 8 23 0 12 174 20 32 23 60 103 57 37 50 24 2 14 45 67 79 70 107 150 104 84 97 71 1 17 38 7 46 22 34 25 62 105 59 39 52 26 14 87 14 26 17 54 97 51 31 44 18 6 2 68 7 19 10 47 90 44 24 37 11 3 4
9 eac rpc
1 64 76 67 104 147 101 81 94 68 1 17 33 9 42 14 26 17 54 97 51 31 44 18 6 2 63 11 23 14 51 94 48 28 41 15 9 24 15 27 18 55 98 52 32 45 19 2 7 4 25 45 57 48 85 128 82 62 75 49 5 1 7 6 6 2
6 53 65 56 93 136 90 70 83 57 2 13 23 157 4 16 7 44 87 41 21 34 8 1 38 0 12 179 20 32 23 60 103 57 37 50 24 14 6
3 40 83 37 30 4
3 40 83 37 30 4
Note: N is the maximum number of soil depths points located in each class of each map. Light grey highlighted cells show class combinations thoseestimated soil depths can directly be validated with soil depths points that are exactly located in the same class combination. Dark grey highlightedcells indicate class combinations that do not comprise direct validation points, but that can be compared with the class mean of the testing dataset. Black cells highlight class combinations that occur in the final prediction maps, but those soil depths cannot be calculated due to a lack of soildepth points located in one or both of the combined classes (empty classes). White cells highlight class combinations that do not occur in the finalprediction maps.
class RMSE stay in the range of the calculated total
RMSE and the data set’s standard deviation of 1Ð09 m.
The single and class RMSE exceed the mean total
RMSE for estimated soil depth greater than 1 m. This
was expected considering the value range of mea-
sured (0–1 m) and predicted soil depths (0–0Ð54m)
(Figure 4b). RMSEclass values are generally larger than
RMSEsingle values because of the greater data range result-
ing from the comparison of an estimated soil depth
point to the mean soil depth of a layer class. The small
RMSEsingle values indicate that the estimated soil depths
predicted with the harmonic mean differ only little from
Copyright © 2009 John Wiley & Sons, Ltd. Hydrol. Process. 23, 3017–3029 (2009)DOI: 10.1002/hyp
8/10/2019 Dahlke 2009 HP
http://slidepdf.com/reader/full/dahlke-2009-hp 9/13
THE EXTRAPOLATION OF TRANSECT SOIL DEPTH POINT INFORMATION 3025
Figure 4. Comparison of RMSE and estimated soil depths calculated with the harmonic mean. Diagram (a) shows the estimated soil depths (black dots), results of the two validation scenarios RMSEsingle value (black crosses) and RMSEclass value (grey diamonds). The RMSEsingle value result froma comparison of the estimated soil depth of a certain class combination to validation points located in areas with the same class combination andRMSEclass value show the comparison of the estimated soil depth of a certain class combination to all validation points of either one of the combinedclasses in a map combination. Diagram (b) shows a comparison of minimum and maximum estimated soil depths predicted with the testing and
training data set
the soil depths actually measured in the area of a certain
class.
Predicted soil depths maps
Maps of estimated soil depth were generated with the
harmonic mean for each map combination (Figure 5).
The predicted soil depth maps show an increasing degreeof spatial disaggregation the more classes the spatial
data sets in the overlay process have. The number
of entities increases in the prediction maps from 128
to 1438 for the overlay of the geomorphology with
a terrain map consisting of minimum two classes to
maximum 9 classes. Similarly, the size of the largest
spatial entity in the predicted soil maps decreases from
maximum 160 800 m2 to 32300 m2. Soil depth maps
predicted with the 8- or 9-classes terrain layer exhibit
‘empty’ or ‘no-data’ areas, where the soil depth cannot
be modelled. Both terrain layer lack soil depth points
in one of the classes to calculate the class mean. Thesize of the ‘no data’ areas in the geomorphology/8-
terrain classes map covers 0Ð034 km2 and 0Ð031 km2
in the geomorphology/9-terrain classes map. The areas
equal 3Ð9% and 3Ð3% of the catchment area (0Ð942 km2),
respectively.
Soil depth maps with a higher degree of spatial
disaggregation show also a greater range of predicted
soil depths. Minimum, maximum and average soil depths
increased from 0Ð50 m to 0Ð31 m, 2Ð24 m to 3Ð04 m and
1Ð2 m to 1Ð68 m, respectively with increasing number of
included terrain classes in the predicted soil depth map
(Figure 6).
RMSE were calculated between the soil catena points
and the cell values in the soil depth prediction maps
to estimate the most suitable soil depth prediction map
(Table VI). The map combinations of the geomorphology
map with the 2-terrain-classes map reached the best
coefficients among all map combinations and tested
statistical means. The lowest RMSE (RMSE D 0Ð46 m)
was reached for the geomorphology/2-terrain classes map
predicted with the harmonic mean, which also showed
the highest R2
. The second lowest RMSE (RMSE D0Ð61 m) was reached for the geomorphology/5-terrain
classes map. The prediction error of these two map
combinations was less than 10% of the overall soil depth
range measured in the catchment.
DISCUSSION
The R2 reached in the soil depth prediction maps agrees
well with accuracies achieved for most quantitative
spatial soil models (Beckett and Webster, 1971; Ryan
et al., 2000). According to Beckett and Webster (1971),
R2
greater than 0Ð7 are unusual for most spatial modelsand R2 of 0Ð5 or less are common. In this study, the
RMSE of the final soil depth prediction maps showed
an error smaller than 10% of the data range. This shows
that the presented soil model approach provides an easy
applicable method in terms of computation requirements
that predict spatial variability of soil depth more accurate
than a single explanatory variable.
The fact that the geomorphology/2-terrain classes map
reached the lowest RMSE among all tested statistical
means was unexpected, because both the value range of
estimated soil depths and the degree of spatial disaggrega-
tion were smaller in the final prediction map than in map
combinations with more classes. However, this fact can
be explained with the clustering approach that has been
used to reclassify the terrain attributes to generate second
Copyright © 2009 John Wiley & Sons, Ltd. Hydrol. Process. 23, 3017–3029 (2009)DOI: 10.1002/hyp
8/10/2019 Dahlke 2009 HP
http://slidepdf.com/reader/full/dahlke-2009-hp 10/13
3026 H. E. DAHLKE ET AL.
Figure 5. Maps of estimated soil depths using the harmonic mean as prediction model. Each map shows a map combination of the geomorphology
layer consisting 8 classes and a terrain layer with varying number of classes (2–9 terrain classes). Grey areas indicate ‘empty’ classes, where soildepths could not be estimated due to lacking point data in the training data set
Figure 6. Comparison of minimum, maximum and mean soil depth foreach produced soil depth map using the harmonic mean. Statistics issorted according to the number of classes in the terrain map used in the
overlay with the geomorphology map
input layer for the overlay process. The k -means cluster-
ing algorithm used in this study randomly generates k
clusters from the continuous terrain attribute maps. The
final location and size of the clusters are, however, statis-
tically determined by the convergence criterion that needs
to be met for each cluster (Hartigan and Wong, 1979).
The terrain classes resulting from the clustering depend
on statistical differences in topography, but might not
reflect the actual soil depth variability in the watershed.
An expert-based differentiation and reclassification of the
terrain attributes as input layer are therefore suggested for
future applications.Although the best RMSE suggests that the soil depth
map with the lowest disaggregation is the best choice for
further applications, if a higher spatial disaggregation is
desired, the user has to balance between the prediction
accuracy and the number of classes used in the overlay
process. The use of input layers with more classes may
lower the probability to calculate the layer class means
(e.g. soil depth). The overlay of several explanatory
variables with a low number of classes will likely
increase the probability to ensure complete coverage in
the prediction maps and higher prediction accuracies.
However, in case of the occurrence of unpredictable
areas, post-processing is needed to complete the soil
depth information. Several approaches can be applied
such as taking only the information from one of the
Copyright © 2009 John Wiley & Sons, Ltd. Hydrol. Process. 23, 3017–3029 (2009)DOI: 10.1002/hyp
8/10/2019 Dahlke 2009 HP
http://slidepdf.com/reader/full/dahlke-2009-hp 13/13
THE EXTRAPOLATION OF TRANSECT SOIL DEPTH POINT INFORMATION 3029
Nogami M. 1995. Geomorphometric measures for digital elevationmodels. Zeitschrift f¨ ur Geomorphologie, N.F. Suppl. Bd. 101: 53–67.
Olofsson B, Fleetwood A. 1994. Georadarundersokningar i Stubbetorp-somradet, Kolmarden. Avd for mark-och vattenresurser, KTH.
O’Loughlin EM. 1986. Prediction of surface saturation zones in naturalcatchments by topographic analysis. Water Resources Research 22:794–804.
Odeh IOA, McBratney AB, Chittleborough DJ. 1994. Spatial prediction
of soil properties from landform attributes derived from a digitalelevation model. Geoderma 63(3–4): 197–214.
Odeh IOA, McBratney AB, Chittleborough DJ. 1995. Further resultson prediction of soil properties from terrain attributes: heterotopiccokriging and regression kriging. Geoderma 673: 215–226.
Olaya V. 2004. A Gentle Introduction to SAGA GIS , 1Ð1 edn, 216.Olsson M. 1999. Soil Survey in Sweden. In Soil Resources of Europe ,
Bullock P, Jones RJA, Montanarella L (eds). European Soil BureauResearch Report No.6, EUR 18991 EN. Office for Official Publicationsof the European Communities: Luxembourg; 202.
Park SJ, van de Giesen N. 2004. Soil-landscape delineation to definespatial sampling domains for hillslope hydrology. Journal of Hydrology
295: 28–46.Persson C. 1982. Beskrivning till jordartskartan Katrineholm SO , Serie
Ae, Nr 46. SGU (description of the soil map Katrineholm SO, inSwedish).
Pettersson O. 1995. Vattenbalans f¨ or f ¨ altforskningsomr˚ aden. SMHIHydrologi, No 59. SMHI; 21 (Water balance for field research areas,in Swedish).
Quinn T, Zhu AX, Burt JE. 2005. Effects of detailed soil spatial informa-tion on watershed modeling across different model scales. International
Journal of Applied Earth Observation and Geoinformation 7: 324–338.Ryan PJ, McKenzie NJ, O’Connell D, Loughhead AN, Leppert PM,
Jacquier D, Ashton L. 2000. Integrating forest soils information acrossscales: spatial prediction of soil properties under Australian forests.Forest Ecology and Management 138: 139–157.
Saco PM, Willgoose GR, Hancock GR. 2006. Spatial organization of soil depths using a landform evolution model. Journal of Geophysical
Research 111: 14, F02016. DOI:10.1029/2005JF000351.Shary PA, Sharaya LS, Mitusov AV. 2002. Fundamental quantitative
methods of land surface analysis. Geoderma 107: 1–35.
Sinowski W, Auerswald K. 1999. Using relief parameters in adiscriminant analysis to stratify geological areas with different spatialvariability of soil properties. Geoderma 89(1–2): 113–128.
Skidmore AK, Watford F, Luckananurug P, Ryan PJ. 1996. An opera-tional GIS expert system to map forest soils. Photogrammetric Engi-neering and Remote Sensing 62: 501–511.
Sommer M, Schlichting E. 1997. Archetypes of catenas in respect tomatter a concept for structuring and grouping catenas. Geoderma 76:
1–33.Stieglitz M, Shaman J, McNamara J, Engel V, Shanley J, Kling GW.
2003. An approach to understanding hydrologic connectivity onthe hillslope and implications for nutrient transport. Global
Biogeochemical Cycles 17(4): 1105. DOI:10.1029/2003GB002041.Tarboton DG. 1997. A new method for the determination of flow
directions and upslope areas in grid digital elevation models. Water Resources Research 33(2): 309–319.
Tromp-van Meerveld HJ, McDonnell JJ. 2006a. Threshold relations insubsurface stormflow: 1. A 147-storm analysis of the Panola hillslope.Water Resources Research 42: W02410.
Tromp-van Meerveld HJ, McDonnell JJ. 2006b. On the interrelationsbetween topography, soil depth, soil moisture, transpiration rates andspecies distribution at the hillslope scale. Advances in Water Resources29: 293–310.
Wigmosta MS, Vail L, Lettenmaier DP. 1994. A distributed hydrology-
vegetation model for complex terrain. Water Resources Research 30:1665–1679.Wikstrom A. 1979. Beskrivning till berggrundskartan Katrineholm SO ,
Serie Af, Nr 123. SGU (Description of the geological map,Katrineholm SO, in Swedish).
Wolpert D. 1990. The relationship between Occam’s razor andconvergent guessing. Complex Systems 4: 319–368.
Wood J. 1996. The Geomorphological Characterisation of Digital Elevation Models. PhD Thesis. Department of Geography, Universityof Leicester.
Zevenbergen LW, Thorne CR. 1987. Quantitative analysis of land surfacetopography. Earth Surface Processes and Landforms 12(1): 47–56.
Zhu AX. 2000. Mapping soil landscape as spatial continua: the neuralnetwork approach. Water Resources Research 36: 663–677.
Copyright © 2009 John Wiley & Sons, Ltd. Hydrol. Process. 23, 3017–3029 (2009)DOI: 10.1002/hyp