Forest Ecology and Management - WSL...R. Engler et al./Forest Ecology and Management 310 (2013)...

10
Combining ensemble modeling and remote sensing for mapping individual tree species at high spatial resolution Robin Engler , Lars T. Waser, Niklaus E. Zimmermann, Marcus Schaub, Savvas Berdos, Christian Ginzler, Achilleas Psomas Swiss Federal Research Institute WSL, Zuercherstrasse 111, CH-8903 Birmensdorf, Switzerland article info Article history: Received 5 April 2013 Received in revised form 21 July 2013 Accepted 31 July 2013 Keywords: Vegetation mapping Aerial imagery Species distribution modeling Ensemble forecast Forest ecosystems Switzerland abstract The ability to map vegetation and in particular individual trees is a key component in forest management and long-term forest monitoring. Here we present a novel approach for mapping individual tree species based on ensemble modeling, i.e. combining the projections of several modeling techniques in order to reduce uncertainty. Using statistical modeling in conjunction with high-resolution aerial imagery (50 cm spatial resolution) and topo-climatic variables (5 m spatial resolution), we map the distributions of six major tree species (3 broadleaf and 3 conifers) in a study area of North-Eastern Switzerland. We also compare the relative predictive power of both topo-climatic and remote-sensing variables for map- ping the spatial tree patterns and assess the importance of calibration data quality on model perfor- mance. We evaluate our projections using cross-validation as well as with independent data. Overall, the evaluations that we obtain for our vegetation maps are in line with, or higher than, those in similar studies. Depending on the considered tree species, 47.8–85.6% of our samples were correctly predicted, and we obtain an overall CCR (correct classification rate) of 0.72 and a Cohen’s kappa of 0.65. Comparing the predictive power of the different modeling techniques, we find that ensemble modeling (i.e. combin- ing the projections of different individual modeling techniques) generally performs better than individual modeling techniques. Ó 2013 Elsevier B.V. All rights reserved. 1. Introduction Vegetation mapping is often of prime importance in monitoring, protection and restoration programs, and has therefore been the focus of many remote sensing applications (Xie et al., 2008). One particular aspect of vegetation mapping is the identification and classification of individual tree species. Being able to identify and geo-localize individual species provides useful data not only for forest management and long-term forest monitoring (e.g. Lorenz, 1995; Zweifel et al., 2010; Graf Pannatier et al., 2010; Dobbertin, 2009; Thimonier et al., 2010), but also for other application such as biodiversity assessment and conservation planning (Ferrier, 2002; Kukkala and Moilanen, 2013), or ozone risk assessment (Emberson et al., 2001; Büker et al., 2011). This latter is because sensitivity to tropospheric ozone is not only dependent on ozone exposure, microclimate and site conditions, but also on species specific, physiological and morphological, characteristics (e.g. Schaub et al., 2005). Therefore, high spatial resolution vegeta- tion mapping is of prime importance for resource management and conservation. More generally, the mapping of individual tree species at high resolution is the first stepping-stone for making use of species-specific data such as carbon- and water-balance, sap-flow, micro-climate or transpiration in spatially explicit model parameterization, calibration, and validation (e.g. Etzold et al., 2011). During the 1980s and 1990s, mapping and classification of individual tree species was based on the interpretation and mapping of aerial photographs. Methods have also been devel- oped to identify individual tree crowns (Wulder, 1998). Over the last decade, high spatial resolution data (pixel size of one meter or less) became increasingly available, opening a new round of re- search on classifying tree species at the individual tree level (Brandtberg, 2002; Key et al., 2001; Erikson, 2004). Such informa- tion is increasingly needed to assess biodiversity effects and to map ecosystem services or landscape functions. Digital airborne data have facilitated new opportunities for tree species classifica- tion since the digital devices are considered to be spectrally and radiometrically superior to analogue cameras (Petrie and Walker, 2007). The data are recorded by frame-based sensors, e.g. Z/I DMC (Olofsson et al., 2006), Ultracam (Hirschmugl et al., 2007) or line-scanning sensors, e.g. ADS40/ADS80 (Waser et al., 2010, 2011; Waser, 2012). Complementary to spectral imagery, high-resolution airborne laser scanning (ALS) has become an 0378-1127/$ - see front matter Ó 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.foreco.2013.07.059 Corresponding author. Tel.: +41 (0) 44 7392 448; fax: +41 (0) 44 7392 215. E-mail address: [email protected] (R. Engler). Forest Ecology and Management 310 (2013) 64–73 Contents lists available at ScienceDirect Forest Ecology and Management journal homepage: www.elsevier.com/locate/foreco

Transcript of Forest Ecology and Management - WSL...R. Engler et al./Forest Ecology and Management 310 (2013)...

Page 1: Forest Ecology and Management - WSL...R. Engler et al./Forest Ecology and Management 310 (2013) 64–73 65 2.4. Remote sensing variable preparation The remote sensing data used for

Forest Ecology and Management 310 (2013) 64–73

Contents lists available at ScienceDirect

Forest Ecology and Management

journal homepage: www.elsevier .com/ locate/ foreco

Combining ensemble modeling and remote sensing for mappingindividual tree species at high spatial resolution

0378-1127/$ - see front matter � 2013 Elsevier B.V. All rights reserved.http://dx.doi.org/10.1016/j.foreco.2013.07.059

⇑ Corresponding author. Tel.: +41 (0) 44 7392 448; fax: +41 (0) 44 7392 215.E-mail address: [email protected] (R. Engler).

Robin Engler ⇑, Lars T. Waser, Niklaus E. Zimmermann, Marcus Schaub, Savvas Berdos, Christian Ginzler,Achilleas PsomasSwiss Federal Research Institute WSL, Zuercherstrasse 111, CH-8903 Birmensdorf, Switzerland

a r t i c l e i n f o a b s t r a c t

Article history:Received 5 April 2013Received in revised form 21 July 2013Accepted 31 July 2013

Keywords:Vegetation mappingAerial imagerySpecies distribution modelingEnsemble forecastForest ecosystemsSwitzerland

The ability to map vegetation and in particular individual trees is a key component in forest managementand long-term forest monitoring. Here we present a novel approach for mapping individual tree speciesbased on ensemble modeling, i.e. combining the projections of several modeling techniques in order toreduce uncertainty. Using statistical modeling in conjunction with high-resolution aerial imagery(50 cm spatial resolution) and topo-climatic variables (5 m spatial resolution), we map the distributionsof six major tree species (3 broadleaf and 3 conifers) in a study area of North-Eastern Switzerland. Wealso compare the relative predictive power of both topo-climatic and remote-sensing variables for map-ping the spatial tree patterns and assess the importance of calibration data quality on model perfor-mance. We evaluate our projections using cross-validation as well as with independent data. Overall,the evaluations that we obtain for our vegetation maps are in line with, or higher than, those in similarstudies. Depending on the considered tree species, 47.8–85.6% of our samples were correctly predicted,and we obtain an overall CCR (correct classification rate) of 0.72 and a Cohen’s kappa of 0.65. Comparingthe predictive power of the different modeling techniques, we find that ensemble modeling (i.e. combin-ing the projections of different individual modeling techniques) generally performs better than individualmodeling techniques.

� 2013 Elsevier B.V. All rights reserved.

1. Introduction

Vegetation mapping is often of prime importance in monitoring,protection and restoration programs, and has therefore been thefocus of many remote sensing applications (Xie et al., 2008). Oneparticular aspect of vegetation mapping is the identification andclassification of individual tree species. Being able to identify andgeo-localize individual species provides useful data not only forforest management and long-term forest monitoring (e.g. Lorenz,1995; Zweifel et al., 2010; Graf Pannatier et al., 2010; Dobbertin,2009; Thimonier et al., 2010), but also for other application suchas biodiversity assessment and conservation planning (Ferrier,2002; Kukkala and Moilanen, 2013), or ozone risk assessment(Emberson et al., 2001; Büker et al., 2011). This latter is becausesensitivity to tropospheric ozone is not only dependent on ozoneexposure, microclimate and site conditions, but also on speciesspecific, physiological and morphological, characteristics(e.g. Schaub et al., 2005). Therefore, high spatial resolution vegeta-tion mapping is of prime importance for resource management andconservation. More generally, the mapping of individual tree

species at high resolution is the first stepping-stone for makinguse of species-specific data such as carbon- and water-balance,sap-flow, micro-climate or transpiration in spatially explicit modelparameterization, calibration, and validation (e.g. Etzold et al.,2011).

During the 1980s and 1990s, mapping and classification ofindividual tree species was based on the interpretation andmapping of aerial photographs. Methods have also been devel-oped to identify individual tree crowns (Wulder, 1998). Over thelast decade, high spatial resolution data (pixel size of one meteror less) became increasingly available, opening a new round of re-search on classifying tree species at the individual tree level(Brandtberg, 2002; Key et al., 2001; Erikson, 2004). Such informa-tion is increasingly needed to assess biodiversity effects and tomap ecosystem services or landscape functions. Digital airbornedata have facilitated new opportunities for tree species classifica-tion since the digital devices are considered to be spectrally andradiometrically superior to analogue cameras (Petrie and Walker,2007). The data are recorded by frame-based sensors, e.g. Z/I DMC(Olofsson et al., 2006), Ultracam (Hirschmugl et al., 2007) orline-scanning sensors, e.g. ADS40/ADS80 (Waser et al., 2010,2011; Waser, 2012). Complementary to spectral imagery,high-resolution airborne laser scanning (ALS) has become an

Page 2: Forest Ecology and Management - WSL...R. Engler et al./Forest Ecology and Management 310 (2013) 64–73 65 2.4. Remote sensing variable preparation The remote sensing data used for

R. Engler et al. / Forest Ecology and Management 310 (2013) 64–73 65

operational tool in the last decade, which has rendered theclassification of individual species more feasible (Brandtberg,2007; Ørka et al., 2009; Heinzel and Koch, 2011). Airborne hyper-spectral imagery has also been found to produce high accuraciesfor identifying individual tree species (e.g. Aspinal, 2002; Leckieet al., 2003, 2005; Boschetti et al., 2007; Jones et al., 2010). Forinstance, in a recent study, Dalponte et al. (2013) obtained up to96% producer accuracies when mapping conifers in Norwegianboreal forests.

An alternative to the use of remote-sensing based variables forcarrying-out projections of species distributions is the use oftopo-climatic variables (e.g. temperature, rainfall, slope, solarradiation, soil water balance). Such variables have been usedabundantly in the field of species distribution modeling, wherespatial projections of the ecological niche of species (i.e., habitatsuitability conditions) are constructed by statistically relatingspecies observations (presence/absence or abundance) to environ-mental predictor variables (see Guisan and Zimmermann, 2000;Guisan and Thuiller, 2005 for a review). However, speciesdistribution models have generally not been used to mapindividual trees but rather to predict habitat suitability conditionsat a given site (e.g. Engler et al., 2004; Randin et al., 2006). Todate, relatively few efforts have been made to combine remote-sensing data together with topo-climatic variables in order tomap vegetation (e.g. Zimmermann et al., 2007), and none to ourknowledge has tried to combine these data-sources for mappingthe distribution of individual tree species at very high spatialresolution.

The technique of species distribution modeling has far ad-vanced over the last two decades (Zimmermann et al., 2010),specifically with regards to the use of advanced statisticalmethods and the implementation of ensemble modeling tech-niques. Ensemble modeling (or ensemble forecasting; Araújoand New, 2007) consists in combining the projections fromseveral different statistical modeling techniques into a singleprojection. The idea behind ensemble modeling is that acombined projection will have lower mean error than any ofits individual constituents (Thuiller et al., 2009). Ensemble mod-eling has become increasingly used in bioclimatic modeling ofspecies distribution (e.g. Engler et al., 2011), but has, to ourknowledge, never been applied in a remote-sensing mappingapplication so far.

Here we introduce a novel approach for mapping individualtree species by using ensemble modeling and the combinationof high-resolution remote-sensing data (aerial imagery) andtopo-climatic variables. We map the distributions of six treespecies (3 broadleaf and 3 conifers), over a 20 � 10 km studyarea of complex topography in North-Eastern Switzerland, usingstatistical modeling in conjunction with high-resolution aerialimagery (50 cm pixel size) and topo-climatic variables(5 m pixel size). More precisely, we combine six different statis-tical modeling techniques in an ensemble modeling approach toobtain probabilistic distribution maps for each of our targetspecies. We additionally compare the relative predictive powerof both topo-climatic and remote-sensing variables for mappingthe spatial tree patterns, and we assess the importance ofcalibration data quality on model performance. Finally, we com-bine the spatial patterns obtained for each of our six individualspecies to obtain a map that classifies each 5 � 5 m pixellocated within a forest into one of our target species, or ‘‘other’’when none of our species was predicted with high enough like-lihood. This allows us to map individual species for a range ofpurposes, or to simply re-classifying the species map e.g. intomaps of plant function types (such as broadleaves and needle-leaves).

2. Methods

2.1. Study area and target species

The study area is located in North-Eastern Switzerland and cov-ers a region of 20 � 10 km centered on 9.35�E 47.36�N (Fig. 3 andSupplementary material Fig. S1). Elevation ranges from 580 to1300 m a.s.l., and the general topography of the area is very hilly(83% of the study area has a slope > 10%). The land cover is a het-erogeneous mixture of forest, grasslands, pastures, agriculturaland urban areas.

The forested lands form a mosaic of fragmented surfaces thatcover 27.8% of the study area (52.6 km2; Supplementary materialFig. S1). They are mostly characterized by mixed forests with adominance of deciduous trees along rivers and coniferous treesabove 1200 m. The forests are partly managed: clearings and bothdeforestation and afforestation occur in several parts of the area.

For our mapping exercise we selected six tree species that aretypical for the study area: Fagus sylvatica L., Fraxinus excelsior L., Acersp. (mostly Acer pseudoplatanus L.), Larix decidua Mill., Picea abies L.and Abies alba Mill. The chosen species represent a balanced mix ofdifferent tree types (i.e. broadleaf vs. conifers) and occurrence fre-quency, with species being respectively very common (Picea, Fagus),relatively frequent (Fraxinus, Acer, Abies) and relatively rare (Larix).

2.2. Data sampling

During the summers of 2010 and 2011, 812 individual trees(130 Fagus, 119 Fraxinus, 89 Acer, 69 Larix, 271 Picea, 134 Abies)were mapped following a random-stratified sampling design. Thestratification factors used in the random sampling were growingdegree days above 5 �C and soil water balance (see Table 1), andensured the collected data spanned the major environmental gra-dients found in our study area in a representative manner (Supple-mentary material Fig. S1).

Field work was carried out as follows: each visited samplinglocation was prospected for our six target species within a radiusof about 100 m. When a species was found present at a given sam-pling location, one easy-to-identify individual (typically a largetree or a tree within a large patch of trees of the same species)was mapped using prints of the 50 cm resolution aerial imageryand differential GPS (±2 m accuracy). We only mapped those indi-viduals that could be identified with very high certainty on the aer-ial imagery (typically large canopy trees). The field-collected datawere later entered in a GIS, allowing us to manually delineatethe crown of each recorded tree individual with high accuracy.

2.3. Topo-climatic variable preparation

12 different topo-climatic variables were prepared at a 5 � 5 mspatial resolution (Table 1). Slope and topographic position werecomputed from a 2 m resolution digital elevation model aggre-gated to 5 m using bilinear interpolation. Distance to the nearestwater body was derived from the numeric land use model ofthe Swiss Federal Office of Topography (swisstopo – VEC-TOR25 model). All other topo-climatic variables were downscaledfrom existing 25 m resolution data (see Zimmermann and Kienast,1999; Zimmermann et al., 2007 for details on computation), usingbilinear interpolation. The original 25 m resolution data were com-puted from long-term (1961–1990) monthly means for averagetemperature (�C) and sum of precipitation (mm) provided fordifferent elevations by the Swiss Federal Office of Meteorologyand Climatology (MeteoSwiss), a digital elevation model and a soilsuitability map (BfS, 2000).

Page 3: Forest Ecology and Management - WSL...R. Engler et al./Forest Ecology and Management 310 (2013) 64–73 65 2.4. Remote sensing variable preparation The remote sensing data used for

Table 1List of topo-climatic variables prepared at a 5 � 5 m resolution and tested for multi-collinearity.

Variable Description and unit Reference

Growing degree days > 5 �Ca Sum of days multiplied by temperature > 5 �C (�C � d � yr�1) Zimmermann and Kienast(1999)

Mean temperature of coldestmonth

Average temperature from the coldest month of the year (�C) Zimmermann and Kienast(1999)

Summer moisture index Average of daily atmospheric H2O balance from June to August (mm � d�1) Zimmermann and Kienast(1999)

Summer sum of precipitations Sum of precipitation from April–September (mm) Zimmermann and Kienast(1999)

Winter sum of precipitations Sum of precipitation from October–February (mm) Zimmermann and Kienast(1999)

Yearly solar radiation Sum of monthly average of daily global solar radiation (kj �m�2 � yr�1) Zimmermann et al. (2007)Summer solar radiationa Sum of monthly average of daily global solar radiation from April to September (kj �m�2 � yr�1) Zimmermann et al. (2007)Slopea Slope angle (�)Soil water balancea Difference between precipitation and potential evapotranspiration. Accounts for soil water holding

capacity (mm)Zimmermann et al. (2007)

Topographic wetness indexa Relative measure of local hydrological potential (specific catchment area/slope) (m) Sørensen and Seibert (2007)Topographic positiona Identification of topographic features (ridge, slope, toe slope) at various spatial scales (unitless) Zimmermann et al. (2007)Distance to nearest water bodya Euclidian distance to closest river, lake or swamp area (m)

a Denotes variables that were kept for model calibration after analysis of multi-collinearity.

66 R. Engler et al. / Forest Ecology and Management 310 (2013) 64–73

2.4. Remote sensing variable preparation

The remote sensing data used for this study were six secondgeneration Airborne Digital Sensor ADS40-SH52 (Leica Geosys-tems, 2011) image strips with a 50% side-overlapping. The radio-metrically corrected ADS40 data were recorded on 25 July 2008and had a spatial resolution of 0.5 m (approx. scale 1:15,000), aradiometric resolution of 11 Bit and a spectral resolution of fourbands: in the blue (428–492 lm), green (533–587 lm), red(608–662 lm), and near infrared (833–887 lm) regions ofthe electromagnetic spectrum. For details on the sensor see e.g.Kellenberger and Nagy (2008).

Three ortho-images were then calculated from the six imagestrips using digital surface models generated in Socet Set 5.4.1(BAE Systems, 2007). Image values were calibrated digital numbersthat equal to at-sensor radiances multiplied by a factor 50. Fromthe original four spectral bands (red, blue, green, near infrared)four vegetation indices were then derived: the normalized differ-ence vegetation index (NDVI), the renormalized difference vegeta-tion index (RDVI), the modified simple ratio index (MSR) and themodified chlorophyll absorption ratio index (MCARI1). Details ofthese different indices are given in Table 2.

For each of the vegetation indices as well as for the original fourbands, we aggregated the 50 cm data to 5 m resolution grids bycomputing the mean and the standard deviation of all 50 cm pixelsfalling within a 5 m pixel. The pixel size of 5 m was chosen becauseit roughly corresponds to the size of tree crown, and earlier studieshave shown that this pixel size produces results close to those ob-tained when using object-based segments (i.e., auto-delineatedtree-crowns; Waser, 2012). The 5 m grid was overlaid to the studyarea starting with the upper left corner. Since tree crowns are

Table 2List of remote sensing variables tested for multi-colinearity.

Vegetation index or band name Equation

Normalized difference vegetation index (NDVI)a NDVI ¼ ðRNIR

Renormalized difference vegetation index (RDVI) RDVI ¼ RNIRðModified simple ratio index (MSR)

MSR ¼ RNIRRRED�

Modified chlorophyll absorption ratio index 1 (MCARI1)a MCARI1 ¼ 1:Blue band (428–492 lm) –Green band (533–587 lm)a –Red band (608–662 lm) –Near Infrared band (833–887 lm) –

a Denotes variables that were kept for model calibration after analysis of multi-coline

sometimes mixed with deep shadows that do not carry any usefulinformation, but would bias the mean and standard deviation com-puted for a 5 � 5 m pixel, those 50 � 50 cm pixels with NDVI val-ues < 0.2 (identifying deep areas of deep shadow) weredisregarded during the computing of mean and standard deviationvalues (for an example see Fig. 1). The threshold value ofNDVI = 0.2 was chosen because none of our species samples(i.e. manually delineated tree crowns) had a 50 � 50 cm pixel valuebelow that threshold. All 5 � 5 m pixels that, after the NDVI-basedfiltering, had less than ten 50 cm pixels remaining were removedfrom any further analysis as the mean or standard deviation com-puted for such pixels was deemed unreliable.

2.5. Multi-colinearity among variables (variable pre-selection)

Multi-collinearity among explanatory variables (i.e. topo-climatic and spectral variables) was assessed through correlation(Pearson correlation coefficient). Most of the remote-sensingderived vegetation indices and spectral bands had high multi-collinearity, and therefore, only three of them were kept, both intheir ‘‘mean’’ and ‘‘standard deviation’’ form: green band, NDVIand MCARI (Table 2). Similarly, only topo-climatic variables withcorrelation <0.7 were kept. Thus, out of the original 28 variables,13 were kept for model calibration. Among the selected 13 vari-ables, all but one pair had correlation values <0.7.

2.6. Model calibration

For each species, models were calibrated with three differentsets of explanatory variables and two different sets of response

Reference

� RREDÞ=ðRNIRÞ Rouse et al. (1974)� RREDÞ=

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiðRNIR þ RREDÞ

pRougean and Breon (1995)

1�=

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiRNIRRREDþ 1

� �rHaboudane et al. (2004)

2 2:5ðRNIR � RREDÞ � 1:3ðRNIR � RGREENÞ½ � Haboudane et al. (2004)

arity.

Page 4: Forest Ecology and Management - WSL...R. Engler et al./Forest Ecology and Management 310 (2013) 64–73 65 2.4. Remote sensing variable preparation The remote sensing data used for

Fig. 1. (a) Original 50 cm resolution RGB image of a patch of forest showing a mix of tree and shadow areas. The overlaid grey grid shows the dimension of 5 � 5 m pixels towhich the 50 cm pixel were aggregated. (b) Same image but showing those pixels that were kept based on NDVI filtering (in bright blue–yellow–red color) for the computingof the mean and standard deviation values of each 5 � 5 m pixel. Crossed-out pixels illustrate pixel that were discarded from any further analysis because they contained toofew (<10) 50 cm pixels above the shadow-removal threshold.

Topo-climatic + Remote-sensing derived (ALL)

Topo-climatic only (TC)

Remote-sensing derived only (RS)

Explanatory variables levels [3] Response variables levels [2]

Pixel coverage > 50% (R1) No limit in pixel coverage (R2)

GLM GAM GBM MARS RF FDA CART ANN EM (combination of models)

Modeling techniques [9]

Fig. 2. Experimental design showing the different combinations of explanatory variables, response variable and modeling techniques that were tested. EM = Ensembleforecast model (combination of models).

R. Engler et al. / Forest Ecology and Management 310 (2013) 64–73 67

variables (Fig. 2). The three different sets of explanatory variableswere the following: topo-climatic + remote-sensing derived vari-ables (hereafter abbreviated ‘‘ALL’’; marked with ‘a’ in Tables 1and 2), remote-sensing derived variables only (abbreviated ‘‘RS’’;denoted with ‘a’ in Table 2) and topo-climatic variables only(abbreviated ‘‘TC’’; denoted with ‘a’ in Table 1).

Distinguishing two sets of response variable aimed to testwhether the quality of the data used for model calibration hadan effect on model predictive power. In the first set (hereafterabbreviated ‘‘R1’’), only those samples covering at least 50% of a5 � 5 m pixel were kept (a sample being a manually delineatedtree crown). In the second set (abbreviated ‘‘R2’’), all samples werekept, regardless of their size. In other words, the R1 dataset is asubset of the R2 dataset where only those samples (tree crowns)that were large enough to cover >50% of a 5 � 5 m pixel were kept.For each tree crown sample in the R1 and R2 datasets, we extractedthe values of the topo-climatic and remote-sensing derived vari-ables from the 5 m girds. If a tree crown extended over more thanone 5 m pixel, which happened frequently, the weighted averagevalue of those pixels was computed (weights were proportionalto the coverage of each pixel by the tree crown). These values wereused for model calibration. For the remote-sensing variables, thereason why we chose to extract values from the 5 m grids, ratherthan to use values based only on those 50 cm pixels falling insideof the delineated tree crowns, is because our model projectionswere going to be carried-out on the same 5 m grids (and not ondelineated tree crowns). Our models were thus calibrated withthe same type of data than those onto which they would ultimatelybe projected.

For each of the above levels, probabilistic models were cali-brated using eight different statistical techniques: generalized lin-ear models (GLM; McCullagh and Nelder, 1989), generalized

additive models (GAM; Hastie and Tibshirani, 1986), boostedregression trees (GBM; Ridgeway, 1999), random forest (RF;Breiman, 2001), multivariate adaptive regression splines (MARS;Friedman, 1991), flexible discriminant analysis (FDA; Hastieet al., 1994), artificial neural networks (ANN; Ripley, 1996), andclassification and regression trees (CART, Breiman et al., 1984).These models relate the response to the explanatory variable in or-der to derive the probability of a pixel to host a given targetspecies.

A final synthesis projection of potential tree distribution wasobtained by combining the results of six different individualmodeling techniques (GLM, GAM, GBM, FDA, MARS and RF) intoa so-called ensemble model (Araújo and New, 2007). Thisensemble model (EM) was obtained by averaging the individualmodel projections after having weighted them proportionally totheir TSS score (a measure of model quality – see Section ‘‘2.7’’for details). This is an ensemble modeling method that has beenshown particularly robust (Marmion et al., 2009; Engler et al.,2011).

For model calibration, we only had species presence recordsavailable. Yet, all the above-mentioned modeling techniques re-quire both presence and absence data. Therefore, we used the pres-ence records of all species but the one that was modeled as absenceinformation. For instance, when calibrating a model for Fagus, thepresence data from Fagus were contrasted against the presencedata of all other species that were considered absence (of Fagus) re-cords. In order to limit spatial autocorrelation, absence records thatwere located within less than 50 m from an already selected pres-ence were discarded. Presence and absence data were alsoweighted during model calibration so that both had equal preva-lence (i.e. so that models are not biased towards overpredictionof either presences or absences). All models were only projected

Page 5: Forest Ecology and Management - WSL...R. Engler et al./Forest Ecology and Management 310 (2013) 64–73 65 2.4. Remote sensing variable preparation The remote sensing data used for

68 R. Engler et al. / Forest Ecology and Management 310 (2013) 64–73

onto pixels located within forested lands as defined by the1:25,000 scale Swiss topographic map.

2.7. Model evaluation

Each individual model was evaluated through two differentprocedures: repeated split-sample and spatially independent dataevaluation (Fig. 3).

In the split-sample evaluation, the predictive power of eachindividual model was evaluated through a repeated randomdata-splitting procedure. A model was trained on 70% of the data(chosen randomly) and evaluated on the remaining 30% usingthe true skill statistic (TSS; Allouche et al., 2006). TSS is a measurethat assesses the agreement between predicted and observed val-ues. It is computed as specificity (fraction of correctly predictedpresences) + sensitivity (fraction of correctly predicted absences)– 1, and varies between negative values (systematically wrong),0 (random model) and 1 (perfect agreement). For instance, a TSSof 0.6 means that the proportion of correctly predicted presencesand absences in the data is roughly 80% (0.8 + 0.8–1 = 0.6). Sinceprojections from our models are probabilities which are continu-ous (between 0 and 1), they needed to be converted to binary val-ues (i.e. a species is either predicted present or absent) in order tocompute a value of TSS. This was done using a reclassificationthreshold that is applied to the original, probabilistic, model pro-jections: values above or equal to the threshold are reclassifiedas predicted species presences, while values below the thresholdare reclassified as predicted absences. When we hereafter refer toTSS values, we actually refer to the maximum TSS value that isobtained when testing all possible reclassification thresholds be-tween 0 and 1. This entire split-sample evaluation procedure wasrepeated 30 times for each model, and the evaluation values ob-tained from each replicate were averaged.

In the independent data evaluation, the data were split into twosubsets based on geography (Fig. 3). Models were calibrated usingthe data from the eastern part of the study area (�70% of the data),before being evaluated against the data from the western part ofthe study area (�30% of the data). Thus, the main difference be-tween the ‘‘repeated split-sample’’ and the ‘‘independent’’ evalua-tion, is that in the former the calibration and evaluation data comefrom the exact same geographic extent, while in the later they

Fig. 3. Illustration of the split-sample and the spatially independent evaluationprocedure. In the split-sample evaluation, calibration (70%) and evaluation (30%)samples are drawn over the entire study area. In the spatially independentevaluation, we divided the study area into two distinct regions consisting of 30%and 70% of the total area (dashed line) and then used data from the eastern side formodel calibration and data from the western side for model evaluation. Therectangle on the map indicates the study area location.

come from distinct geographical areas. For this reason, the inde-pendent evaluation procedure is a tougher test for the modelsand can be expected to give a better idea of a model’s predictivepower when applied outside of its calibration area.

Finally, we also computed the TSS value that is obtained whenusing the optimum reclassification threshold obtained from thecalibration data (i.e. the threshold that maximizes the TSS valuefor the calibration data rather than for the evaluation data). Thisvalue is interesting because, unlike the other TSS values that wecomputed from independent data, it is not optimized from the per-spective of the data that is used for model evaluation, and hencerepresents the true TSS value that would be obtained when extrap-olating a model outside of its calibration area while using thereclassification threshold derived from its calibration area.

2.8. Multi-species vegetation map

In order to obtain a single vegetation map where each pixel isattributed a single species (i.e., one of our six target species, or‘‘other’’ when none of our species was predicted with a high-enoughlikelihood), we combined the projections that we obtained from oursingle species ensemble models (EM) as follows: for each species, thecontinuous probability values (in the range [0:1]) that were belowthe maximum TSS reclassification threshold were set to 0, whilethe values above the reclassification threshold were linearly re-scaled from 1 to 10 (where 10 indicates the highest probability ofoccurrence). Whenever more than one species was predicted to oc-cur on a given pixel, the species with the highest occurrence proba-bility was attributed to the pixel, except for Larix, over which allother species had precedence (i.e., Larix was only attributed to a pixelwhen it was predicted present and all other species were predictedto be absent). This is because Larix is the least abundant species inour study area and was often over-predicted by its model. If, for agiven pixel, none of our six target species was predicted to bepresent, then that pixel was set to ‘‘other’’.

3. Results

3.1. Multi-collinearity among variables

All of the variables that were kept for model calibration had cor-relation values <0.7, except for one pair (NDVI mean – MCARImean; Fig. 4). Remote-sensing derived and topo-climatic variablesshowed virtually no correlation with each other (average correla-tion between both groups <0.05).

3.2. Comparison of models across modeling techniques

While almost all models obtained evaluation scores greaterthan the threshold above which models are considered ‘‘useful’’(TSS = 0.4), the ANN and CART modeling techniques generally re-ceived lower evaluation scores (Supplementary material Fig. S2).For this reason, these two modeling techniques were not consid-ered for computing the ensemble model projections (EM).

The EM projections (obtained when combining the projections ofthe 6 remaining modeling techniques) generally produced the mod-els with the highest rates of correctly classified samples, both in thesplit-sample and the independent data evaluation (Supplementarymaterial Fig. S2). For this reason all analysis from here-on are car-ried-out solely on the results of the EM model projections.

3.3. Comparison of models fitted with different explanatory andresponse variables sets

Comparing the models based on the three different levels ofexplanatory variables (ALL, RS, TC; Fig. 5a and b) reveals that the

Page 6: Forest Ecology and Management - WSL...R. Engler et al./Forest Ecology and Management 310 (2013) 64–73 65 2.4. Remote sensing variable preparation The remote sensing data used for

Fig. 4. Tree view of the Pearson correlation coefficients between predictorvariables. The value that is read at the branching between two predictors or groupsof predictors indicates the correlation level between the two predictors or groups ofpredictors. Correlation values were computed by using the predictor valuesassociated with the sample locations of all species. The dashed grey line indicatesthe 0.8 correlation level threshold. ME stands for ‘‘mean’’ and SD for ‘‘standarddeviation’’.

Fig. 5. Comparison of TSS model evaluation between the models built with allvariables (ALL), the remote sensing variables only (RS), and the topo-climaticvariables only (TC), as well as with the R1 (light grey) and R2 (dark grey) responsevariable data sets. (a) TSS values obtained from the random split-sample evaluationprocedure. (b) TSS values obtained from the spatially independent data evaluationprocedure. (c) TSS values obtained when applying the TSS optimized probabilitythreshold derived from the calibration data to the evaluation data. Each box-plot isbuilt form six values of our target species. The boxes extend from the data’s 1st to3rd quartile, the horizontal bars in the box represent the median, and the dashedline marks the threshold necessary to reach a model quality that we consider‘‘useful’’ (TSS = 0.4).

Table 3TSS values from split-sample and independent evaluation for individual speciesmodels. Values are given for ensemble models (EM) calibrated with ALL R2 data.

Species

Fagus Fraxinus Acer Larix Picea Abies

Split-sample evaluation 0.76 0.70 0.73 0.75 0.75 0.71Independent evaluation 0.79 0.60 0.72 0.72 0.71 0.66

R. Engler et al. / Forest Ecology and Management 310 (2013) 64–73 69

inclusion of remote-sensing variables (ALL and RS) allowed highTSS scores (around 0.75 – meaning that on average �87.5% of thepresence and absences are correctly predicted). On the contrary,models based solely on TC variables performed poorly (TSS valuesgenerally <0.4). Based on this result, all analyses hereafter consideronly the models calibrated with both the remote sensing and topo-climatic variables (ALL).

Comparing the TSS values obtained from the split-sample(Fig. 5a), the spatially independent (Fig. 5b) and the spatially inde-pendent with thresholds derived from calibration data (Fig. 5c)evaluation procedures shows that testing the models in a geo-graphically different area (Fig. 5b and c) results in lower rates ofcorrect predictions as compared to testing models against datafrom the same area where the calibration data originate from(split-sample evaluation; Fig. 5a).

Interestingly, while the models based on R1 data (pixel cover-age >50% – light grey histograms in Fig. 5) and R2 data (no lowerlimit in pixel coverage – dark grey histograms in Fig. 5) performedsimilarly in the split-sample and independent data evaluation, theR2 models achieved better classification rates than the R1 modelswhen using the calibration-data derived threshold (Fig. 5c).

3.4. Comparison of models across species

When considering the EM projections, all species achieved highevaluation scores, with TSS values between 0.6 and 0.79 (Table 3).Such TSS values mean that, on average, presence and absenceoccurrences of a species were correctly predicted with a rate of�80–90%. Fagus is the species that obtained the highest evaluationscore. Acer, Larix and Picea are at average. Fraxinus and Abies ob-tained slightly lower evaluation scores than the other species,but in absolute terms their models remain rather good.

3.5. Multi-species vegetation maps

We computed the multi-species vegetation map using theensemble forecast models (EM) calibrated with the ALL R2 data(we chose the ALL R2 models because they obtained the highestevaluation scores under independent evaluation – see Fig. 5c).

With respectively 85.6% and 83.8% of the samples correctly pre-dicted, Picea and Fagus are the two species that are best classifiedby our multi-species map (Table 4a). Abies samples were correctlyclassified with a rate of 76.9%, and Fraxinus samples with a rate of56.3%. The least well classified species are Acer and Larix, withrespectively 49.4% and 47.8% of correctly predicted samples. As ex-pected, using fully independent data for model evaluation(Table 4b) leads to an important decrease in correctly predictedsamples: the overall correct classification rate (CCR) decreasesfrom 72% to 53% and Cohen’s kappa from 0.65 to 0.41. A subsetof the multi-species map is shown in Fig. 6.

It is possible, although not very likely, that two individuals ofdiffering species inhabit the same 5 � 5 m pixel. Nevertheless, ana-lyzing how frequently two different species are predicted to bepresent within a same pixel gives an indication of which speciesare the most likely to be mistaken for each other by the model(Table 5). This analysis indicates that, among our species, broadleafspecies are more likely to be confused with each other than coni-fers. For instance, 59.1% of the pixels where Fraxinus was predicted

Page 7: Forest Ecology and Management - WSL...R. Engler et al./Forest Ecology and Management 310 (2013) 64–73 65 2.4. Remote sensing variable preparation The remote sensing data used for

Table 4Confusion matrix for tree species classification of the multi-species ensemble forecast (EM) ALL R2 model based on (a) the split-sample evaluation and (b) fully independent data.CCR = overall correct classification rate, Kappa = Cohen’s kappa coefficient. The ‘‘Other’’ column indicates samples for which none of our species was predicted with high enoughlikelihood. Bold values highlight the matrix diagonal (correctly predicted samples).

Field data Classified asFagus Fraxinus Acer Larix Picea Abies Other Prod. acc. (%) Kappa

a) Split-sample evaluationFagus 109 6 8 1 – 3 3 83.8Fraxinus 27 67 16 – – 3 6 56.3Acer 16 24 44 – – 5 – 49.4Larix 1 3 4 33 25 – 3 47.8Picea 2 1 2 5 232 17 12 85.6Abies 1 3 – – 23 103 4 76.9User acc. (%) 69.9 64.4 59.5 84.6 82.9 78.6 –CCR 0.72 0.65

b) fully independent data evaluationFagus 26 12 5 – – 2 1 56.5Fraxinus 3 18 12 – 1 1 6 43.9Acer 5 22 7 – – 1 – 20.0Larix – 2 4 6 10 – 9 19.4Picea – – 1 3 79 7 3 84.9Abies – 1 – – 23 15 1 37.5User acc. (%) 76.5 32.7 24.1 66.7 69.9 57.7 –CCR 0.53 0.41

Fig. 6. Subset of (a) original aerial imagery and (b) the multi-species vegetation map where each pixel (5 � 5 m) is attributed to one of our six species or to ‘‘other’’.

Table 5Pixels-based overlap between pair-wise projections of the six target species. Theoverlap is listed column-wise per species and expresses to what percentage the pixelsthat are projected to have each target species present (listed in columns) are alsoprojected to carry the respective other five species (listed in rows). For instance, thefirst column of the table indicates the overlap between the projected distribution ofFagus and all other species as a percentage of the total distribution of Fagus. All valuesrepresent averages among the EM ALL R2 independent and EM ALL R2 repeated-splitsample models.

70 R. Engler et al. / Forest Ecology and Management 310 (2013) 64–73

were also predicted to host Acer, and 64.5% of the pixels predictedas being Acer were also projected to host Fraxinus. Fagus was alsorelatively often predicted to overlap with Fraxinus or Acer. Among

coniferous species, Larix and Picea have the highest likelihood ofbeing confused: 31.2% of pixels predicted to host Larix are also pre-dicted as Picea. Overall, broadleaf and coniferous species wererarely mistaken for each other, though there is some risk of confu-sion between Larix and Acer, or Fagus and Abies.

4. Discussion

In this study we mapped the spatial location of six individualforest tree species at a 5 m spatial resolution. We also investigatedthe impact that different modeling techniques, different explana-tory variable types (topo-climatic vs. remote-sensing based), anddifferent qualities of the response variable have on modelperformance.

4.1. Comparison of model performance across modeling techniques,explanatory and response variables

Comparing the predictive power of the different modeling tech-niques, we find that ensemble models (i.e. combining the projec-tions of different individual modeling techniques, EM) represent

Page 8: Forest Ecology and Management - WSL...R. Engler et al./Forest Ecology and Management 310 (2013) 64–73 65 2.4. Remote sensing variable preparation The remote sensing data used for

R. Engler et al. / Forest Ecology and Management 310 (2013) 64–73 71

overall the best performing approach (Supplementary material,Fig. S2). This corroborates with previously found results from thespecies distribution modeling field (Marmion et al., 2009; Engleret al., 2011). The accuracy of the fitted models that we obtain fromthe ensemble modeling technique suggest that, at least at a regio-nal level, useful and informative species identity maps can be pro-duced that can be used for a range of research and managementapplications.

In terms of explanatory variable importance, our results showthat the remote-sensing derived variables (raw bands and vegeta-tion indices) are by far the most important variables for mappingthe spatial distribution of our target species at high resolution(5 m pixel size). In our particular case, adding topo-climatic vari-ables to the remote sensing predictors did not significantly im-prove the models’ predictive power (Fig. 5a and b; paired t-testcomparing ALL and RS model TSS values, p-values always > 0.15,N = 12). Conversely, removing the remote-sensing derived vari-ables from the models led to a clear drop in predictive powerand to poorly performing models (TSS < 0.4). As such, our resultssharply contrast with those found by Zimmermann et al. (2007)who also combined topo-climatic and remote-sensing based vari-ables to model tree distributions, although at a much coarser scale(90 m pixels) and over a much larger spatial extent. In their study,Zimmermann and colleagues found that models based solely ontopo-climatic variables proved more accurate that those basedonly on remote-sensing variables, and that the alone-contributionto model predictive power of remote-sensing based variables wason average of 20% while it was of 40% for topo-climatic predictors.The reason for these divergent findings most likely originates fromthe differences in both spatial resolution of the aerial imagery thatwas used, and geographical extent of the two studies: while Zim-mermann et al. have been working over a large spatial extent ata moderate spatial resolution (�400 � 600 km, 90 m pixels), thepresent study uses much higher resolution (5 m pixels) over amuch smaller area (20 � 30 km). In our study, two adjacent 5 mpixels that host different species can have the same topo-climaticconditions, making it difficult for the models to discriminate spe-cies based on topo-climatic variables alone. Furthermore, givenits relatively small extent, our study area contains only limited var-iation in climatic space and therefore renders lower predictivecapacity to topo-climatic predictors. The combination of thesetwo reasons most likely explains why topo-climatic variables havelittle explanatory power in our models, whereas the opposite wasfound by Zimmermann et al. (2007). As a corollary, we expecttopo-climatic variables to have an increased predictive power ifworking at a larger extent, e.g. whole cantons (states) or all of Swit-zerland, where some species could for instance be excluded fromcertain areas on a climatic basis. On the other hand, our results alsosuggest that high-resolution remote-sensing predictors are crucialto distinguish spatial patterns when mapping species at very highresolution.

Finally, we also evaluated the influence of using two differentresponse variable datasets: the R1 dataset contained only samples(i.e. digitized tree crowns) that covered >50% of a 5 � 5 m pixel,while the R2 dataset contained all samples, regardless of their cov-erage. Our results indicate that there is no significant difference inmodel quality to be found when evaluating these two datasetsusing the split-sample and independent evaluations (Fig. 5a andb; paired t-test p-values always > 0.1, N = 12). However, the R2dataset scored better when the projections were reclassified usingthe calibration-data derived threshold (Fig. 5c; paired t-testp = 0.02, N = 12), which is the evaluation method that requiresthe highest degree of generalization from a model. The reasonwhy the R2 dataset performs significantly better in the latter eval-uation is likely related to the fact that the R2 dataset generatesmodels that are more generalizable, and will hence perform better

when projected outside of their calibration range (whereas the R1dataset might produce less generalizable models that perform wellwhen projected on their calibration area, but less so whenprojected outside of the calibration area).

4.2. Species mapping performance

In our ensemble modeling projections that are using both re-mote-sensing derived and topo-climatic variables, the TSS valueswe obtained for individual species range between �0.7–0.8(Fig. 5a and b). This means that, on average, about 85–90% of theindependent evaluation samples were correctly classified by themodels for individual species. The performance values decreasesomewhat when using the reclassification threshold (i.e. thethreshold used for converting probability to actual presences andabsences) derived from the calibration data (Fig. 5c), but remainnevertheless acceptable (TSS between �0.5–0.7 � 75–85% of cor-rectly classified samples).

Overall, the accuracies obtained in this study are in line with, orhigher than, those in similar studies. In the multi-species map(Fig. 6), depending on the considered species, 47.8–85.6% of oursamples were correctly predicted (Table 4a). We obtained an over-all CCR (correct classification rate) of 0.72 and a Cohen’s kappa of0.65. Waser et al. (2011), who also used ADS40 multi-spectral dataobtained an overall CCR of 0.76, a Cohen’s kappa of 0.76 andpercentages of correctly classified samples ranging between25.4–84.7% depending on the species. Olofsson et al. (2006) usedmultispectral imagery taken with a Zeiss/Intergraph DMC aerialcamera and obtained 88% CCR to discriminate between Scots pine,Norway spruce and deciduous trees. However direct comparison ofour results with these studies are difficult because the study areasand evaluation methods differ. Note that, as expected, the multi-species model has higher error rates than the individual speciesmodels as it can only predict a single species at each location(if two species are predicted at the same location, only the onewith highest likelihood is kept).

When projecting our models to a geographical location differentfrom the one where the calibration samples originate (fully inde-pendent evaluation, Table 4b), the percentage of correctly predictedsamples decreased by 27–39% for Fagus, Acer, Larix and Abies, yetonly by 1.8% and 12.4% for respectively Picea and Fraxinus. The over-all CCR decreased from 0.72 to 0.53. The fact that models performbetter at the geographical location where their calibration dataoriginates from can be expected. Yet, our results nevertheless pro-vide an example of the amount of decrease in predictive perfor-mance that can be expected when projecting models outside oftheir calibration area, an evaluation that is rarely provided.

In our study we chose to work at a 5 m spatial resolution as thisroughly corresponds to the size of a tree crown, and was shown toprovide good results for individual tree species classification(Waser, 2012). However, pixels are not necessarily centered on asingle tree, and 5 � 5 m pixels can contain a mixture of differentspecies. This likely increases the rate of inaccurate projections byour models. Working at a higher spatial resolution, for instance2 m rather than 5 m, might offer improvement as the likelihoodof having mixtures of different species in a pixel would be reduced.However, it would also reduce the number of 50 cm pixels fromwhich the mean and standard deviation computation is based(i.e. the values of the remote-sensing based variables were com-puted as the average and standard deviation of all the 50 cm pixelsthat fell within a 5 m pixel), thereby increasing variability in thesevalues.

A best-of-both-worlds approach might be achieved by combin-ing our approach with filtering techniques such as local maximumfiltering (Wulder et al., 2000), or with pattern-recognition tech-niques using the object-oriented classification software eCognition

Page 9: Forest Ecology and Management - WSL...R. Engler et al./Forest Ecology and Management 310 (2013) 64–73 65 2.4. Remote sensing variable preparation The remote sensing data used for

72 R. Engler et al. / Forest Ecology and Management 310 (2013) 64–73

(Definiens, 2009). These techniques allow delineating the crown ofindividual trees, and thus the mixture in tree crowns could be re-duced while at the same time encompassing as many of the 50 cmpixels as available for a given tree individual (e.g. Waser et al.,2011; Waser, 2012). A similar idea is to combine LiDAR andmulti-spectral data, the former to delineate individual trees andthe later to distinguish among the different species (Holmgrenet al., 2008; Chubey et al., 2009; Ke et al., 2010), or data fromsensors providing additional spectral or 3-D information, i.e.hyperspectral data (e.g. Leckie et al., 2005). Additionally, texturerecognition techniques (e.g., Franklin et al., 2000; Dobrowskiet al., 2008) could also be used to further improve the classificationaccuracy, in particular to delineate grassy clearings inside of forestpatches that were often wrongly classified in our results.

Besides mixture of tree canopies in a single pixel that frequentlyoccurs in forest ecosystems, another source of pixel misclassifi-cation originates from differences in scene illumination betweenimages. The images that we used were recorded at different timesof the day and under varying sun angles. For instance, our modelsfrequently over-predicted Abies alba in areas where the aerialimagery data are darker, because the canopy of Abies alba has low-er reflectance values than the canopy of other species. This issuedoes however not affect much our results since our training andevaluation sites are located outside of these darker areas(this was a necessity as it is very difficult to identify individualtrees in such areas). A corollary is however that our models aresomewhat biased towards trees with larger crowns and areaslocated in the brighter parts of the aerial imagery, and their predic-tions are thus less reliable in the darker sections of the image. Tosome extent, this could be improved by processing the differentaerial images so as to avoid having too much contrast betweenthe different scenes (Inpho GmbH and Stellacore Corp, 2012).

Finally, multi-temporal data (e.g. Key et al., 2001; Hill et al.,2010) could also represent a possible option to improve model pre-dictive accuracy as was found e.g. in Schwarz and Zimmermann(2005). For instance, Larix was often mistaken as Picea by our mod-els (Table 5), but imagery taken in the winter (when Larix has noneedles) would likely allow to better discriminate the two species.

In terms of practical applications, mapping individual tree spe-cies at high spatial resolution may serve for a range of differentpurposes. E.g. it can improve the accuracy of physiological modelapplications. For example the DO3SE model that is a soil–vegeta-tion-atmosphere-transport (SVAT) model has been specifically de-signed to estimate ozone deposition to European vegetation, basedon species specific parameterization (Büker et al., 2011). With theadditional information gained by vegetation mapping, the accuracyfor ozone risk assessment based on species specific estimates maybe significantly improved at forest stand or ecosystem level. Otherexamples for the use of individual tree species maps includethe calculation of ecosystem services provisioning based onspecies-specific characteristics, the derivation of conservationrelated analyses that are based on the knowledge of individual treespecies information and their spatial configuration, or other formsof up- or downscaling of forest-related information via statisticalor process-based models. Specifically, such maps can assist inscaling the information obtained at forest inventory plots to largerspatial scales, by knowing the number and position of individualtree species between inventory plots.

Acknowledgment

This research was conducted as part of the 6th & 7th EuropeanFramework Program Grants GOCE-CT-2007-036866 (ECOCHANGE)and ENV-CT-2009-226544 (MOTIVE).

Appendix A. Supplementary material

Supplementary data associated with this article can be found, inthe online version, at http://dx.doi.org/10.1016/j.foreco.2013.07.059.

References

Allouche, O., Tsoar, A., Kadmon, R., 2006. Assessing the accuracy of speciesdistribution models: prevalence, kappa and the true skill statistic (TSS).Journal of Applied Ecology 43, 1223–1232.

Araújo, M.B., New, M., 2007. Ensemble forecasting of species distributions. Trends inEcology and Evolution 22 (1), 42–47.

Aspinal, R.J., 2002. Use of logistic regression for validation of maps of the spatialdistribution of vegetation species derived from high spatial resolutionhyperspectral remotely sensed data. Ecological Modelling 157, 301–312.

BAE Systems, 2007. SOCET SET User’s manual, Version 5.4.1, October 2007, 1134pp.BfS, 2000. Digitale Bodeneignungskarte der Schweiz. Bundesamt für Statistik, Bern,

12 pp.Boschetti, M., Boschetti, L., Oliveri, S., Casati, L., Canova, I., 2007. Tree species

mapping with Airborne hyper-spectral MIVIS data: the Ticino Park study case.International Journal of Remote Sensing 28, 1251–1261.

Brandtberg, T., 2002. Individual tree-based species classification in high spatialresolution aerial images of forests using fuzzy sets. Fuzzy Sets and Systems 132(3), 371–387.

Brandtberg, T., 2007. Classifying individual tree species under leaf-off and leaf-onconditions using airborne LiDAR. ISPRS Journal of Photogrammetry and RemoteSensing 61, 325–340.

Breiman, L., 2001. Random Forests. Machine Learning 45 (1), 5–32.Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J., 1984. Classification and

Regression Trees. Wadsworth International Group, Belmont, CA, USA.Büker, P., Morrissey, T., Briolat, A., Falk, R., Simpson, D., Tuovinen, J.-P., Alonso, R.,

Barth, S., Baumgarten, M., Grulke, N., Karlsson, P.E., King, J., Lagergren, F.,Matyssek, R., Nunn, A., Ogaya, R., Peñuelas, J., Rhea, L., Schaub, M., Uddling, J.,Werner, W., Emberson, L.D., 2011. DO3SE modelling of soil moisture todetermine ozone flux to European forest trees. Atmospheric Chemistry andPhysics Discussion 11, 33583–33650.

Chubey, M., Stehle, K., Albricht, R., Gougeon, F., Leckie, D., Gray, S., Woods, M.,Courville, P., 2009. Semi-automated species classification in Ontario Great Lakes– St. Lawrence forest conditions. Final Report: Great Lakes - St. Lawrence ITCProject (2005/2008). Ontario Ministry of Natural Resources, January 2009, 71 p.

Dalponte, M., Orka, H.O., Gobakken, T., Gianelle, D., Naeesset, E., 2013. Tree speciesclassification in boreal forests with hyperspectral data. IEEE Transactions onGeoscience and Remote Sensing 51, 2632–2645.

Definiens, A.G., 2009. Definiens eCognition Developer 8 User Guide. Definens AG,Munchen, Germany.

Dobbertin, M., 2009. Forest ecosystems in a changing environment: what are futuremonitoring and research needs? Foreword. Ann. For. Sci. 66.

Dobrowski, S.Z., Safford, H.D., Cheng, Y.B., Ustin, S.L., browski et al. 2008. Mappingmountain vegetation using species distribution modeling, image-based textureanalysis, and object-based classification. Applied Vegetation Science 11, 499–508.

Emberson, L.D., Simpson, D., Tuovinen, J.-P., Ashmore, M.R., Cambridge, H.M., 2001.Modelling and mapping ozone deposition in Europe. Water Air and SoilPollution 130, 577–582.

Engler, R., Guisan, A., Rechsteiner, L., 2004. An improved approach for predicting thedistribution of rare and endangered species from occurrence and pseudo-absence data. Journal of Applied Ecology 41, 263–274.

Engler, R., Randin, C.R., Thuiller, W., Dullinger, S., Zimmermann, N.E., Araújo, M.B.,Pearman, P.B., Albert, C.H., Choler, P., de Lamo, X., Dirnböck, T., Gómez-García,D., Grytnes, J.-A., Heegard, E., Høistad, F., Le Lay, G., Nogues-Bravo, D., Normand,S., Piédalu, C., Puscas, M., Sebastià, M.-T., Stanisci, A., Theurillat, J.-P., Trivedi, M.,Vittoz, P., Guisan, A., 2011. 21st Century climate change threatens mountainflora unequally across Europe. Global Change Biology 17 (7), 2330–2341.

Erikson, M., 2004. Species classification of individually segmented tree crowns inhigh-resolution aerial images using radiometric and morphologic imagemeasures. Remote Sensing of Environment 91 (3–4), 469–477.

Etzold, S., Ruehr, N.K., Zweifel, R., Dobbertin, M., Zingg, A., Pluess, P., Häsler, R.,Eugster, W., Buchmann, N., 2011. The carbon balance of two contrastingmountain forest ecosystems in Switzerland: similar annual trends, but seasonaldifferences. Ecosystems 14, 1289–1309.

Ferrier, S., 2002. Mapping spatial pattern in biodiversity for regional conservationplanning: where to from here? Systematic Biology 51, 331–363.

Franklin, S.E., Hall, R.J., Moskal, L.M., Maudie, A.J., Lavigne, M.B., 2000. Incorporatingtexture into classification of forest species composition from airbornemultispectral images. International Journal of Remote Sensing 21, 61–79.

Friedman, J.H., 1991. Multivariate adaptive regression splines. Annals of Statistics19, 1–67.

Graf Pannatier, E., Dobbertin, M., Heim, A., Schmitt, M., Thimonier, A., Waldner, P.,Frey, B., 2010. Response of carbon fluxes to the 2003 heat wave and drought inthree mature forests in Switzerland. Biogeochemistry 107, 295–317.

Guisan, A., Thuiller, W., 2005. Predicting species distribution: offering more thansimple habitat models. Ecology Letters 8, 993–1009.

Page 10: Forest Ecology and Management - WSL...R. Engler et al./Forest Ecology and Management 310 (2013) 64–73 65 2.4. Remote sensing variable preparation The remote sensing data used for

R. Engler et al. / Forest Ecology and Management 310 (2013) 64–73 73

Guisan, A., Zimmermann, N.E., 2000. Predictive habitat distribution models inecology. Ecological Modelling 135, 147–186.

Haboudane, D., Miller, J.R., et al., 2004. Hyperspectral vegetation indices and novelalgorithms for predicting green LAI of crop canopies: modeling and validation inthe context of precision agriculture. Remote Sensing of Environment 90 (3),337–352.

Hastie, T., Tibshirani, R., 1986. Generalized additive models. Statistical Science 1 (3),297–318.

Hastie, T., Tibshirani, R., Buja, A., 1994. Flexible discriminant analysis byoptimal scoring. Journal of the American Statistical Association 89 (428),1255–1270.

Heinzel, J.N., Koch, B., 2011. Exploring full-waveform LiDAR parameters for treespecies classification. International Journal of Applied Earth Observation andGeoinformation 13 (1), 152–160.

Hill, R., Wilson, A.K., et al., 2010. Mapping tree species in temperate deciduouswoodland using time-series multi-spectral data. Applied Vegetation Science 13(1), 86–99.

Hirschmugl, M., Weninger, B., Raggam, H., Schardt, M., 2007. Single tree detection invery high resolution remote sensing data. Remote Sensing of Environment 110(4), 533–544.

Holmgren, J., Persson, Å., Söderman, U., 2008. Species identification of individualtrees by combining high resolution LiDAR data with multispectral images.International Journal of Remote Sensing 29 (5), 1537–1552.

Inpho GmbH and Stellacore Corp., 2012. Orthovista Direct. <http://www.orthovista.com/> (accessed 25.11.12).

Jones, T.G., Coops, N.C., Sharma, T., 2010. Assessing the utility of airbornehyperspectral and LiDAR data for species distribution mapping in the coastalPacific Northwest, Canada. Remote Sensing of Environment 114, 2841–2852.

Ke, Y., Quackenbush, L.J., et al., 2010. Synergistic use of QuickBird multispectralimagery and LIDAR data for object-based forest species classification. RemoteSensing of Environment 114 (6), 1141–1154.

Kellenberger, T.W., and Nagy, P., 2008. Potential of the ADS40 aerial scanner forarchaeological prospection in Rheinau, Switzerland. In: Proceedings of XXICongress International Society for Photogrammetry and Remote Sensing,Beijing China.

Key, T., Warner, T.A., McGraw, J.B., Ann Fajvan, M., 2001. A comparison ofmultispectral and multitemporal information in high spatial resolutionimagery for classification of individual tree species in a temperate hardwoodforest. Remote Sensing of Environment 75, 100–112.

Kukkala, A.S., Moilanen, A., 2013. Core concepts of spatial prioritisation insystematic conservation planning. Biological Reviews 88, 443–464.

Leckie, D.G., Gougeon, F.A., Walsworth, N., Paradine, D., 2003. Stand delineation andcomposition estimation using semi-automated individual tree crown analysis.Remote Sensing of Environment 85 (3), 355–369.

Leckie, D.G., Tinis, S., Nelson, T., Burnett, Ch., Gougeon, F.A., Cloney, E., Paradine, D.,2005. Issues in species classification of trees in old growth conifer stands.Canadian Journal of Remote Sensing 31 (2), 175–190.

Leica Geosystems, 2011. ADS80 sensor. <http://www.leica-geosystems.com/de/Leica-ADS80-Airborne-Digital-Sensor_86846.htm> (last access: 21.11.2012).

Lorenz, M., 1995. International co-operative programme on assessment andmonitoring of air pollution effects on forests – ICP forests. In: 5thInternational Conference on Acidic Deposition – Science and Policy: AcidReign 95, pp. 1221–1226. Springer Netherlands, Gothenburg, Sweden.

Marmion, M., Parviainen, M., Luoto, M., Heikkinen, R.K., Thuiller, W., 2009.Evaluation of consensus methods in predictive species distribution modeling.Diversity and Distributions 15, 59–69.

McCullagh, P., Nelder, J.A., 1989, . Generalized Linear Models, second ed. Chapman &Hall, London, UK.

Olofsson, K., Wallermann, J., Holmgren, J., Olsson, H., 2006. Tree speciesdiscrimination using Z/I DMC imagery and template matching of single trees.Scandinavian Journal of Forest Research 21, 106–110.

Ørka, H.O., Næsset, E., Bollands?s, O.M., 2009. Classifying species of individual treesby intensity and structural features derived from airborne laser scanner data.Remote Sensing of Environment 113 (6), 1163–1174.

Petrie, G., Walker, A.S., 2007. Airborne digital imaging technology: a new overview.Photogrammetric Record 22 (119), 203–225.

Randin, C.F., Dirnbock, T., Dullinger, S., Zimmermann, N.E., Zappa, M., Guisan, A.,2006. Are niche-based species distribution models transferable in space?Journal of Biogeography 33 (10), 1689–1703.

Ridgeway, G., 1999. The state of boosting. Computing Science and Statistics 31,172–181.

Ripley, B.D., 1996. Pattern Recognition and Neural Networks. Cambridge UniversityPress.

Rougean, J.-L., Breon, F.M., 1995. Estimating PAR absorbed by vegetation frombidirectional reflectance measurements. Remote Sensing of Environment 51 (3),375–384.

Rouse, J.W., Haas, R.H., et al. 1974. Monitoring the vernal advancements andretrogradation of natural vegetation. NASA/GSFC, Final Report. Greenbelt, MD:1–137.

Schaub, M., Skelly, J.M., Zhang, J.W., Ferdinand, J.A., Savage, J.E., Stevenson, R.E.,Davis, D.D., Steiner, K.C., 2005. Physiological and foliar symptom response in thecrowns of Prunus serotina, Fraxinus americana and Acer rubrum canopy trees toambient ozone under field conditions. Environmental Pollution 133, 553–567.

Schwarz, M., Zimmermann, N.E., 2005. A new GLM-based method for mapping treecover continuous fields using MODIS reflectance data. Remote Sensing ofEnvironment 95, 428–443.

Sørensen, R., Seibert, J., 2007. Effects of DEM resolution on the calculation oftopographical indices: TWI and its components. Journal of Hydrology 347, 79–89.

Thimonier, A., Graf Pannatier, E., Schmitt, M., Waldner, P., Walthert, L., Schleppi, P.,Dobbertin, M., Kräuchi, N., 2010. Does exceeding the critical loads for nitrogenalter nitrate leaching, the nutrient status of trees and their crown condition atSwiss Long-term Forest Ecosystem Research (LWF) sites? European Journal ofForest Research 129, 443–461.

Thuiller, W., Lafourcade, B., Engler, R., Araújo, M.B., 2009. BIOMOD – a platform forensemble forecasting of species distributions. Ecography 32, 369–373.

Waser, L.T., 2012. Airborne remote sensing data for semi-automated extraction oftree area and classification of tree species. Dissertation 20464, ETH Zurich. 153p.

Waser, L.T., Klonus, S., Ehlers, M., Küchler, M., Jung, A., 2010. Potential of digitalsensors for land cover and tree species classifications – a case study in theframework of the DGPF-project. Photogrammetrie FernerkundungGeoinformation 2, 141–156.

Waser, L.T., Ginzler, C., Kuechler, M., Baltsavias, E., Hurni, L., 2011. Semi-automaticclassification of tree species in different forest ecosystems by spectral andgeometric variables derived from Airborne Digital Sensor (ADS40) and RC30data. Remote Sensing of Environment 115, 76–85.

Wulder, M., 1998. Optical remote-sensing techniques for the assessment of forestinventory and biophysical parameters. Progress in Physical Geography 22, 449–476.

Wulder, M., Niemann, K.O., Goodenough, D.G., 2000. Local maximum filtering forthe extraction of tree locations and basal area from high spatial resolutionimagery. Remote Sensing of Environment 73 (1), 103–114.

Xie, Y., Sha, Z., Yu, M., 2008. Remote sensing imagery in vegetation mapping: areview. Journal of Plant Ecology 1 (1), 9–23.

Zimmermann, N.E., Kienast, F., 1999. Predictive mapping of alpine grasslands inSwitzerland: species versus community approach. Journal of Vegetation Science10 (4), 469–482.

Zimmermann, N.E., Moisen, G.G., Edwards, T.C., Frescino, T.S., Blackard, J.A., 2007.Remote sensing-based predictors improve distribution models of rare, earlysuccessional and broadleaf tree species in Utah. Journal of Applied Ecology 44,1057–1067.

Zimmermann, N.E., Edwards, T.C., Graham, C.G., Pearman, P.B., Svenning, J.-C., 2010.New trends in species distribution modelling. Ecography 33, 985–989.

Zweifel, R., Eugster, W., Etzold, S., Dobbertin, M., Buchmann, N., Häsler, R., 2010.Link between continuous stem radius changes and net ecosystem productivityof a subalpine Norway spruce forest in the Swiss Alps. New Phytology 187, 819–830.