Emilie Henderson, Janet Ohmann , Matthew Gregory, Heather Roberts and Harold Zald

All for one or One for All?

Mapping many species individually vs. simultaneously with random forest.

Emilie Henderson, Janet Ohmann, Matthew Gregory, Heather Roberts and Harold Zald

August 10, 2012Ecological Society of America Annual Meeting

Portland, Oregon

Species Distribution Modeling

• Been around for a long time, and has exploded over the last decade.

With the rise of new powerful statistical techniques and GIS tools, the development of predictive habitat distribution models has rapidly increased in ecology.

– Guisan and Zimmerman 2000• Generalized Linear/Additive Models • Neural networks• Bayesian models• Ordination• Classification methods

• Web of Knowledge: ‘species distribution’– 2000 - 2001: 556 articles– 2011 – 2012: 1,389 articles

SDM Uses

From Giusan and Thuiller 2005

Strategies for community-level modeling

• ‘assemble first, predict later’

• ‘predict first, assemble later’

• ‘assemble and predict together’

--Ferrier & Guisan 2006

Objective: Compare two strategies for community-level predictive mapping.

You Are Here

Pacific silver fir Abies amabilisGrand fir/ White fir Abies grandis / concolorSubalpine fir Abies lasiocarpaNoble fir / Shasta red fir Abies procera/shastensisBigleaf maple Acer macrophyllusRed alder Alnus rubraMadrone Arbutus menzieziiIncense cedar Calocedrus decurrensMountain mahogany Cercocarpus ledifoliusGiant chinkapin Chrysolepis chrysophyllaPacific Dogwood Cornus nutalliiOregon ash Fraxinus latifoliaWestern Juniper Juniperus occidentalisNo Trees PresentLodgepole pine Pinus contortaEngelman spruce Picea engelmaniiJeffrey Pine Pinus jeffreyiiSugar pine Pinus lambertianaWestern white pine Pinus monticolaPonderosa pine Pinus ponderosaBlack cottonwood Populus balsamifera ssp trichocarpaBitter cherry Prunus emarginataDouglas-fir Pseudotsuga menzieziiOregon white oak Quercus garryanaCalifornia black oak Quercus kelloggiiPacific yew Taxus brevifoliaWestern red cedar Thuja plicataWestern hemlock Tsuga heterophyllaMountain hemlock Tsuga mertensiana

Plot Data

Forest Inventory and Analysis Annual Plots: 1948 plots

Techniques – Random Forest Based (Breiman 2001, Cutler et al. 2007)

Binary prediction (R package: randomForest, Liaw & Wiener 2002)

Continuous prediction

Nearest Neighbor Imputation (R package: yaImpute, Crookston & Finley 2008)

Spatial Data Layers

Climate (from PRISM climate data)

Soil Parent Material (from SSURGO/Soil Resources Inventory)

Topography (from National Elevation Dataset)

Spectral reflectance (LANDSAT)

|SMRTP < 228.5

ANNTMP < 606

TC3 < -1433.5

SMRTP < 244.5

FALSE TRUEFALSE

FALSE FALSE

|SMRTMP < 1169

TC3 < -1440.39 SMRTP < 246.5

ANNTMP < 748.5FALSE TRUE

FALSE FALSEFALSE

|SMRTMP < 1223.5

SMRTP < 228.5

TC1 < 2164.61

SMRTP < 246.5

TRUE FALSEFALSE FALSE FALSE

|SMRTP < 228.5

DEM < 1268.5

TC1 < 2162.89

SMRTP < 244.5

FALSETRUE FALSE

FALSE FALSE

|SMRTP < 228.5

ANNTMP < 611.5

TC3 < -1239.17

SMRTP < 268.5

FALSE TRUEFALSE

FALSE FALSE

|SMRTP < 228.5

ANNTMP < 611.5

TC3 < -1240.94

SMRTMP < 1327.5

FALSE TRUEFALSE

FALSE FALSE

# True / # Trees = 4/6 = .66

For RF Regression, predicted value for a pixel is the average of all the predictions of nodes.

Random forest -- Nearest-Neighbor imputation

Imputation = Filling in missing values from existing values.

studyarea

(2) Place new pixel

withinfeature

(3) find nearest-neighbor plot within feature

(4) impute nearest

neighbor’s Plot ID # to

Methods: k-NN

feature space geographic space

Elevation

Rainfall

(1)Place plots

within feature space

“Assemble and Predict Together”

(2) calculate

axis scores of pixel from

mapped data layersstudyarea

(3) find nearest-neighbor plot

in gradient space

(4) impute nearest

neighbor’s Plot ID# to

Methods: GNN (Ohmann and Gregory 2002)

gradient space geographic spaceCCA

Axis 2(e.g., Temperature,

Elevation)

CCAAxis 1

(e.g., Rainfall, local

topography)

(1)conductgradient

analysis ofplot data

studyarea

Methods: Random Forest Nearest Neighbor Imputation

Random Forest space geographic space

|SMRTP < 228.5

ANNTMP < 606

TC3 < -1433.5

SMRTP < 244.5

FALSE TRUEFALSE

FALSE FALSE

|SMRTMP < 1169

TC3 < -1440.39 SMRTP < 246.5

ANNTMP < 748.5FALSE TRUE

FALSE FALSEFALSE

|SMRTMP < 1223.5

SMRTP < 228.5

TC1 < 2164.61

SMRTP < 246.5

TRUE FALSEFALSE FALSE FALSE

|SMRTP < 228.5

DEM < 1268.5

TC1 < 2162.89

SMRTP < 244.5

FALSETRUE FALSE

FALSE FALSE

|SMRTP < 228.5

ANNTMP < 611.5

TC3 < -1239.17

SMRTP < 268.5

FALSE TRUEFALSE

FALSE FALSE

|SMRTP < 228.5

ANNTMP < 611.5

TC3 < -1240.94

SMRTMP < 1327.5

FALSE TRUEFALSE

FALSE FALSE

Nearest Neighbor Plot: #3Second Nearest Neighbor: #5

Strategies for communitiy-level modeling

• ‘assemble first, predict later’

• ‘predict first, assemble later’– Random forest – classification (binary prediction)– Random forest – regression (continuous prediction)

• ‘assemble and predict together’– Random forest – imputation (continuous prediction)

--Ferrier & Giusan 2006

Dimensions of Map Accuracy

• Single-species metrics– Range – presence/absence– Abundance – How much basal area?– Is the distribution of values predicted realistic?

• Community-level metrics– Diversity– Composition

ficity

Sensitivity: True positives/(True Positives + False Negatives)

Specificity: True Negatives/(True Negatives + False Positives)

True Skill Statistic (TSS): Sensitivity + Specificity - 1

Root Mean Square Difference:

0 50 100 150

0.40.50.60.70.80.91.0

RF_CRFNNPlot Data

ficity

0 50 100 150 200

0.40.50.60.70.80.91.0

aset RF_C

RFNNPlot DataRoot Mean Square Difference:

Single Species Models• Range

– Random Forest – Binary: best– Random Forest – Nearest Neighbor: acceptable– Random Forest -- Continuous: fail

• Abundance (Basal Area)– RMSD

• Random Forest – Continuous: best• Random Forest – Nearest Neighbor: acceptable• Random Forest – Binary: NA

– Empirical Cumulative Distribution Functions: (predicted value distributions)

• Random Forest – Nearest Neighbor: best• Random Forest – Continuous: fail• Random Forest – Binary: NA

rvatio

Alpha diversity

Diversity: Species Richness and Evenness

rvatio

Shannon diversity

Beta Diversity

rvatio

1 1 3 5 6

1 1 3 5 4

2 2 3 5 4

2 2 3 4 4

Average Alpha Diversity for Blue Pixel: 3.04

1 1 3 5 6

1 1 3 5 4

2 2 3 5 4

2 2 3 4 4

Results – Composition

Bray-Curtis, Binary

Bray-Curtis, Continuous

What is the Bray-Curtis distance between our observed and predicted communities?

Discussion• Species absences are an important dimension of

composition– Disturbance?– Succession?– Competition/Facilitation?– Dispersal limitations?

• Community assembly rules can be used to help refine mapped species lists. (e.g., Guisan and Rahbek, 2011)

• But… imputation avoids the pitfalls & complications of re-assembling communities after mapping because they are never taken apart.

Conclusions• Practical Considerations:

– Models of individual species may be • Strongest in one dimension• Useful for understanding species’ ecology• The best option for some types of available data (e.g.,

presence-only data from museum specimens)

– Nearest Neighbor mapping is a useful tool for building multipurpose maps.

• Ranges and abundances• Composition• Diversity

Acknowledgements

• Nationwide Forest Imputation Study

• Landscape Ecology Modeling Mapping and Analysis team in Corvallis.

Emilie Henderson, Janet Ohmann , Matthew Gregory, Heather Roberts and Harold Zald

Documents

Transcript of Emilie Henderson, Janet Ohmann , Matthew Gregory, Heather Roberts and Harold Zald

Gradients or hierarchies? Which assumptions make a better map? Emilie B. Grossmann Janet L. Ohmann Matthew J. Gregory Heather K. May.

Boocking Emilie

The Downfall - Emilie Zola

Reed and Zald Unsettlement of Communities of Inquiry_Updated

Emilie Breslin 3ptppt

Emilie Gieler - MM14NL

High Expectations for all! What is your why? Emilie Wylde emilie@livingopps.org.

My very own private agile Confessions agiles Emilie Esposito Emilie Franchomme 18/10/2013.

Innovate to excite - Novozymes, Anders Ohmann

Book illustration Emilie Poggi

Emilie, Winston, Helena

Realisations Emilie Niermans

ABOUT EMILIE...ABOUT EMILIE Emilie Perz is an international yoga educator and movement therapist based in Los Angeles. Her vast knowledge of yoga and anatomy, infectious positivity

Ellerbe Fellowship - Emilie Kopp

Moi et Emilie

Introduction To Emilie Ontko

Emilie Arnaud_Portfolio

Technology Powerpoint Emilie

All for one or One for All? Mapping many species individually vs. simultaneously with random forest. Emilie Henderson, Janet Ohmann, Matthew Gregory, Heather.

Bulliard emilie