Plans to improve estimators to better utilize panel data John Coulston Southern Research Station...

Plans to improve estimators to better utilize panel data

John Coulston

Southern Research Station

Forest Inventory and Analysis

Background and Motivation

• Symposium session on combining panel data: Recommendation– “…any serious attempt at defining an

estimation system for analysis of changes and trends over time must explicitly account for time in the assumed underlying model…adopt and encourage an inferential model for FIA that places time on an equal footing with area…”

• Putting the “A” back in FIA – Clutter 2006.

Examples and approach

• Forest area change in Georgia from 1998-2007

• Spatial realization of forest age structure in Alabama in 2007

• Use an appropriate technique for the question posed

Why reinvent the wheel?

• Some analytical alternatives to Bechtold and Patterson 2005 for the annual forest inventory– Mixed estimator (Van Deusen 1999, 2002)

• Current estimates – flexible underlying trend

– Mixed model (Smith & Conkling 2005)• Current estimates and significance of annual change – linear

– Random Forest ( Breiman 2001, Crookston & Finley 2008)

• Machine learning approach to classification and regression. Implemented in temporal map based estimation.

Is there a trend in Georgia forest area

from 1998-2007#

#Southwest

Southeast

CentralNorth Central

Mixed Estimator

ttt ey

equation)n transitioaby (described over timet coefficien random

error randomt independen

tat time splot value ofmean

Mixed Model

Stratified Estimate

h stratumin valuelevel-plotmean y

example)for NLCD n,informatio

sensedremotely with defined (typicallyh stratum of weight

valuelevel-plotmean

Is there a trend in Georgia forest area

from 1998-2007#

#Southwest

Southeast

CentralNorth Central

1998 2000 2002 2004 2006

Central North

1998 2000 2002 2004 2006

North Central

South East

1998 2000 2002 2004 2006

South West

mixed estimationmixed modelsimple random samplestratified estimation

Example: forest area trends in GA 1998-2007

1998 2000 2002 2004 2006

Central North

1998 2000 2002 2004 2006

North Central

South East

1998 2000 2002 2004 2006

South West

Typical “sampling error” approach

2000 2001 2002 2003 2004 2005

Hypothesis:H0: Δpf=0H1: Δpf≠0

Approach:Sampling errors overlap so no significant change.

Issues:Type II errors;Failure to leverage repeated measures

Explicitly testing for change

• If trend is “sufficiently linear” then the mixed model can be used to test

• HO: b1 = 0• H1: b1 ≠ 0

Unit b1 t-value Prob > t

Southeast -0.05% -0.730 0.466Southwest 0.13% 1.094 0.274

Central 0.00% -0.004 0.997North Central -0.31% -2.489 0.013

North -0.21% -2.401 0.017

1998 2000 2002 2004 2006

North Central

•Recall the mixed model: b1 is the slope (change in y over time).

Example 2: Spatial realization of forest age structure in Alabama in 2007

• Using a time-series on Landsat images identify the disturbance year and magnitude for each pixel.

• Calibrate the disturbance year and magnitude information to FIA age class information based on:

Cjz=f(Xz,Yz,Mz(j-d),(j-d)z,Fjz)Where cjz is the age class for location z at time j.Xz=longitude of location zYz=latitude of location zMz(j-d)=magnitude of last disturbance in year j-d at location z.

(j-d)z=the number of years since the last disturbance at

location z.Fjz=land cover in year j at location z.

Random Forest AlgorithmLearning algorithm

Each tree is constructed using the following algorithm:

1. Let the number of training cases be N, and the number of variables in the classifier be M.

2. We are told the number m of input variables to be used to determine the decision at a node of the tree; m should be much less than M.

3. Choose a training set for this tree by choosing N times with replacement from all N available training cases (i.e. take a bootstrap sample). Use the rest of the cases to estimate the error of the tree, by predicting their classes.

4. For each node of the tree, randomly choose m variables on which to base the decision at that node. Calculate the best split based on these m variables in the training set.

5. Each tree is fully grown and not pruned (as may be done in constructing a normal tree classifier).

Accuracy of Age Class Map

0-8 9-16 17-24 25+ non-forest0-8 233 46 10 35 37

FIA data 9-16 51 323 42 101 4717-24 22 34 192 117 21

25+ 38 30 45 1265 16non-forest 29 29 17 103 1710

OverallUser's accuracy 62% 70% 63% 78% 93% 81%

Random Forest model

Conclusions• No one technique could answer the two question posed• Use the appropriate methodology or combination of methodologies

to address your question.• From the examples, time should be explicitly accounted for when

doing trend analysis or making “current” estimates.• Leverage the longitudinal (repeated measure) data when possible. • The temporally indifferent method currently used by FIA does

generally provide estimates with smaller standard error. However, it is not a current estimate and the estimate should be tied to the approximate mid-point of the cycle – not the end year.

• All demonstrated techniques run using the R statistical package which can be directly linked to either internal oracle tables or FIADB.

Plans to improve estimators to better utilize panel data John Coulston Southern Research Station...

Documents

Transcript of Plans to improve estimators to better utilize panel data John Coulston Southern Research Station...

Ivanov-Regularised Least-Squares Estimators over …...Ivanov-Regularised Estimators over RKHSs Ivanov-Regularised Least-Squares Estimators over Large RKHSs and Their Interpolation

John W. Coulston Mark J. Ambrose K. H. Riitters Barbara L ...

Estimators for IPCA core inflation

Estimators and observers-Optimal Control

Shrinkage estimators for structural parametersfm · Chakravarty Basic estimators Classical combina-tion estimators Stein-type estimators Post-model selection size distortion Large

Synthetic estimators in Ireland

Xbond2-Dynamic Panel Data Estimators

Estimators Piping Man Hour Manual

Nonlinear estimators and time-embedding

Clement Coulston, Reflection from Year 2013 and Prompts to Consider for 2014

2 estimators

Comparison of Least Square Estimators with Rank Regression Estimators …€¦ · · 2014-03-17Comparison of Least Square Estimators with Rank Regression Estimators of Weibull Distribution

Point Estimators - Stony Brook

Streaming Coreset Constructions for M-Estimators · StreamingCoresetConstructionsforM-Estimators Vladimir Braverman DepartmentofComputerScience,JohnsHopkinsUniversity,Baltimore,MD,USA

Shrinkage estimators for structural parameters

Properties of Estimators 1

Large deviations for M-estimators

2014 Institute in Innovation in Education Prague Gathering - Clement Coulston Reflections

Chapter 5 – Affective Aspects Ben Coulston, Lauren Goff, Shanee Dawkins, Jarrett Chapman.

Alternative Oil Spill Occurrence Estimators