Machine Learning for Understanding and Managing Ecosystems
-
Upload
diannepatricia -
Category
Technology
-
view
232 -
download
2
Transcript of Machine Learning for Understanding and Managing Ecosystems
-
Machine Learning for Understanding and Managing Ecosystems Tom Dietterich Oregon State University
In collaboration with Postdocs: Dan Sheldon (now at UMass, Amherst), Mark Crowley (now at U.
Waterloo) Graduate Students: Majid Taleghan, Kim Hall, Liping Liu, Akshat Kumar, Tao
Sun, Rachel Houtman, Sean McGregor, Hailey Buckingham Economists: H. Jo Albers, Claire Montgomery Cornell Lab of Ornithology: Steve Kelling, Daniel Fink, Andrew Farnsworth, Wes Hochachka, Benjamin Van Doren, Kevin Webb
1 IBM Cognitive Computing
-
The World Faces Many Sustainability Challenges
Species Extinctions Invasive Species Effects of Climate Change on these
IBM Cognitive Computing 2
-
Computational Sustainability
The study of computational methods that can contribute to the sustainable management of the earths ecosystems Data Models Policies
Data Integration
Data Interpretation
Model Fitting
Policy Optimization
Data Acquisition
Policy Execution
3 IBM Cognitive Computing
-
Outline: Three Projects at Oregon State
Models of Bird Migration Collective Graphical Models
Policy Optimization Controlling Invasive Species Managing Wildland Fire
Data Integration
Data Interpretation
Model Fitting
Policy Optimization
Data Acquisition
Policy Execution
4 IBM Cognitive Computing
-
BirdCast Project Understanding Bird Migration
Goal: Develop a scientific model of bird migration Produce 24- and 48-hour bird migration forecasts
Understanding bird decision making Absolute timing (e.g., based on day length) Temperature Wind speed and direction Relative humidity Food availability
IBM Cognitive Computing 5
-
Data (1): www.ebird.org Volunteer Bird
Watchers Stationary Count Travelling Count
Time, place, duration, distance travelled Checklist of
species seen 8,000-12,000
checklists uploaded per day
6 IBM Cognitive Computing
-
Data (2): Doppler Weather Radar
Radar detects weather (remove) smoke, dust, and
insects (remove) birds and bats
IBM Cognitive Computing 7
-
Data (3): Acoustic monitoring Night flight calls People can identify species or
species groups from these calls
IBM Cognitive Computing 8
-
Modeling Goal: Spatial Hidden Markov Model Define a grid over the US Consider a single bird We say the bird is in state on day if it is
located inside cell on that day Let ( ) be the probability that the
bird will fly from cell to cell on the night from day to day + 1
We will represent this probability in terms of variables such as wind speed and direction distance from to air temperature relative humidity day of the year etc.
Let be the coefficients of the probability model.
9 IBM Cognitive Computing
-
Simulating the Migration of a Single Bird Assume we know the value of The bird starts in cell 4 at time = 1 1 4 = 1
Simulate the first night by drawing a cell according to 4 rolling a dice
Repeat this for time steps
If we had enough bird watchers, we could map out the trajectory of the bird
Then we could match that against our simulated trajectory and adjust until the simulations matched the observed behavior IBM Cognitive Computing 10
-
Simulating the Migration of a Single Bird Assume we know the value of The bird starts in cell 4 at time = 1 1 4 = 1
Simulate the first night by drawing a cell according to 4 rolling a dice
Repeat this for time steps
If we had enough bird watchers, we could map out the trajectory of the bird
Then we could match that against our simulated trajectory and adjust until the simulations matched the observed behavior IBM Cognitive Computing 11
-
Simulating the Migration of a Single Bird Assume we know the value of The bird starts in cell 4 at time = 1 1 4 = 1
Simulate the first night by drawing a cell according to 4 rolling a dice
Repeat this for time steps
If we had enough bird watchers, we could map out the trajectory of the bird
Then we could match that against our simulated trajectory and adjust until the simulations matched the observed behavior IBM Cognitive Computing 12
-
Population of Birds Consider a population of birds The state of this population is a vector such that () is
the number of birds in cell on day We can simulate each of these birds moving simultaneously each bird rolls a dice every night to decide where to go
If we have enough bird watchers, we can get a good estimate
of every day We can compare our simulations against the observations
and adjust until they match
IBM Cognitive Computing 13
-
This is very slow Computer Science to the rescue Formulate the problem mathematically Formalism is called the Collective Graphical Model
(CGM) Develop algorithms for probabilistic inference Use these algorithms to fit the model to the observations
IBM Cognitive Computing 14
-
16 grid cells
Probabilistic Inference for CGMs Gibbs sampler + Markov
basis [Sheldon, Dietterich, NIPS 2011]
IBM Cognitive Computing 15
-
16 grid cells
Probabilistic Inference for CGMs
49 grid cells
Gibbs sampler + Markov basis [Sheldon, Dietterich, NIPS 2011]
IBM Cognitive Computing 16
-
16 grid cells
Probabilistic Inference for CGMs
49 grid cells
Gibbs sampler + Markov basis [Sheldon, Dietterich, NIPS 2011]
Convex optimization [Sheldon, Sun, Kumar, ICML 2013]
IBM Cognitive Computing 17
-
16 grid cells
Probabilistic Inference for CGMs
49 grid cells
Gibbs sampler + Markov basis [Sheldon, Dietterich, NIPS 2011]
Convex optimization [Sheldon, Sun, Kumar, ICML 2013]
Asymptotic Gaussian approximation [Liu, Sheldon, Dietterich ICML 2014]
No Data
IBM Cognitive Computing 18
-
16 grid cells
Probabilistic Inference for CGMs
49 grid cells
Gibbs sampler + Markov basis [Sheldon, Dietterich, NIPS 2011]
Convex optimization [Sheldon, Sun, Kumar, ICML 2013]
Asymptotic Gaussian approximation [Liu, Sheldon, Dietterich ICML 2014]
Non-linear belief propagation [Sun, Sheldon, Kumar, ICML 2015]
IBM Cognitive Computing 19
-
16 grid cells
Probabilistic Inference for CGMs Gibbs sampler + Markov
basis [Sheldon, Dietterich, NIPS 2011]
Convex optimization [Sheldon, Sun, Kumar, ICML 2013]
Asymptotic Gaussian approximation [Liu, Sheldon, Dietterich ICML 2014]
Non-linear belief propagation [Sun, Sheldon, Kumar, ICML 2015]
Proximal algorithm [Vilnis, Belanger, Sheldon, McCallum UAI 2015]
49 grid cells
IBM Cognitive Computing 20
-
Initial Results: Ruby-throated Humming Bird
IBM Cognitive Computing 21
-
Need to Constrain the Model Problem: The migration model tends to store birds in
Canada There are no observations there, so the model is not constrained by
the data
Solution: Constrain the model Specify the times and places where the CGM is allowed to have birds
IBM Cognitive Computing 22
-
Constrained Results: Ruby-Throated Humming Bird
IBM Cognitive Computing 23
-
Fitted Transition Parameters Distance and direction traveled: northness: 0.4808 distance: 0.1895 stayput: 3.5058
time: 0.5217 temperature: 0.1556 wind profit: 0.2754
IBM Cognitive Computing 24
-
Next Steps: Integrating Multiple Data Sources
IBM Cognitive Computing 25
,+1
(, )
= 1, ,
,+1 ()
,+1 ()
,+1 ()
,+1 ()
= 1, ,(, ) = 1, , = 1, ,
= 1, , = 1, , = 1, ,
eBird acoustic radar
bird
s ,+1
-
Outline: Three Projects at Oregon State
Models of Bird Migration Collective Graphical Models
Policy Optimization Controlling Invasive Species Managing Wildland Fire
Data Integration
Data Interpretation
Model Fitting
Policy Optimization
Data Acquisition
Policy Execution
26 IBM Cognitive Computing
-
Invasive Species Management in River Networks
Tamarisk: invasive tree from the Middle East Out-competes native vegetation for
water Reduces biodiversity
What is the best way to manage a spatially-spreading organism?
27 IBM Cognitive Computing
-
Mathematical Model Tree-structured river network Each segment has sites where a tree
can grow. Each site can be {empty, occupied by native, occupied by
invasive}
Management actions Each segment: {do nothing, eradicate,
restore, eradicate+restore}
1 2
3 4
5
n
28 IBM Cognitive Computing
-
Dynamics and Objective Dynamics: In each time period
1 2
3 4
5
n
29 IBM Cognitive Computing
-
Dynamics and Objective Dynamics: In each time period Natural death
1 2
3 4
5
n
30 IBM Cognitive Computing
-
Dynamics and Objective Dynamics: In each time period Natural death Seed production
1 2
3 4
5
n
31 IBM Cognitive Computing
-
Dynamics and Objective Dynamics: In each time period Natural death Seed production Seed dispersal (preferentially downstream)
1 2
3 4
5
n
32 IBM Cognitive Computing
-
Dynamics and Objective Dynamics: In each time period Natural death Seed production Seed dispersal (preferentially downstream) Seed competition to become established
1 2
3 4
5
t n
n n n
33 IBM Cognitive Computing
-
Dynamics and Objective Dynamics: In each time period Natural death Seed production Seed dispersal (preferentially downstream) Seed competition to become established
Couples all edges because of spatial spread Inference is intractable
1 2
3 4
5
t n
n n n
34 IBM Cognitive Computing
-
Dynamics and Objective Dynamics: In each time period Natural death Seed production Seed dispersal (preferentially downstream) Seed competition to become established
Couples all edges because of spatial spread Inference is intractable
Objective: Minimize expected discounted costs
(sum of cost of invasion plus cost of management) Subject to annual budget constraint
1 2
3 4
5
t n
n n n
35 IBM Cognitive Computing
-
Finding the Optimal Management Policy
Formalize as a Markov Decision Process Solve by Stochastic Dynamic Programming SDP requires transition matrix , , = (|,) We dont know Solution: Write a simulator Draw Monte Carlo samples from simulator to estimate [, ,]
IBM Cognitive Computing 36
-
Solving the Tamarisk MDP using Monte Carlo Samples
Repeat Use the current policy to choose a state and management action Invoke the simulator , (, ) is the resulting state is the cost of the action and the resulting state
Update our model of Apply stochastic dynamic programming to compute an improved policy
Until the policy has converged Key question: What , should we choose? Our answer: The DDV heuristic
IBM Cognitive Computing 37
-
Comparison against best previous Monte Carlo MDP planning method
IBM Cognitive Computing 38
1.E+05
1.E+06
1.E+07
Num
ber o
f Sam
ples
MDP
DDV
Fiechter
-
Published Rule of Thumb Policies for Invasive Species Management
Triage Policy Treat most-invaded edge first Break ties by treating upstream first
Leading edge Eradicate along the leading edge of invasion
Chades, et al. Treat most-upstream invaded edge first Break ties by amount of invasion
DDV Our PAC solution
39 IBM Cognitive Computing
-
Cost Comparisons: Rule of Thumb Policies vs. DDV
0
50
100
150
200
250
300
350
400
450
Large pop, upto down
Chades Leading Edge Optimal
Total Costs
Triage
DDV
Chades
Leading Edge
40 IBM Cognitive Computing
-
Outline: Three Projects at Oregon State
Models of Bird Migration Collective Graphical Models
Policy Optimization Controlling Invasive Species Managing Wildland Fire
Data Integration
Data Interpretation
Model Fitting
Policy Optimization
Data Acquisition
Policy Execution
41 IBM Cognitive Computing
-
Managing Wildfire in Eastern Oregon Natural state: Large Ponderosa Pine trees with
open understory Frequent ground fires that remove
understory plants (grasses, shrubs) but do not damage trees
Fires have been suppressed since
1920s Heavy accumulation of fuels in
understory Large catastrophic fires that kill all
trees and damage soils Huge firefighting costs and lives lost
42 IBM Cognitive Computing
-
Study Area: Deschutes National Forest
Goal: Return the landscape to its natural fire regime Management Question: LET-BURN: When lightning
ignites a fire, should we let it burn?
43 IBM Cognitive Computing
-
Formulating LETBURN as a Markov Decision Process ,,,,
State space: 4000 management units; each unit is in one of 25 local states Weather Ignition site
Action space: At fire ignition time , ,
Reward function: (, ,) Cost of lost timber value Cost of lost species habitat Cost of fire suppression
44
ignition
action
fire outcome
+1
new ignition
fire simulator lightning simulator
IBM Cognitive Computing
-
The Simulator is Very Expensive
Simulating one fire can take from 5 to 60 minutes (depending on the size of the fire) FARSITE Forest Vegetation Simulator (FVS) Lightning Strike model Weather Simulator
Monte Carlo methods require at least 106 simulator calls What can we do?
IBM Cognitive Computing 45
-
Current Strategy: Policy Search using a Surrogate Model Define a parameterized space of policies: = Simulate an initial set of 100-year trajectories under a variety
of policies Apply Bayesian Optimization (SMAC; Hutter, et al., 2011) to
find the optimal value of To simulate for some new , apply the Model-Free
Monte Carlo algorithm (Fonteneau, et al., 2013)
IBM Cognitive Computing 46
-
A Simpler Problem: LETBURN one year
Is there any benefit to allowing fires to burn for just one year? Year 1: LETBURN Years 2-100: SUPPRESS ALL
Evaluate via Monte Carlo trials
47 IBM Cognitive Computing
-
Expected Benefit of LETBURN (Suppress all fires after year 1)
0
5
10
15
20
25
30
35
-2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60
Freq
uenc
y
Expected Benefit (x $100,000)
mean = $2.47 million
median = $2.74 million
48 [Houtman, Montgomery, Gagnon, Calkin, Dietterich, McGregor, Crowley 2013] IBM Cognitive Computing
-
Summary
Models of Bird Migration Collective Graphical Models
Policy Optimization Controlling Invasive Species Managing Wildland Fire
Data Integration
Data Interpretation
Model Fitting
Policy Optimization
Data Acquisition
Policy Execution
49 IBM Cognitive Computing
-
Common Threads Spatially-spreading processes Bird migration Invasive species Fire spread
Dynamical model CGM: Spatial HMM with clever inference Simulator of seed spread Simulator of fire spread
Computational challenges Efficient probabilistic inference Minimize calls to expensive simulators Value of information heuristics + PAC guarantees Bayesian optimization
IBM Cognitive Computing 50
-
Thank-you Dan Sheldon, Akshat Kumar, Tao Sun: Collective Graphical Models Steve Kelling, Andrew Farnsworth, Wes Hochachka, Daniel Fink:
BirdCast H. Jo Albers, Kim Hall, Majid Taleghan, Mark Crowley: Tamarisk Claire Montgomery, Sean McGregor, Mark Crowley, Rachel Houtman Carla Gomes for spearheading the Institute for Computational
Sustainability
National Science Foundation Grants 0832804 (CompSust), 1331932 (CyberSEES), 1125228 (Birdcast), 1521687 (CompSustNet)
51 IBM Cognitive Computing
-
Common Threads Spatially-spreading processes Bird migration Invasive species Fire spread
Dynamical model CGM: Spatial HMM with clever inference Simulator of seed spread Simulator of fire spread
Computational challenges Efficient probabilistic inference Minimize calls to expensive simulators Value of information heuristics + PAC guarantees Bayesian optimization
IBM Cognitive Computing 52
Machine Learning for Understanding and Managing EcosystemsThe World Faces Many Sustainability ChallengesComputational SustainabilityOutline: Three Projects at Oregon StateBirdCast ProjectUnderstanding Bird MigrationData (1): www.ebird.orgData (2): Doppler Weather RadarData (3): Acoustic monitoringModeling Goal: Spatial Hidden Markov ModelSimulating the Migration of a Single BirdSimulating the Migration of a Single BirdSimulating the Migration of a Single BirdPopulation of BirdsThis is very slowProbabilistic Inference for CGMsProbabilistic Inference for CGMsProbabilistic Inference for CGMsProbabilistic Inference for CGMsProbabilistic Inference for CGMsProbabilistic Inference for CGMsInitial Results:Ruby-throated Humming BirdNeed to Constrain the ModelConstrained Results:Ruby-Throated Humming BirdFitted Transition Parameters Next Steps: Integrating Multiple Data SourcesOutline: Three Projects at Oregon StateInvasive Species Management in River NetworksMathematical ModelDynamics and ObjectiveDynamics and ObjectiveDynamics and ObjectiveDynamics and ObjectiveDynamics and ObjectiveDynamics and ObjectiveDynamics and ObjectiveFinding the Optimal Management PolicySolving the Tamarisk MDP using Monte Carlo SamplesComparison against best previous Monte Carlo MDP planning methodPublished Rule of Thumb Policies for Invasive Species ManagementCost Comparisons: Rule of Thumb Policies vs. DDVOutline: Three Projects at Oregon StateManaging Wildfire in Eastern OregonStudy Area: Deschutes National ForestFormulating LETBURN as a Markov Decision Process ,,, , The Simulator is Very ExpensiveCurrent Strategy:Policy Search using a Surrogate ModelA Simpler Problem: LETBURN one yearExpected Benefit of LETBURN(Suppress all fires after year 1)SummaryCommon ThreadsThank-youCommon Threads