Nonparametric Bayesian Learning of Switching Dynamical Processes

25
Massachusetts Institute of Technology Stochastic Systems Group Nonparametric Bayesian Learning of Switching Dynamical Processes Emily Fox, Erik Sudderth, Michael Jordan, and Alan Willsky Nonparametric Bayes Workshop 2008 Helsinki, Finland Laboratory for Information and Decision Systems

description

Laboratory for Information and Decision Systems. Nonparametric Bayesian Learning of Switching Dynamical Processes. Emily Fox, Erik Sudderth, Michael Jordan, and Alan Willsky Nonparametric Bayes Workshop 2008 Helsinki, Finland. Applications. = set of dynamic parameters. Priors on Modes. - PowerPoint PPT Presentation

Transcript of Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 1: Nonparametric Bayesian Learning of Switching Dynamical Processes

Massachusetts Institute of Technology

Stochastic Systems Group

Nonparametric Bayesian Learning of

Switching Dynamical Processes

Emily Fox, Erik Sudderth, Michael Jordan, and Alan Willsky

Nonparametric Bayes Workshop 2008

Helsinki, Finland

Laboratory for Information and Decision Systems

Page 2: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 2Massachusetts Institute of Technology

Applications

Page 3: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 3Massachusetts Institute of Technology

Priors on Modes

• Switching linear dynamical processes useful for describing nonlinear phenomena

• Goal: allow uncertainty in number of dynamical modes

Utilize hierarchical Dirichlet process (HDP) prior

Cluster based on dynamics

Switching Dynamical Processes

= set of dynamic parameters

Page 4: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 4Massachusetts Institute of Technology

Outline

• Background Switching dynamical processes: SLDS, VAR Prior on dynamic parameters Sticky HDP-HMM

• HDP-AR-HMM and HDP-SLDS

• Sampling Techniques

• Results Synthetic IBOVESPA stock index Dancing honey bee

Page 5: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 5Massachusetts Institute of Technology

Linear Dynamical Systems

• State space LTI model:

• Vector autoregressive (VAR) process:

Page 6: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 6Massachusetts Institute of Technology

Linear Dynamical Systems

• State space LTI model:

State space models

VAR processes

• Vector autoregressive (VAR) process:

Page 7: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 7Massachusetts Institute of Technology

Switching Dynamical Systems

• Switching linear dynamical system (SLDS):

• Switching VAR process:

Page 8: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 8Massachusetts Institute of Technology

Prior on Dynamic Parameters

Group all observations assigned to mode k

Define the following mode-specific matrices

Results in K decoupled linear regression problems

Rewrite VAR process in matrix form:

Place matrix-normal inverse Wishart prior on:

Page 9: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 9Massachusetts Institute of Technology

Sticky HDP-HMM

• Dirichlet process (DP): Mode space of unbounded size Model complexity adapts to

observations

• Hierarchical: Ties mode transition distributions Shared sparsity

• Sticky: self-transition bias parameter

Time

Mo

de

Infinite HMM: Beal, et.al., NIPS 2002HDP-HMM: Teh, et. al., JASA 2006 Sticky HDP-HMM: Fox, et.al., ICML 2008

Page 10: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 10Massachusetts Institute of Technology

• Global transition distribution:

Sticky HDP-HMM

sparsity of is shared,increased probability of self-transition

• Mode-specific transition distributions:

Page 11: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 11Massachusetts Institute of Technology

HDP-AR-HMM and HDP-SLDS

HDP-AR-HMM HDP-SLDS

Page 12: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 12Massachusetts Institute of Technology

Blocked Gibbs Sampler

Sample parameters

• Approximate HDP: Truncate stick-breaking Weak limit approximation:

• Sample transition distributions:

• Sample dynamic parameters using state sequence as VAR(1) pseudo-observations:

Fox, et.al., ICML 2008

Page 13: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 13Massachusetts Institute of Technology

Blocked Gibbs Sampler

Sample mode sequence

• Use state sequence as pseudo-observations of an HMM

• Compute backwards messages:

• Block sample as:

Page 14: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 14Massachusetts Institute of Technology

Blocked Gibbs Sampler

Sample state sequence

• Equivalent to LDS with time-varying dynamic parameters

• Compute backwards messages (backwards information filter):

• Block sample as:

All Gaussian distributions

Page 15: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 15Massachusetts Institute of Technology

Hyperparameters

• Place priors on hyperparameters and learn them from data

• Weakly informative priors

• All results use the same settings

hyperparameters

can be set using the data

Page 16: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 16Massachusetts Institute of Technology

Results: Synthetic VAR(1)

HDP-HMM

HDP-VAR(1)-HMM HDP-VAR(2)-HMM

HDP-SLDS

5-mode VAR(1) data

Page 17: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 17Massachusetts Institute of Technology

Results: Synthetic AR(2)

HDP-SLDS

HDP-VAR(1)-HMM HDP-VAR(2)-HMM

HDP-HMM

3-mode AR(2) data

Page 18: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 18Massachusetts Institute of Technology

Results: Synthetic SLDS

HDP-SLDS

HDP-VAR(1)-HMM HDP-VAR(2)-HMM

HDP-HMM

3-mode SLDS data

Page 19: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 19Massachusetts Institute of Technology

Results: IBOVESPA

• Data: Sao Paolo stock index

• Goal: detect changes in volatility

• Compare inferred change-points to 10 cited world events

sticky HDP-SLDS non-sticky HDP-SLDS ROC

Daily Returns

Carvalho and Lopes, Comp. Stat. & Data Anal., 2006

Page 20: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 20Massachusetts Institute of Technology

Results: Dancing Honey Bee

• 6 bee dance sequences with expert labeled dances: Turn right (green) Waggle (red) Turn left (blue)

Sequence 1 Sequence 2 Sequence 3 Sequence 4 Sequence 5 Sequence 6

TimeOh et. al., IJCV, 2007

x-pos

y-pos

sin

cos

• Observation vector: Head angle (cos, sin) x-y body position

Page 21: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 21Massachusetts Institute of Technology

Movie: Sequence 6

Page 22: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 22Massachusetts Institute of Technology

Results: Dancing Honey Bee

Nonparametric approach:

• Model: HDP-VAR(1)-HMM

• Set hyperparameters

• Unsupervised training from each sequence

• Infer: Number of modes Dynamic parameters Mode sequence

Supervised Approach [Oh:07]:

• Model: SLDS

• Set number of modes to 3

• Leave one out training: fixed label sequences on 5 of 6 sequences

• Data-driven MCMC Use learned cues (e.g., head angle)

to propose mode sequences

Oh et. al., IJCV, 2007

Page 23: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 23Massachusetts Institute of Technology

Results: Dancing Honey Bee

Sequence 4 Sequence 5 Sequence 6

HDP-AR-HMM: 83.2%SLDS [Oh]: 93.4%

HDP-AR-HMM: 93.2%SLDS [Oh]: 90.2%

HDP-AR-HMM: 88.7%SLDS [Oh]: 90.4%

Page 24: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 24Massachusetts Institute of Technology

Results: Dancing Honey Bee

Sequence 1 Sequence 2 Sequence 3

HDP-AR-HMM: 46.5%SLDS [Oh]: 74.0%

HDP-AR-HMM: 44.1%SLDS [Oh]: 86.1%

HDP-AR-HMM: 45.6%SLDS [Oh]: 81.3%

Page 25: Nonparametric Bayesian Learning of Switching Dynamical Processes

Page 25Massachusetts Institute of Technology

Conclusion

• Examined HDP as a prior for nonparametric Bayesian learning of SLDS and switching VAR processes.

• Presented efficient Gibbs sampler

• Demonstrated utility on simulated and real datasets