Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX...

31
Deep Energy-Based NARX Models Johannes N. Hendriks 1 , Fredrik K. Gustafsson 2 Antˆ onio H. Ribeiro 2 , Adrian G. Wills 1 , Thomas B. Sch¨ on 2 1 The University of Newcastle, Australia 2 Uppsala University, Sweden Workshop on Nonlinear System Identification Benchmarks 2021

Transcript of Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX...

Page 1: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Deep Energy-Based NARX Models

Johannes N. Hendriks1, Fredrik K. Gustafsson2

Antonio H. Ribeiro2, Adrian G. Wills1,Thomas B. Schon2

1The University of Newcastle, Australia2Uppsala University, Sweden

Workshop on Nonlinear System Identification Benchmarks2021

Page 2: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Motivation

Common performance criteria such as maximum-likelihood orprediction-error criteria usually involve assumptions about

uncertainty, be they explicit or implicit

Nonlinear SysId Benchmarks, 2021 1 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 3: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Nonlinear ARX model (Gaussian noise)

§ Data model:yt “ fθpyt´1, ut´1q ` et ,

where et „ N p0, σq.

§ fθ ù model structure.

§ Maximum likelihood estimator:

pθ “ arg minθ

Tÿ

t“1

}yt ´ fθpyt´1, ut´1q}2.

Nonlinear SysId Benchmarks, 2021 2 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 4: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Nonlinear ARX model (Gaussian noise)

§ Data model:yt “ fθpyt´1, ut´1q ` et ,

where et „ N p0, σq.§ fθ ù model structure.

§ Maximum likelihood estimator:

pθ “ arg minθ

Tÿ

t“1

}yt ´ fθpyt´1, ut´1q}2.

Nonlinear SysId Benchmarks, 2021 2 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 5: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Nonlinear ARX model (Gaussian noise)

§ Data model:yt “ fθpyt´1, ut´1q ` et ,

where et „ N p0, σq.§ fθ ù model structure.

§ Maximum likelihood estimator:

pθ “ arg minθ

Tÿ

t“1

}yt ´ fθpyt´1, ut´1q}2.

Nonlinear SysId Benchmarks, 2021 2 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 6: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Energy-based NARX models

§ Arbitrary distributions:

yt |pyt´1, ut´1q „ pθpyt |yt´1, ut´1q,

§ Energy-based model:

pθpyt |yt´1, ut´1q “egθpyt ,yt´1,ut´1q

ş

egθpγ,yt´1,ut´1q dγ,

Gustafsson, F.K., Danelljan, M., Bhat, G., and Schon,T.B. (2020). Energy-based models for deepprobabilistic regression. In Proceedings of the European Conference on Computer Vision (ECCV)

§ gθ ù Highly flexible structure: in our case a neural network.

Nonlinear SysId Benchmarks, 2021 3 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 7: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Energy-based NARX models

§ Arbitrary distributions:

yt |pyt´1, ut´1q „ pθpyt |yt´1, ut´1q,

§ Energy-based model:

pθpyt |yt´1, ut´1q “egθpyt ,yt´1,ut´1q

ş

egθpγ,yt´1,ut´1q dγ,

Gustafsson, F.K., Danelljan, M., Bhat, G., and Schon,T.B. (2020). Energy-based models for deepprobabilistic regression. In Proceedings of the European Conference on Computer Vision (ECCV)

§ gθ ù Highly flexible structure: in our case a neural network.

Nonlinear SysId Benchmarks, 2021 3 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 8: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Energy-based NARX models

§ Arbitrary distributions:

yt |pyt´1, ut´1q „ pθpyt |yt´1, ut´1q,

§ Energy-based model:

pθpyt |yt´1, ut´1q “egθpyt ,yt´1,ut´1q

ş

egθpγ,yt´1,ut´1q dγ,

Gustafsson, F.K., Danelljan, M., Bhat, G., and Schon,T.B. (2020). Energy-based models for deepprobabilistic regression. In Proceedings of the European Conference on Computer Vision (ECCV)

§ gθ ù Highly flexible structure: in our case a neural network.

Nonlinear SysId Benchmarks, 2021 3 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 9: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Model training

§ Maximum likelihood

pθ “ arg maxθ

Tÿ

i“1

´ log pθpyt | yt´1, ut´1q

“ arg minθ

Tÿ

t“1

ˆ

´gθpyt , xtq ` ln

ż

egθpγ,xtq dγ

˙

§ Noise contrastive estimation:Gutmann, M. and Hyvarinen, A. (2010). Noise-contrastive estimation: A new estimation principle forunnormalized statistical models. In Proceedings of the International Conference on Artificial Intelligenceand Statistics (AISTATS), 297–304

Nonlinear SysId Benchmarks, 2021 4 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 10: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Model training

§ Maximum likelihood

pθ “ arg maxθ

Tÿ

i“1

´ log pθpyt | yt´1, ut´1q

“ arg minθ

Tÿ

t“1

ˆ

´gθpyt , xtq ` ln

ż

egθpγ,xtq dγ

˙

§ Noise contrastive estimation:Gutmann, M. and Hyvarinen, A. (2010). Noise-contrastive estimation: A new estimation principle forunnormalized statistical models. In Proceedings of the International Conference on Artificial Intelligenceand Statistics (AISTATS), 297–304

Nonlinear SysId Benchmarks, 2021 4 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 11: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Model training

§ Maximum likelihood

pθ “ arg maxθ

Tÿ

i“1

´ log pθpyt | yt´1, ut´1q

“ arg minθ

Tÿ

t“1

ˆ

´gθpyt , xtq ` ln

ż

egθpγ,xtq dγ

˙

§ Noise contrastive estimation:Gutmann, M. and Hyvarinen, A. (2010). Noise-contrastive estimation: A new estimation principle forunnormalized statistical models. In Proceedings of the International Conference on Artificial Intelligenceand Statistics (AISTATS), 297–304

Nonlinear SysId Benchmarks, 2021 4 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 12: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Example 1: AR model

yt “ 0.95yt´1 ` et .

Figure: Gaussian error et

Nonlinear SysId Benchmarks, 2021 5 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 13: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Example 1: AR model

yt “ 0.95yt´1 ` et .

Figure: Gaussian mixture error et

Nonlinear SysId Benchmarks, 2021 5 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 14: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Example 1: AR model

yt “ 0.95yt´1 ` et .

Figure: Cauchy error et

Nonlinear SysId Benchmarks, 2021 5 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 15: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Example 1: AR model

yt “ 0.95yt´1 ` et .

Figure: Gaussian error et with conditional variance

Nonlinear SysId Benchmarks, 2021 5 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 16: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Example 2: ARX model

yt “ 1.5yt´1 ´ 0.7yt´2 ` ut´1 ` 0.5ut´2 ` et ,

(a) Sequence (b) t=56

Figure: Estimates of pθpyt |xtq for a validation data.

Nonlinear SysId Benchmarks, 2021 6 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 17: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Example 3: nonlinear model

Model:

y˚t “

´

0.8´ 0.5e´y˚2t´1

¯

y˚t´1 ´

´

0.3` 0.9e´y˚2t´1

¯

y˚t´2

` ut´1 ` 0.2ut´2 ` 0.1ut´1ut´2 ` vt ,

yt “yt ` wt

Process and output error:

vt „N p0, σ2v q

wt „N p0, σ2v q Figure: System only with

process noise. Input inblue and output in red.

Chen, S., Billings, S.A., and Grant, P.M. (1990). Non-Linear System Identification Using Neural Networks.International Journal of Control, 51(6), 1191–1214.

Nonlinear SysId Benchmarks, 2021 7 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 18: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Example 3: nonlinear model

Table: Simulated nonlinear MSE on the validation set for the fullyconnected network (FCN) NARX model and EB-NARX model

N “ 100 N “ 250 N “ 500FCN EB-NARX FCN EB-NARX FCN EB-NARX

σ “ 0.1 0.122 0.099 0.069 0.070 0.057 0.054σ “ 0.3 0.398 0.390 0.353 0.354 0.289 0.308σ “ 0.5 0.860 0.869 0.809 0.822 0.754 0.779

Nonlinear SysId Benchmarks, 2021 8 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 19: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Example 3: nonlinear model

(a) Sequence (b) t=56

Figure: Estimates of pθpyt |xtq for a validation data.

Nonlinear SysId Benchmarks, 2021 9 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 20: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Example 4: Coupled Electric Drives

Figure: Illustration of the CE8 coupled electric drives system

Wigren, T. and Schoukens, M. (2017). Coupled electric drives data set and reference models. Technical Report.Uppsala Universitet, 2017

Nonlinear SysId Benchmarks, 2021 10 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 21: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Example 4: Coupled Electric Drives

Figure: pθpyt |xtq sequence

Nonlinear SysId Benchmarks, 2021 11 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 22: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Example 4: Coupled Electric Drives

Figure: t “ 40

Nonlinear SysId Benchmarks, 2021 11 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 23: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Example 4: Coupled Electric Drives

Figure: t “ 57

Nonlinear SysId Benchmarks, 2021 11 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 24: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Example 4: Coupled Electric Drives

Figure: t “ 60

Nonlinear SysId Benchmarks, 2021 11 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 25: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Conclusion

§ Energy based NARX learns the full conditional distributionrather than the point estimate.

§ The current paper only considers one-step-ahead predictionsand not multi-step-ahead predictions.

§ Propagate MAP point estimates vs probabilities.

§ Studying energy-based models for nonlinear ARMAX, outputerror and other types of models that can handle more generalnoise types.

Nonlinear SysId Benchmarks, 2021 12 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 26: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Conclusion

§ Energy based NARX learns the full conditional distributionrather than the point estimate.

§ The current paper only considers one-step-ahead predictionsand not multi-step-ahead predictions.

§ Propagate MAP point estimates vs probabilities.

§ Studying energy-based models for nonlinear ARMAX, outputerror and other types of models that can handle more generalnoise types.

Nonlinear SysId Benchmarks, 2021 12 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 27: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Conclusion

§ Energy based NARX learns the full conditional distributionrather than the point estimate.

§ The current paper only considers one-step-ahead predictionsand not multi-step-ahead predictions.

§ Propagate MAP point estimates vs probabilities.

§ Studying energy-based models for nonlinear ARMAX, outputerror and other types of models that can handle more generalnoise types.

Nonlinear SysId Benchmarks, 2021 12 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 28: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Conclusion

§ Energy based NARX learns the full conditional distributionrather than the point estimate.

§ The current paper only considers one-step-ahead predictionsand not multi-step-ahead predictions.

§ Propagate MAP point estimates vs probabilities.

§ Studying energy-based models for nonlinear ARMAX, outputerror and other types of models that can handle more generalnoise types.

Nonlinear SysId Benchmarks, 2021 12 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 29: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Thank you!

To appear in the 19th IFAC Symposium in System Identification.

Paper: https://arxiv.org/abs/2012.04136

Code: https://github.com/jnh277/ebm arx

Contact:[email protected]@[email protected]@[email protected]

Nonlinear SysId Benchmarks, 2021 13 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 30: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Thank you!

To appear in the 19th IFAC Symposium in System Identification.

Paper: https://arxiv.org/abs/2012.04136

Code: https://github.com/jnh277/ebm arx

Contact:[email protected]@[email protected]@[email protected]

Nonlinear SysId Benchmarks, 2021 13 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon

Page 31: Deep Energy-Based NARX Models -0.3cm 0.990.4pt40pt · 2021. 7. 14. · Deep Energy-Based NARX Models Johannes N. Hendriks1, Fredrik K. Gustafsson2 Ant^onio H. Ribeiro2, Adrian G.

Thank you!

To appear in the 19th IFAC Symposium in System Identification.

Paper: https://arxiv.org/abs/2012.04136

Code: https://github.com/jnh277/ebm arx

Contact:[email protected]@[email protected]@[email protected]

Nonlinear SysId Benchmarks, 2021 13 / 13 Hendriks, Gustafsson, Ribeiro, Wills, Schon