A Statistical Theory of the Kalman...
Transcript of A Statistical Theory of the Kalman...
![Page 1: A Statistical Theory of the Kalman Filterpages.cs.wisc.edu/~vrpatel6/resources/Presentations/pdf/A... · 2018-09-08 · This slide deck can be found at galton.uchicago.edu/~vpatel.](https://reader036.fdocuments.in/reader036/viewer/2022081404/5f04eb877e708231d4105d9a/html5/thumbnails/1.jpg)
A Statistical Theory of the Kalman Filter
Vivak PatelDepartment of Statistics, University of Chicago SIAM UQ, Lausanne, Switzerland April 8, 2016
![Page 2: A Statistical Theory of the Kalman Filterpages.cs.wisc.edu/~vrpatel6/resources/Presentations/pdf/A... · 2018-09-08 · This slide deck can be found at galton.uchicago.edu/~vpatel.](https://reader036.fdocuments.in/reader036/viewer/2022081404/5f04eb877e708231d4105d9a/html5/thumbnails/2.jpg)
Outline
I. Kalman Filter
II. Single State Estimation
III. Selective Assimilation
IV. Incremental Estimator
V. Numerical Experiment
![Page 3: A Statistical Theory of the Kalman Filterpages.cs.wisc.edu/~vrpatel6/resources/Presentations/pdf/A... · 2018-09-08 · This slide deck can be found at galton.uchicago.edu/~vpatel.](https://reader036.fdocuments.in/reader036/viewer/2022081404/5f04eb877e708231d4105d9a/html5/thumbnails/3.jpg)
VI. Conclusions
![Page 4: A Statistical Theory of the Kalman Filterpages.cs.wisc.edu/~vrpatel6/resources/Presentations/pdf/A... · 2018-09-08 · This slide deck can be found at galton.uchicago.edu/~vpatel.](https://reader036.fdocuments.in/reader036/viewer/2022081404/5f04eb877e708231d4105d9a/html5/thumbnails/4.jpg)
Kalman FilterLet be a sequence of states in related by the stochastic difference equation
where and .
Our objective is to estimate given and a sequence of noisy observations related to
where and
The Kalman Filter estimates and its covariance from estimates and using:
{χt : t ∈ N} Rd
χt+1 = Dtχt + ηt
Dt ∈ Rd×d ηt ∼ N (0, QM )
χt Dt {Zt : t ∈ N} χt
Zt = Otχt + ξt
Ot ∈ Rd×d′ξt ∼ N (0, QO)
χt+1 Ct+1 χ̂t Ct
χ̂t+1 = arg min {∥Zt+1 − Ot+1χ∥2Q−1
O+ ∥χ − Dtχ̂t∥
2(DtCtD
Tt
+QM )−1}C −1
t+1 = (DtCtDTt + QM )−1 + OT
t+1Q−10 Ot+1
![Page 5: A Statistical Theory of the Kalman Filterpages.cs.wisc.edu/~vrpatel6/resources/Presentations/pdf/A... · 2018-09-08 · This slide deck can be found at galton.uchicago.edu/~vpatel.](https://reader036.fdocuments.in/reader036/viewer/2022081404/5f04eb877e708231d4105d9a/html5/thumbnails/5.jpg)
Kalman FilterObservability/Controllability
Kalman, 1960Kalman & Bucy, 1961
Exponential Convergence (Fixed State, Deterministic Observations)
Johnstone, Johnson, Bitmead & Anderson, 1982Bittanti, Bolzern & Campi, 1990Parkum, Poulsen & Holst, 1992Cao & Schwarz, 2003
Nonexpansive in Specialized Metrics
Bougerol, 1993Atar & Zeitouni, 1997Le Gland, et. al., 2004Carli & Sepulchre, 2015
![Page 6: A Statistical Theory of the Kalman Filterpages.cs.wisc.edu/~vrpatel6/resources/Presentations/pdf/A... · 2018-09-08 · This slide deck can be found at galton.uchicago.edu/~vpatel.](https://reader036.fdocuments.in/reader036/viewer/2022081404/5f04eb877e708231d4105d9a/html5/thumbnails/6.jpg)
Kalman FilterQuestions important to statisticians:
1. Q: How many observations are needed (with stochastic dynamics and noisy observations) to estimate a single state "well"? A: With probability approaching 1, objective decays like .
2. Q: Does the covariance estimate actually estimate the asymptotic covariance of the parameter estimate? A: Yes, it does.
3. Q: How important are the apriori parameter and covariance estimate? A: ...
d/k
![Page 7: A Statistical Theory of the Kalman Filterpages.cs.wisc.edu/~vrpatel6/resources/Presentations/pdf/A... · 2018-09-08 · This slide deck can be found at galton.uchicago.edu/~vpatel.](https://reader036.fdocuments.in/reader036/viewer/2022081404/5f04eb877e708231d4105d9a/html5/thumbnails/7.jpg)
Kalman FilterQuestions important to numerical optimizer:
1. Q: Can we do "better" than the Kalman Filter? A: No.
2. Q: What is the rate of convergence of the Kalman Filter? A: Sublinear to a single state. Presumably, we cannot do better.
3. Q: Do we need to carry around the covariance estimate from state to state? A: ...
4. Q: Can we parallelize computations over relevant dimensions? A: ...
![Page 8: A Statistical Theory of the Kalman Filterpages.cs.wisc.edu/~vrpatel6/resources/Presentations/pdf/A... · 2018-09-08 · This slide deck can be found at galton.uchicago.edu/~vpatel.](https://reader036.fdocuments.in/reader036/viewer/2022081404/5f04eb877e708231d4105d9a/html5/thumbnails/8.jpg)
Kalman FilterQuestions important to UQ community (not in other categories):
1. Q: How well does the KF "perform" when the dynamics are just stable (e.v. of 1) or unstable (e.v. > 1)? A: ...
2. Q: How well does the KF "perform" when deterministic dynamics are misspecified? A: ...
3. Q: How well does the KF "perform" when model noise is misspecified/correlated from state to state? A: ...
4. Q: How well does the (E)KF "perform" when the observation function is misspecified?
5. Q: How well does the KF "perform" when the observation model noise is misspecified/correlated from state to state?
![Page 9: A Statistical Theory of the Kalman Filterpages.cs.wisc.edu/~vrpatel6/resources/Presentations/pdf/A... · 2018-09-08 · This slide deck can be found at galton.uchicago.edu/~vpatel.](https://reader036.fdocuments.in/reader036/viewer/2022081404/5f04eb877e708231d4105d9a/html5/thumbnails/9.jpg)
Single State EstimationTo get to statistical notation:
Initialization:
Observations, Errors, Covariates:
True Parameter, Data Model:
β0 = Dtχ̂t M0 = DtCtDTt + QM
Yk is the kth component of Zt
ϵk is the kth component of ξt
Xk is the kth row of Ot
β∗ = χt+1 Yk = XTk
β∗ + ϵk
![Page 10: A Statistical Theory of the Kalman Filterpages.cs.wisc.edu/~vrpatel6/resources/Presentations/pdf/A... · 2018-09-08 · This slide deck can be found at galton.uchicago.edu/~vpatel.](https://reader036.fdocuments.in/reader036/viewer/2022081404/5f04eb877e708231d4105d9a/html5/thumbnails/10.jpg)
Selective AssimilationQuestion: Suppose is the estimator of after observations are assimilated. How quickly does converge to ?
General Assumptions:
are independent, and explore the entire space regularly
(Iterative) Batch Estimator:
If and and conditioned on then almost surely. However:
are never known a priori. Requires iteration.Batch estimator must be recomputed if is incremented.
β̂k β∗ k β̂k β∗
ϵk E [ϵk|Xk] = 0 supk E [ϵ2k|Xk] < ∞
Xk Rd
β̂k = arg min { k
∑j=1
(Yj − XTj β)2 + ∥β − β0∥2
M −10
}1
σ2j
β0 ∈ Rd M0 ≻ 0 X1, … , Xk β̂k → β∗
σ2j
k
![Page 11: A Statistical Theory of the Kalman Filterpages.cs.wisc.edu/~vrpatel6/resources/Presentations/pdf/A... · 2018-09-08 · This slide deck can be found at galton.uchicago.edu/~vpatel.](https://reader036.fdocuments.in/reader036/viewer/2022081404/5f04eb877e708231d4105d9a/html5/thumbnails/11.jpg)
Incremental EstimatorKalman Filter (Single State)
Assumptions:
are independent, and are independent, identically distributed with finite second moment.
for all s.t. .
β̂k+1 = arg min { (Yk+1 − XTk+1β)2 + ∥∥β − β̂k
∥∥2
M −1k
}M −1
k+1 = M −1k + Xk+1XT
k+1
1
γ2k
1
γ2k
ϵk E [ϵk|Xk] = 0 supk E [ϵ2k|Xk] < ∞
X1, X2, …P [|XT
1 v| = 0] < 1 v ∈ Rd ∥v∥2 = 1
![Page 12: A Statistical Theory of the Kalman Filterpages.cs.wisc.edu/~vrpatel6/resources/Presentations/pdf/A... · 2018-09-08 · This slide deck can be found at galton.uchicago.edu/~vpatel.](https://reader036.fdocuments.in/reader036/viewer/2022081404/5f04eb877e708231d4105d9a/html5/thumbnails/12.jpg)
Incremental EstimatorTuning Parameter Requirements:
Theorem 1: Conditioned on , almost surely.
Theorem 2: Let
If then for all asymptotically almost surely
0 < δ2 ≤ infk
γ2k
≤ supk
γ2k
≤ Δ2 < ∞
β0 is arbitrary and M0 ≻ 0
X1, X2, … , Xk ∥βk − β∗∥ → 0
Mk = E [(β̂k − β∗) (β̂k − β∗)T ∣∣∣X1, … , Xk]
σ2j = σ2 ∀j ∈ N ϵ > 0
Mk ⪯ Mk ⪯ Mk
1 − ϵ
Δ2
1σ2
1 + ϵ
δ2
![Page 13: A Statistical Theory of the Kalman Filterpages.cs.wisc.edu/~vrpatel6/resources/Presentations/pdf/A... · 2018-09-08 · This slide deck can be found at galton.uchicago.edu/~vpatel.](https://reader036.fdocuments.in/reader036/viewer/2022081404/5f04eb877e708231d4105d9a/html5/thumbnails/13.jpg)
Numerical ExperimentData Set
Source: Public Use File from Center of Medicare and Medicaid Services.Described health care expenses, type of visit, patient demographics.Contained million records (too big to fit in my computer's 8GB memory).
Linear Model
Response: Cost of visit.Predictors: Patient's gender, Type of Facility, Patient's age.Intercept term was included, giving unknown variables.
Tuning Parameter
N = 2.8
p = 31
γ21 (k) = 1
k
γ2 = 37000γ3 = 0.001
![Page 14: A Statistical Theory of the Kalman Filterpages.cs.wisc.edu/~vrpatel6/resources/Presentations/pdf/A... · 2018-09-08 · This slide deck can be found at galton.uchicago.edu/~vpatel.](https://reader036.fdocuments.in/reader036/viewer/2022081404/5f04eb877e708231d4105d9a/html5/thumbnails/14.jpg)
Numerical ExperimentConvergence of Estimated Covariance.
Recall: .
Estimated covariance tracks well with mean residuals squared.
Mk ≺ Mk ≺ Mk1−ϵ
maxk γ2k
1σ2
1+ϵ
mink γ2k
![Page 15: A Statistical Theory of the Kalman Filterpages.cs.wisc.edu/~vrpatel6/resources/Presentations/pdf/A... · 2018-09-08 · This slide deck can be found at galton.uchicago.edu/~vpatel.](https://reader036.fdocuments.in/reader036/viewer/2022081404/5f04eb877e708231d4105d9a/html5/thumbnails/15.jpg)
Conclusions1. Q: How many observations are needed (with stochastic dynamics and noisy observations) to estimate
a single state "well"? A: (The trace of) is sufficient for determining convergence.
2. Q: Does the covariance estimate actually estimate the asymptotic covariance of the parameter estimate? A: Yes.
3. Q: Can we converge faster than the Kalman Filter? A: For a single state and given that is estimating , no. We will incrementally invert the hessian of the objective. In fact, we show that the conditioning has no impact on the rate of convergence.
Mk
Mk ≺ Mk ≺ Mk1−ϵ
maxk γ2k
1σ2
1+ϵ
mink γ2k
Mk Mk
![Page 16: A Statistical Theory of the Kalman Filterpages.cs.wisc.edu/~vrpatel6/resources/Presentations/pdf/A... · 2018-09-08 · This slide deck can be found at galton.uchicago.edu/~vpatel.](https://reader036.fdocuments.in/reader036/viewer/2022081404/5f04eb877e708231d4105d9a/html5/thumbnails/16.jpg)
Thank YouAcknowledgements
Mihai AnitescuGabriel Terejanu & Haiyan ChengUniversity of Chicago, Department of Statistics & SIAM Travel Award
Slides This slide deck can be found at galton.uchicago.edu/~vpatel.
Reference
Patel, Vivak. "Kalmanbased Stochastic Gradient Method with Stop Condition and Insensitivity toConditioning." arXiv preprint arXiv:1512.01139 (2015).