A Split-Step Solution of the Fokker–Planck Equation for ......The Fokker–Planck equation for...

5
A Split-Step Solution of the Fokker–Planck Equation for the Conditional Density (Invited Paper) H. C. Lambert * , F. E. Daum , and J. L. Weatherwax * * MIT Lincoln Laboratory Raytheon Company Abstract—We have developed and tested a new algorithm, which we call the “wave filter,” that solves the nonlinear filtering problem with discrete-time measurements by solving the Fokker– Planck equation for the conditional probability density function using a split-step technique. The wave filter uses fast convolution to compute the effect of process noise at discrete times. Between measurements, the conditional density is propagated by solving a system of ordinary differential equations. Measurement update is carried out via Bayes’ rule. We propose the “adjoint method” to reduce the computational complexity of the algorithm by adaptively varying the mesh size. I. I NTRODUCTION Nonlinear filtering algorithms can be divided into three broad categories. First, there are those filters that compute the sufficient statistics characterizing the conditional probability density function. These filters effectively reduce the problem of solving a partial differential equation (PDE) to that of solving a series of ordinary differential equations (ODEs). This class of filters includes the extended and unscented Kalman filters. Another major class of filters consists of those which approximate the conditional density using Monte Carlo sampling techniques. The particle filter is the quintessence of this class of filters. Finally, there are those filters that compute the conditional density exactly by explicitly solving the Fokker–Planck equation (see e.g. [1]). Note that in general, the computational complexity of both Monte Carlo and exact techniques increases exponentially with the dimension of the state vector, while the computational complexity of filters solving for the sufficient statistics varies polynomially with the dimension of the state vector (see e.g. [2] and [3]). In this paper, we present a new algorithm, which we refer to as the “wave filter 1 ”, that solves for the conditional density exactly and hence falls under the third category of nonlinear filters. Our algorithm relies on using a “split-step” technique for solving the Fokker–Planck equation. We begin by reviewing the nonlinear filtering problem in section II. The split-step solution of the Fokker–Planck equa- tion along with a formulation of the wave filtering algorithm is presented in section III. We give two numerical examples in section IV demonstrating the performance of the wave filter. Finally, in section V, we discuss the possibility of using the 1 The term “wave filter” is more than a mere wordplay and is in fact reminiscent of the wave-particle duality in quantum mechanics. “adjoint method” to reduce the computational complexity of the wave filter. II. PROBLEM STATEMENT In a nutshell, the filtering problem consists of recursively estimating the state of an uncertain variable from noisy mea- surements. The dynamic behavior of the state is described by the system dynamic model [4]: dx t = f t (x t )+ dw t , (1) where the process noise vector w t is a zero-mean Gaussian white noise process with spectral density matrix Q t . The mapping of the state space at time index k to the discrete-time measurement space is given by the measurement model [4]: z k = h k (x k )+ v k , (2) where the measurement noise vector v k is a zero-mean Gaus- sian white noise process with covariance R k . The dimension of the measurement vector is smaller than or equal to the dimension of the state vector. The spectral density Q t and covariance R k of the process and measurement noise vectors are known a-priori and together with the system and measure- ment functions, f t and h k , respectively, serve as “inputs” to the filtering algorithm. What we seek is the conditional probability density function, p(x t k |Z k ), of the state variable at time t k given the sequence of measurements Z k = {z 1 ,..., z k }. Between measurements, the evolution of the conditional density is prescribed by the Fokker–Planck equation [5]: ∂p ∂t = -f T t ∂p x t - tr f t x t p + 1 2 tr Q t 2 p x 2 t (3) with initial condition p(x t k-1 |Z k-1 ). We denote the solution as p(x t k |Z k-1 ). This constitutes the “propagation” step of the solution. At the time of measurement arrival, we update the conditional density using Bayes’ rule: p(x t k |Z k )= p(z k |x t k )p(x t k |Z k-1 ) p(z k |Z k-1 ) (4) where p(z k |x t k ) is the likelihood function of the measurement z k . In the next section, we consider a “split-step” technique for solving the Fokker–Planck equation (3).

Transcript of A Split-Step Solution of the Fokker–Planck Equation for ......The Fokker–Planck equation for...

  • A Split-Step Solution of the Fokker–PlanckEquation for the Conditional Density

    (Invited Paper)

    H. C. Lambert∗, F. E. Daum†, and J. L. Weatherwax∗∗ MIT Lincoln Laboratory† Raytheon Company

    Abstract— We have developed and tested a new algorithm,which we call the “wave filter,” that solves the nonlinear filteringproblem with discrete-time measurements by solving the Fokker–Planck equation for the conditional probability density functionusing a split-step technique. The wave filter uses fast convolutionto compute the effect of process noise at discrete times. Betweenmeasurements, the conditional density is propagated by solvinga system of ordinary differential equations. Measurement updateis carried out via Bayes’ rule. We propose the “adjoint method”to reduce the computational complexity of the algorithm byadaptively varying the mesh size.

    I. I NTRODUCTION

    Nonlinear filtering algorithms can be divided into threebroad categories. First, there are those filters that compute thesufficient statistics characterizing the conditional probabilitydensity function. These filters effectively reduce the problemof solving a partial differential equation (PDE) to that ofsolving a series of ordinary differential equations (ODEs).This class of filters includes the extended and unscentedKalman filters. Another major class of filters consists of thosewhich approximate the conditional density using Monte Carlosampling techniques. The particle filter is the quintessenceof this class of filters. Finally, there are those filters thatcompute the conditional density exactly by explicitly solvingthe Fokker–Planck equation (see e.g. [1]). Note that in general,the computational complexity of both Monte Carlo and exacttechniques increases exponentially with the dimension of thestate vector, while the computational complexity of filterssolving for the sufficient statistics varies polynomially withthe dimension of the state vector (see e.g. [2] and [3]).

    In this paper, we present a new algorithm, which we referto as the “wave filter1”, that solves for the conditional densityexactly and hence falls under the third category of nonlinearfilters. Our algorithm relies on using a “split-step” techniquefor solving the Fokker–Planck equation.

    We begin by reviewing the nonlinear filtering problem insection II. The split-step solution of the Fokker–Planck equa-tion along with a formulation of the wave filtering algorithmis presented in section III. We give two numerical examples insection IV demonstrating the performance of the wave filter.Finally, in section V, we discuss the possibility of using the

    1The term “wave filter” is more than a mere wordplay and is in factreminiscent of the wave-particle duality in quantum mechanics.

    “adjoint method” to reduce the computational complexity ofthe wave filter.

    II. PROBLEM STATEMENT

    In a nutshell, the filtering problem consists of recursivelyestimating the state of an uncertain variable from noisy mea-surements. The dynamic behavior of the state is described bythe system dynamic model [4]:

    dxt = ft(xt) + dwt, (1)

    where the process noise vectorwt is a zero-mean Gaussianwhite noise process with spectral density matrixQt. Themapping of the state space at time indexk to the discrete-timemeasurement space is given by the measurement model [4]:

    zk = hk(xk) + vk, (2)

    where the measurement noise vectorvk is a zero-mean Gaus-sian white noise process with covarianceRk. The dimensionof the measurement vector is smaller than or equal to thedimension of the state vector. The spectral densityQt andcovarianceRk of the process and measurement noise vectorsare known a-priori and together with the system and measure-ment functions,ft and hk, respectively, serve as “inputs” tothe filtering algorithm.

    What we seek is the conditional probability density function,p(xtk |Zk), of the state variable at timetk given the sequenceof measurementsZk = {z1, . . . , zk}. Between measurements,the evolution of the conditional density is prescribed by theFokker–Planck equation [5]:

    ∂p

    ∂t= −fTt

    ∂p

    ∂xt− tr

    (∂ft∂xt

    )p +

    12tr

    (Qt

    ∂2p

    ∂x2t

    )(3)

    with initial condition p(xtk−1 |Zk−1). We denote the solutionasp(xtk |Zk−1). This constitutes the “propagation” step of thesolution. At the time of measurement arrival, we update theconditional density using Bayes’ rule:

    p(xtk |Zk) =p(zk|xtk)p(xtk |Zk−1)

    p(zk|Zk−1) (4)

    wherep(zk|xtk) is the likelihood function of the measurementzk. In the next section, we consider a “split-step” techniquefor solving the Fokker–Planck equation (3).

  • III. SOLUTION OF THE FOKKER–PLANCK EQUATION

    As a foray into solving the Fokker–Planck equation, weexamine two limiting cases. First, we consider the case ofrandom walk dynamics with time-invariant process noise:

    ft = 0 and Qt = Q.

    The Fokker–Planck equation for this case reduces to

    ∂p

    ∂t=

    12tr

    (Q

    ∂2p

    ∂x2t

    ), (5)

    which we recognize as the diffusion equation withQ inter-preted as the diffusion tensor. Taking the Fourier transformwith respect to the state vectorxt, we obtain

    dp†

    dt= −1

    2ξTt Qξtp

    †, (6)

    where p† denotes the Fourier transform of the conditionaldensity, andξt corresponds to the Fourier “wavenumber”variable. Equation (6) is a simple ODE that can be solvedin closed form. Taking the inverse Fourier transform of theresult, we arrive at a solution for the diffusion equation that canbe regarded as the smeared version of the initial density. Weconclude that the diffusion equation (5) can be solved usingefficient algorithms for computing the convolution integralsuch as the fast Fourier transform (FFT).

    Next, we consider the case when there is no process noise:

    Qt = 0.

    For this case, the Fokker–Planck equation reduces to a con-vection equation:

    ∂p

    ∂t+ fTt

    ∂p

    ∂xt= −tr

    (∂ft∂xt

    )p, (7)

    whereft can be interpreted as the drift vector, while the left-hand side of the equation denotes a substantive derivative [6].The PDE (7) can in turn be cast into an equivalent system ofODEs:

    d

    dt

    [xtp

    ]=

    [ft

    −tr(

    ∂ft∂xt

    )p

    ]. (8)

    We solve equation (8) by first forming a grid for the solutiondomain. Subsequently, the state vector valid at a particular gridpoint and the corresponding conditional density are propagatedsimultaneously using a standard ODE solver such as the 4th-order Runge–Kutta integrator [7].

    The results we obtain for these two limiting cases can becombined to solve the Fokker–Planck equation for the generalcase. This is achieved by mandating the process noise spectraldensity matrix to be discrete-time; that is, we consider the casewhen process noise is “injected” into the system at discretetime intervals coinciding with times of measurement. We notethat unlike in physics, process noise is an engineeringdesignparameter. In physics, we have no control over the behaviorof the diffusion tensor. Thus, we let

    Qt =∑

    k

    Qkδ(t− tk), (9)

    whereδ(t − tk) is the Dirac delta function. Hence, immedi-ately after the measurement update step, the Fokker–Planckequation reduces to the diffusion equation (5), while betweenmeasurements, it reduces to the convection equation (7). Werefer to this method for solving the Fokker–Planck equationas the “split-step” technique. Finally, we can always considera diagonal process noise covariance matrix via an appropriatecoordinate transformation and thereby reducing the computa-tional complexity of solving the diffusion equation (5) evenfurther.

    In summary, we solve the nonlinear filtering problem withdiscrete-time measurements in three steps: (1) smear theconditional density valid at time indexk− 1 by convolving itwith a Gaussian function with covarianceQk−1; (2) propagatethe system of ODEs (8) from time indexk − 1 to time indexk for each grid point of the solution domain; and (3) updatethe conditional density with the measurement received at timeindex k using Bayes’ rule (4). We refer to this algorithm asthe “wave filter.”

    IV. N UMERICAL EXAMPLES

    A. Spinning Disk

    In this and the next section, we consider two simple two-dimensional examples demonstrating the performance of thewave filter with measurement updates. First, we consider theso called “spinning disk” problem [8]. We imagine a flat diskspinning about an axis. An observer observing the spinningdisk from an edge measures a feature of the disk that isa nonlinear function of some “metric” property of the disk.Specifically, we consider the state of the disk to be completelyspecified by the two-dimensional vector:

    xt =[θt θ̇t

    ]T, (10)

    whereθt is the angle of a point on the edge of the disk, whileθ̇ is the spin rate of the disk. We consider a linear systemdynamic model corresponding to a simple “constant-velocity”model:

    ft(xt) =[θ̇ 0

    ]T. (11)

    On the other hand, for the measurement model, we considerthe nonlinear model:

    hk = cos(θtk). (12)

    Since cosine is an even function, this measurement modelgives rise to an ambiguity with regards to the sign of the spinrate. In other words, the posterior density function is expectedto be bi-modal.

    Figure 1 summarizes our simulation results. Numericalvalues for the simulation parameters are listed in Table I.Figure 1 shows plots of the conditional density as a function ofthe two states, angle and spin rate. The first simulation “frame”shown in Figure 1a corresponds to the initial density prior tothe acquisition of the measurement. The density is assumed tobe uniform. As shown in Figure 1b, the density shows its bi-modal characteristic immediately after the first measurementupdate. Here, the estimation of spin rate remains poor. After

    2

  • TABLE I

    SPINNING DISK EXAMPLE SIMULATION PARAMETERS

    PARAMETERS VALUE DIMENSIONNumber of grid points per dimension 64 N/ATrue initial angle 65 degTrue initial spin rate 2 deg· s−1Angle process noise variance 0 deg2 · s−1Spin rate process noise variance 0.1 deg2 · s−3Measurement error std. dev. 0.25 N/ASampling time 0.5 s

    a few iterations, the estimation of the spin rate improves (Fig-ure 1c). However, the sign of the spin rate remains ambiguous.The last simulation frame shown in Figure 1d demonstratesthe convergence of the algorithm. We note how the spacing ofthe mesh changes over time. Our numerical integration of theconvection equation implements the notion of the substantivederivative by explicitly propagating each mesh point accordingto the underlying system dynamic model. Here, the systemdynamic model is linear. This is reflected in the shearedappearance of the mesh after a few iterations of the algorithm.

    B. Tracking a Falling Object on Reentry

    In this section, we consider a simple example with anonlinear dynamic model and a linear measurement model(in a way the mirror image of the problem we considered insection IV-A). We consider the problem of tracking a fallingobject on reentry from beyond the Earth’s atmosphere. Thestate vector is given by

    xt =[rt ṙt

    ]T, (13)

    where rt and ṙt refer to the range and range rate from asensor immediately below the falling object on a verticalline. A more general scenario would consist of observing theobject at an angle and would therefore have the completeposition and velocity vectors as the components of the statevector. Furthermore, due to the atmospheric drag, the ballisticcoefficient of the object and the air density would have to beknown in order to model the drag force in the direction op-posite to the object’s velocity vector in addition to the Earth’sgravitational force. In real applications, the ballistic coefficientis modelled as part of the state since it is in general unknown.For air density, a model is usually assumed with variousparameters which depend on the physical characteristics of thelocal atmosphere. These parameters vary randomly with sensorlocation and are dependent on meteorological factors. Here,we assume a simple model for the air density, and we assumethat the ballistic coefficient of the falling object is knownexactly. This scenario is not entirely unrealistic since one canenvisage an experiment consisting of dropping say a bulletwhose ballistic coefficient is specified by the manufacturerfrom a high altitude. Here, we follow the reentry examplegiven in section 5.2 of [9].

    (a)

    angle [deg]

    spin

    rat

    e [d

    eg/s

    ]

    Frame 1 of 41

    −200 −100 0 100 200 300 400 500

    −10

    −8

    −6

    −4

    −2

    0

    2

    4

    6

    8

    10

    (b)

    angle [deg]sp

    in r

    ate

    [deg

    /s]

    Frame 2 of 41

    −200 −100 0 100 200 300 400 500

    −10

    −8

    −6

    −4

    −2

    0

    2

    4

    6

    8

    10

    (c)

    angle [deg]

    spin

    rat

    e [d

    eg/s

    ]

    Frame 23 of 41

    −200 −100 0 100 200 300 400 500

    −10

    −8

    −6

    −4

    −2

    0

    2

    4

    6

    8

    10

    (d)

    angle [deg]

    spin

    rat

    e [d

    eg/s

    ]

    Frame 41 of 41

    −200 −100 0 100 200 300 400 500

    −10

    −8

    −6

    −4

    −2

    0

    2

    4

    6

    8

    10

    Fig. 1. Unnormalized conditional density for the spinning disk problem:(a) initial uniform density with uniform mesh, (b) density after one mea-surement update, (c) density after twenty-two measurement updates, and(d) density after forty measurement updates. The true state is shown as thecross section of the two thick lines. cross-section of the two thick lines.

    3

  • TABLE II

    REENTRY EXAMPLE SIMULATION PARAMETERS

    PARAMETERS VALUE DIMENSIONNumber of grid points per dimension 32 N/ATrue initial range 6.1 kmTrue initial range rate 3048 m· s−1Range process noise variance 0 m2 · s−1Range rate process noise variance 0.01 m2 · s−3Measurement error std. dev. 1 kmSampling time 1 s

    The system dynamic model is given by

    ft(xt) =

    [−ṙt

    −ρ(rt)gṙ2t2β + g

    ](14)

    whereg = 9.81 m · s−2 is the gravitational acceleration, andβ = 19161 kg ·m−1 · s−2 is the ballistic coefficient of thefalling object, which, for simplicity we are assuming is knownexactly. The air density is given by

    ρ(rt) = γe−ηrt (15)

    whereγ = 1.754 kg ·m−3 and η = 1.49× 10−4 m−1. Fromequation (14), it can be shown that the trace of the systemJacobian matrix is given by

    tr(

    ∂ft∂xt

    )=

    ∂f1,t∂rt

    +∂f2,t∂ṙt

    = −ρ(rt)gṙtβ

    (16)

    It follows that the solution of the convection equation forthis problem can be obtained from the following system ofnonlinear ODEs:

    d

    dt

    [xtp

    ]=

    [ft(xt)

    ρ(rt)gṙtpβ

    ], (17)

    which we solve numerically using a standard ODE solver suchas the 4th-order Runge–Kutta integrator. Finally, the linearmeasurement model is given by

    hk = rtk . (18)

    The parameters of the simulation are listed in Table II. Theresults are shown in Figure 2. As in the previous example,it is evident from the Figures that the spacing of the meshchanges over time. Again this is a direct consequence of ournumerical integration of the convection equation. Note that forthis example, the nonlinear system dynamic model results ina highly deformed mesh distribution after a few iterations ofthe algorithm.

    V. A DAPTIVE MESHING

    The algorithm presented in section III suffers from theso called curse of dimensionality; that is, the computationalcomplexity of solving the Fokker–Planck equation increasesexponentially with the dimension of the state vector [3]. Ascan be seen in Figure 1, the value of the conditional densityis negligible for many grid points. It is expected that byadaptively changing the mesh size over the solution domain,

    (a)

    altitude [km]

    velo

    city

    [m

    /s]

    Frame 1 of 31

    0 10 20 30 40 50 60 700

    500

    1000

    1500

    2000

    2500

    3000

    3500

    4000

    (b)

    altitude [km]ve

    loci

    ty [

    m/s

    ]

    Frame 2 of 31

    0 10 20 30 40 50 60 700

    500

    1000

    1500

    2000

    2500

    3000

    3500

    4000

    (c)

    altitude [km]

    velo

    city

    [m

    /s]

    Frame 14 of 31

    0 10 20 30 40 50 60 700

    500

    1000

    1500

    2000

    2500

    3000

    3500

    4000

    (d)

    altitude [km]

    velo

    city

    [m

    /s]

    Frame 21 of 31

    0 10 20 30 40 50 60 700

    500

    1000

    1500

    2000

    2500

    3000

    3500

    4000

    Fig. 2. Unnormalized conditional density for the reentry problem: (a) initialuniform density with uniform mesh, (b) density after one measurement update,(c) density after thirteen measurement updates, and (d) density after twentymeasurement updates. The true state is shown as the cross section of the twothick lines. cross-section of the two thick lines.

    4

  • it is possible to reduce the computational complexity of thealgorithm by concentrating on regions of high probabilitydensity. Many useful adaptive techniques have been developedfor the numerical solution of PDEs (see e.g. [10]). Onepossible technique suggested by Daum and Krichman [11] isthe adjoint method [12].

    We note that for a majority of applications, instead of theconditional density, we are mainly interested in the solutionof expectation integrals of the form:

    J = 〈ψ|p〉 =∫

    ψ(x)p(x|Zk) dx. (19)

    The adjoint method seeks the solution that minimizes the errorin estimating the functionalJ . If we regard the differentialequation (written in operator notation)

    Lp = φ (20)

    as the “prime” problem, then we can define a “dual” problemdescribed by

    L∗q = ψ, (21)

    whereL∗ is the adjoint of theL operator. From the definitionof the adjoint operator, it follows that the expectation inte-gral (19) can be equivalently expressed in terms ofq:

    J = 〈ψ|p〉 = 〈L∗q|p〉 = 〈q|Lp〉 = 〈q|φ〉 . (22)

    It can be shown that the errorδJ in estimating the expectationintegral (19) satisfies the bound [13]:

    ‖δJ‖ ≤∥∥L−1

    ∥∥ ‖Lph − φ‖ ‖L∗qh − ψ‖ , (23)

    where ph and qh refer to the approximate solutions of (20)and (21), respectively. By adaptively changing the mesh sizesuch that the error bound (23) is minimized, it can be shownthat the computational complexity of solving the Fokker–Planck equation can be significantly reduced [12]. We deferto a future study the application of the adjoint method to thewave filter.

    VI. SUMMARY AND CONCLUSION

    We have shown that the Fokker–Planck equation for thenonlinear filtering problem with discrete-time measurementscan be solved efficiently using a split-step technique. However,the computational complexity of the algorithm still increasesexponentially with the dimension of the state vector. Werecommend the adjoint method discussed in section V asa means to adaptively changing the mesh size and therebyreducing the computational complexity of the algorithm.

    ACKNOWLEDGEMENTS

    We wish to thank Dr. K. P. Dunn and Dr. C. B. Chang ofMIT Lincoln Laboratory for helpful discussions.

    REFERENCES

    [1] K. Kastella, “Finite difference methods for nonlinear filtering and auto-matic target recoginition,” inMultitarget/Multisensor Tracking, Y. Bar-Shalom and W. D. Blair, Eds. Artech House, Inc., 2000, vol. III.

    [2] F. E. Daum and J. Huang, “The curse of dimensionality and particlefilters,” in Proceedings of IEEE Aerospace Conference, March 2004.

    [3] F. E. Daum, “Nonlinear filters: beyond the Kalman filter,”IEEE AESSystems Magazine, August 2005.

    [4] A. H. Jazwinski,Stochastic Processes and Filtering Theory. New York:Academy Press, 1970.

    [5] F. E. Daum, “New exact nonlinear filters,” inBayesian Analysis of TimeSeries and Dynamic Models, J. C. Spall, Ed. New York: Marcel Dekker,Inc., 1988.

    [6] D. J. Tritton, Physical Fluid Dynamics, 2nd ed. Oxford SciencesPublications, 1988.

    [7] J. D. Hoffman,Numerical Methods for Engineers and Scientists. Mc-Graw Hill, 1992.

    [8] A. C. Schmidt, “Dynamic Bayesian networks for the classification ofspinning disks,” Master’s thesis, Massachusetts Institute of Technology,2004.

    [9] B. Ristic, S. Arulampalam, and N. Gordon,Beyond the Kalman Filter:Particle Filters for Tracking Applications. Boston: Artech House, 2004.

    [10] K. Eriksson, D. Estep, P. Hansbo, and C. Johnson, “Introduction to adap-tive methods for differential equations,” inActa Numerica. CambridgeUniversity Press, 1995, pp. 105–158.

    [11] F. E. Daum and M. Krichman, “Meshfree adjoint methods for nonlinearfiltering,” in Proccedings of IEEE Aerospace Conference, Big Sky,Montana, March 2006.

    [12] W. Bangerth and R. Rannacher,Adaptive finite element methods fordifferential equations. Birkhauser Inc., 2003.

    [13] N. Pierce and M. B. Giles, “Adjoint and defect error bounding andcorrection for functional estimates,”Journal of Computational Phyiscs,2004.

    5