algorithms and for tra jectory - COREGokul, Sanjna, Sandra, Deepthi, Annelie, Ra-neesh, Jagadeep,...
Transcript of algorithms and for tra jectory - COREGokul, Sanjna, Sandra, Deepthi, Annelie, Ra-neesh, Jagadeep,...
Thesis for the Degree of Do tor of Philosophy
Data asso iation algorithms and
metri design for traje tory
estimation
Abu Sajana Rahmathullah
Department of Signals and Systems
Chalmers University of Te hnology
Göteborg, Sweden
June 2016
Data asso iation algorithms and metri design for traje tory estimation
Abu Sajana Rahmathullah
ISBN 978-91-7597-400-2
© Abu Sajana Rahmathullah, 2016.
Doktorsavhandlingar vid Chalmers tekniska högskola
Ny serie nr 4081
ISSN 0346-718X
Department of Signals and Systems
Chalmers University of Te hnology
SE�412 96 Göteborg
Sweden
Telephone: +46 (0)31 � 772 1000
Typeset by the author using L
A
T
E
X.
Chalmers Reproservi e
Göteborg, Sweden 2016
Abstra t
This thesis is on erned with traje tory estimation, whi h �nds appli ations
in various �elds su h as automotive safety and air tra� surveillan e. More
spe i� ally, the thesis fo uses on the data asso iation part of the problem,
for single and multiple targets, and on performan e metri s.
Data asso iation for single-traje tory estimation is typi ally performed
using Gaussian mixture smoothing. To limit omplexity, pruning or merg-
ing approximations are used. In this thesis, we propose systemati ways to
perform a ombination of merging and pruning for two smoothing strate-
gies: forward-ba kward smoothing (FBS) and two-�lter smoothing (TFS).
We present novel solutions to the ba kward smoothing step of FBS and a
likelihood approximation, alled smoothed posterior pruning, for the ba k-
ward �ltering in TFS.
For data asso iation in multi-traje tory estimation, we propose two it-
erative solutions based on expe tation maximization (EM). The appli ation
of EM enables us to independently address the data asso iation problems
at di�erent time instants, in ea h iteration. In the �rst solution, the best
data asso iation is estimated at ea h time instant using 2-D assignment,
and given the best asso iation, the states of the individual traje tories are
immediately omputed using Gaussian smoothing. In the se ond solution,
we average the states of the individual traje tories over the data asso ia-
tion distribution, whi h in turn is approximated using loopy belief propaga-
tion. Using simulations, we show that both solutions provide good trade-o�s
between a ura y and omputation time ompared to multiple hypothesis
tra king.
For evaluating the performan e of traje tory estimation, we propose two
metri s that behave in an intuitive manner, apturing the relevant features
in target tra king. First, the generalized optimal sub-pattern assignment
metri omputes the distan e between �nite sets of states, and addresses
properties su h as lo alization errors and missed and false targets, whi h
are all relevant to target estimation. The se ond metri omputes the dis-
tan e between sets of traje tories and onsiders the temporal dimension of
traje tories. We re�ne the on epts of tra k swit hes, whi h allow a traje -
iii
Abstra t
tory from one set to be paired with multiple traje tories in the other set
a ross time, while penalizing it for these multiple assignments in an intuitive
manner. We also present a lower bound for the metri that is remarkably
a urate while being omputable in polynomial time.
Keywords: Traje tory estimation, data asso iation, metri s, Gaussian
mixtures, smoothing, expe tation maximization
iv
A knowledgments
This thesis would not have been possible without the support from many.
I would like to start by thanking Prof. Mats Viberg and Prof. Tomas M K-
elvey for giving me the opportunity to work with the Signal Pro essing
group at Chalmers.
I would like to express my immense gratitude to my supervisor Lennart
Svensson. You have been a great guide to me in several aspe ts, from
produ ing this thesis to writing to skiing. I really appre iate your honest,
onstru tive and elaborate feedba ks. I have learnt a lot from our te hni al
dis ussions and from the way you an onne t seemingly di�erent problems.
Thanks to your warm and friendly personality that made working with you
su h a pleasure.
Many thanks to my o�supervisor Daniel Svensson for his support over
the years. I really appre iate that you have been there to help me with
guidan e and feedba ks. I have also enjoyed all the dis ussions I have had
with you.
Thanks also to my o�supervisor Ángel F. Gar ía-Fernández for the
exhilarating dis ussions, prompt feedba ks and for being a great host during
my two months stay in Australia.
I want to thank my ollaborator/Masters thesis student Raghavendra
Selvan for all the valuable dis ussions we have had. Thanks for helping me
with the papers for this thesis despite short noti e.
I would also like to thank Prof. Ba-Ngu Vo for allowing me to visit his
group at Curtin University. My sin ere thanks to Karl, Samuel and Erik
for their valuable time in proofreading this thesis. Thanks to all my urrent
and former olleagues at Chalmers�Maryam, Lars, Ay a, Anders, Malin,
Astrid, Naga, Keerthi, Sathya, Srikar, Rahul�who have made the work
pla e su h a relaxed environment. I have enjoyed my �ve years at Chalmers
that would not have been possible without you all.
Thanks to all my friends who add a lot of fun to my life outside Chalmers.
Thanks go to Ninva, Dipti, Gokul, Sanjna, Sandra, Deepthi, Annelie, Ra-
neesh, Jagadeep, Aravindh and Kavitha.
I annot thank enough my family�Zerina, Rahmathullah, Ziana, Asan
v
A knowledgments
and Hussain�who have always been an immense support for me. Thanks
for the un onditional love you have for me. Last but not the least, thanks
to Ri kard for his love and support. Thanks for being there during the ups
and downs of my life.
vi
Publi ations
This thesis is based on the following publi ations:
Paper I
A.S. Rahmathullah, L. Svensson and D. Svensson, �Merging-based forward-
ba kward smoothing on Gaussian mixtures�, In Pro . 17th International
Conferen e on Information Fusion (FUSION), Salaman a, Spain, July 2014.
Paper II
A.S. Rahmathullah, L. Svensson and D. Svensson, �Two-�lter Gaussian mix-
ture smoothing with posterior pruning�, In Pro . 17th International Con-
feren e on Information Fusion (FUSION), Salaman a, Spain, July 2014.
Paper III
A.S. Rahmathullah, R. Selvan and L. Svensson, �A bat h algorithm for es-
timating traje tories of point targets using expe tation maximization�, A -
epted for publi ation in IEEE Transa tions on Signal Pro essing, April 2016.
Paper IV
R. Selvan, A.S. Rahmathullah and L. Svensson, �Expe tation maximization
with hidden state variables for multi-target tra king�, To be submitted to
IEEE Transa tions on Signal Pro essing.
Paper V
A.S. Rahmathullah, Á.F. Gar ía-Fernández and L. Svensson, �Generalized
optimal sub-pattern assignment metri �, Submitted to IEEE Transa tions
on Signal Pro essing, Jan 2016.
Paper VI
A.S. Rahmathullah, Á.F. Gar ía-Fernández and L. Svensson, �A metri on
the spa e of �nite sets of traje tories for evaluation of multi-target tra k-
ing algorithms�, Submitted to IEEE Transa tions on Signal Pro essing,
May 2016.
vii
Contents
Abstra t iii
A knowledgments v
Publi ations vii
Contents ix
I Introdu tory hapters xv
1 Introdu tion 1
2 Models, obje tives and on eptual solution 5
2.1 State spa e representation . . . . . . . . . . . . . . . . . . . 5
2.2 Problem statement and on eptual solution . . . . . . . . . . 6
2.2.1 Predi tion and �ltering . . . . . . . . . . . . . . . . . 6
2.2.2 Smoothing . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3 Single traje tory estimation 11
3.1 Linear and Gaussian �ltering and smoothing . . . . . . . . . 11
3.1.1 Kalman �ltering . . . . . . . . . . . . . . . . . . . . . 11
3.1.2 Smoothing . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2 Non-linear models . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2.1 Gaussian �lters . . . . . . . . . . . . . . . . . . . . . 14
3.2.2 Parti le �lters . . . . . . . . . . . . . . . . . . . . . . 15
3.3 Gaussian mixture �ltering and smoothing . . . . . . . . . . . 16
3.3.1 Optimal solution . . . . . . . . . . . . . . . . . . . . 17
3.4 Gaussian mixture redu tion . . . . . . . . . . . . . . . . . . 19
3.4.1 Pruning . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.4.2 Merging . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.4.3 Choi e of GMR . . . . . . . . . . . . . . . . . . . . . 20
ix
Contents
3.4.4 GMR for FBS and TFS . . . . . . . . . . . . . . . . 20
4 Multi�traje tory estimation 23
4.1 Data asso iation . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2 Tra king algorithms . . . . . . . . . . . . . . . . . . . . . . . 24
4.2.1 Global nearest neighbour . . . . . . . . . . . . . . . . 25
4.2.2 Multiple hypothesis tra king . . . . . . . . . . . . . . 25
4.2.3 Joint probabilisti data asso iation . . . . . . . . . . 26
4.2.4 Probabilisti multiple hypothesis tra king . . . . . . 26
4.3 EM for data asso iation . . . . . . . . . . . . . . . . . . . . 26
5 Metri s 29
5.1 Need for a metri . . . . . . . . . . . . . . . . . . . . . . . . 29
5.2 Metri properties . . . . . . . . . . . . . . . . . . . . . . . . 30
5.3 Common metri s . . . . . . . . . . . . . . . . . . . . . . . . 31
5.3.1 On ve tor spa es . . . . . . . . . . . . . . . . . . . . 31
5.3.2 On the spa e of �nite sets of ve tors . . . . . . . . . 32
5.3.3 On the spa e of �nite sets of traje tories . . . . . . . 33
6 Contributions and future work 37
6.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Referen es 43
II Appended papers 53
Paper I Merging�based forward�ba kward smoothing on Gaus-
sian mixtures 57
1 Introdu tion . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2 Problem formulation and idea . . . . . . . . . . . . . . . . . 59
2.1 Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3 Ba kground . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.1 Forward-Ba kward Smoothing . . . . . . . . . . . . 62
3.2 Optimal Gaussian mixture FBS . . . . . . . . . . . . 62
3.3 Forward �lter based on pruning and merging approx-
imations . . . . . . . . . . . . . . . . . . . . . . . . . 63
4 Ba kward smoothing through merged nodes . . . . . . . . . 65
4.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.2 Smoothing on merged nodes . . . . . . . . . . . . . 69
5 Algorithm des ription . . . . . . . . . . . . . . . . . . . . . 70
x
Contents
6 Implementation and simulation Results . . . . . . . . . . . . 73
6.1 Simulation s enario . . . . . . . . . . . . . . . . . . . 73
6.2 Implementation details . . . . . . . . . . . . . . . . . 73
6.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . 74
7 Con lusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
A Weight al ulation . . . . . . . . . . . . . . . . . . . . . . . 76
Referen es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Paper II Two��lter Gaussian mixture smoothing with poste-
rior pruning 81
1 Introdu tion . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
2 Problem statement and idea . . . . . . . . . . . . . . . . . . 83
2.1 Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3 Ba kground . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.1 Two-�lter smoothing . . . . . . . . . . . . . . . . . . 84
4 Intragroup approximations of the ba kward likelihood . . . . 86
4.1 Stru ture of the ba kward likelihood . . . . . . . . . 86
4.2 Normalizability of the BL and intragroup approxima-
tions . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5 Smoothed posterior-based pruning . . . . . . . . . . . . . . 90
5.1 Posterior-based pruning . . . . . . . . . . . . . . . . 90
6 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
7 Implementation and simulation results . . . . . . . . . . . . 94
7.1 Simulation s enario . . . . . . . . . . . . . . . . . . . 94
7.2 Implementation details . . . . . . . . . . . . . . . . . 94
7.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . 97
8 Con lusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Referen es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Paper III A bat h algorithm for estimating traje tories of
point targets using expe tation maximization 103
1 Introdu tion . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
2 Problem formulation and ba kground . . . . . . . . . . . . . 106
2.1 Nomen lature . . . . . . . . . . . . . . . . . . . . . . 106
2.2 Problem formulation . . . . . . . . . . . . . . . . . . 107
2.3 Ba kground . . . . . . . . . . . . . . . . . . . . . . . 108
3 EM for target tra king . . . . . . . . . . . . . . . . . . . . . 110
3.1 Expe tation maximization for MAP estimation . . . 110
3.2 Derivation of the EM algorithm for tra king multiple
point targets . . . . . . . . . . . . . . . . . . . . . . 112
4 Marginal data asso iation probabilities al ulation using Loopy
belief propagation . . . . . . . . . . . . . . . . . . . . . . . . 115
xi
Contents
4.1 Loopy belief propagation algorithm . . . . . . . . . . 116
5 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
6 Comparison with PMHT and JPDA . . . . . . . . . . . . . . 119
6.1 Comparison with PMHT . . . . . . . . . . . . . . . . 120
6.2 Comparison with JPDA . . . . . . . . . . . . . . . . 122
7 Simulations and results . . . . . . . . . . . . . . . . . . . . . 125
7.1 Simulation s enarios . . . . . . . . . . . . . . . . . . 125
7.2 Handling sensitivity to initialization . . . . . . . . . . 127
7.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . 130
8 Con lusions and future work . . . . . . . . . . . . . . . . . . 132
A Derivation of the EM algorithm for the Gaussian ase . . . . 133
B Smoothing with the RTS smoother . . . . . . . . . . . . . . 134
Referen es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
Paper IV Expe tation maximization with hidden state vari-
ables for multi�target tra king 141
1 Introdu tion . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
2 Problem formulation and ba kground . . . . . . . . . . . . . 144
2.1 Problem formulation . . . . . . . . . . . . . . . . . . 144
2.2 Ba kground . . . . . . . . . . . . . . . . . . . . . . . 145
3 EM for multi-target tra king . . . . . . . . . . . . . . . . . . 148
3.1 Introdu tion to expe tation maximization . . . . . . 148
3.2 Derivation for data asso iation estimation using EM . 149
3.3 Dis ussion . . . . . . . . . . . . . . . . . . . . . . . . 153
4 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
4.1 Des ription . . . . . . . . . . . . . . . . . . . . . . . 154
4.2 Initialization . . . . . . . . . . . . . . . . . . . . . . . 154
5 Tra k hypothesis swit hing . . . . . . . . . . . . . . . . . . . 156
5.1 Comparing tra k hypotheses . . . . . . . . . . . . . . 158
5.2 Dete ting targets in lose proximity . . . . . . . . . . 159
6 Simulation and results . . . . . . . . . . . . . . . . . . . . . 160
6.1 Simulations . . . . . . . . . . . . . . . . . . . . . . . 160
6.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . 161
7 Con lusion and future work . . . . . . . . . . . . . . . . . . 168
A Derivation of the M�step of EM algorithm . . . . . . . . . . 169
B Smoothing with the RTS smoother . . . . . . . . . . . . . . 170
C Modi�ed au tion algorithm . . . . . . . . . . . . . . . . . . . 171
Referen es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
xii
Contents
Paper V Generalized optimal sub�pattern assignment metri 179
1 Introdu tion . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
2 Ba kground . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
2.1 Metri . . . . . . . . . . . . . . . . . . . . . . . . . . 182
2.2 OSPA metri . . . . . . . . . . . . . . . . . . . . . . 182
2.3 Drawba k with the normalization in OSPA . . . . . . 184
3 Generalized OSPA metri . . . . . . . . . . . . . . . . . . . 184
3.1 Dis ussion . . . . . . . . . . . . . . . . . . . . . . . . 185
4 Choi e of α in MTT . . . . . . . . . . . . . . . . . . . . . . 186
4.1 Motivating the hoi e of α = 2 . . . . . . . . . . . . . 186
4.2 Reformulating GOSPA for α = 2 . . . . . . . . . . . 188
5 Performan e evaluation of MTT algorithms . . . . . . . . . . 189
6 Illustrations . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
7 Con lusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
A Proof of the triangle inequality of GOSPA . . . . . . . . . . 195
B Proof of the triangle inequality of OSPA . . . . . . . . . . . 198
C Proof of the average GOSPA metri . . . . . . . . . . . . . . 199
Referen es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
Paper VI A metri on the spa e of �nite sets of traje tories
for evaluation of multi�target tra king algorithms 207
1 Introdu tion . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
2 Problem formulation and ba kground . . . . . . . . . . . . . 210
2.1 Fundamental properties . . . . . . . . . . . . . . . . 210
2.2 Challenges . . . . . . . . . . . . . . . . . . . . . . . . 211
2.3 Bento's natural and omputable metri s . . . . . . . 212
3 Assignment metri . . . . . . . . . . . . . . . . . . . . . . . 214
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . 215
3.2 Multi-dimensional assignment metri . . . . . . . . . 215
3.3 Interpretation . . . . . . . . . . . . . . . . . . . . . . 217
3.4 Computation . . . . . . . . . . . . . . . . . . . . . . 218
4 LP relaxation metri . . . . . . . . . . . . . . . . . . . . . . 219
4.1 Binary linear programming formulation . . . . . . . . 219
4.2 LP relaxation metri . . . . . . . . . . . . . . . . . . 220
4.3 Implementation of the LP relaxation metri using ADMM221
5 Extension to random sets of traje tories . . . . . . . . . . . 222
6 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . 223
6.1 Comparison between ADMM and `linprog' implemen-
tations . . . . . . . . . . . . . . . . . . . . . . . . . . 223
6.2 Example in MTT using sets of traje tories . . . . . . 225
7 Con lusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
xiii
Contents
A Proof for Proposition VI.9 . . . . . . . . . . . . . . . . . . . 229
A.1 Proof for omputability using LP . . . . . . . . . . . 229
A.2 Proof for metri properties . . . . . . . . . . . . . . . 229
B Formulation of the LP metri in onsensus form . . . . . . . 235
Referen es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
xiv
Chapter 1
Introdu tion
In many appli ations, the obje tive is to systemati ally and sequentially es-
timate quantities of interest from a dynami system using indire t and ina -
urate sensor observations. For instan e, in radar tra king, the aim is often
to determine the position and velo ity of a moving or stationary air raft or
ship. In ommuni ation systems, the on ern is to determine the messages
transmitted through a noisy hannel. In driver assistan e systems, the ob-
je tive is to monitor several features about the driver, the vehi le and the
surroundings. There are also several other appli ations su h as fore asting
weather or �nan ial trends, predi ting house pri es, handwriting re ogni-
tion, speaker identi� ation, and positioning in navigation systems.
The sequential estimation problem an be ategorized into three di�er-
ent problem formulations: predi tion, �ltering and smoothing. The predi -
tion problem is to fore ast the values of the parameters of interest, given
information up to an earlier time, whereas the �ltering problem is about
estimating the parameter at the urrent time, given information up to and
in luding that time. The smoothing problem is to estimate the past state of
the parameter using all the observations made. An example from [1℄ an be
used to explain these di�erent problem formulations, in layman terms. As-
sume that we have re eived a garbled telegram and that the task is to read
it word-by-word and make sense of what the telegram means. The �ltering
formulation would be to read ea h word and understand the meaning so far.
The predi tion formulation would be to guess the oming words, based on
what have been read thus far. In the smoothing formulation, the reader is
allowed to look ahead one or more words. Clearly, as the idiom quoted in
the book goes �it is easy to be wise after the event�, the smoothing formula-
tion will give the best result on average, given that a delay an be tolerated.
In many of the above-mentioned appli ations, the aim is not only to
1
Chapter 1. Introdu tion
estimate the parameters of interest termed `states', but also to des ribe the
un ertainties in the states. The un ertainties are used to des ribe the re-
liability or trustworthiness of the produ ed estimates. Mathemati ally, an
estimate and its asso iated un ertainty is quanti�ed using either a probabil-
ity density fun tion (for ontinuous states) or a probability mass fun tion
(for dis rete states). In sequential estimation, the probability fun tion of the
state, whi h has the information about its estimate and the orresponding
un ertainty, are propagated a ross time to estimate the subsequent states.
For ontinuous states, one of the most ommonly used density fun tions is
the Gaussian density fun tion, whi h is often referred to as the `bell-shaped'
urve. The famous Kalman �lter [2℄ is developed as a solution to the �l-
tering problem when the un ertainties are modelled using Gaussian density
fun tions. There also exist (analyti al) solutions to the smoothing problem
with Gaussian densities.
Even though the Gaussian density models and the Kalman �lter solu-
tions work well for a wide range of appli ations, this may not be enough
for omplex systems. There are many appli ations where the un ertainty
in the evolution of the state or the observation noise annot be a urately
modelled using Gaussian densities. For instan e, in the data asso iation
problem, observations are often re eived from obje ts that are not of inter-
est and the information regarding whi h measurement belongs to the target
of interest is not available. In su h ases, the un ertainty about the states
are lustered in several small regions where ea h region orresponds to ea h
measurement. This happens in ship surveillan e, when false measurements
are re eived from re�e tions of the sea, and in air tra� surveillan e, where
extraneous observations from louds and birds are re eived. In these kinds
of s enarios, instead of a single Gaussian density, the system or the obser-
vations are often modelled using what is alled a Gaussian-mixture density.
In essen e, the un ertainty of the state an be des ribed using a Gaus-
sian mixture where we have a Gaussian omponent for ea h luster/region
around whi h the un ertainty/data is entered, along with a weight that
aptures the intensity. The advantage of using a Gaussian mixture (GM)
is that it is made up of several Gaussian omponents, whi h allows one to
extend the Kalman �lter solutions to these problems as well. However, in
most problems, the number of possibilities and thus the number of Gaussian
omponents in the mixture grows with time, whi h adds to the omplexity
of the algorithms.
In the data asso iation problem, the interpretation of a GM density of
the state is that we have Gaussian un ertainty about the state for every
2
possibility of mat hing the obje ts (also referred to as targets) of interest to
the individual observations from the sensors. To give a sense of the number
of possibilities, assume that there are k targets and n measurements. Then,
the number is
n!(n−k)!
where n! = n× (n− 1)× . . .× 1, assuming that every
target produ es a measurement at ea h time instant. Thus, at ea h time
instant, we have a GM with a large number of omponents. When this
density is propagated to the next time to perform the Bayesian inferen e
we dis ussed in the beginning of the hapter, the omplexity of the problem
explodes. Even for single target, i.e. for k = 1, the omputation of the
optimal solution be omes intra table. Thus, approximations are inevitable.
In this thesis, we provide e� ient and e�e tive solutions for both single and
multi-target s enarios.
Another aspe t onsidered in this thesis is the performan e evaluation
of traje tory estimation algorithms. The main obje tive of this part is to be
able to quantify the similarity between the ground truth and the estimates
returned by an algorithm. We might observe that there is a good mat h
between some of the states in the ground truth and the estimates. We will
want to quantify the similarity to judge how di�erent algorithms perform.
Besides this kind of error, it is possible that there are ertain states in the
ground truth that do not have any good mat h in the estimates, or vi e
versa. We would like to take into a ount these kinds of errors as well when
de�ning the similarity between the ground truth and the estimates. In this
thesis, we have fo ussed on mathemati ally quantifying these kind of simi-
larities for traje tory estimation.
The resear h presented in this thesis has been sponsored by Vinnova
(The Swedish Governmental Agen y for Innovation Systems), under the
resear h program �Nationella �ygtekniska forskningsprogrammet" (NFFP
6), and by Ele troni Defen e Systems, Saab AB.
Outline of the thesis
The thesis is divided into two parts, where Part I presents the theoreti-
al ba kground of the Gaussian mixture smoothing problem and Part II
ontains a set of appended papers. Part I ontains six hapters, among
whi h Chapter 2 presents the mathemati al formulations of the �ltering
and smoothing problems and di�erent models used. Chapter 3 dis usses
in detail the Gaussian mixture smoothing problem in the ontext of single
traje tory estimation, and the di� ulties involved. In Chapter 4, we dis uss
the data asso iation problem in multi target tra king. Chapter 5 presents
3
Chapter 1. Introdu tion
a summary of the metri problem. In Chapter 6, we provide a summary of
the ontributions in the appended papers and also dis uss possible future
resear h dire tions.
In Part II, the ontributions of the thesis are presented in Papers I
through VI. In Paper I and Paper II, we onsidered the data asso iation
problem in estimating a single target traje tory in the presen e of lut-
ter. Typi ally, this is arried out using pruning approximations on GMs.
In the papers, we present an in-depth study of how to design forward�
ba kward smoothing and two-�lter smoothing for Gaussian mixtures, based
on both merging and pruning approximations. In Paper III and Paper IV,
we onsider the data asso iation problem when estimating multiple target
traje tories in the presen e of lutter. We present two solutions based on
expe tation maximization (EM) that are iterative. The major onsequen e
of applying EM is that the problem of GM smoothing redu es to Gaussian
smoothing in ea h iteration. In Paper V and Paper VI, we address the prob-
lem of de�ning metri s for sets of target states and for sets of traje tories,
respe tively.
4
Chapter 2
Models, obje tives and
on eptual solution
In traje tory estimation, the goal is to sequentially estimate an unknown
variable, given noisy observations of the variable. A ording to the Bayesian
estimation prin iple, whi h is ommonly used for these problems, the idea
is that based on our prior knowledge of the pro ess, we predi t the vari-
able with some un ertainty. Then, at a point when new information is
available, the predi tion is updated to get the `posterior'. For traje tory
estimation, the term 'posterior' takes di�erent meanings depending on the
set of measurement that we ondition on. When the most re ent state of
the traje tory is newer than the omplete olle tion of measurements over
time, we have a predi tion problem; for equally new information we have
a �ltering problem; and, for a state that is older than the newest pie e of
information, we have a smoothing problem.
In this hapter, we brie�y dis uss the mathemati al representation of
the variables and how the traje tory estimation problem an be posed. We
also present a brief summary of the models used in traje tory estimation.
2.1 State spa e representation
In a state-spa e representation, the unknown variable to be estimated is
termed the `state'. The state variable at time k is here denoted as xk ∈Rn
and the observed data as zk ∈ Rm. The time variability of the state
is des ribed by a motion model while the relation between the state and
the measurements are given by a sensor model. The impli it Markovian
assumption of the state spa e is that the state xk at time k, given all the
states until time k − 1, depends only on the state xk−1 at time k − 1. The
5
Chapter 2. Models, obje tives and on eptual solution
motion model an then be written as
xk = gk(xk−1, vk), (2.1)
where vk is the pro ess noise. Using this motion model one an des ribe the
transition model fk(xk|xk−1). It is assumed that we have some knowledge
of the state at time 0, de�ned by the prior density p0(x0).
The measurement zk is given by the sensor model
zk = hk(xk, wk), (2.2)
where wk is the measurement noise random variable. The sensor model is
used to obtain the likelihood fun tion pk(zk|xk). In the remainder of the
introdu tory hapters, the subs ript k in the notation of the fun tions gk(·),hk(·), and pk(·|·) will be dropped without loss of generality and for ease of
writing, and be represented as g(·), h(·), and p(·|·).
2.2 Problem statement and on eptual solu-
tion
The posterior density of the state xk is used to determine our estimates of
the state and also des ribes our un ertainties in the state. The obje tive
is to re ursively ompute the posterior density of the state ve tor xk using
the Bayesian prin iple [1℄. In the predi tion problem, the goal is to obtain
the density p(xk|z1:l) for l < k, given the measurements obtained from time
1 to time l, denoted z1:l. In the �ltering problem, the goal is to obtain the
posterior density p(xk|z1:k) of the state xk. In the smoothing problem, we
are interested in omputing the posterior density p(xk|z1:K), where K > k.Below, we present the on eptual solutions to these problems.
2.2.1 Predi tion and �ltering
The predi tion and �ltering densities an be obtained re ursively in two
steps, namely, predi tion and update, using the prior p(x0), the pro ess
model f(xk|xk−1) and the likelihood p(zk|xk). The one-step predi tion
(where l = k − 1 in p(xk|z1:l)) gives the predi tion density at time k by
evaluating the integral
p(xk|z1:k−1) =
∫p(xk−1|z1:k−1)f(xk|xk−1)dxk−1, (2.3)
6
2.2. Problem statement and on eptual solution
where p(xk−1|z1:k−1) denotes the �ltered density at time k−1, and p(xk|z1:k−1)
the predi tion density at time k. An update of the predi tion density at
time k gives the �ltered density at time k as
p(xk|z1:k) ∝ p(xk|z1:k−1)p(zk|xk). (2.4)
The onstant of proportionality in the above equation is
1p(zk|z1:k−1)
, where
p(zk|z1:k−1) =
∫p(xk|z1:k−1)p(zk|xk)dxk. (2.5)
It should be mentioned here that the equations in (2.3), (2.4) and (2.5) pro-
vide the theoreti al solutions but, in pra ti e, these equations are in general
not tra table. For instan e, the integrals annot be omputed a urately,
or the representation of the di�erent densities in these equations an be
intra table and so on.
2.2.2 Smoothing
Similar to the �ltering problem, sequential estimation of the smoothed pos-
terior an be obtained using the Bayesian prin iple. Though the approa hes
dis ussed here have been designed towards �xed-interval smoothing, they
are in their ontextual form, appli able to the �xed-lag and �xed point
smoothing as well [1℄. One an also refer to [3℄ for a umulated state den-
sity formulation of the smoothing problem.
The �rst approa h is forward�ba kward smoothing (FBS) [4℄. As the
name suggests, the �rst step is forward �ltering from time 1 to K, to ob-
tain the �ltered density p(xk|z1:k) at ea h k. This is followed by ba kward
smoothing from time K to time 1. The ba kward smoothing step at time k
uses the smoothed density at time k+1 together with the �ltering densities
at time k as
p(xk|z1:K) = p(xk|z1:k)∫p(xk+1|z1:K)p(xk+1|z1:k)
f(xk+1|xk)dxk+1. (2.6)
The integral in the above equation is proportional to p(zk+1:K |xk), termed
as the ba kward likelihood in this thesis. Therefore, it is possible to in-
terpret the division in the ba kward smoother as omputing the ba kward
likelihood impli itly.
The se ond approa h to smoothing is the two-�lter smoothing (TFS)
method [5℄. To obtain the smoothed density at time k by this method,
forward �ltering is performed from time 1 to k to get the �ltered density
7
Chapter 2. Models, obje tives and on eptual solution
p(xk|z1:k) and ba kward �ltering is run from time K to time k to get the
ba kward likelihood p(zk+1:K |xk). The produ t of the two �lter outputs
gives the smoothed density,
p(xk|z1:K) ∝ p(xk|z1:k)p(zk+1:K |xk). (2.7)
The ba kward �ltering, similar to the forward �lter, is performed re ursively
using two steps: update and retrodi tion. The update step omputes the
likelihood
p(zk+1:K|xk+1) = p(zk+1|xk+1)p(zk+2:K|xk+1) (2.8)
and the retrodi tion step omputes the ba kward likelihood as
p(zk+1:K |xk) =∫p(zk+1:K |xk+1)f(xk+1|xk)dxk+1. (2.9)
Comparing the individual terms in (2.6) and (2.7) and using (2.9), one
an observe that the di�eren e between the two smoothing methods arises
due to the di�eren e in the ways the term p(zk+1:K |xk+1) is omputed. FBS
omputes it from the division of the predi tion and smoothing densities
p(zk+1:K |xk+1) ∝p(xk+1|z1:K)p(xk+1|z1:k)
. (2.10)
whereas the TFS method uses a �ltering approa h as in (2.8).
2.3 Models
In this se tion, we dis uss the di�erent models relevan e to traje tory esti-
mation. We present them in di�erent ategories based on their properties.
All these models determine whether or not the integrals in the on eptual
solutions presented in the last se tion are tra table or not. For instan e,
as we dis uss in the next hapter, when we have a single target with lin-
ear pro ess and measurement models, along with Gaussian noise terms and
Gaussian prior densities, the predi tion, �ltered and smoothed densities are
all Gaussian densities. In this ase, the Kalman �lter and the Rau h-Tung-
Striebel (RTS) smoother provide a re ursive solution to obtain the mean
and the ovarian e of these densities in losed form. Below we summarize
some of the possible models used in the traje tory estimation literature.
The pro ess model g(xk−1, vk) and/or the measurement model h(xk, wk)
an be non-linear fun tions of the state and the noise variables. A ommonly
used non-linear measurement model is when we have range and bearing
measurements and the state that we are interested in is Cartesian position.
8
2.3. Models
These kind of non-linear fun tions are often handled using extended Kalman
�lter (EKF) [1,6℄, uns ented Kalman �ler (UKF) [7,8℄, quadrature Kalman
�lter [9℄, ubature Kalman �lter (CKF) [10℄ or the parti le �lter [11�16℄.
Often in appli ations, we re eive measurements, not only from the tar-
get of interest, but also from sour es that are not of interest to us. These
measurements are termed lutter and often yield un ertainties in the data
asso iations. The problem arises when it is not immediate whi h measure-
ments are from the single target and whi h are from lutter. The data
asso iation problem also arises when we have multiple targets in the region
of interest. Again, we re eive a set of measurements whi h may not have
the target identity. The problem is only aggravated when we also have the
lutter data in addition.
9
Chapter 3
Single traje tory estimation
In this hapter, we brie�y dis uss the various hallenges in �ltering and
smoothing problems that arise in the single�traje tory estimation problem.
We dis uss in detail the Gaussian mixture �ltering and smoothing problems,
whi h are fo us areas of the thesis.
3.1 Linear and Gaussian �ltering and smooth-
ing
Assume that the prior, p(x0), is a Gaussian density, and that the motion
and sensor models are linear fun tions of the state ve tor xk, with additive
Gaussian noise, i.e.,
xk = Fkxk−1 + vk (3.1)
and
zk = Hkxk + wk, (3.2)
where Fk ∈ Rn×n, Hk, ∈ Rm×n
, vk ∼ N (0, Qk) and wk ∼ N (0, Rk). Then,it an be shown that the posterior densities are Gaussian and have losed
form expressions [2℄. Again, for onvenien e of writing, the subs ript kwill be dropped from the matrix notations. In this se tion, we dis uss the
algorithms to obtain the mean and ovarian e of the �ltered and smoothed
densities.
3.1.1 Kalman �ltering
Let the predi tion density, p(xk|Z1:k−1), and �ltered density, p(xk|Z1:k), be
denoted as N(xk;µk|k−1, Pk|k−1
)and N
(xk;µk|k, Pk|k
), respe tively. The
notationN (x;µ, P ) denotes a Gaussian density in the variable x with mean
µ and ovarian e P . The goal of predi tion and �ltering is then to �nd the
11
Chapter 3. Single traje tory estimation
�rst two moments of the orresponding Gaussian densities. The ubiquitous
Kalman �lter equations [2℄ provide losed�form expressions for the �rst two
moments of the predi tion and �ltered densities in (2.3) and (2.4). The
predi tion equations are given by
µk|k−1 = Fµk−1|k−1 (3.3)
Pk|k−1 = FPk−1|k−1FT +Q (3.4)
and the update equations by
Sk = HPk|k−1HT +R (3.5)
Kk = Pk|k−1HTS−1
k (3.6)
z̃k = zk −Hµk|k−1 (3.7)
µk|k = µk|k−1 +Kkz̃k (3.8)
Pk|k = (In −KkH)Pk|k−1. (3.9)
where In is an n-by-n identity matrix. The so� alled innovation z̃k, and
Sk, the innovation ovarian e, des ribe the expe ted measurement distribu-
tions. Kk is the Kalman gain, whi h an be viewed as the weight for new
information in the innovation ompared to the predi tion.
3.1.2 Smoothing
Below we provide the two versions�forward�ba kward smoothing (FBS)
and two��lter smoothing(TFS)�of the smoothing algorithm dis ussed in
Se tion 2.2.2 to obtain the mean and ovarian e of the smoothed density
under linear and Gaussian assumptions.
For the FBS method, the Rau h-Tung-Striebel (RTS) [4℄ smoother gives
the losed-form expressions for the mean and ovarian e of the smoothed
density. Using notations similar to the ones for the predi tion and �ltered
densities, the smoothed density at time k is denoted as N(xk;µk|K, Pk|K
).
The RTS equations are
µk|K = µk|k + Ck
(µk+1|K − µk+1|k
)(3.10)
Pk|K = Pk|k + Ck
(Pk+1|K − Pk+1|k
)CT
k , (3.11)
where
Ck = Pk|kFTP−1
k+1|k (3.12)
is similar to the Kalman gain in the Kalman �lter equations (3.3) to (3.9).
12
3.2. Non-linear models
For the TFS method, the work in [17℄ provides the losed-form solu-
tion for the moments of the smoothed density. Let the likelihoods be de-
noted as p(Zk+1:K|xk+1) = N (Uk+1xk+1;ψk+1, Gk+1) and p(Zk+1:K|xk) =
N (Jkxk; ηk, Bk). Let the starting onditions at time K be JK = [ ],ηK = [ ] and BK = [ ]. The update step in (2.8) of the ba kward �lter is
then given by
Uk+1 =
[Jk+1
H
](3.13)
ψk+1 =
[ηk+1
zk+1
](3.14)
Gk+1 =
[Bk+1 0
0 R
](3.15)
while the retrodi tion step in (2.9) is given by
Jk = Uk+1F (3.16)
ηk = ψk+1 (3.17)
Bk = Uk+1QUTk+1 +Gk+1. (3.18)
Using the outputs of the forward �lter and the ba kward �lter at time k,
the smoothed density in (2.7) is given by
µk|K = µk|k +Wk
(ηk − Jkµk|k
)(3.19)
Pk|K = (In −WkJk)Pk|k, (3.20)
with gain
Wk = Pk|kJTk
(JkPk|kJ
Tk +Bk
)−1. (3.21)
Note that the above three equations have similarities to the Kalman update
equations in (3.8) and (3.9). Here, the �ltering parameters, µk|k and Pk|k,
are updated with the innovation from the future measurements from time
k + 1 to K, whereas in the Kalman �lter the predi tion parameters are
updated with the innovation from the measurement at time k.
3.2 Non-linear models
When the motion model g(·) and/or the measurement model h(·) are non-linear or when the noise is not additive Gaussian, the posterior density for
the predi tion, �ltering and smoothing is, in general, not a Gaussian den-
sity. One example is when we obtains range and bearing measurements
13
Chapter 3. Single traje tory estimation
from a radar and want to tra k the position and the velo ity of the target.
The optimal solution then be omes intra table as the integrals annot be
omputed in losed form. Approximations and sub�optimal approa hes are
therefore inevitable. There are several sub-optimal approa hes to estimate
the �ltered density in this ase, some of whi h are dis ussed in this se tion
su h as the Gaussian and parti le �lters. We dis uss Gaussian mixtures,
whi h is appli able when the posterior has multimodal shape, in more detail
in subsequent se tions of this hapter.
The smoothing problem has additional hallenges ompared to the �l-
tering problem. First, the equation in the FBS method involves division of
densities, whi h is di� ult to ompute for arbitrary densities. Se ond, the
a ura y of the approximations in the forward �ltering highly a�e ts the
ba kward smoothing and the smoothed density. The TFS method, on the
other hand, does not involve density divisions and the two �lters an ideally
be run independently of ea h other. However, the likelihood p(zk+1:K |xk)is not, in general, a normalizable density fun tion, whi h limits the possi-
bilities to apply onventional approximation te hniques for densities during
the ba kward �ltering. Due to these additional ompli ations, applying the
te hniques used for non-linear �ltering to non-linear smoothing does not
always produ e fruitful results. In this se tion, we also dis uss the hal-
lenges in extending te hniques su h as sequential Monte Carlo sampling,
linearization and sigma-point methods to the smoothing problem.
3.2.1 Gaussian �lters
One approa h to handle non-linear models is to approximate the posterior
density as a Gaussian density. The methods that use this approa h are
alled Gaussian �lters/smoothers, named appropriately. There are several
methods to make a Gaussian approximation of the �ltered density and to
ompute its �rst two moments. One method is based on linearizations of
the fun tions, g(·) and h(·), after whi h the Kalman �lter equations in (3.3)
to (3.9) an be used to obtain the mean and ovarian e of the Gaussian
approximation of the �ltered density. The famous extended Kalman �l-
ter [1,6,18℄ and the many variants of it are based on this approa h. Though
these algorithms work for a good number of models, their performan e de-
teriorates when the fun tions are highly non-linear.
Another type of methods used to obtain a Gaussian approximation of
the �ltered density is based on sigma-points, su h as the uns ented Kalman
�lter [7, 8℄, the quadrature Kalman �lter [9℄ and the ubature Kalman �l-
14
3.2. Non-linear models
ter [10℄. In these methods, a handful of points, termed sigma-points, are
hosen deterministi ally based on the �rst two moments of the prior den-
sity. The sigma points are then propagated through the non-linear models
to obtain the means and ovarian es used to ompute the moments of the
Gaussian approximation of the �ltered density. The sigma-point methods
also impli itly perform a linearisation using statisti al linear regression [19℄.
The analogue of Gaussian �ltering methods, su h as the extended Kalman
�lter and the uns ented Kalman �lter, exists for TFS of non-linear mod-
els. The extended Kalman smoother [20℄, similar to its �ltering ounter-
part, has poor performan e when the non-linearity is severe. The uns ented
Kalman smoother [21℄, [20, Chap. 7℄ needs the inverse of the dynami model
fun tions, whi h may not be feasible in all s enarios. The uns ented RTS
smoother [22℄ is the FBS version of a Gaussian smoother and is shown to
have similar performan e as the uns ented Kalman smoother, but without
the need of inverting the model fun tions.
3.2.2 Parti le �lters
Parti le �lters or sequential Monte Carlo �lters [11�16℄ are based on repre-
senting the density p(x) with a set of randomly drawn samples x(m), termed
`parti les', along with their orresponding weights. The parti les de�ne the
positions of Dira delta fun tions su h that the weighted sum of the Dira
delta fun tions of the parti les provides a good approximation of the true
density:
p(x) ≈ 1
N
N∑
m=1
δ(x− x(m)
). (3.22)
These methods use the on ept of importan e sampling, where the parti-
les are generated from a proposal density, whi h is simpler to generate
the samples from, instead of the true density. The parti les are propagated
through the pro ess model and the weights are updated using the likelihood
to obtain the posterior density. The hoi e of proposal density is ru ial to
parti le �lters, and the proposal density should have the same support as
the true density and should be as similar to the true density as possible.
The advantages of parti le �lters are that the performan e of su h �lters
is una�e ted by the severity of the non-linearity in g(·) and h(·), that theyare are asymptoti ally optimal also when the fun tions are non-linear, and
that they are often easy to implement. However, they an be omputa-
tionally demanding as the dimension of the state ve tor in reases. Another
problem with parti le �lters is that they degenerate, whi h means that
15
Chapter 3. Single traje tory estimation
the weights of most parti les be ome zero. This an be over ome by re-
sampling [16℄ frequently, where multiple opies of the `good' parti les with
signi� ant weights are retained and the `poor' parti les are removed.
Similar to the �ltering method, sequential Monte Carlo smoothing is
based on approximating the smoothed posterior density using a set of par-
ti les. In parti le Markov hain Monte Carlo (MCMC) methods [23℄, par-
ti le �lters are use to approximate the joint posterior distribution, whi h
is then used to generate proposals for MCMC. In ase of FBS based on
these methods, a vanilla version works well when k ≈ K, where K is the
bat h duration. However, when k ≪ K [24℄, be ause of su essive resam-
pling steps, the marginal density be omes approximated by a single parti le
whi h leads to deteriorated performan e. This is the degenera y problem
that is inherent in parti le �lters [11℄. One simple approa h is to use the
forgetting properties of the Markov model, i.e., to approximate the �xed-
interval smoothed density p(xk|z1:K) using the �xed-lag smoothed density
p(xk|z1:k+δ) [25, 26℄. Unfortunately, automati sele tion of δ is di� ult.
In ase of TFS, it is not straightforward to approximate the output of
the ba kward �lter using parti les, as it it not a normalizable density. The
arti� ial prior method [5℄ uses the auxiliary probability density p̃(xk|zk+1:K)
instead of the likelihood p(zk+1:K |xk). The auxiliary density is obtained
using what is alled arti� ial prior densities. The hoi e of the arti� ial
prior plays a major role in the performan e of the TFS algorithm for parti le
methods.
3.3 Gaussian mixture �ltering and smoothing
There are many appli ations in whi h we re eive several measurements on
the state variable, where the reliability of the measurements an vary. The
likelihood in these appli ations are onveniently modelled as mixtures. In
the lassi data asso iation problem, where we do not have information
about whi h measurement orresponds to whi h target upon re eiving a set
of measurements, the posterior density is a Gaussian mixture, even if we
assume the motion and measurements models are linear and Gaussian. We
dis uss more on this problem in the next hapter.
Gaussian mixtures are weighted sum of Gaussian densities, whi h usually
make a good approximation for the multi-modal densities. In this se tion,
we explain that when the likelihood and/or the state transition density are
Gaussian mixtures, the true posterior densities after �ltering and smoothing
16
3.3. Gaussian mixture filtering and smoothing
are also Gaussian mixtures. The number of terms in the GM usually grows
exponentially with time, and we therefore need to onstrain the number of
terms. In these situations, redu tion algorithms an be used to approximate
the posteriors. In this se tion, we provide a brief overview of the most
ommonly used mixture redu tion methods and dis uss the hallenges in
applying these to smoothing problems.
3.3.1 Optimal solution
It was presented in the last hapter that the forward-ba kward smooth-
ing (FBS) method is based on forward �ltering and ba kward smoothing
while the two-�lter smoothing (TFS) method involves forward �ltering and
ba kward �ltering. These steps involve the predi tion, update and retrod-
i tion steps stated in equations (2.3), (2.4), (2.8) and (2.9). One an noti e
that all these equations involve produ ts of fun tions, whi h in this ase
are Gaussian mixtures. When the state transition densities and the likeli-
hoods are both GMs, one an use the fa t that a produ t of GMs yields a
GM and show that the posterior densities are all Gaussian mixtures. The
number of omponents in the resulting GM is the produ t of the number of
omponents in the individual mixtures, whi h explains why the number of
omponents grows exponentially with time.
Forward �ltering
One an show that, starting with a GM prior, the predi tion and the up-
date steps of forward �ltering result in a Gaussian mixture posterior density.
Evaluating these steps with GMs is equivalent to using Kalman �lters, one
for every triplet of Gaussian omponents in the prior p(xk−1|z1:k−1), the tran-
sition density f(xk|xk−1) and the likelihood p(zk|xk), yielding a Gaussian
term in the posterior p(xk|z1:k). The term in the onstant of proportion-
ality p(zk|z1:k−1) in (2.5), is not al ulated expli itly in the update step of
the Kalman �lter, whi h involves produ t of Gaussian densities. However,
in the ase with GMs, this onstant of proportionality is used in the up-
dated weight al ulation. The updated weight for the resulting Gaussian
omponent is given by the produ t of the individual weights of the ompo-
nents in the predi tion density and the likelihood along with the onstant
of proportionality.
Ba kward smoothing of FBS
The ba kward smoothing of FBS involves a division of the smoothed and
the predi tion GM densities as in equation (2.6). Starting from time K,
17
Chapter 3. Single traje tory estimation
using the prin iple of mathemati al indu tion, it an be shown that the
division results in a GM and therefore the smoothed posterior p(xk|z1:K)is also a GM whi h has the same number of omponents for k = 1, . . . , K.
The weights of the omponents in the the smoothed density at time k are
the same as the weights of the omponents in the smoothed density at time
k + 1. Instead of performing the division, an equivalent way of obtaining
the smoothed posterior is as follows [3, Se . V A℄: starting from k = K−1,
the RTS re ursions are used for every triplet of asso iated omponents in
the smoothed density at time k + 1, the predi tion at time k and in the
�ltered density at time k, to ompute the smoothed density at time k.
Ba kward �lter of TFS
In the ba kward �lter of TFS, we need to ompute the ba kward likelihood
as in equations (2.8) and (2.9). The ideas in forward �ltering annot be
applied dire tly to the ba kward �lter be ause often the likelihoods an be
of the form
w0 +∑
i
wiN (Hixk;µi, Pi) (3.23)
where di�erent Hi an apture di�erent features of the state xk. Stri tly
speaking, these are not Gaussian mixture densities; they are neither Gaus-
sian nor densities in xk. We refer to them as redu ed dimension Gaussian
mixtures in this thesis. To ompute the produ t of likelihoods, one an use
the following general produ t rule:
wiN (Hix;µi, Pi)× wjN (Hjx;µj, Pj) = wijN (Hijx;µij, Pij) (3.24)
where
wij = wiwj (3.25)
Hij =
[Hi
Hj
](3.26)
µij =
[µi
µj
](3.27)
Pij =
[Pi 0
0 Pj
]. (3.28)
Using this in equations (2.8) and (2.9), one an show that the output of
the ba kward �lter has a stru ture similar to the inputs as in (3.23). The
smoothed density is given by the produ t of the outputs of the two �lters,
that an be omputed similarly to the update step in the forward �lter of
GMs, in luding the weight update using the proportionality onstant as
dis ussed in Se tion (3.3.1).
18
3.4. Gaussian mixture redu tion
3.4 Gaussian mixture redu tion
The number of omponents in the resulting GM, after update, predi tion
and retrodi tion iterations, grows exponentially with time. Therefore, ap-
proximations are ne essary to redu e the number of omponents. There are
several Gaussian mixture redu tion (GMR) algorithms that are well stud-
ied in the literature, whi h an be used for �ltering and smoothing. The
GMR algorithms are based on pruning insigni� ant omponents from the
GM and/or merging similar omponents.
3.4.1 Pruning
The number of omponents in the posterior GM an be prevented from
growing exponentially by pruning some of the omponents after ea h iter-
ation. There are several pruning strategies that an be adopted. Three
methods that are ommonly used are threshold-based pruning [27℄, M-best
pruning [28�30℄ and N-s an pruning. In threshold-based pruning, only the
omponents that have a weight greater than a prede�ned threshold are
retained and used for predi tion in the next iteration. The number of om-
ponents in the resulting GM an vary based on the threshold. The idea
behind the M-best pruning algorithm is that only the nodes with the Mhighest weights (or asso iation probabilities) are retained.
To explain the N-s an pruning [27℄, whi h is designed for redu tion dur-
ing �ltering, let us say we are interested in performing pruning at time k.We pi k the omponent that has the maximum weight. Starting from this
omponent, we tra e ba kwards N steps to �nd its parent omponent, at
time k−N . Only the o�spring at time k, of this parent node at time k−N ,
are retained. To be mentioned here is that the multiple hypothesis tra king
(MHT) �ltering [27℄ is often based on N-s an pruning.
3.4.2 Merging
One an also use merging of similar omponents to redu e the number
of omponents in a GM . There are several merging algorithms su h as
Salmond's [31�33℄, Runnalls' [34℄, Williams' [35℄ algorithms and many more
[36�39℄. These algorithms work based on the following three steps:
1. Find the most suitable pair of omponents to merge a ording to a
`merging ost' riterion.
2. Merge the similar pair and repla e the pair with the merged Gaussian
omponent.
19
Chapter 3. Single traje tory estimation
3. Che k if a stopping riterion is met. Otherwise, set the redu ed mix-
ture to the new mixture and go to step 1.
The merging ost in step 1 looks for similarity of the omponents and it an
be di�erent a ross algorithms. A few of the ommonly used merging osts
are the Kullba k-Leibler divergen e [40℄ and the integral-squared error [41�
43℄. The merging of the omponents in step 2 is usually based on moment
mat hing [40℄. That is, the moments of the GM before and after merging are
the same. The stopping riterion an also vary a ross algorithms, e.g., it an
be based on if the omponents in the redu ed mixture is at a manageable
number. In ertain algorithms, it is he ked based on that the omponents
in the redu ed GM are not similar.
3.4.3 Choi e of GMR
Two main riteria in hoosing the appropriate GMR algorithm are the om-
putational omplexity involved and the a ura y. Most of the pruning al-
gorithms are usually simpler to implement, ompared to merging. There is
information about the un ertainty of the estimate in the ovarian e matri-
es of the pruned omponents. So, as a result of pruning, we might have
underestimated un ertainties. In ontrast, for merging, the un ertainty is
preserved be ause of moment-mat hing. However, the merging algorithms
are more omputationally intensive than pruning. As a trade-o� between
omplexity and a ura y of the un ertainty, it may be more feasible to use a
ombination of pruning and merging. Pruning ensures that the omponents
with negligible weights are removed, without being aggressive. Merging re-
du es the number of omponents further, but keeping the moments of the
retained density the same as before.
3.4.4 GMR for FBS and TFS
Applying GMR, both pruning and merging for the forward �ltering is straight-
forward. When the forward �ltering is based on pruning, it is trivial to
perform the ba kward smoothing of the FBS similar to the optimal solu-
tion, using the �ltered densities. Starting from the last time instant, RTS is
performed ba kwards on the individual retained omponents. This method
su�ers from degenera y similar to parti le smoothing. This is be ause for
k ≪ K, the number of omponents in the forward �lter that orresponds
to the omponents in the smoothed posterior will be one. A solution to
the degenera y is to perform FBS based on merging, something that has
not been explored mu h in the literature. The main hallenge is that for
the ba kward smoothing, the asso iations a ross omponents are no longer
20
3.4. Gaussian mixture redu tion
simple, to use RTS dire tly and ompute the weights of the smoothed den-
sity. In Paper I [44℄ of this thesis, the problem of FBS based on merging is
investigated.
The literature on TFS for Gaussian mixture densities is also sparse. The
two �lters of the TFS an be run independently of ea h other. This allows
the GMR algorithms to be used on both the �lters. Then the di� ulty is
in using the Gaussian mixture redu tion te hniques in the ba kward �lter,
sin e its output is not a density fun tion. So, the GMR algorithms dis ussed
here annot be applied dire tly. In Paper II [45℄ of this thesis, we propose a
method alled smoothed posterior pruning, through whi h pruning an be
employed in the ba kward �lter.
21
Chapter 4
Multi�traje tory estimation
In the previous hapter, we presented a brief dis ussion of the data as-
so iation problem. Even with linear Gaussian assumptions, the number of
Gaussian terms in the Gaussian mixture form of the posterior density grows
exponentially. In this hapter, we dis uss the problem further in the multi�
target setting. Additional hallenges are posed by having a set of data from
multiple targets, where the target identities are not available.
4.1 Data asso iation
In the presen e of multiple point targets, ea h of whi h follows a linear-
Gaussian pro ess and measurement model, at ea h time we observe a set
of measurements. If the identity of a target is not available in the mea-
surements, then there is un ertainty about whi h measurement belongs to
whi h target. In ase of a single traje tory, ea h hypothesis is an event of
assigning a target to a measurement from the set. In this ase, the number
of possibilities at ea h time is the same as the number of measurements,
whi h when multiplied a ross time be omes exponential.
With multiple targets, at ea h time, the problem is even worse. Let
us say we have n targets and m measurements at a parti ular time. Now,
ea h hypothesis, often referred to as a global hypothesis, is the event of
asso iating the n targets to the m measurements (assuming n ≤ m) su h
that a target is assigned to at most one measurement and a measurement is
assigned to at most one target. The number of possibilities is ombinatorial
given by
m!(m−n)!
; for instan e, if we assume n = 2 and m = 3, the number of
ombinatorial possibilities is 6; if we double them up, n = 4 and m = 6, thenumber of possibilities is 360. If we also onsider the possibility that a tar-
get does not need to generate a measurement, in other words, a target an
23
Chapter 4. Multi�traje tory estimation
be missed, the number of possibilities is even higher, and again ombinato-
rial. A ross time, the omplexity of the problem gets multiplied. Therefore,
the optimal way of solving the problem onsidering all the possibilities is
intra table. Below we dis uss brie�y the traditional approa h taken.
Let K denote the bat h duration, k the time index, NK the number of
targets in the entire bat h duration and Mk the number of measurements
obtained at k. Assume the state variable is X = (Xk,i : k = 1, . . . , K, i =
1 . . . , NK) and the measurement is Z = (Zk,j : k = 1, . . . , K, j = 1 . . . ,Mk)where i stands for the target index and j for the measurement index.
If we know the posterior density p(X|Z), we an estimate the states,
whi h are tra table and straightforward if there is no un ertainty in the
measurement origin. However, in the multi�target tra king problems, the
measurement set omprises the measurements from the targets that are
dete ted as well as the lutter measurements, and the origin of the mea-
surements in the set are not known. To handle this un ertainty in the
measurement origin, one traditional way is to introdu e a data asso ia-
tion variable φ = {φk,i, ∀k, i}, where φk,i = j denotes the assignment of
the target i at time k to the measurement Zk,j. With these variables,
the density of interest be omes p(X, φ|Z), using whi h the estimates an
be omputed. Though the introdu tion of this variable makes it easier
to represent the measurement un ertainty, the estimation problem is still
intra table due to the sheer number of possibilities of φ. For instan e, on-
sider X̂MAP
= argmaxX p(X|Z) = argmaxX maxφ p(X, φ|Z). The optimal
point is sear hed over all the possibilities of φ, whi h is exponential in the
number of measurements. Thus, sub-optimal approa hes are inevitable. In
the remainder of this se tion, we give a brief overview of some of the ex-
isting sub-optimal algorithms to estimate X . Adhering to the onventional
terminology, we refer to an instan e of φ as the data asso iation hypothesis.
4.2 Tra king algorithms
The existing algorithms for tra king an be broadly ategorised into two:
the ones that jointly estimate X and φ, and the others that estimate Xwhile marginalizing φ. Global nearest neighbour (GNN) [27℄ and multiple
hypothesis tra king (MHT) [27,46,47℄ belong to the �rst ategory, whereas
joint probabilisti data asso iation (JPDA) [27, 48�55℄, probabilisti MHT
(PMHT) [56�59℄ and their variants belong to the se ond one. There are
also sampling�based algorithms like Markov hain Monte Carlo data asso-
iation (MCMCDA) [60℄. In this algorithm, an estimate of a hypothesis is
24
4.2. Tra king algorithms
obtained by making several random hanges to the existing hypothesis. The
omputational and memory requirements of this algorithm are very high,
in general. In this se tion, we �rst fo us on the �rst ategory of algorithms
that estimate φ, while estimating X , followed by the se ond ategory that
estimate X .
4.2.1 Global nearest neighbour
In ase of the global nearest neighbour algorithm, at ea h time instant k,the best data asso iation hypothesis is hosen to be the one with the largest
Pr{φk|Z1:k}, where φk stands for the data asso iation at time k and Z1:k
for the set of measurements from time 1 to k, respe tively. This hypothesis
is propagated to the next time and asso iated with the new set of mea-
surements to form a new set of hypotheses. The best hypothesis is hosen
to be the one with the largest Pr{φ1:k+1|Z1:k+1} and the whole pro edure
is repeated for subsequent time instants. This algorithm is very simple to
implement using 2-D assignment algorithms [61℄ su h as the au tion algo-
rithm [62℄ or Jonker-Volgenant-Castanon (JVC) [63℄, but sin e it makes
hard de isions every time instant, it underestimates the ovarian e and an
often lead to tra k loss.
4.2.2 Multiple hypothesis tra king
Similar to GNN, MHT generally makes hard de isions when estimating φ.
However, unlike GNN, the MHT algorithms make hard de isions based on
multiple s ans of data. The ommonly used N-s an pruning based tra k�
oriented MHT algorithm hooses the best set of hypotheses at time k based
on the last N s ans of data and, propagates them, and repeats the pro e-
dure again. In essen e, the algorithm estimates the best hypothesis at time
k − N based on the information until the urrent time instant k. The N-
s an pruning algorithm is typi ally implemented using the N-dimensional
assignment algorithms as in [64�66℄. Another popular version of MHT is
to retain the M-best hypotheses at ea h time and propagate only these M
hypotheses to the next time instant. This is typi ally implemented using
Murty's algorithm [28�30,67℄.
As an be noti ed, MHT maintains multiple data asso iation hypotheses
every time instant and, hen e the name `multiple hypothesis tra king'. The
larger the N (orM) is, the more a urate the estimates are. However, larger
N leads to higher omplexity; and smaller N leads to the `short' history
problem of MHT. That is, the di�erent hypotheses that are maintained
ea h time instant k di�er only in the most re ent N (k − (N + 1), . . . , k)
25
Chapter 4. Multi�traje tory estimation
asso iations and are identi al from time 1 to k−N . Therefore, there is only
one data asso iation sequen e maintained from time 1 to k − N and any
new information from future measurements annot be used to update the
data asso iation in those time instants.
4.2.3 Joint probabilisti data asso iation
Let us now shift our fo us to the other lass of algorithms whi h estimate
X by integrating out φ. The JPDA algorithm integrates out φ at ea h time
instant and performs moment mat hing to approximate the distribution
over X to a Gaussian density. That is, the possibly multi�modal density
p(X|Z) =∑φ
p(X, φ|Z) is approximated to a uni-modal density p̃(X|Z)where the in lusive KL divergen e KL(p(X|Z)||p̃(X|Z)) [68℄ is minimized.
Again, this algorithm is omputationally simpler than MHT, but when the
approximated posterior density is propagated a ross time, the performan e
degrades.
4.2.4 Probabilisti multiple hypothesis tra king
Another popular tra king algorithm is the probabilisti multiple hypothesis
tra king (PMHT), whi h aims to ompute the maximum a posteriori (MAP)
estimates of the entire sequen e of target states. The idea is to obtain these
estimates using the expe tation maximization (EM) algorithm, where the
solution is iterative and involves several lo al optimisations:
X̂(n+1) = argmaxX
∑
φ
Pr{φ|Z, X̂(n)} ln p(X, φ, Z). (4.1)
PMHT allows a target to be asso iated to multiple measurements, whi h
enables losed�form expressions to ompute the marginal asso iation prob-
abilities of the di�erent traje tories. However, this approximation is more
suitable for extended target models than the point target model assump-
tions presented in the beginning of this se tion. Paper III [69℄ of this thesis
also uses EM similar to PMHT to estimate the states; however, we retain
the point�target onstraints that a target is assigned to at most one mea-
surement and a measurement is assigned to at most one target.
4.3 EM for data asso iation
Expe tation maximization (EM), �rst dis ussed in [70, 71℄, is an iterative
te hnique, widely used to obtain approximate maximum�likelihood estima-
tion (MLE), or MAP estimates of parameters from observed data. In the
26
4.3. EM for data asso iation
EM solution, the model is assumed to have hidden variables that relate the
measurements with the states. This makes the posterior density analysis
simpler, whi h is otherwise intra table. In traje tory estimation, we are
interested in obtaining estimates of the state ve tor X . In this se tion, we
propose two versions of EM to obtain the state estimates using the joint
density p(X, φ, Z). To start with, we present a brief introdu tion to the EMalgorithm for MAP estimation.
To derive EM in its general form, we adhere to a notation that is ommon
in the EM literature. A ording to the notation, θ is the parameter to
be estimated, Z the observed data and γ the hidden variable. The MAP
estimation of θ is given by,
θ̂ = argmaxθp(θ|Z) = argmax
θln
∫p(θ, Z, γ)dγ. (4.2)
Note that the logarithm that has been introdu ed in the maximization is a
monotoni ally in reasing fun tion and does not a�e t the MAP estimation.
In many appli ations (in luding tra king, as will be shown), the integral
in the MAP estimation a ording to (4.2) is not always tra table. To get a
tra table approximation, qγ(γ) over the hidden variable γ is introdu ed in
the obje tive fun tion in (4.2):
ln p(θ, Z) = ln
∫qγ(γ)
p(θ, Z, γ)
qγ(γ)dγ (4.3)
≥∫qγ(γ) ln
p(θ, Z, γ)
qγ(γ)dγ (4.4)
, F(qγ(γ), θ). (4.5)
Jensen's inequality is used to go from (4.3) to (4.4). As an be observed,
the term F(qγ(γ), θ) on the right�hand side of (4.4) is a lower bound on the
logarithm of the joint density ln p(θ, Z) [71℄ and is a fun tional of qγ and θ.
In EM, this lower bound is in reased with iterations su h that the di�eren e
between the bound and the logarithm of the density is de reased [70℄. This
is a hieved by performing the following set of operations in ea h iteration
(n + 1):
q(n+1)γ (γ) = argmax
qγ(γ)F(qγ(γ), θ(n)) (4.6)
θ(n+1) = argmaxθF(q(n+1)
γ (γ), θ). (4.7)
The �rst step is alled the the E-step, where the best q(n+1)γ (γ) is omputed
given θ(n). The se ond step, alled the M-step, omputes the best θn given
27
Chapter 4. Multi�traje tory estimation
q(n+1)γ (γ).
In Paper III [69℄ and Paper IV [72℄, we have used two approa hes in EM
to obtain the estimates. In Paper III, we use the EM problem formulation
to estimate the state X dire tly from the joint density p(X, φ, Z). In other
words, we set θ = X and γ = φ. In Paper IV, we reverse the roles of X and
φ. That is, we use EM to estimate the data asso iation φ, from whi h we
an obtain the state estimates X immediately.
28
Chapter 5
Metri s
Metri s are important in multi-target tra king (MTT) for performan e eval-
uation and algorithm design. In essen e, metri s are ne essary to quantify
the loseness between a ground truth and an estimate thereof. When design-
ing metri s for MTT, there are spe i� hallenges that must be addressed,
su h as lo alisation error, error due to missed and false targets and penalty
for tra k swit hes.
In this hapter, we dis uss the need for metri s in MTT, followed by a
dis ussion on the basi metri properties. We also present a summary on
the ommonly used metri s, and brie�y dis uss the hallenges in designing
a metri for traje tory estimation.
5.1 Need for a metri
Metri s are needed in traje tory estimation for two main reasons: designing
algorithms and performan e evaluation. In algorithm design, one wants an
estimate that is lose to the true state in some sense. It is reasonable that a
metri is used to de�ne this loseness. When evaluating the performan e of
an algorithm, one needs a similarity measure to quantify the error between
the obtained estimates and the ground truth. On e again, it seems natural
that a metri is used to quantify this error.
In multi-target tra king (MTT), the estimation is often formulated as a
Bayesian �ltering problem where the ground truth is a random quantity and
the estimates depend deterministi ally on the observed data. For perfor-
man e evaluation, in many ases, we average over several realizations of the
data, so estimates are random as well. In both the s enarios we dis ussed
above�designing algorithms and performan e evaluation�the obje tives
29
Chapter 5. Metri s
involve an averaging of the metri over di�erent instan es of the random
quantities involved, i.e., the ground truth and the estimate. In other words,
the similarity between the ground truth and the estimate is omputed in
an average sense. It is again important that this similarity whi h involves
averaging is also a metri .
5.2 Metri properties
The de�nition of a metri varies slightly based on if the variables involved
are random or not. In this se tion, we summarize the properties of a metri
on general spa es and on probability spa es. We also dis uss the signi� an e
of these properties brie�y.
De�nition 5.1. A metri dA(·, ·) on a set A is a fun tion that satis�es the
following properties for any x, y, z ∈ A [73, Se . 2.15℄:
1. Non-negativity: dA(x, y) ≥ 0.
2. De�niteness: dA(x, y) = 0 ⇔ x = y.
3. Symmetry: dA(x, y) = dA(y, x).
4. Triangle inequality: dA(x, y) ≤ dA(x, z) + dA(z, y).
For metri s in a probability spa e A, the de�niteness between random
variables is in the almost�sure sense [74, Se . 2.2℄, as des ribed in the fol-
lowing de�nition.
De�nition 5.2. A metri dA(·, ·) on a set A is a fun tion that satis�es the
following properties for random variables x, y, z ∈ A:
1. Non-negativity: dA(x, y) ≥ 0.
2. De�niteness: dA(x, y) = 0 ⇔ Pr(x = y) = 1.
3. Symmetry: dA(x, y) = dA(y, x).
4. Triangle inequality: dA(x, y) ≤ dA(x, z) + dA(z, y).
In the above de�nition, Pr(x = y) = 1 implies that the event that x and ytake the same value has probability 1.
We now pro eed to des ribe and dis uss the signi� an e of these prop-
erties, with more emphasis on the triangle inequality property. The non-
negativity property ensures that the distan e annot be negative. The de�-
niteness property ensures that a distan e between a point to itself is 0. For
30
5.3. Common metri s
random variables, this property is in essen e ensured for all the points that
have non-zero probability. The symmetry property on�rms that the dis-
tan e from point x to y should be the same as the distan e from point y to x.
The triangle inequality property, despite its abstra tness, has a major
pra ti al importan e in algorithm assessment [75, Se . 6.2.1℄. Suppose there
are two estimates y and z for a ground truth x. Let us assume that a ording
to dA, the estimate z is lose to the ground truth x and is also lose to the
other estimate y. Then, a ording to intuition, the se ond estimate y should
also be lose to the ground truth x. This property is ensured by the triangle
inequality property. The triangle inequality also has pra ti al impli ations
to ensure the quality of approximate optimal estimators. Consider x to be
the ground truth, and z to be the optimal estimate, a ording to a ertain
riterion. Let us assume that it is di� ult to ompute the optimal estimate
z su h that we resort to an approximation y of the optimal z. This happens
often in pra ti e. If the triangle inequality does not hold, it would mean
that even if we have a good estimate y, lose to the optimal z, whi h in
turn is lose to the ground truth x, it is possible that the distan e from yto the ground truth x is high. This property is learly not desirable.
5.3 Common metri s
In this se tion, we dis uss some of the ommonly used metri s in the litera-
ture. One way of ategorizing the metri s is based on the kind of variables
involved. Below, we summarize the metri s used for ve tors, �nite sets of
ve tors and �nite sets of traje tories. Before we present the metri s, we �rst
dis uss in ea h se tion in whi h s enarios these kinds of metri s are useful.
5.3.1 On ve tor spa es
Metri on ve tor spa es is a well studied problem with the most ommon
one being the Eu lidean metri . The root mean square error (RMSE) is
based on the Eu lidean metri and is ommonly used when the involved
quantities are random. Below we present a slightly general version of the
RMSE.
De�nition 5.3. Given two ve tors x, y in RN, then the p-norm for any
1 ≤ p ≤ ∞ is a metri . It is de�ned as follows:
dp(x, y) ,p
√√√√N∑
i=1
|xi − yi|p. (5.1)
31
Chapter 5. Metri s
De�nition 5.4. If the ve tors x and y are random ve tors with joint dis-
tribution f(x, y), then the following de�nition is also a metri :
d̄p(x, y) ,p
√E[dp(x, y)p], (5.2)
where the expe tation is de�ned with respe t to the joint distribution f(x, y).When we set p = 2 in the above de�nition, we get the ommonly used RMSE
metri .
The Eu lidean metri is also ommonly used in the traje tory estimation
problem for simple s enarios. For instan e, in the single traje tory estima-
tion problem, where there is no un ertainty in the birth time and the death
time of the traje tory, then there is a one-to-one orresponden e between
the estimated state and the ground truth at ea h time instant. In this ase,
one an just use the metri for ve tors at ea h time instant.
5.3.2 On the spa e of �nite sets of ve tors
Let us now onsider s enarios with multiple targets, where we are inter-
ested in how good the lo alisation is at ea h time instant. In this ase, at
ea h time instant, both the ground truth and the estimates are sets of state
ve tors, where there is un ertainty about whi h ve tor in the estimated set
orresponds to whi h ve tor in the ground truth. Now, the quantity of in-
terest is a metri between sets of ve tors.
The study of the metri s in this spa e is relatively new. In the MTT lit-
erature, there are several metri s that have been proposed for this purpose,
su h as the Wasserstein metri [75,76℄, the Hausdor� metri [76℄, the OSPA
metri [77℄ and many more [78�83℄. Among these, the optimal sub-pattern
assignment (OSPA) metri is the most ommonly used one. The metri s
in [82℄ and [83℄ propose a base distan e in the metri that also takes into
the a ount the quality information about the estimated state. Below we
present the OSPA metri .
De�nition 5.5. Let d(·, ·) denote a metri on RNsu h that d(x, y) is the
distan e between x, y ∈ RNand let d(c)(x, y) = min(d(x, y), c) be the ut-o�
metri asso iated with d(x, y) [75, Se . 6.2.1℄. We also refer to d(c)(x, y)
as base distan e. Let Πn be the set of all permutations in {1, . . . , n} for anyn ∈ N. Any element π ∈ Πn is a sequen e (π(1), . . . , π(n)).
Let X = {x1, . . . , x|X|} and Y = {y1, . . . , y|Y |} be �nite subsets of a
bounded observation window W ⊂ RN, where |A| denotes the ardinality of
32
5.3. Common metri s
a set A. For 1 ≤ p <∞ and |X| ≤ |Y |, OSPA [77℄ is de�ned as
d(c)p (X, Y ) ,
1
|Y |
min
π∈Π|Y |
|X|∑
i=1
d(c)(xi, yπ(i))p + cp(|Y | − |X|)
1p
. (5.3)
For |X| > |Y |, d(c)p (X, Y ) = d(c)p (Y,X). The ∞-OSPA is de�ned as
d(c)∞ (X, Y ) ,
minπ∈Π|Y |
max1≤i≤|X|
d(c)(xi, yπ(i)) |X| = |Y |
c otherwise
. (5.4)
In Paper V [84℄, we dis uss the short omings of the above formulation.
We propose a new metri whi h addresses lo alisation error as well as missed
and false targets that are of interest in MTT. We also extend the metri to
ompute the equivalents of the RMSE metri for ve tors.
5.3.3 On the spa e of �nite sets of traje tories
In many tra king algorithms, su h as in multiple hypothesis tra king (MHT)
[27,46,47℄ and joint probabilisti data asso iation (JPDA) [49,50℄, the out-
put of the algorithm is not just the set of states at ea h time. Instead, the
output is a set of time sequen es of states, i.e., traje tories of states. Note
that the theory for sets of traje tories has been well established in [85℄. To
de�ne a metri between sets of traje tories, it is ommon to use the metri
dis ussed in Se tion 5.3.2 or a simpler modi� ation of it. But this strategy
produ es strange and ounter�intuitive behavior. Below, we dis uss some
of those approa hes and their short omings.
One approa h is to use OSPA on the entire sets of traje tories [86,87℄. In
this approa h, one uses the OSPA de�nition where the base metri between
two tra ks is de�ned. To dis uss the problems with this approa h, let us
onsider a new set of examples in Figures 5.1 and 5.2. The ground truth is
the traje tory shown in blue o's in both the �gures and the estimates are
the ones shown in red x's. A ording to the metri we just dis ussed, OSPA
pi ks the one with the minimum of the base distan e between the tra ks
in the ground truth and the tra ks in the estimates. Assuming ǫ is almost
0 and ρ is large, the OSPA distan e indi ates that both the estimates in
Figure 5.1 and 5.2 have the same distan e to the ground truth. This is
learly ounter�intuitive. The estimate in Figure 5.1 is learly a better one
ompared to the one in Figure 5.2. This undesirable behavior is due to
the property that OSPA assigns ea h tra k in the ground truth to exa tly
one tra k in the estimate, assuming the estimate has more tra ks than the
33
Chapter 5. Metri s
ground truth.
1 2 3 4 5 6
∆
∆+ ǫ
time
state
Figure 5.1: If ǫ is small, the estimate indi ated by red ×'s is still a good estimate
ompared to the one in Figure 5.2 for the ground truth in blue o's. The only problem
with this estimate is that it has split a single traje tory into two.
1 2 3 4 5 6
∆
∆+ ǫ
∆+ ρ
time
state
Figure 5.2: If ǫ is small and ρ is large, the estimate tra ks in red olor with×'s should havea larger distan e to the ground truth in blue o's ompared to the estimate in Figure 5.1.
It is ommon to dire tly use the OSPA metri on the set of states of
the traje tories at ea h time instant. A short oming of this approa h an
be illustrated with a simple example. Let us assume that in the ground
truth we have a traje tory of states, i.e., a time sequen e of states. Let us
onsider two di�erent version of the estimates:
34
5.3. Common metri s
• Estimate 1: we obtain a time sequen e of states exa tly identi al to
the one in the ground truth,
• Estimate 2: we obtain two time sequen es of states, one whi h ex-
a tly mat hes the �rst half of the tra k in the ground truth and the
se ond sequen e whi h exa tly mat hes the se ond half of the tra k
in the ground truth. (This orresponds to ǫ = 0 in the example in
Figure 5.1).
Using the strategy we just dis ussed, we get the exa t same value, 0, forthe `metri ' for both these estimates. This property learly violates the
de�niteness property we dis ussed in the beginning of the hapter.
A summary of the learnings from the above two approa hes is that you
an assign a tra k in the ground truth to di�erent tra ks at di�erent times,
but there should be an additional ost for being assigned to di�erent tra ks.
This we refer to as the swit hing ost. We have used this approa h in
Paper VI of the thesis. In the paper [88℄, we ompare our metri to some
of the other metri s [89, 90℄ in the literature that uses the same approa h.
35
Chapter 6
Contributions and future work
The main obje tives of the thesis are to design algorithms for addressing the
data asso iation problem in traje tory estimation and to design a metri to
evaluate the algorithms. The ontributions of ea h paper omprising the
thesis are dis ussed brie�y in this hapter. Furthermore, possible ideas for
future resear h that arose during the writing of this thesis are presented.
6.1 Contributions
In the following se tion, the ontributions of the six papers in the thesis,
and the relations between them, are presented.
Paper I
In this paper, the problem of forward-ba kward smoothing (FBS) of Gaus-
sian mixture (GM) densities based on merging is addressed. The existing
literature provides pra ti al algorithms for the FBS of GMs that are based
on pruning. The drawba k of a pruning strategy is that as a result of ex es-
sive pruning, the forward �ltering an result in degenera y. The ba kward
smoothing on this degenerate forward �lter an lead to highly underesti-
mated data asso iation un ertainties. To over ome this, we propose us-
ing merging of the GM during forward �ltering as well as during ba kward
smoothing. As mentioned before, the forward �lter based on merging is well
studied in the literature. A strategy to perform the ba kward smoothing
on �ltered densities with merged omponents is analysed and explained in
this paper. When ompared to FBS based on an N-s an pruning algorithm,
the two-�lter smoothed densities obtained using the presented approxima-
tions of the ba kward �lter show better tra k-loss, root mean squared error
(RMSE) and normalized estimation error squared (NEES) for lower om-
plexity.
37
Chapter 6. Contributions and future work
Paper II
The obje tive of this paper is to obtain an algorithm for two-�lter smooth-
ing (TFS) of GM densities based on merging approximations. The TFS
involves two �lters, namely the forward �lter and the ba kward �lter, where
the former has been studied extensively in the literature. The latter, i.e.,
the ba kward �lter, has a stru ture similar to a GM, but is not a normaliz-
able density. Therefore, the traditional Gaussian mixture redu tion (GMR)
algorithms annot be applied dire tly in the ba kward �lter. The existing
literature, though providing an analysis of the ba kward �lter, does not
present a strategy for the involved GMR. This paper presents two strate-
gies using whi h the Gaussian mixture redu tion (GMR) an be applied to
the ba kward �lter. The �rst one is an intragroup approximation method
whi h depends on the stru ture of the ba kward �lter, and presents a way
in whi h GMR an be applied within ertain groups of omponents. The
se ond method is a smoothed posterior pruning method, whi h is similar to
the pruning strategy for the (forward) �ltered densities dis ussed in [91℄. In
Paper I, the posterior pruning idea is formulated and proved to be a valid
operation for both the forward and the ba kward �lters. When ompared to
FBS based on an N-s an pruning algorithm, the two-�lter smoothed densi-
ties obtained using the presented approximations of the ba kward �lter are
shown to have better tra k-loss, RMSE and NEES for lower omplexity.
Paper III
This paper address the data asso iation problem in multiple traje tory es-
timation. The obje tive in this paper is to obtain the maximum aposteriori
(MAP) estimate of the state X from the joint density p(X, φ, Z). This prob-
lem is not tra table as the number of possibilities of φ is exponential. In
this paper, we address the problem using expe tation maximisation (EM)
to estimate X , while treating φ as a hidden variable. We show that state
estimates an be obtained by running an iterative algorithm, where in ea h
iteration, a Rau h-Tung Striebel (RTS) smoother is run for ea h target.
The measurements updates in the �lter and smoother is arried out with
the omposite measurements, whi h are weighted sums of the measurements
at ea h time instant. The weights, whi h are the marginal data asso iation
probabilities, are omputed using loopy belief propagation. We show in the
paper, that despite the simpli ity, the algorithm performan e is omparable
to a multiple hypothesis tra king (MHT) algorithm.
38
6.1. Contributions
Paper IV
The data asso iation problem is addressed in this paper by estimating the
data asso iation variable φ from the joint density p(X, φ, Z). On e φ is es-
timated, X is immediate to estimate using an RTS smoother. The strategy
is to use EM to estimate φ. This strategy results in an iterative algo-
rithm, where in ea h iteration, one runs an RTS smoother for ea h target.
The measurements for the RTS smoother are obtained using global nearest
neighbour (GNN) at ea h time. In the paper, we show that the algorithm
outperforms an MHT implementation in terms of mean optimal sub-pattern
assignment(OSPA) performan e.
Paper V
In this paper, we present a metri named generalised OSPA (GOSPA) to
ompute distan e between two sets of ve tors. We show that ompared to
the OSPA metri , our metri addresses the problem by penalising missed
and false targets, whereas OSPA penalises the ardinality mismat h. We
also show that the GOSPA metri an be extended to random �nite sets of
ve tors, whi h is relevant for performan e evaluation and algorithm design.
We show that given a joint distribution over two sets of ve tors, the mean
GOSPA and the root mean squared GOSPA are also metri s.
Paper VI
In this paper, we propose a metri based on multidimensional assignments
in the spa e of sets of traje tories. Besides the lo alisation ost, missed and
false targets [84℄, this metri also addresses the problems of tra k swit hes by
allowing a traje tory to be assigned to multiple traje tories a ross time, but
by penalising it for these multiple assignments. We introdu e the on epts
of half and full swit hes to quantify the penalty. As this multidimensional
assignment metri belongs to the NP hard lass of problems, we also propose
a lower bound for the metri , whi h is omputable in polynomial time using
linear programming (LP). We also show that this lower bound is a metri
in the spa e of sets of traje tories. From simulations, we have observed that
the lower bound omputed using LP often returns the optimal value for the
multidimensional metri . An e� ient way to ompute the LP metri using
alternating dire tion method of multipliers (ADMM) that s ales linearly
with time is also presented. We further adapt this metri to random sets of
traje tories.
39
Chapter 6. Contributions and future work
6.2 Future work
Besides the ideas and algorithms presented in the thesis, we also obtained
a plethora of ideas to investigate in the future. In this se tion, we present
and dis uss the ideas, whi h range from the extensions of GM smoothing to
more omplex s enarios than single-target linear Gaussian pro ess models,
to omputationally heaper GM merging methods and message passing in
generi graphs.
Merging algorithms
The TFS and FBS algorithms presented in the thesis are based on merging.
There are several methods, su h as Runnalls', Salmond's and variants of
these, whi h one an hoose for GM merging implementation. However,
the omputational omplexity of these methods is a serious limitation when
it omes to pra ti al implementations where GM merging is ne essary at
ea h time instant. In both redu tion algorithms, the merging ost must be
omputed for every pair of omponents, whi h involves expensive matrix
multipli ations. Therefore, the omplexity of these algorithms is quadrati ,
if not exponential, in the number of omponents, whi h is still expensive
onsidering predi tion, update and retrodi tion steps. For the results pre-
sented in the thesis, signi� ant amount of e�ort went into devising pra ti al
merging algorithms, whi h resulted in two strategies. One is a ombination
of Runnalls' and Salmond's algorithms, whi h is used in the forward �lter.
The other method is a modi�ed version of Salmond's algorithm. A possible
investigation an be in making fewer ost omputations than omputing the
ost for every pair (i, j). One way of redu ing the number of merging ost
omputations is by obtaining bounds on the ost fun tion. Suppose there
is an upper bound on the least possible ost. And suppose that for some
group of pairs of omponents, we an ompute a lower bound on the osts.
If the group's lower bound is greater than the upper bound on the lowest
ost, the ost omputation for the omponent pairs in the group an be
avoided. The hallenge is thus in obtaining the upper bound on the least
ost, and sele ting the group that an be eliminated. A loser analysis of
the ost fun tion is ne essary to obtain these bounds and a good hoi e of
groups.
Traje tory estimation with random birth and death events
In Paper III and Paper IV, it is assumed that all the tra ks are present
the whole bat h duration. It would be interesting to extend the approa h
to ases where the tra ks births and deaths happen at random times. One
40
6.2. Future work
exhaustive approa h is to onsider all possible birth and death time om-
binations for all the tra ks. Similar to the data asso iation problem, this
is also a ombinatorial problem. A more appealing approa h would be to
model these variables into the joint density and use EM to estimate the
birth and death time variables as well.
Online algorithms
The algorithms proposed in Papers I to IV are all bat h algorithms. Ex-
tending all these algorithms to online algorithms extends the s ope of these
algorithms. There are several possibilities to investigate here. For instan e,
one an use a sliding window approa h, where one an tune the width of
the window and also the overlap a ross the windows based on the appli a-
tion. Another approa h is to extend the idea of smoothed �ltering proposed
in [91℄. That is, to obtain the �ltered density p(xk|z1:k), one an go ba k
and improve all the approximations made at all the previous time instants.
This improvement in approximation an be implemented using an iterative
approa h in the papers.
Metri for sets of traje tories based on distan e-based
swit hing ost
In Paper VI, tra k swit hes are used to penalise when a traje tory in the
ground truth set is assigned to multiple traje tories in the estimate. In
the urrent version, the penalty for the tra k swit h is a �xed parameter.
But there an be s enarios when this penalty must be varied based on the
severity of the swit h. For instan e, onsider a pair of traje tories in the
ground truth that are lose to ea h other for ertain duration and then
move apart. Let us onsider two estimates for the ground truth. First is
an estimate with traje tories su h that the tra k swit h happens when the
traje tories in the ground truth are lose together. The se ond estimate has
traje tories su h that the tra k swit h happens when the two traje tories
in the ground truth are far apart. A ording to intuition, the �rst estimate
is better than the se ond, as the tra k swit h happens in a region where it
might be di� ult to resolve. This di�eren e should be possible to address
by de�ning a penalty for the tra k swit h that depends on the loseness
of traje tories in the ground truth and their orresponding traje tories in
the estimate. The major hallenge here an be in de�ning the penalty in
a onsistent way that still retains the metri properties. This would be an
interesting problem to investigate.
41
Referen es
[1℄ B. D. Anderson and J. B. Moore, Optimal �ltering. Courier Dover
Publi ations, 2012.
[2℄ R. E. Kalman, �A new approa h to linear �ltering and predi tion prob-
lems,� Journal of basi Engineering, vol. 82, no. 1, pp. 35�45, 1960.
[3℄ W. Ko h and F. Govaers, �On a umulated state densities with appli a-
tions to out-of-sequen e measurement pro essing,� IEEE Transa tions
on Aerospa e and Ele troni Systems, vol. 47, no. 4, pp. 2766�2778,
2011.
[4℄ H. E. Rau h, C. Striebel, and F. Tung, �Maximum likelihood estimates
of linear dynami systems,� AIAA journal, vol. 3, no. 8, pp. 1445�1450,
1965.
[5℄ M. Briers, A. Dou et, and S. Maskell, �Smoothing algorithms for
state�spa e models,� Annals of the Institute of Statisti al Mathemati s,
vol. 62, no. 1, pp. 61�89, 2010.
[6℄ K. Ito and K. Xiong, �Gaussian �lters for nonlinear �ltering problems,�
IEEE Transa tions on Automati Control, vol. 45, no. 5, pp. 910�927,
2000.
[7℄ S. J. Julier, J. K. Uhlmann, and H. F. Durrant-Whyte, �A new ap-
proa h for �ltering nonlinear systems,� in Pro eedings of the Ameri an
Control Conferen e, vol. 3. IEEE, 1995, pp. 1628�1632.
[8℄ S. J. Julier and J. K. Uhlmann, �Uns ented �ltering and nonlinear
estimation,� Pro eedings of the IEEE, vol. 92, no. 3, pp. 401�422, 2004.
[9℄ I. Arasaratnam, S. Haykin, and R. J. Elliott, �Dis rete-time nonlinear
�ltering algorithms using Gauss�Hermite quadrature,� Pro eedings of
the IEEE, vol. 95, no. 5, pp. 953�977, 2007.
[10℄ I. Arasaratnam and S. Haykin, �Cubature Kalman �lters,� IEEE Trans-
a tions on Automati Control, vol. 54, no. 6, pp. 1254�1269, 2009.
43
Referen es
[11℄ A. Dou et and A. M. Johansen, �A tutorial on parti le �ltering and
smoothing: Fifteen years later,� Handbook of Nonlinear Filtering,
vol. 12, pp. 656�704, 2009.
[12℄ W. K. Hastings, �Monte Carlo sampling methods using Markov hains
and their appli ations,� Biometrika, vol. 57, no. 1, pp. 97�109, 1970.
[13℄ A. Smith, A. Dou et, N. de Freitas, and N. Gordon, Sequential Monte
Carlo methods in pra ti e. Springer S ien e & Business Media, 2013.
[14℄ A. Dou et, S. Godsill, and C. Andrieu, �On sequential Monte Carlo
sampling methods for Bayesian �ltering,� Statisti s and omputing,
vol. 10, no. 3, pp. 197�208, 2000.
[15℄ M. S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, �A tu-
torial on parti le �lters for online nonlinear/non�Gaussian Bayesian
tra king,� IEEE Transa tions on Signal Pro essing, vol. 50, no. 2, pp.
174�188, 2002.
[16℄ N. Gordon, B. Risti , and S. Arulampalam, Beyond the Kalman �lter:
Parti le �lters for tra king appli ations. Arte h House, London, 2004.
[17℄ D. Fraser and J. Potter, �The optimum linear smoother as a ombina-
tion of two optimum linear �lters,� IEEE Transa tions on Automati
Control, vol. 14, no. 4, pp. 387�390, 1969.
[18℄ A. H. Jazwinski, Sto hasti pro esses and �ltering theory. Courier
Dover Publi ations, 2007.
[19℄ Á. F. Gar ía-Fernández, L. Svensson, M. Morelande, and S. Särkkä,
�Posterior linearization �lter: Prin iples and implementation using
sigma points,� IEEE Transa tions on Signal Pro essing, vol. 63, no. 20,
2015.
[20℄ S. S. Haykin, S. S. Haykin, and S. S. Haykin, Kalman �ltering and
neural networks. Wiley Online Library, 2001.
[21℄ E. A. Wan, R. Van Der Merwe, and A. T. Nelson, �Dual estimation
and the uns ented transformation.� in NIPS, 1999, pp. 666�672.
[22℄ S. Sarkka, �Uns ented Rau h�Tung�Striebel smoother,� IEEE Trans-
a tions on Automati Control, vol. 53, no. 3, pp. 845�849, 2008.
[23℄ C. Andrieu, A. Dou et, and R. Holenstein, �Parti le Markov hain
Monte Carlo methods,� Journal of the Royal Statisti al So iety: Series
B (Statisti al Methodology), vol. 72, no. 3, pp. 269�342, 2010.
44
Referen es
[24℄ G. Kitagawa, �Monte Carlo �lter and smoother for non Gaussian non-
linear state spa e models,� Journal of omputational and graphi al
statisti s, vol. 5, pp. 1�25, 1996.
[25℄ P. Fearnhead, D. Wyn oll, and J. Tawn, �A sequential smoothing algo-
rithm with linear omputational ost,� Biometrika, vol. 97, no. 2, pp.
447�464, 2010.
[26℄ S. J. Godsill, A. Dou et, and M. West, �Monte Carlo smoothing for
nonlinear time series,� Journal of the Ameri an Statisti al Asso iation,
vol. 99, no. 465, 2004.
[27℄ S. S. Bla kman and R. Popoli, Design and analysis of modern tra king
systems. Arte h House, 1999, vol. 685.
[28℄ K. G. Murty, �Letter to the editor: An algorithm for ranking all the
assignments in order of in reasing ost,� Operations Resear h, vol. 16,
no. 3, pp. 682�687, 1968.
[29℄ I. J. Cox and S. L. Hingorani, �An e� ient implementation of Reid's
multiple hypothesis tra king algorithm and its evaluation for the pur-
pose of visual tra king,� IEEE Transa tions on Pattern Analysis and
Ma hine Intelligen e, vol. 18, no. 2, pp. 138�150, 1996.
[30℄ I. J. Cox and M. L. Miller, �On �nding ranked assignments with ap-
pli ation to multitarget tra king and motion orresponden e,� IEEE
Transa tions on Aerospa e and Ele troni Systems, vol. 31, no. 1, pp.
486�489, 1995.
[31℄ D. Salmond, �Mixture redu tion algorithms for point and extended
obje t tra king in lutter,� IEEE Transa tions on Aerospa e and Ele -
troni Systems, vol. 45, no. 2, pp. 667�686, 2009.
[32℄ D. J. Salmond, �Mixture redu tion algorithms for un ertain tra king,�
DTIC Do ument, Te h. Rep., 1988.
[33℄ ��, �Mixture redu tion algorithms for target tra king,� in IEE Col-
loquium on State Estimation in Aerospa e and Tra king Appli ations.
IET, 1989, pp. 7�1.
[34℄ A. R. Runnalls, �Kullba k-Leibler approa h to Gaussian mixture redu -
tion,� IEEE Transa tions on Aerospa e and Ele troni Systems, vol. 43,
no. 3, pp. 989�999, 2007.
45
Referen es
[35℄ J. L. Williams and P. S. Maybe k, �Cost-fun tion-based Gaussian mix-
ture redu tion for target tra king,� in Pro eedings of the 6th Interna-
tional Conferen e on Information Fusion, vol. 2, 2003, pp. 1047�1054.
[36℄ D. Crouse, P. Willett, K. Pattipati, and L. Svensson, �A look at Gaus-
sian mixture redu tion algorithms,� in Pro eedings of the 14th Interna-
tional Conferen e on Information Fusion, 2011, pp. 1�8.
[37℄ D. J. Salmond, �Mixture redu tion algorithms for target tra king in
lutter,� in Pro eedings of SPIE: Signal and Data Pro essing of Small
Targets Conferen e. International So iety for Opti s and Photoni s,
1990, pp. 434�445.
[38℄ D. J. Petru i, �Gaussian mixture redu tion for Bayesian target
tra king in lutter,� Master's thesis, Air For e Institute of Te hnology,
2005. [Online℄. Available: http://www.dti .mil/dti /tr/fulltext/u2/
a443588.pdf
[39℄ J. L. Williams, �Gaussian mixture redu tion for tra king multiple ma-
neuvering targets in lutter,� Master's thesis, Air For e Institute of
Te hnology, 2003.
[40℄ S. Kullba k and R. A. Leibler, �On information and su� ien y,� The
Annals of Mathemati al Statisti s, vol. 22, no. 1, pp. 79�86, 1951.
[41℄ M. F. Huber and U. D. Hanebe k, �Progressive Gaussian mixture re-
du tion,� in Pro eedings of the 11th International Conferen e on Infor-
mation Fusion. IEEE, 2008, pp. 1�8.
[42℄ D. W. S ott, �Parametri statisti al modeling by minimum integrated
square error,� Te hnometri s, vol. 43, no. 3, pp. 274�285, 2001.
[43℄ D. W. S ott and W. F. Szew zyk, �From kernels to mixtures,� Te hno-
metri s, vol. 43, no. 3, pp. 323�335, 2001.
[44℄ A. S. Rahmathullah, L. Svensson, and D. Svensson, �Merging-based
forward-ba kward smoothing on Gaussian mixtures,� in Pro eedings of
the 17th International Conferen e on Information Fusion. IEEE, 2014,
pp. 1�8.
[45℄ ��, �Two-�lter Gaussian mixture smoothing with posterior pruning,�
in Pro eedings of the 17th International Conferen e on Information
Fusion. IEEE, 2014, pp. 1�8.
[46℄ D. Reid, �An algorithm for tra king multiple targets,� IEEE Transa -
tions on Automati Control, vol. 24, no. 6, pp. 843�854, 1979.
46
Referen es
[47℄ T. Kurien and M. Liggins, �Report-to-target assignment in multisensor
multitarget tra king,� in Pro eedings of the 27th IEEE Conferen e on
De ision and Control. IEEE, 1988, pp. 2484�2488.
[48℄ O. E. Drummond, �Multiple target tra king with multiple frame, prob-
abilisti data asso iation,� in Opti al Engineering and Photoni s in
Aerospa e Sensing. International So iety for Opti s and Photoni s,
1993, pp. 394�408.
[49℄ T. E. Fortmann, Y. Bar-Shalom, and M. S he�e, �Multi-target tra k-
ing using joint probabilisti data asso iation,� in IEEE Conferen e on
De ision and Control in luding the Symposium on Adaptive Pro esses,
vol. 19. IEEE, 1980, pp. 807�812.
[50℄ K.-C. Chang and Y. Bar-Shalom, �Joint probabilisti data asso iation
for multitarget tra king with possibly unresolved measurements and
maneuvers,� IEEE Transa tions on Automati Control, vol. 29, no. 7,
pp. 585�594, 1984.
[51℄ L. Svensson, D. Svensson, M. Guerriero, and P. Willett, �Set JPDA �l-
ter for multitarget tra king,� IEEE Transa tions on Signal Pro essing,
vol. 59, no. 10, pp. 4677�4691, 2011.
[52℄ L. Y. Pao, �Multisensor multitarget mixture redu tion algorithms for
tra king,� Journal of Guidan e, Control, and Dynami s, vol. 17, no. 6,
pp. 1205�1211, 1994.
[53℄ Y. Bar-Shalom, F. Daum, and J. Huang, �The probabilisti data as-
so iation �lter,� Control Systems Magazine, vol. 29, no. 6, pp. 82�100,
2009.
[54℄ J. L. Williams and R. A. Lau, �Data asso iation by loopy belief propa-
gation,� in Pro eedings of the 13th International Conferen e on Infor-
mation Fusion, 2010.
[55℄ J. Williams and R. Lau, �Approximate evaluation of marginal asso-
iation probabilities with belief propagation,� IEEE Transa tions on
Aerospa e and Ele troni Systems, vol. 50, no. 4, pp. 2942�2959, 2014.
[56℄ R. L. Streit and T. E. Luginbuhl, �Probabilisti multi-hypothesis tra k-
ing,� DTIC Do ument, Te h. Rep., 1995.
[57℄ P. Willett, Y. Ruan, and R. Streit, �PMHT: Problems and some
solutions,� IEEE Transa tions on Aerospa e and Ele troni Systems,
vol. 38, no. 3, pp. 738�754, 2002.
47
Referen es
[58℄ D. F. Crouse, M. Guerriero, and P. Willett, �A riti al look at theP-
MHT,� Journal of Advan es in Information Fusion, vol. 4, no. 2, pp.
93�116, 2009.
[59℄ K. J. Molnar and J. W. Modestino, �Appli ation of the EM algorithm
for the multitarget/multisensor tra king problem,� IEEE Transa tions
on Signal Pro essing, vol. 46, no. 1, pp. 115�129, 1998.
[60℄ S. Oh, S. Russell, and S. Sastry, �Markov hain Monte Carlo data as-
so iation for general multiple-target tra king problems,� in Pro eedings
of the IEEE Conferen e on De ision and Control, 2004.
[61℄ H. W. Kuhn, �Variants of the Hungarian method for assignment prob-
lems,� Naval Resear h Logisti s Quarterly, vol. 3, no. 4, pp. 253�258,
1956.
[62℄ D. P. Bertsekas, �The au tion algorithm: A distributed relaxation
method for the assignment problem,� Annals of operations resear h,
vol. 14, no. 1, pp. 105�123, 1988.
[63℄ R. Jonker and A. Volgenant, �A shortest augmenting path algorithm
for dense and sparse linear assignment problems,� Computing, vol. 38,
no. 4, pp. 325�340, 1987.
[64℄ S. Deb, M. Yeddanapudi, K. Pattipati, and Y. Bar-Shalom, �A gen-
eralized SD assignment algorithm for multisensor-multitarget state es-
timation,� IEEE Transa tions on Aerospa e and Ele troni Systems,
vol. 33, no. 2, pp. 523�538, 1997.
[65℄ A. B. Poore, A. J. Robertson III, and P. J. Shea, �New lass of
Lagrangian-relaxation-based algorithms for fast data asso iation in
multiple hypothesis tra king appli ations,� in SPIE Symposium on
OE/Aerospa e Sensing and Dual Use Photoni s. International So-
iety for Opti s and Photoni s, 1995, pp. 184�194.
[66℄ R. L. Popp, K. R. Pattipati, and Y. Bar-Shalom, �m-best SD assign-
ment algorithm with appli ation to multitarget tra king,� IEEE Trans-
a tions on Aerospa e and Ele troni Systems, vol. 37, no. 1, pp. 22�39,
2001.
[67℄ I. J. Cox and M. L. Miller, �A omparison of two algorithms for de-
termining ranked assignments with appli ation to multitarget tra k-
ing and motion orresponden e,� IEEE Transa tions on Aerospa e and
Ele troni Systems, vol. 33, no. 1, pp. 295�301, 1997.
48
Referen es
[68℄ T. Minka, �Divergen e measures and message passing,� Te hni al re-
port, Mi rosoft Resear h, Te h. Rep., 2005.
[69℄ A. S. Rahmathullah, A. F. Gar ía-Fernández, and L. Svensson, �A
bat h algorithm for estimating traje tories of point targets using expe -
tation maximization,� To bepublished in IEEE Transations on Signal
pro essing, 2016.
[70℄ A. P. Dempster, N. M. Laird, and D. B. Rubin, �Maximum likelihood
from in omplete data via the EM algorithm,� Journal of the royal sta-
tisti al so iety. Series B (methodologi al), pp. 1�38, 1977.
[71℄ M. J. Beal, �Variational algorithms for approximate Bayesian infer-
en e,� Ph.D. dissertation, University of London, 2003.
[72℄ A. S. Rahmathullah, A. F. Gar ía-Fernández, and L. Svensson, �Ex-
pe tation maximization with hidden state variables for multi-target
tra king,� To be submitted for publi ation to IEEE Transations on Sig-
nal pro essing, 2016.
[73℄ W. Rudin, Prin iples of mathemati al analysis. M Graw-Hill New
York, 1964, vol. 3.
[74℄ S. T. Ra hev, L. Klebanov, S. V. Stoyanov, and F. Fabozzi, The meth-
ods of distan es in the theory of probability and statisti s. Springer
S ien e & Business Media, 2013.
[75℄ R. P. Mahler, Advan es in statisti al multisour e-multitarget informa-
tion fusion. Arte h House, 2014.
[76℄ J. R. Ho�man and R. P. Mahler, �Multitarget miss distan e via optimal
assignment,� IEEE Transa tions on Systems, Man and Cyberneti s -
Part A: Systems and Humans,, vol. 34, no. 3, pp. 327�336, 2004.
[77℄ D. S huhma her, B.-T. Vo, and B.-N. Vo, �A onsistent metri for
performan e evaluation of multi-obje t �lters,� IEEE Transa tions on
Signal Pro essing, vol. 56, no. 8, pp. 3447�3457, 2008.
[78℄ B. E. Fridling and O. E. Drummond, �Performan e evaluation methods
for multiple-target-tra king algorithms,� in Orlando'91, Orlando, FL.
International So iety for Opti s and Photoni s, 1991, pp. 371�383.
[79℄ O. E. Drummond and B. E. Fridling, �Ambiguities in evaluating perfor-
man e of multiple target tra king algorithms,� in Aerospa e Sensing.
International So iety for Opti s and Photoni s, 1992, pp. 326�337.
49
Referen es
[80℄ O. Fujita, �Metri s based on average distan e between sets,� Japan
Journal of Industrial and Applied Mathemati s, vol. 30, no. 1, pp. 1�
19, 2013.
[81℄ A. Gardner, J. Kanno, C. Dun an, and R. Selmi , �Measuring distan e
between unordered sets of di�erent sizes,� in Pro eedings of the IEEE
Conferen e on Computer Vision and Pattern Re ognition, 2014, pp.
137�143.
[82℄ X. He, R. Tharmarasa, T. Kirubarajan, and T. Thayaparan, �A tra k
quality based metri for evaluating performan e of multitarget �lters,�
IEEE Transa tions on Aerospa e and Ele troni Systems, vol. 49, no. 1,
pp. 610�616, 2013.
[83℄ S. Nagappa, D. E. Clark, and R. Mahler, �In orporating tra k un er-
tainty into the OSPA metri ,� in Pro eedings of the 14th International
Conferen e on Information Fusion. IEEE, 2011, pp. 1�8.
[84℄ A. S. Rahmathullah, A. F. Gar ía-Fernández, and L. Svensson,
�Generalized optimal sub-pattern assignment metri ,� Submitted for
publi ation to IEEE Transations on Signal pro essing, 2016. [Online℄.
Available: http://arxiv.org/pdf/1601.05585v1.pdf
[85℄ L. Svensson and M. Morelande, �Target tra king based on estimation of
sets of traje tories,� in Pro eedings of the 17th International Conferen e
on Information Fusion. IEEE, 2014.
[86℄ B. Risti , B.-N. Vo, and D. Clark, �Performan e evaluation of multi-
target tra king using the OSPA metri ,� in Pro eedings of the 13th
International Conferen e on Information Fusion, 2010.
[87℄ B. Risti , B.-N. Vo, D. Clark, and B.-T. Vo, �A metri for performan e
evaluation of multi-target tra king algorithms,� IEEE Transa tions on
Signal Pro essing, vol. 59, no. 7, pp. 3452�3457, 2011.
[88℄ A. S. Rahmathullah, A. F. Gar ía-Fernández, and L. Svensson, �A
metri on the spa e of �nite sets of traje tories for evaluation of multi-
target tra king algorithms,� Submitted for publi ation to IEEE Tran-
sations on Signal pro essing, 2016.
[89℄ T. Vu and R. Evans, �A new performan e metri for multiple target
tra king based on optimal subpattern assignment,� in Pro eedings of
the 17th International Conferen e on Information Fusion. IEEE, 2014.
50
Referen es
[90℄ J. Bento, �A metri for sets of traje tories that is pra ti al and math-
emati ally onsistent,� arXiv preprint arXiv:1601.03094, 2016.
[91℄ A. S. Rahmathullah, L. Svensson, D. Svensson, and P. Willett,
�Smoothed probabilisti data asso iation �lter,� in Pro eedings of the
16th International Conferen e on Information Fusion. IEEE, 2013,
pp. 1296�1303.
51