ESE 524ESE 524 Detection and Estimation Theory€¦ · ESE 524ESE 524 Detection and Estimation...
Transcript of ESE 524ESE 524 Detection and Estimation Theory€¦ · ESE 524ESE 524 Detection and Estimation...
ESE 524ESE 524Detection and Estimation Theoryy
Joseph A. O’SullivanSamuel C. Sachs Professor
Electronic Systems and Signals Research LaboratoryResearch Laboratory
Electrical and Systems EngineeringWashington University
211 Urbauer Hall314-935-4173 (Lynda answers)
jao@wustl eduJ. A. O'S. ESE 524, Lecture 2, 01/15/09 1
D t ti Th rDetection Theory Which model best fits the measured data? Which model best fits the measured data? Decision or detection or hypothesis testing Principled approach Principled approach
Define the concept of a model Define the concept of a decisionp Define an objective function to be minimized Given these, derive an optimal decision Quantify the performance (as a function of
parameters)
J. A. O'S. ESE 524, Lecture 2, 01/15/09 2
E pl Thr i D rtExample: Throwing Darts Suppose that when people throw darts at a blackboard,
h i di ib i f h l i f h dthere is a distribution of the locations of the darts. The first individual is very good and his darts land in the
vicinity of the target, with a circularly symmetric distributiondistribution.
The second individual has a systematic error that he does not seem to be able to correct. His darts land in a circularly symmetric distribution offset from the truth.y
The third individual has no systematic error, but his darts, while still landing according to a circularly symmetric distribution around the truth, are more widespread than the fi t i di id lfirst individuals.
Problem 1: Individual 1 or 2 Problem 2: Individual 1 or 3
J. A. O'S. ESE 524, Lecture 2, 01/15/09 3
Other problems
Example: Thr i D rt 2
4
6
Throwing Darts These problems are still not -4
-2
0
well posed. Need additional assumptions such as:
A1: The distributions are two- -10 -8 -6 -4 -2 0 2 4 6 8-10
-8
-6
dimensional Gaussian distributions around the truth with known covariance matrix. 5
6
A2: All throws are independent of all other throws.
A3: The covariance matrix is 1
2
3
4
diagonal with equal variance in each direction
-2
-1
0
J. A. O'S. ESE 524, Lecture 2, 01/15/09 4
-3 -2 -1 0 1 2 3 4 5 6-3
Matlab Code f Pl t 2
4
6
for Plotsfunction errf=dartslecture2errf 0;
-4
-2
0
errf=0;x=randn(2,100);y=2*ones(2,100)+randn(2,100);z=3*randn(2,100); -10 -8 -6 -4 -2 0 2 4 6 8
-10
-8
-6
z 3 randn(2,100);figureplot(x(1,:),x(2,:),'bx')hold on 5
6
Can decide among hypotheses if more “throws” (data) are observed
plot(y(1,:),y(2,:),'ro')plot(z(1,:),z(2,:),'gv')axis equal;figure 1
2
3
4
figureplot(x(1,:),x(2,:),'bx')hold onplot(2+y(1,:),2+y(2,:),'ro')
-2
-1
0
p ( y( , ), y( , ), )axis equal;errf=1; J. A. O'S. ESE 524, Lecture 2, 01/15/09 5
-3 -2 -1 0 1 2 3 4 5 6-3
Probability Density Functions: M l b M h PlMatlab Mesh Plots
J. A. O'S. ESE 524, Lecture 2, 01/15/09 6
M tl b C d f r pdf C p t tiMatlab Code for pdf Computationfunction errf=dartslecture2pdferrf=0;x=-5:.1:5;[x,y]=meshgrid(x);p1=exp( (x ^2+y ^2)/2)/(2*pi);p1=exp(-(x.^2+y.^2)/2)/(2*pi);p2=exp(-((x-2).^2+(y-2).^2)/2)/(2*pi);p3=exp(-(x.^2+y.^2)/18)/(18*pi);figuregmesh(x,y,p1)figuremesh(x,y,p2)figuremesh(x,y,p3)figuremesh(x y max(max(p1 p2) p3))mesh(x,y,max(max(p1,p2),p3))errf=1;
J. A. O'S. ESE 524, Lecture 2, 01/15/09 7
E l Th i (O Di ) DExample: Throwing (One Dim) Darts Suppose that one of two possible known signals pp p g
is transmitted during the interval [0,T]: s(t) or –s(t)
Th i l i d i hit G i i The signal is measured in white Gaussian noise:
{ }( ) ( ) ( ), 0
1 1r t a Es t w t t Ta
= + ≤ ≤∈ + −
The receiver first computes the integral of r(t)times s(t) (unit energy) over the interval to get
{ }1, 1a ∈ +
0(0, / 2)r a E nn N
= +N
J. A. O'S. ESE 524, Lecture 2, 01/15/09 8
Problem: a=+1 or a=-1
E pl Thr i 2D D rtExample: Throwing 2D Darts In QAM (quadrature amplitude modulation), one of four
signals is sent over [0,T]:
( ) ( ) ( ) ( )2 2 2 2( ) cos , cos , sin , cosc c c cs t t t t tT T T T
ω ω ω ω ∈ − −
Assume measurements are in white Gaussian noise Assuming the cosine and sine are orthogonal over the
interval the measurements are integrated against each to
T T T T
interval, the measurements are integrated against each to get two measurements.
Problem: which of four signals was sent?( ) ( ) ( ) 0r t Es t w t t T= + ≤ ≤ 1 1 0 0Ir −
0 0
( ) ( ) ( ), 0
2 2( ) cos( ) ( ) cos( )
2 2
T T
I c c I
T T
r t Es t w t t T
r r t t dt E s t t dt nT T
ω ω
= + ≤ ≤
= = + 0
1 1 0 0, , ,
0 0 1 1
(0, / 2)
I
Q
I
Er
n N
∈ −
N
J. A. O'S. ESE 524, Lecture 2, 01/15/09 90 0
2 2( ) sin( ) ( ) sin( )Q c c Qr r t t dt E s t t dt nT T
ω ω= = + 0(0, / 2)Q
I Q
n N
n n⊥
N
Finite DimensionalBi D i ThBinary Detection Theory Problem: Given a measurement of a random
t d id hi h f t d l it i dvector r, decide which of two models it is drawn from.
Source DecisionSystem
Source produces H0 or H1 probabilities P0 and
y
H0 or H1
r H0 or H1
Source produces H0 or H1 probabilities P0 and P1=1-P0. Probabilities may not be known.
Random measurement vector results. Transition probabilities (system) are often assumed known.p ( y )
Decision rule is a partition of measurement space into two sets Z0 and Z1 corresponding to decisions H0 and H1.
J. A. O'S. ESE 524, Lecture 2, 01/15/09 10
0 1
O t B Crit riOutcomes, Bayes Criterion There are four possible outcomes:
Decide H0 and H1 is true: P1(1-PD) Decide H1 and H1 is true: P1PD Decide H0 and H0 is true: P0(1-PF)
D id H d H i t P P Decide H1 and H0 is true: P0PF
Probability of False Alarm: PF Decide H1 given H0 is true
P b bili f D i P Probability of Detection: PD Decide H1 given H1 is true
Probability of Miss: PM=1-PD Decide H0 given H1 is true
Error probabilities equal integrals of conditional (or transition) probability density functions over d i i i
J. A. O'S. ESE 524, Lecture 2, 01/15/09 11
decision regions
B Crit riBayes Criterion Assumptions:
Source prior probabilities P0 and P1 are known Transition densities are known There are costs of decision outcomes and these are
known: Cij cost of deciding Hi and Hj is true C01- C11 >0 is the relative cost of a miss C10- C00 >0 is the relative cost of a false alarm10 00
Bayes decision criterion: Pick the decision rule to minimize the average risk (cost)
(1 ) (1 )C P P C P P C P P C PP+ + +R
( ) ( )00 0 01 1 10 0 11 1
01 11 1 10 00 0 00 0 11 1
(1 ) (1 )(1 )
(1 ) Simplified notation
F D F D
D F
C P P C P P C P P C PPC C P P C C P P C P C P
C P P C P P
= − + − + += − − + − + += − + ←
R
J. A. O'S. ESE 524, Lecture 2, 01/15/09 12
1 0
00 0 11 1
(1 ) Simplified notationM D F FC P P C P PC P C P
= − + ←+ +
R ltResults Theorem: Likelihood ratio test is optimal Theorem: Likelihood ratio test is optimal Corrolary: The log-likelihood ratio test is
optimalopt a For proofs, see notes and text Example: deterministic signal in Gaussian Example: deterministic signal in Gaussian
noise
J. A. O'S. ESE 524, Lecture 2, 01/15/09 13
D i i R lDecision RulesCriterion Transition
DensitiesPriorsP & P
CostsCDensities P0 & P1 Cij
Bayes: mimize expected risk or cost Yes Yes Yes
Minimum ProbabilityMinimum Probability of Error Yes Yes No
Minimax Yes No Yes
Neyman-Pearson Yes No No
1 0M M F FP P C P P= +R C The entire space is divided into two regions: Decide H0 or H1
1
0
| 1( | )M HP p H d= r R RZ
The integrals of the probability density functions determine the probabilities of errorIntegrate over the “other” or “wrong”
J. A. O'S. ESE 524, Lecture 4, 01/25/07 140
1
| 0( | )F HP p H d= r R RZ
Integrate over the other or wrong region
B Crit riBayes Criterion Define the two decision regions. Minimize Bayes risk over
the decision regions. Every point in the space must be included in one integral or
the other Put a point in a region if it contributes less to that integral
than to the other Introduce new notation for the result of the comparisonp
1 0
( | ) ( | )M M F FP P C P P
P H d C P H d
= +
+ R R R R
R C
C1 0
0 1
1 | 1 0 | 0( | ) ( | )
( | ) ( | )
M H F HP p H d C P p H d
P p H C P p H
= + r rR R R R
R RZ Z
C
CJ. A. O'S. ESE 524, Lecture 2, 01/15/09 15
1 01 | 1 0 | 0( | ) ( | )M H F HP p H C P p Hr rR RC
B Crit riBayes Criterion Compare and 11 | 1( | )M HP p Hr RC
00 | 0( | )F HC P p Hr R
Introduce new notation for the result of the comparison If the left side is bigger, decide H1
If the right side is bigger, decide H0
For continuous pdfs, do not worry about equality
1 0M M F FP P C P P= +R C
1 0
0 1
1 0
1 | 1 0 | 0( | ) ( | )M M F F
M H F HP p H d C P p H d= + r rR R R RZ Z
C0 1
1
1 01 | 1 0 | 0( | ) ( | )H
M H F HP p H C P p H><r rR RC
J. A. O'S. ESE 524, Lecture 2, 01/15/09 16
1 0
0H<
Bayes Criterion: Derivation of Lik lih d R ti T tLikelihood Ratio Test Compare and 11 | 1( | )M HP p Hr RC
00 | 0( | )F HC P p Hr R
There are several equivalent ways to compare these values including the likelihood ratio and the log-likelihood ratio
1H
>1 0
0
1 | 1 0 | 0( | ) ( | )M H F HH
H
P p H C P p H><r rR RC
( )1
1
0 0
| 1 0
| 0 1
( | )( | )
HH F
H MH
p H C Pp H P
η>Λ <r
r
RR
R
C
( )
0 0
1
1
|
| 1 0( | )
ln ln( | )
H
HH F
p H C PlH P
γ > <
r RR
R
CJ. A. O'S. ESE 524, Lecture 2, 01/15/09 17
( )0 0
| 0 1( | )H MHp H P < r R C
Bayes Criterion: Computation of Err r Pr b bilitiError Probabilities The likelihood ratio is a nonnegative-valued g
function of the observed data. Evaluated at a random realization of the data, the
lik lih d ti i d i bllikelihood ratio is a random variable. Comparing the likelihood ratio to a threshold is
the optimal test.the optimal test. The probabilities of error can be computed in
terms of the probability density functions for ith th lik lih d ti th l lik lih deither the likelihood ratio or the log-likelihood
ratio.
J. A. O'S. ESE 524, Lecture 2, 01/15/09 18
Opti l D i i R lOptimal Decision Rules Likelihood Ratio Test Threshold for Bayes Rule:
0 10 00 0( )( )
FC P C C PC P C C P
η −= =−
Computations of the error probabilities
( )1
1| 1( | )ln ln
H
Hp Hl η
>
r RR
1 01 11 1( )MC P C C P−
( )1
1| 1( | )H
Hp Hη>
Λ r RR ( ) 1
00
|
| 0
ln ln( | )
( | )
HH
lp H
P L H dLγ
η= <
r
RR
( ) 1
00
|
| 0( | )
( | )
HH
p H
P H dη
ηΛ =<
Λ Λ
r
RR
1| 1
| 0
( | )
( | )
M l H
F l H
P p L H dL
P p L H dL
−∞∞
=
=
10
0
( | )
( | )
M
F
P p H d
P p H d∞
= Λ Λ
= Λ Λ
J. A. O'S. ESE 524, Lecture 4, 01/25/07 19
0| 0( | )F l Hpγ0( | )F p
η
Mi i Pr b bilit f Err rMinimum Probability of Error Minimum Probability of Error Minimum Probability of Error
1 0
1e M F
M F
P P P P PC C
= += =
0
1
1M FC CPP
η =
Loglikelihood Ratio Test
1
J. A. O'S. ESE 524, Lecture 2, 01/15/09 20
Mi i Pr b bilit f Err rMinimum Probability of ErrorAlternative View of
1
1| 1 0( | )
H
Hp H P>r RDecision Rule:
Compute Posterior
00
1
| 0 1
| 0 0| 1 1
( | )
( | )( | )
HH
H
HH
p H P
p H Pp H P
<
>
r R
RRPosterior Probabilities on Hypotheses
Select Most
01
0
1
| 0 0| 1 1 ( | )( | )for any ( ) 0
( ) ( )
( | )( | )
HH
H
H
p H Pp H Pg
g g
H PH P
>>
<rr RR
RR R
RR Select Most Likely Hypothesis
01
0
| 0 0| 1 1 ( | )( | )( ) ( )
( ) ( | ) ( | )
HH
H
p H Pp H Pp p
H P H P
><
+
rr RRR R
R R R1 0
1
| 1 1 | 0 0
1 0
( ) ( | ) ( | )
( | ) ( | )
H H
H
p p H P p H P
P H P H
= +
>
r rR R R
R R
J. A. O'S. ESE 524, Lecture 4, 01/25/07 210
1 0
H<
G i E plGaussian Example
J. A. O'S. ESE 524, Lecture 2, 01/15/09 22
Mi i D i i R lMinimax Decision Rule Find the decision rule that minimizes the Find the decision rule that minimizes the
maximum Bayes Risk over all possible priors. min maxR
Transition Priors Costs
1 1
min maxPZ
R
Criterion TransitionDensities
PriorsP0 & P1
CostsCij
Bayes: mimize expected risk or cost Yes Yes Yes
Minimum Probability of Error Yes Yes No
Minimax Yes No Yes
J. A. O'S. ESE 524, Lecture 4, 01/25/07 23
Neyman-Pearson Yes No No
Mi i D i i R l A l iMinimax Decision Rule: Analysis For any fixed decision rule the risk is linear in P1y 1
The maximum over P1 is achieved at an end point To make that end point as low as possible, the
risk should be constant with respect to P1
To minimize that constant value, the risk should achieve the minimum risk at some P *achieve the minimum risk at some P1 .
At that value of the prior, the best decision rule is a likelihood ratio test
1
1 0
| 1( | )M M F F
M H
P P C P P
P p H d
= +
= r R RZ
R C
J. A. O'S. ESE 524, Lecture 4, 01/25/07 24
0
0
1
| 0( | )F HP p H d= r R RZ
Z
Mi i D i i R l A l iMinimax Decision Rule: Analysis Bayes risk is concave (it is always below Bayes risk is concave (it is always below
its tangent) Minimax is achieved at an end point or at a s ac e ed at a e d po t o at
an interior point on the Bayes risk curve where the tangent is zero
J. A. O'S. ESE 524, Lecture 4, 01/25/07 25
Minimax Decision Rule: 0.7
0.8
0.9
1
5: 5 0XH x e X− ≥Example0.3
0.4
0.5
0.6
PD
0
1
: 5 , 0
: , 04 ln 5
X
H x e X
H x e Xl X
−
∞
≥
≥= −
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
PF
Matlab Code
5 5 '
'
''
5
1
XF
XM
P e dX e
P e dX e
γ
γ
γγ
− −
− −
= =
= = −
20
25
30pf=0:0.01:1;pd=pf.^0.2;eta=0.2*(pf.^(-0.8)); % eta=dP_D/dP_Ffigureplot(pf pd);xlabel('P F');ylabel('P D')
00.2
0.2
1
(1 )F
F
D M
M F F
P P P
C P C P
= − =
− =
10
15
Ris
k
plot(pf,pd);xlabel('P_F');ylabel('P_D')cm=10;cf=100;p1star=1./(1+cm*eta/cf);riskoptimal=cm*(1-pd).*p1star+cf*pf.*(1-p1star);figure
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
5
P1
plot(p1star,riskoptimal,'b'), hold onp1=0:0.01:1;r1=cm*(1-pd(10))*p1+cf*pf(10)*(1-p1);plot(p1,r1,'r'), hold onr2=cm*(1-pd(20))*p1+cf*pf(20)*(1-p1);
J. A. O'S. ESE 524, Lecture 4, 01/25/07 26
r2 cm (1-pd(20)) p1+cf pf(20) (1-p1);plot(p1,r2,'g'), hold onr3=cm*(1-pd(30))*p1+cf*pf(30)*(1-p1);plot(p1,r3,'c')xlabel('P_1');ylabel('Risk')
N P r D i i R lNeyman-Pearson Decision Rule Minimize PM subject to PF ≤ α Variational approach Variational approach Upper bound usually achieved Likelihood ratio test; threshold?
( )M FF P Pη α= + −
1 0
0 1
| 1 | 0( | ) ( | )H Hp H d p H dη α
= + −
r rR R R RZ Z
Criterion TransitionDensities
PriorsP0 & P1
CostsCij
Bayes: mimize expected risk or cost Yes Yes Yesexpected risk or cost
Minimum Probability of Error Yes Yes No
Minimax Yes No Yes
J. A. O'S. ESE 524, Lecture 4, 01/25/07 27Neyman-Pearson Yes No No
N P r D i i R lNeyman-Pearson Decision Rule Minimize PM subject to PF ≤ α Variational approach Variational approach Upper bound usually achieved Likelihood ratio test; threshold?
( )M FF P Pη α= + −
1 0
0 1
| 1 | 0( | ) ( | )H Hp H d p H dη α
= + −
r rR R R RZ Z
Plot the ROC: PD versus PF for the family of likelihood ratio tests
Draw a vertical line where PF = α Find the corresponding P
D
F
dPdP
η =
Find the corresponding PD At that point, the threshold equals
the derivative of the ROCD
F
dPd
dPd
η=
J. A. O'S. ESE 524, Lecture 4, 01/25/07 28
dη
S rSummary Several decision rules Likelihood (and loglikelihood) ratio test is optimal Receiver operating characteristic (ROC) plots
probability of detection versus probability of false alarm with the threshold as a parameter all possible optimal performancepossible optimal performance
Neyman-Pearson is a point on the ROC (PF = α) Minimax is a point on the ROC (PFCF=PMCM) Probability of error is a point on the ROC (slope η
= (1-P1)/P1)
J. A. O'S. ESE 524, Lecture 4, 01/25/07 29