Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific...
-
Upload
allyssa-benninger -
Category
Documents
-
view
216 -
download
0
Transcript of Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific...
![Page 1: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/1.jpg)
Giansalvo EXIN Cirrincione
unit #3
![Page 2: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/2.jpg)
p aram etricm eth od s
n on -p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
PROBABILITY DENSITY ESTIMATION
• labelled• unlabelled
A specific functional form for the density model is assumed. This contains a number of parameters which are then optimized by fitting the model to the training set.
The chosen form is not correct
![Page 3: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/3.jpg)
p aram etricm eth od s
n on -p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
PROBABILITY DENSITY ESTIMATION
It does not assume a particular functional form, but allows the form of the density to be determined entirely by the data.
The number of parameters grows with the size of the TS
![Page 4: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/4.jpg)
p aram etricm eth od s
n on -p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
PROBABILITY DENSITY ESTIMATION
It allows a very general class of functional forms in which the number of adaptive parameters can be increased in a sistematic way to build even more flexible models, but where the total number of parameters in the model can be varied independently from the size of the data set.
![Page 5: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/5.jpg)
n on -p aram etricm eth od s
m axim u m like lih ood B ayes ian in fe ren ce s toch as tic tech n iq u esfo r on -lin e lea rn in g
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
Parametric model: normal or Gaussian distribution
parameters
2
3dd
![Page 6: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/6.jpg)
n on -p aram etricm eth od s
m axim u m like lih ood B ayes ian in fe ren ce s toch as tic tech n iq u esfo r on -lin e lea rn in g
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
Parametric model: normal or Gaussian distribution
Mahalanobis distance
![Page 7: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/7.jpg)
contour of constant probability density (smaller by a factor exp(-1/2))
n on -p aram etricm eth od s
m axim u m like lih ood B ayes ian in fe ren ce s toch as tic tech n iq u esfo r on -lin e lea rn in g
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
Parametric model: normal or Gaussian distribution
ii uu iΣ
![Page 8: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/8.jpg)
n on -p aram etricm eth od s
m axim u m like lih ood B ayes ian in fe ren ce s toch as tic tech n iq u esfo r on -lin e lea rn in g
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
Parametric model: normal or Gaussian distribution
parameters 2d
The components of x are statistically independent
![Page 9: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/9.jpg)
n on -p aram etricm eth od s
m axim u m like lih ood B ayes ian in fe ren ce s toch as tic tech n iq u esfo r on -lin e lea rn in g
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
Parametric model: normal or Gaussian distribution
parameters 1djj
![Page 10: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/10.jpg)
n on -p aram etricm eth od s
m axim u m like lih ood B ayes ian in fe ren ce s toch as tic tech n iq u esfo r on -lin e lea rn in g
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
Parametric model: normal or Gaussian distribution
Some properties :• any moment can be expressed as a function of and • under general assumptions, the mean of M random variables tends to be distributed normally, in the limit as M tends to infinity (central limit theorem). Example: sum of a set of variables drawn independently from the same distribution• under any non-singular linear transformation of the coordinate system, the pdf is again normal, but with different parameters• the marginal and conditional densities are normal.
![Page 11: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/11.jpg)
n on -p aram etricm eth od s
m axim u m like lih ood B ayes ian in fe ren ce s toch as tic tech n iq u esfo r on -lin e lea rn in g
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
Parametric model: normal or Gaussian distribution
discriminant functiondiscriminant function
independent normal class-conditional pdf’s independent normal class-conditional pdf’s
quadratic decision boundaryquadratic decision boundary
![Page 12: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/12.jpg)
n on -p aram etricm eth od s
m axim u m like lih ood B ayes ian in fe ren ce s toch as tic tech n iq u esfo r on -lin e lea rn in g
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
Parametric model: normal or Gaussian distribution
independent normal class-conditional pdf’s k =
independent normal class-conditional pdf’s k =
linear decision boundarylinear decision boundary
![Page 13: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/13.jpg)
n on -p aram etricm eth od s
m axim u m like lih ood B ayes ian in fe ren ce s toch as tic tech n iq u esfo r on -lin e lea rn in g
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
Parametric model: normal or Gaussian distribution
P(C1) = P(C2)
![Page 14: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/14.jpg)
n on -p aram etricm eth od s
m axim u m like lih ood B ayes ian in fe ren ce s toch as tic tech n iq u esfo r on -lin e lea rn in g
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
Parametric model: normal or Gaussian distribution
P(C1) = P(C2) = P(C3)
![Page 15: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/15.jpg)
n on -p aram etricm eth od s
m axim u m like lih ood B ayes ian in fe ren ce s toch as tic tech n iq u esfo r on -lin e lea rn in g
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
Parametric model: normal or Gaussian distribution
template matchingtemplate matching
= =
![Page 16: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/16.jpg)
n on -p aram etricm eth od s
m axim u m like lih ood B ayes ian in fe ren ce s toch as tic tech n iq u esfo r on -lin e lea rn in g
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
ML finds the optimum values for the parameters by maximizing a likelihoodfunction derived from the training data.
drawn independently from the required distribution
![Page 17: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/17.jpg)
n on -p aram etricm eth od s
m axim u m like lih ood B ayes ian in fe ren ce s toch as tic tech n iq u esfo r on -lin e lea rn in g
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
TS joint probability density
Likelihood of for the given TS
ML finds the optimum values for the parameters by maximizing a likelihoodfunction derived from the training data.
![Page 18: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/18.jpg)
n on -p aram etricm eth od s
m axim u m like lih ood B ayes ian in fe ren ce s toch as tic tech n iq u esfo r on -lin e lea rn in g
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
error function
homeworkhomeworkGaussian pdf
sample averages
![Page 19: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/19.jpg)
n on -p aram etricm eth od s
m axim u m like lih ood B ayes ian in fe ren ce s toch as tic tech n iq u esfo r on -lin e lea rn in g
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
Uncertainty in the values of the parameters
![Page 20: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/20.jpg)
n on -p aram etricm eth od s
m axim u m like lih ood B ayes ian in fe ren ce s toch as tic tech n iq u esfo r on -lin e lea rn in g
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
![Page 21: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/21.jpg)
n on -p aram etricm eth od s
m axim u m like lih ood B ayes ian in fe ren ce s toch as tic tech n iq u esfo r on -lin e lea rn in g
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
weighting factor (posterior distribution)
drawn independently from the underlying distribution
![Page 22: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/22.jpg)
n on -p aram etricm eth od s
m axim u m like lih ood B ayes ian in fe ren ce s toch as tic tech n iq u esfo r on -lin e lea rn in g
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
![Page 23: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/23.jpg)
n on -p aram etricm eth od s
m axim u m like lih ood B ayes ian in fe ren ce s toch as tic tech n iq u esfo r on -lin e lea rn in g
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
![Page 24: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/24.jpg)
n on -p aram etricm eth od s
m axim u m like lih ood B ayes ian in fe ren ce s toch as tic tech n iq u esfo r on -lin e lea rn in g
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
A prior which gives rise to a posterior having the same functional form is said to be a conjugate prior (reproducing densities, e.g. Gaussian).
For large numbers of observations, the Bayesian representation of the density approaches the maximum likelihood solution.
![Page 25: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/25.jpg)
Example
Assume knownFind given
normaldistribution
homeworkhomework
sample mean
![Page 26: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/26.jpg)
Example
normaldistribution
![Page 27: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/27.jpg)
n on -p aram etricm eth od s
m axim u m like lih ood B ayes ian in fe ren ce s toch as tic tech n iq u esfo r on -lin e lea rn in g
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
Iterative techniques:• no storage of a complete TS• on-line learning in real-time adaptive systems• tracking of slowly varying systems
From the ML estimate of the mean of a normal distribution
![Page 28: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/28.jpg)
The Robbins-Monro algorithm
Consider a pair of random variables g and which are correlated
regression function
Assume g has finite variance:
![Page 29: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/29.jpg)
The Robbins-Monro algorithm
positivepositive
Successive corrections decrease in magnitude
for convergence
Corrections are sufficiently large that
the root is found
The accumulated noise has finite variance (noise doesn’t spoil
convergence )
![Page 30: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/30.jpg)
The Robbins-Monro algorithmThe ML parameter estimate can be formulated as a sequential update method using the Robbins-Monro formula.
![Page 31: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/31.jpg)
homework
![Page 32: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/32.jpg)
Consider the case where the pdf is taken to be a normal distribution, with known standard deviation and unknown mean . Show that, by choosing aN = 2 / (N+1), the one-dimensional iterative version of the ML estimate of the mean is recovered by using the Robbins-Monro formula for sequential ML. Obtain the corresponding formula for the iterative estimate of 2 and repeat the same analysis.
2
ˆ
x
g 2
ˆ
f
![Page 33: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/33.jpg)
n on -p aram etricm eth od s
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
SUPERVISED LEARNING
histograms We can choose both the number of bins M and their starting position on the axis.The number of bins (viz. the bin width) acts as a smoothing parameter.
Curse of dimensionality ( Md bins)
![Page 34: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/34.jpg)
Density estimation in generalDensity estimation in general
The probability that a new vector x, drawn from the unknown pdf p(x), will fall inside some region R of x-space is given by:
If we have N points drawn independently from p(x), the probability that K of them will fall within R is given by the binomial law:
The distribution is sharply peaked as N tends to infinity.
Assume p(x) is continuous and slightly varies over the region R of volume V.
![Page 35: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/35.jpg)
Density estimation in generalDensity estimation in general
Assumption #1R relatively large so that P will be large and the binomial
distribution will be sharply peaked
Assumption #2R small justifies the assumption of p(x)
nearly constant inside the integration region.
FIXED DETERMINED FROM DATA
K-nearest-neighbours
![Page 36: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/36.jpg)
Density estimation in generalDensity estimation in general
Assumption #1R relatively large so that P will be large and the binomial
distribution will be sharply peaked
Assumption #2R small justifies the assumption of p(x)
nearly constant inside the integration region.
DETERMINED FROM DATA
FIXED
Kernel-based methods
![Page 37: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/37.jpg)
Kernel-based methodsdhV
We can find an expression for K by defining a kernel function H(u), also known as a Parzen window, given by:
R is a hypercube centred on x
Superposition of N cubes of side h with each cube centred on one of the data points.
interpolation function (ZOH)
![Page 38: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/38.jpg)
Kernel-based methods
smoother estimate
![Page 39: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/39.jpg)
Kernel-based methods
30 samples
ZOH
Gaussian
![Page 40: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/40.jpg)
Kernel-based methodsOver different selections
of data points xn
The expectation of the estimated density is a convolution of the true pdf with the kernel function and so represents a smoothed version of the pdf.
All of the data points must be stored !
For a finite data set, there is no non-negative estimator which is unbiased for all continuous pdf’s (Rosenblatt, 1956)
![Page 41: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/41.jpg)
K-nearest neighbours
One of the potential problems with the kernel-based approach arises from the use of a fixed width parameter (h) for all of the data points. If h is too large, there may be regions of x-space in which the estimate is oversmoothed. Reducing h may lead to problems in regions of lower density where the model density will become noisy.
The optimum choice of h may be a function of position.
Consider a small hypersphere centred at a point x and allow the radius of the sphere to grow until it contains precisely K data points. The estimate of the density is then given by K / NV.
![Page 42: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/42.jpg)
K-nearest neighbours
The estimate is not a true probability density since its
integral over all x-space diverges.
All of the data points must be stored !
Branch-and-bound
![Page 43: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/43.jpg)
K-nearest neighbour classification rule
The data set contains Nk points in class Ck and N
points in total.
Draw a hypersphere around x which
encompasses K points irrespective of their class.
VN
KCp
k
kk x
K
K
p
CPCpCP kkk
k x
xx
NV
Kp x
N
NCp k
k
![Page 44: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/44.jpg)
K-nearest neighbour classification rule
K
K
p
CPCpCP kkk
k x
xx
Find a
hype
rsphe
re
arou
nd x
which
cont
ains
K poin
ts an
d the
n ass
ign
x to t
he cl
ass h
avin
g the
majo
rity i
nsid
e the
hype
rsphe
re.
K = 1 : nearest-neighbour rule
![Page 45: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/45.jpg)
K-nearest neighbour classification rule
K
K
p
CPCpCP kkk
k x
xx
Samples
that
are c
lose
in fe
ature
spac
e lik
ely
belong t
o the s
ame
class
.
K = 1 : nearest-neighbour rule
![Page 46: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/46.jpg)
1-NNR
K-nearest neighbour classification rule
![Page 47: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/47.jpg)
Measure of the distance between two density functions
Kullback-Leibler distanceor
asymmetric divergence
L 0 with equality iff the two pdf’s are equal.
![Page 48: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/48.jpg)
homework
![Page 49: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/49.jpg)
![Page 50: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/50.jpg)
![Page 51: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/51.jpg)
![Page 52: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/52.jpg)
n on -p aram etricm eth od s
p aram etricm eth od s
sem i-p aram etricm eth od s
fin ite n u m b er o f tra in in g sam p les
Techniques not restricted to specific functional forms, where the size of the model only grows with the complexity of the problem being solved, and not simply with the size of the data set.
computationally intensive
MIXTURE MODELMIXTURE MODEL
Training methods based on ML: nonlinear optimization re-estimation (EM algorithm) stochastic sequential estimation
![Page 53: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/53.jpg)
MIXTURE DISTRIBUTIONMIXTURE DISTRIBUTION
mixing parametersmixing parameters
prior probability of the data point having been generated from component j of the mixture
![Page 54: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/54.jpg)
To generate a data from the pdf, one of the components j is first selected at random with probability P(j) and then a data point is generated from the corresponding component density p(xj).
It can approximate any CONTINUOUS density to arbitrary accuracy provided the model has a sufficiently large number of components, and provided the parameters of the model are chosen correctly.
incomplete data(no component label)
![Page 55: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/55.jpg)
posterior probability
![Page 56: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/56.jpg)
spherical Gaussianspherical Gaussian
d
![Page 57: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/57.jpg)
MAXIMUM LIKELIHOODMAXIMUM LIKELIHOOD
Adjustable parameters : P( j ) j j = 1, … , M j j = 1, … , M
Problems : singular solutions (likelihood goes to infinity) local minima
One of the Gaussian components collapses onto
one of the data points
![Page 58: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/58.jpg)
MAXIMUM LIKELIHOODMAXIMUM LIKELIHOOD
Possible solutions : constrain the components to have equal variance minimum (underflow) threshold for the variance
Problems : singular solutions (likelihood goes to infinity) local minima
![Page 59: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/59.jpg)
softmax or normalized exponential
![Page 60: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/60.jpg)
Expressions for the parameters at a minimum of E
Mean of the data vectors weighted by the posterior
probabilities that the corresponding data points were generated from that
component.
![Page 61: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/61.jpg)
Expressions for the parameters at a minimum of E
Variance of the data w.r.t. the mean of that
component, again weighted with the posterior
probabilities.
![Page 62: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/62.jpg)
Expressions for the parameters at a minimum of E
Posterior probabilities for that component, averaged
over the data set.
![Page 63: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/63.jpg)
Expressions for the parameters at a minimum of E
Highly non-linear coupled
equations
![Page 64: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/64.jpg)
Expectation-maximization (EM) algorithm
The error function
decreases at each iteration until a
local minimum is found
old
old
oldnew
new
new
![Page 65: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/65.jpg)
proof
Given a set of non-negative numbers j that sum to one :
Jensen’s inequality
![Page 66: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/66.jpg)
QQEE oldnew
Minimizing Q leads to a decrease in the value of the Enew unless Enew is already at a local minimum.
![Page 67: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/67.jpg)
Gaussian mixture model
Minimize :
end proof
![Page 68: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/68.jpg)
example
EM algorithm• 1000 data points• uniform distribution• seven components j
MjP
jj
12
x
after 20 cycles after 20 cycles
Contours of constant probability density
![Page 69: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/69.jpg)
k
c
kk CPCpp
1
xx k
c
kk CPCpp
1
xx
Why expectation-maximization ?
Hypothetical complete data set xn introduce zn , integer in the range (1,M), specifying which component of the mixture generated x.
The distribution of zn is unknown
![Page 70: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/70.jpg)
Why expectation-maximization ?
First we guess some values for the parameters of the mixture model (the old parameter values) and then we use these, together with Bayes’ theorem, to find the probability distribution of the {zn}. We then compute the expectation of Ecomp w.r.t. this distribution. This is the E-step of the EM algorithm. The new parameter values are then found by minimizing this expected error w.r.t. the parameters. This is the maximization or M-step of the EM algorithm (min E = ML).
![Page 71: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/71.jpg)
Why expectation-maximization ?
Pold(zn|xn) is the probability for zn, given the value of xn and the old parameter values. Thus, the expectation of Ecomp over the complete set of {zn} values is given by:
probability distribution for the {zn}
![Page 72: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/72.jpg)
Why expectation-maximization ?
Pold(zn|xn) is the probability for zn, given the value of xn and the old parameter values. Thus, the expectation of Ecomp over the complete set of {zn} values is given by:
homeworkhomework
![Page 73: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/73.jpg)
Why expectation-maximization ?
Pold(zn|xn) is the probability for zn, given the value of xn and the old parameter values. Thus, the expectation of Ecomp over the complete set of {zn} values is given by:
which is equal to Q ~
![Page 74: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/74.jpg)
Stochastic estimation of parameters
It requires the storage of all previous data
points
![Page 75: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/75.jpg)
Stochastic estimation of parameters
no singular solutions in on-line problems
![Page 76: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/76.jpg)
![Page 77: Giansalvo EXIN Cirrincione unit #3 PROBABILITY DENSITY ESTIMATION labelled unlabelled A specific functional form for the density model is assumed. This.](https://reader036.fdocuments.in/reader036/viewer/2022062519/56649c985503460f949542f0/html5/thumbnails/77.jpg)