Post on 04-Oct-2020
7/2/07 R.P.W. Duin 1
Non-Euclidean Problems inHuman Centered Information Processing
Robert P.W. Duin, Delft University of Technology
(in cooperation with El bieta P kalska)
London, 7 February 2007
r.p.w.duin@tudelft.nl, http://ict.ewi.tudelft.nl/~duin/
z• ce
7/2/07 R.P.W. Duin 2
The Problem
Possible (initial) solutions
Further analysis
Conclusion
Contents
The Problem
7/2/07 R.P.W. Duin 4
Measuring human relevant information
7/2/07 R.P.W. Duin 5
Measuring human relevant information
7/2/07 R.P.W. Duin 6
Measuring human relevant information
The nearest neighbors sorted:
Euclidean distances used
7/2/07 R.P.W. Duin 7
Spatial connectivity is lost
x1 x2 x3
x1
x2
x3
Dependent (connected) measurements are represented independently,The dependency has to be refound from the data
The Connectivity Problem in the Pixel Representation
7/2/07 R.P.W. Duin 8
Feature space
Training set
Test object
Spatial connectivity is lost
The Connectivity Problem in the Pixel Representation
7/2/07 R.P.W. Duin 9
ReshufflePixels
Feature space
Reshuffling pixels will not change the classification
Training set
Test object
Spatial connectivity is lost
The Connectivity Problem in the Pixel Representation
7/2/07 R.P.W. Duin 10
ReshufflePixels
Feature space
Reshuffling pixels will not change the classification
Training set
Test object
Spatial connectivity is lost
The Connectivity Problem in the Pixel Representation
7/2/07 R.P.W. Duin 11
Interpolation does not yield valid objects
Feature Space
image_1 image_2 image_3
image_1 image_3image_2
class subspace
The Connectivity Problem in the Pixel Representation
Another Metric is Needed
Another Metric is Needed
Learn from Human Observed Similarities
7/2/07 R.P.W. Duin 14
World
7/2/07 R.P.W. Duin 15
World
sensor
7/2/07 R.P.W. Duin 16
World
representation
feature 1
feature 2
sensor
7/2/07 R.P.W. Duin 17
World
human expert
representation
feature 1
feature 2
sensor
7/2/07 R.P.W. Duin 18
Examples
Image database retrieval
Man-machine interaction
Consumer preferesences
Computer vision
Pattern recognition
Machine learning
World
human expert
representation in a non-Euclidean (non-metric) space
feature 1
feature 2
sensor
7/2/07 R.P.W. Duin 19
Deformable Templates
A.K. Jain, D. Zongker, Representation and recognition of handwritten digit using deformable templates,
IEEE-PAMI, vol. 19, no. 12, 1997, 1386-1391.
Matching new objects x to various templates y
class x( ) class minarg y D x y,( )( )( )=
Examples of deformed templates
Dissimilarity measure appears to be non-metric
The Problem
Non Metric Data
Possible Solution
A New Representation
7/2/07 R.P.W. Duin 22
The Pattern Recognition System
(area)
(perimeter) x1
x2
Class A
Class B
Objects
Training Set GeneralizationRepresentation
Feature SpaceClassifier
Test Object classified as ’B’
7/2/07 R.P.W. Duin 23
Dissimilarity Based Representation
d2
d1
dissimilarity space
d1
d2
7/2/07 R.P.W. Duin 24
1. Positivity: dij ≥ 0 2. Reflexivity: dii = 0 3. Definiteness: dij = 0 objects i and j are identical 4. Symmetry: dij = dji 5. Triangle inequality: dij < dik + dkj 6. Compactness: if the objects i and j are very similar then dij < δ. 7. True representation: if dij < δ then the objects i and j are very similar. 8. Continuity of d
Met
ricDissimilarity - Assumptions
7/2/07 R.P.W. Duin 25
A B
X
Given labeled training set T
Unlabeled object x to be classified
The traditional Nearest Neighbour rule (template matching) just finds: label(argmintrainset(di)), without using DT. Can we do any better?
dx = (d1 d2 d3 d4 d5 d6 d7)
DT
d11d12d13d14d15d16d17
d21d22d23d24d25d26d27
d31d32d33d34d35d36d37
d41d42d43d44d45d46d47
d51d52d53d54d55d56d57
d61d62d63d64d65d66d67
d71d72d73d74d75d76d77
=
Define dissimilarity measure dij between raw data of objects i and j
Dissimilarity Representation
7/2/07 R.P.W. Duin 26
features
objects
Consider dissimilarities as ’features’
⇒ n objects given by n features ⇒ overtrained
⇒ select ’features’, i.e. representation objects,
by
- regularisation
- systematic selection
- random selection
Approach 1: Dissimilarity Space
DT
d11d12d13d14d15d16d17
d21d22d23d24d25d26d27
d31d32d33d34d35d36d37
d41d42d43d44d45d46d47
d51d52d53d54d55d56d57
d61d62d63d64d65d66d67
d71d72d73d74d75d76d77
=
7/2/07 R.P.W. Duin 27
dx = ( d1 d2 d3 d4 d5 d6 d7)
DT
d11d12d13d14d15d16d17
d21d22d23d24d25d26d27
d31d32d33d34d35d36d37
d41d42d43d44d45d46d47
d51d52d53d54d55d56d57
d61d62d63d64d65d66d67
d71d72d73d74d75d76d77
=
r1 r2 r3
r1(d2)
r3(d7)
R3
Dissimilarities
Selection of 3 objects for representation Classification in a 3D dissimilarity space
A B
X
Given labeled training set T
Unlabeled object x to be classified
r2(d4)
Approach 2: Dissimilarity Space
7/2/07 R.P.W. Duin 28
Examples of the raw data
Example Dissimilarity Space: NIST Digits 3 and 8
7/2/07 R.P.W. Duin 29
d10
d300
NIST digits: Hamming distances of 2 x 200 digits
Example of Dissimilarity Space
7/2/07 R.P.W. Duin 30
0 20 40 60 80 1000.1
0.14
0.18
0.22
0.26
0.3
TRAINING size per class
Cla
ssifi
catio
n er
ror
Modified−Hausodorff dist. on contour digit
10
CNN selection 45 90
31
30
* Nearest neighbour results
Fisher LD
20 Size of the representation set
Dissimilarity based Classification outperforms Nearest neighbour Rule
Dissimilarity Space Classification ⇔ Nearest Neighbour rule
7/2/07 R.P.W. Duin 31
A B
Training set
Is there a feature space X for which Dist(X,X) = D?
Rk
x1
x2
Euclidean distances D
Dissimilarity matrix D X?
Approach 2: Embedding the Dissimilarity Representation
7/2/07 R.P.W. Duin 32
−300 −200 −100 0 100 200 300 400 500
−400
−300
−200
−100
0
100
200
ALEXANDRA
BALCLUTHA
BLENHEIM
CHRISTCHURCH
DUNEDIN FRANZ JOSEF
GREYMOUTH
INVERCARGILL
MILFORD
NELSON
QUEENSTOWN
TE ANAU
TIMARU
13 x 13Distance Matrix
Euclidean Embedding: Multidimensional Scaling
7/2/07 R.P.W. Duin 33
A B
Training set
Dissimilarity matrix Ddij
dkjdik
i
k
j
or if sometimes dij > dik + dkj : triangle inequality not satisfied.
embedding in Euclidean space not possible → Pseudo-Euclidean embedding
If the dissimilarity matrix cannot be explained from a vector space,
e.g. Hausdorff and Hamming distance.
(Linear) Euclidean Embedding
7/2/07 R.P.W. Duin 34
Metric non-Euclidean Example
B
C
A
D10
10
10
5.1
5.1
5.1
Triangle inequality is satisfied, but Euclidean embedding is impossible.
Examples: Hausdorff, L1-norm
7/2/07 R.P.W. Duin 35
Non-Metric Distances
14.9
7.8 4.1
object 78
object 419
object 425
D(A,C)A
B
C
D(A,C) > D(A,B) + D(B,C)
D(A,B) D(B,C)
A BC
J(A,C) = 0; J(A,B) = large; J(C,B) = small ≠ J(A,B)
Weighted edit-distance for strings Single Linkage Clustering
µA µB–
x
σA σB
The Fisher Criterion
Bunke’s Chicken Dataset
J A B,( )µA µB– 2
σA2 σB
2+-------------------------=
7/2/07 R.P.W. Duin 36
(Pseudo-)Euclidean Embedding
m×m D is a given, imperfect dissimilarity matrix of training objects.
Construct inner-product matrix: ,
Eigenvalue Decomposition ,
Select k eigenvectors: (problem: < 0)
Let ℑk be a k x k diag. matrix, ℑk(i,i) = sign(Λk(i,i))
Λk(i,i) < 0 → Pseudo-Euclidean
m×n Dz is the dissimilarity matrix between new objects and the training set.
The inner-product matrix:
The embedded objects:
B 12---JD 2( )J–= J I 1
m-----11T–=
B QΛQT=( )p
X QkΛk
12---
= Λk
X Qk Λk
12---
ℑk=
Bz12--- Dz
2( )J 1n---11TD 2( )J–( )–=
Z BzQk Λk
12---
ℑk=
7/2/07 R.P.W. Duin 37
If D is non-Euclidean, B has p positive and q negative eigenvalues.
Solutions:• Remove all eigenvectors with small and negative eigenvalues
• Take absolute values of eigenvalues and proceed
• Construct a pseudo-Euclidean space
Pseudo-Euclidean Embedding
0 200 400 600 800 1000−50
0
50
100
150
200Eigenvalues
7/2/07 R.P.W. Duin 38
Pseudo-Euclidean Space (Krein Space)If D is non-Euclidean, B has p positive and q negative eigenvalues.A pseudo-Euclidean space ε with signature (p,q), k =p+q, is a non-degenerate inner product space ℜk = ℜp ⊕ ℜq such that:
, x y,⟨ ⟩ε xTℑpqy xii 1=
p
∑ yii p 1+=
q
∑–= = ℑpqIp p× 0
0 Iq q×–=
dε2 x y,( ) x y– x y–,⟨ ⟩ε dp
2 x y,( ) dq2 x y,( )–= =
11w=J v
11v J x = 0T
<x,x> > 0ε
<x,x> < 0ε R1
R1
i
<x,x> > 0ε
<x,x> < 0ε
F
D
A
CE
B
−4
−3
−2
−1
0
1
2
3
4
−4 −3 −2 −1 0 1 2 3 4
v
Rq
R p
���������������
���������������
��������������������
��������������������
���������������
���������������
����������
����������
���������������
���������������
��������������������������������������������������������
��������������������������������������������������������
2 2d (x,y) = d (x,y) − d (x,y) p q 2
7/2/07 R.P.W. Duin 39
Dissimilarity Representation ⇔ Kernels
Let R = {x1, x2, ... , xn}
Kernel classifier:
Linear classifier in a dissimilarity space:
If D has an Euclidean behavior, then
or
are Mercer kernels: SVM can be directly applied, Otherwise:
• Transform D appropriately or regularize K
• Treat K as kernels in pseudo-Euclidean space: indefinite SVM
f x( ) wTK x R,( ) w0+ wiK x xi,( )wi SV∈
∑ w0+= =
f D x R,( )( ) wTD x R,( ) w0+=
K 12---JD*2J–= K D*2 σ2⁄–{ }exp=
7/2/07 R.P.W. Duin 40
1. Nearest neighbour rule
2. Reduce training set to representation set
⇒ dissimilarity space
3. Embedding:Select large Λii > 0 ⇒ Euclidean space }discriminant function
Select large |Λii| > 0 → pseudo-Euclidean space
A BTraining set Dissimilarity matrix D
XTest object Dissimilarities dx with training set
Dissimilarity Based Classification
7/2/07 R.P.W. Duin 41
Dissimilarity Space equivalent to Embedding better than Nearest Neighbour Rule
Three Approaches Compared for the Zongker Data
0 500 1000 15000
0.1
0.2
Size of the representation set R
Ave
rage
d ge
nera
lizat
ion
erro
r
Digit data
RLDC; Rep. SetLP; Rep. SetRLDC; Embed.1−NN3−NN
Nearest neighbour Rule
Dissimilarity Space
Embedding
7/2/07 R.P.W. Duin 42
Convex
Heptagons:
Pentagons:
Minimum edge length: 0.1 of maximum edge length
Distance measures: Hausdorff D = max{ maxi(minj(dij)) , maxj(mini(dij)) }.
dij = distance between vertex i of polygon_1 and vertex j of polygon_2.Polygons are scaled and centered.
D
Modified Hausdorff D = max{meani(minj(dij)) , meanj(mini(dij)) }. (no metric!)
find the largest from the smallest vertex distance
no class overlapzero error
Example 2: Polygons
7/2/07 R.P.W. Duin 43
0 500 1000 15000
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
Size of the representation set R
Ave
rage
d ge
nera
lizat
ion
erro
r
Polygon data
RLDC; Rep. SetLP; Rep. SetLDC; Embed.1−NN3−NN
Nearest neighbour Rule
Dissimilarity Space
Embedding
Zero error difficult to reach
Dissimilarity Based Classification of Polygons
7/2/07 R.P.W. Duin 44
4 6 8 10 14 20 30 40 55 70 100 140 2000
1
2
3
4
5
6
7
8
9
Number of prototypes
Ave
rage
cla
ssifi
catio
n er
ror
(in %
)Polydistm; #Objects: 1000; Classifier: BayesNQ
1−NN−finalk−NN−finalk−NNk−NN−DSEdiCon−1−NNRandom *RandomC *ModeSeek *KCentres *FeatSel *KCentres−LP *LinProg *EdiCon *
Prototype Selection: Polygon Dataset
The classification performance of the quadratic Bayes Normal classifier and the k-NN in dissimilarity spaces and the direct k-NN, as a function of the number of selected prototypes. Note that for 10-20 prototypes already better results are obtained than by using 1000 objects in the NN rules.
Possible Solution
The Dissimilarity Representation
Possible Solution
The Dissimilarity Representation
E. Pekalska et al., The Dissimilarity Representation for Pattern Recognition,
Foundations and Applications,
World Scientific, Singapore, 2005, ISBN 981-256-530-2
An Analysis
Causes of Non-Metric Data
7/2/07 R.P.W. Duin 48
Non-Metric Measurements
14.9
7.8 4.1
object 78
object 419
object 425
Weighted edit-distance for strings
Bunke’s Chicken Dataset
7/2/07 R.P.W. Duin 49
Representation by Orders Sets (Strings); Edit Distance
X = (x1, x2, .... , xk) Y = (y1, y2, .... , yn)
DE(X,Y) : Σ edit operations X ⇒ Y(insertions, deletions, substitutions)
DE(snert ,meer ) = 3:snert ⇒ seert ⇒ seer ⇒ meer
DE( ner ,meer ) = 2:ner ⇒ mer ⇒ meer
b
a
u
v
u
b
u
a u a v v b
Possibly weighted.
Triangle inequality ⇒ computational feasible.
Length normalisation problem: D(aa,bb) < D(abcdef,bcdd)
See Marzal & Vidal, IEEE PAMI-15, 1993, 926-932
7/2/07 R.P.W. Duin 50
Better Measurements are Sometimes more Non-Euclidean
E.Pekalska et al., Non-Euclidean or non-metric measures can be informative, SSSPR 2006, Springer, LNCS 4109.
Clas
sific
atio
n Er
ror
Non
-Euc
lidea
ness
7/2/07 R.P.W. Duin 51
VJ
1800:
Crossing the Jostedalsbreen was impossible.
7/2/07 R.P.W. Duin 52
VJ
1800:
Crossing the Jostedalsbreen was impossible.
Travelling around (200 km) lasted 5 days.
7/2/07 R.P.W. Duin 53
VJX
1800:
Crossing the Jostedalsbreen was impossible.
Travelling around (200 km) lasted 5 days.
Unitll the shared point X was found.
7/2/07 R.P.W. Duin 54
VJX
1800:
Crossing the Jostedalsbreen was impossible.
Travelling around (200 km) lasted 5 days.
Unitll the shared point X was found.
People could visit each other in 8 hours.
7/2/07 R.P.W. Duin 55
VJX
1800:
Crossing the Jostedalsbreen was impossible.
Travelling around (200 km) lasted 5 days.
Unitll the shared point X was found.
People could visit each other in 8 hours.
D(V,J) = 5 days
D(V,X) = 4 hours
D(X,J) = 4 hours} Non-Metric
7/2/07 R.P.W. Duin 56
Cause of non-metric data - 1
14.9
7.8 4.1
object 78
object 419
object 425
Weighted edit-distance for strings
Bunke’s Chicken Dataset
7/2/07 R.P.W. Duin 57
Cause of non-metric data - 1
14.9
7.8 4.1
object 78
object 419
object 425
Weighted edit-distance for strings
Bunke’s Chicken Dataset
Large distances are overestimated,due to computational problems
7/2/07 R.P.W. Duin 58
Cause of non-metric data - 2
7/2/07 R.P.W. Duin 59
Cause of non-metric data - 2
7/2/07 R.P.W. Duin 60
Cause of non-metric data - 2
7/2/07 R.P.W. Duin 61
Cause of non-metric data - 2
7/2/07 R.P.W. Duin 62
Cause of non-metric data - 2
?
Non-metric data due to partially observed projections
7/2/07 R.P.W. Duin 63
Cause of non-metric data - 2
?
Non-metric data due to partially observed projections
Small distances are underestimated
7/2/07 R.P.W. Duin 64
Cause of non-metric data - 2
S =
5 3 1 2 42 1 3 4 5
5 3 1 4 25 2 4 1 3
4 2 3 5 11 3 2 5 4
1 4 3 5 23 5 1 4 2
Preferences for items
Consumers
Example: consumer preferences for recommendation systems
7/2/07 R.P.W. Duin 65
Cause of non-metric data - 3
Distance(Table,Book) = 0
Distance(Table,Cup) = 0
Distance(Book,Cup) = 0.75
D(A,C)A
B
C
D(A,C) > D(A,B) + D(B,C)
D(A,B) D(B,C)
Single Linkage Clustering
7/2/07 R.P.W. Duin 66
Cause of non-metric data - 3
7/2/07 R.P.W. Duin 67
Cause of non-metric data - 3
Non-Euclidean Human Relations
7/2/07 R.P.W. Duin 68
Causes of non-metric data
1. Overestimated large distances (too difficult to compute)
2. Underestimated small distances (one-sided view of objects)
caused by the construction of complicated measures, needed to
correspond with human observations.
7/2/07 R.P.W. Duin 69
Causes of non-metric data
1. Overestimated large distances (too difficult to compute)
2. Underestimated small distances (one-sided view of objects)
caused by the construction of complicated measures, needed to
correspond with human observations.
3. Essential non-metric distance definitions
as the human concept of distance differs from the mathematical one.
Conclusion
The Human World is Non-Metric
Solutions
Neglect it: Use Euclidean Distances only
Solutions
Live with it: Heuristic Corrections
Solutions
Adapt to it: Dissimilarity Space
d2
dissimilarity space
d1
d2
Solutions
Adapt to it: Krein Space
11w=J v
11v J x = 0T
<x,x> > 0ε
<x,x> < 0ε R1
R1
i
<x,x> > 0ε
<x,x> < 0ε
F
D
A
CE
B
−4
−3
−2
−1
0
1
2
3
4
−4 −3 −2 −1 0 1 2 3 4
v
Rq
R p
���������������
���������������
��������������������
��������������������
���������������
���������������
����������
����������
���������������
���������������
��������������������������������������������������������
��������������������������������������������������������
2 2d (x,y) = d (x,y) − d (x,y) p q 2
Solutions
Adapt to it: Non-Point Representations
Blob Representation String Representation
Thank You!