Unsupervised Clustering - sites.cs.ucsb.edu
Transcript of Unsupervised Clustering - sites.cs.ucsb.edu
![Page 1: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/1.jpg)
Unsupervised Clustering
![Page 2: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/2.jpg)
2PR , ANN, & ML
Unsupervised Clustering
Training samples are not labeled
May not know
how many classes
a prior probability
state-conditional probability
Automatic discovery of structures
Intuitively, objects in the same class stick
together and form clusters
P i( )
p x i( | )
![Page 3: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/3.jpg)
3PR , ANN, & ML
Locating groups (clusters) having similar
measurements
Given x x x unlabelled
partition in to C clusters
n
C
{ , ,..., }( )
, ,...,
1 2
1 2
Unsupervised Clustering (cont)
![Page 4: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/4.jpg)
4PR , ANN, & ML
Similarity Measure
Need a similarity measurement s(x,x’)
(e.g., distance between x and x’ )
dd
d
d
q| d
t
kk
d
k
yxyx
yx
yxyx
yxΣyxyx
yxyx
),(
features)binary (for y Commonalit
),(
productinner Normalized
)()(),(
Distance sMahalanobi
1,)|(),(
Metric Minkowski
1
1
1
![Page 5: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/5.jpg)
5PR , ANN, & ML
Similarity Measure (cont.)
How to set the threshold?
Too large all samples assigned into a single class
Too small each sample in its own class
How to properly weight each feature?
How do brightness of a fish and length of a fish
relate?
How to scale (or normalize) different measures?
![Page 6: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/6.jpg)
6PR , ANN, & ML
Axis Scaling
![Page 7: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/7.jpg)
7PR , ANN, & ML
Axis Scaling (cont.)
![Page 8: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/8.jpg)
8PR , ANN, & ML
![Page 9: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/9.jpg)
9PR , ANN, & ML
Threshold
![Page 10: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/10.jpg)
10PR , ANN, & ML
Criteria for Clustering
A criterion function for clustering
ii xi
i
c
i x
ien
J xmmx1
||1
2
c
i x xi i in
J1 '
2|'|1
xx
![Page 11: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/11.jpg)
11PR , ANN, & ML
Within and Between group variance
minimize and maximize the trace and
determinant of the appropriate scatter matrix
trace: square of the scattering radius
determinant: square of the scattering volume
x
c
i
t
iB
xii
c
i x
t
W
nn
nii
xmmmmmS
xmmxmxS
ii
ii
1))((
1))((
1
1
Criteria for Clustering (cont.)
![Page 12: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/12.jpg)
12PR , ANN, & ML
Pathology: when class sizes differ
ii xi
i
c
i x
ien
J xmmx1
||1
2
Relaxed constraint Allow the large class to
grow into smaller class
Stringent constraint Disallow large class and
split it
![Page 13: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/13.jpg)
13PR , ANN, & ML
K-Means Algorithm
(fixed # of clusters)
Arbitrarily pick N cluster centers, assign
samples to nearest center
Compute sample mean of each cluster
Reassign samples to clusters with the
nearest mean (for all samples)
Repeat if there are changes, otherwise stop
![Page 14: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/14.jpg)
14PR , ANN, & ML
![Page 15: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/15.jpg)
15PR , ANN, & ML
![Page 16: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/16.jpg)
16PR , ANN, & ML
![Page 17: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/17.jpg)
17PR , ANN, & ML
K-Means
In some sense, it is a simplified version of
the case II of learning parametric form
c
j
jjjk
iiikki
wPwxp
wPwxpxwP
1
)(),|(
)(),|(),|(
In stead of multiple membership, give the
sample to whichever class whose mean is
closest to it
otherwise
xwP ki0
mean classclosest with class1),|(
![Page 18: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/18.jpg)
18PR , ANN, & ML
Fuzzy K-Means
In Matlab, you can find k-mean (it is called
c-mean) under fuzzy logic toolbox
It implements fuzzy k-mean
Where membership function is not 1 or 0, but
can be fractional
![Page 19: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/19.jpg)
19PR , ANN, & ML
Iterative K-means Iterative refinement of clusters to minimize
Each step: move a sample from one to another
ii xi
i
c
i x
in
J xmmx1
||1
2
22
2
2
ji
|ˆ|1
|ˆ|1
|ˆ|11
ˆ
|ˆ|11
ˆ
to from x̂
j
j
j
i
i
i
j
j
j
jj
j
j
jj
i
i
iii
i
iii
n
n
n
n
n
nJJ
n
n
nJJ
n
mxmx
mxmx
mm
mxmx
mm
![Page 20: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/20.jpg)
20PR , ANN, & ML
1. Select an initial partition of the n samples into clusters and compute
2. Select the next candidate sample
3. If the current cluster is larger than 1, then
4. Transfer
5. Update 6. If J has not changed in n attempts, stop, otherwise
go back to 2.
x
ijn
n
ijn
n
j
j
j
j
j
j
j2
2
|ˆ|1
|ˆ|1
mx
mx
x to if for all jk k j
J m mi k, ,
J m mc, ,...,1
![Page 21: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/21.jpg)
21PR , ANN, & ML
Hierarchical Clustering
K-means assume a “flat” data
description
Hierarchical descriptions are more
frequently
![Page 22: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/22.jpg)
22PR , ANN, & ML
Hierarchical clustering (cont.)
Samples in the same cluster will remain so
at a higher level
x1 x2 x3 x4 x5 x6 1
2
3
4
5
level
![Page 23: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/23.jpg)
23PR , ANN, & ML
Bottom-up: agglomerate
Top-down: divisive
.to.Go
bycdecrement,anddelete,and.Merge
andclusters,distinctofpairnearestthe.Find
stopc,c.If
},...,x{xand n c.Let
jji
ji
n
25
1ˆ4
3
ˆ2
ˆ1 1
Hierarchical clustering Options
![Page 24: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/24.jpg)
24PR , ANN, & ML
At a certain step, we have c classes
ixi
iiin
n xm1
,||
Merge two clusters i and j
)(11
,|| jjii
jixji
ijjiij
jiij
nnnn
xnn
mnnij
mm
Hierarchical clustering Procedure
![Page 25: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/25.jpg)
25PR , ANN, & ML
Then certain criteria function will increase
2
222
222
||
)(
||||||
ji
ji
ji
ijjijjii
x
j
x
i
x
ijij
nn
nn
nnnn
Ejiij
mm
mmm
mxmxmx
),(minarg||minarg,,
2
,
**
jiji
ji
ji
ji
jid
nn
nnji
mm
I*,j* chosen in such a way
Until no such pair exist that cause an increase in the criteria
function less than certain preset threshold
Hierarchical clustering Procedure (cont.)
![Page 26: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/26.jpg)
26PR , ANN, & ML
In fact, the criteria function can be chosen
in many different ways, resulting in
different clustering behaviors
)(max),(
classes)(compact neighbor Furthest
)(min),(
classes) (elongatedneighbor Nearest
,max
,min
yx,
yx,
ji
ji
yxji
yxji
d
d
Criteria Function
![Page 27: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/27.jpg)
27PR , ANN, & ML
•Starting from every sample in a cluster
•Merge them according to some criteria function
•Until two clusters exist
![Page 28: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/28.jpg)
28PR , ANN, & ML
![Page 29: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/29.jpg)
29PR , ANN, & ML
![Page 30: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/30.jpg)
30PR , ANN, & ML
In Reality
With n objects, the distance matrix is n x n
For large databases, it is computationally
expensive to compute and store the matrix
Solution: storing only k nearest clusters in
the distance matrix: complexity is k x n
![Page 31: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/31.jpg)
31PR , ANN, & ML
Divisive Clustering
Less often used
Have to be careful that criterion function
usually decreases monotonically (the
samples become purer each time)
Natural grouping is the one where a large
drop in impurity occurs
iteration
impurity
![Page 32: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/32.jpg)
32PR , ANN, & ML
ISODATA Algorithm
Iterative self-organizing data analysis
technique
A tried-and-true clustering algorithm
Dynamically update # of clusters
Can go both top-down (split) and bottom-up
(merge) directions
![Page 33: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/33.jpg)
33PR , ANN, & ML
Notation:
merged be can that clusters ofnumber maximum
mergingfor distance maximum
splittingfor spread maximum
clusters ofnumber (desired) eapproximat
clustera in samples ofnumber on threshold
maxN
D
N
T
m
s
D
![Page 34: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/34.jpg)
34PR , ANN, & ML
Algorithm
1. Cluster the existing data into Nc clusters but eliminate any data and classes with fewer than T members, decrease Nc. Exit when classification of samples has not changed
2. If iteration odd and
Split clusters whose samples are sufficiently disjoint, increase Nc
If any clusters have been split, go to 1
3. Merge any pair of clusters whose samples are sufficiently close
4. Go to step 1
DcD
c NNorN
N 22
![Page 35: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/35.jpg)
35PR , ANN, & ML
Step 1
Classify samples according
to nearest mean
Discard samples in clusters
< T samples, decrease Nc
Recompute sample
mean of each cluster
Change in classification?
Parameter T
yes
noexit
![Page 36: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/36.jpg)
36PR , ANN, & ML
Compute for each cluster
ki
ki
jkji
kj
k
k
i
k
k
N
Nd
)(
)(
||1
max
||1
)(2
)(
y
y
μy
μy
Compute globally
cN
k
kk dNN
d1
1
Step 2
![Page 37: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/37.jpg)
37PR , ANN, & ML
22, skk
2
12
dc
k
k
NNor
TNand
dd
TNDs ,,2
Split cluster
Nc= Nc+1
yes
yes
no
no
Cluster centers are displaced in
opposite direction along the axis of
largest variance
![Page 38: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/38.jpg)
38PR , ANN, & ML
Compute for each pairs of clusters
|| jiijd μμ
Sort distance from smallest to largest less than Dm
Step 3
![Page 39: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/39.jpg)
39PR , ANN, & ML
max# Nmergeof
Dd mij
Merge i and j
Nc= Nc-1
yes
noNeither cluster i
nor j merged?
)(1
' jjii
ji
NNNN
μμμ
![Page 40: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/40.jpg)
40
Spectral Clustering
Graph theoretical approach
(Advanced) linear algebra is really
important
A field by itself
Cover the basics here
PR , ANN, & ML
![Page 41: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/41.jpg)
41
Graph notations
Undirected graph G = (V, E)
V = {v1, …, vn} are nodes (samples)
E = {ei,j| 0<=i,j<n} are edges, with weights
(similarity) wi,j
W: adjacency matrix with entries wi,j
D: degree matrix (diagonal) with entries
A: subset of vertices, \bar{A}: complement
V\A
1A = (f1, … fn)T, fi=1 if i in A and 0 otherwise
PR , ANN, & ML
![Page 42: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/42.jpg)
42
More graph notations
Size of subset A in V
Based on # of vertices
Based on its connections
Connected component A
Any two vertices in A can be joined by a path
where all intermediate points lie in A
No connection between A and \bar{A}
Partition with A1, …, Ak, if
PR , ANN, & ML
![Page 43: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/43.jpg)
43
Similarity Definitions
RBF (fully connected or thresholded):
E.g, Gaussian kernel gives wi,j
e-neighborhood graph:
Connect all points whose pairwise distances less
than e
K-nearest neighbor:
vi and vj are neighbors if either one is a kNN of the
other
Mutual k-nearest neighbor:
vi and vj are neighbors if both are a kNN of the other
PR , ANN, & ML
![Page 44: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/44.jpg)
44
Graph Laplacian: L = D - W
PR , ANN, & ML
![Page 45: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/45.jpg)
45
One connected component
PR , ANN, & ML
![Page 46: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/46.jpg)
46
Multiple connected components
PR , ANN, & ML
![Page 47: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/47.jpg)
47
Normalized Laplacian
PR , ANN, & ML
![Page 48: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/48.jpg)
48
Unnormalized Spectral Clustering
PR , ANN, & ML
![Page 49: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/49.jpg)
49
Normalized Spectral Clustering
PR , ANN, & ML
![Page 50: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/50.jpg)
50PR , ANN, & ML
![Page 51: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/51.jpg)
51
Toy Example
Random sample of 200 points drawn from 4
Gaussians
Similarity based on
Graph
Fully connected or
10 nearest neighbors
PR , ANN, & ML
![Page 52: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/52.jpg)
52
Min-Cut Formulation
If the edge weight represents degree of similarity,
optimal bi-partitioning of a graph is to minimize
the cut (so called min-cut problem)
Intuition: Cut the connection between dis-similar
samples, hence, edge weight (similarity) should
be small
BvAu
vuwBAcut,
),(),(
Feature
space
![Page 53: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/53.jpg)
53
Problem with Min-cut
Tends to cut out small regions
Sum weight (green + red + blue) = constant
Min-cut minimizes sum of weights of blue
edges, with no regard to green and red (half of
the picture)
![Page 54: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/54.jpg)
54
Remedy Distribute the total weight such that
Sum of weights of the blue edges are
minimized
Max between group variance
Sum of weights of the red (green) edges are
maximized
Min within group variance
![Page 55: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/55.jpg)
55
Normalized Cut
Penalize cutting out small, isolated clusters
set vertex full :
green)-totalblue (red ),(),(
red)-totalblue (green ),(),(
(blue) ),(),(
),(
),(
),(
),(),(
,
,
,
V
vuwVBasso
vuwVAasso
vuwBAcut
greentotal
blue
redtotal
blue
VBasso
BAcut
VAasso
BAcutBANcut
VvAu
VvAu
BvAu
A B
![Page 56: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/56.jpg)
56
Normalized Cut (cont.)
Penalize cutting out small, isolated clusters
Small blue
Small red (or green)
greentotal
blue
redtotal
blue
VBasso
BAcut
VAasso
BAcutBANcut
),(
),(
),(
),(),(
![Page 57: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/57.jpg)
57
Intuition
2
2),(),(
2),(
),(
),(
),(
),(
),(
),(
),(
1),(
),(
),(
),( 1
),(
),(
),(
),(
1),(
),(
),(
),(
),(),(),(),(),(),(
,
,,,,
bluered
blue
bluegreen
blue
bluered
red
bluegreen
green
BANcutBANassoc
VBasso
BAcut
VAasso
BAcut
VBasso
BBassoc
VAasso
AAassoc
VBasso
BAcut
VBasso
BBassoc
VAasso
BAcut
VAasso
AAassoc
VAasso
BAcut
VAasso
vuw
BAcutvuwvuwvuwvuwVAasso
AvAu
AvAuBvAuAvAuVvAu
Assoc reflects intra-class connection which
should be maximized
Ncut represents inter-class connection which
should be minimized
![Page 58: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/58.jpg)
58
Solution
How do you define similarity?
Multiple measurements (i.e., a feature vector)
can be used
How do you find the minimal normalized
cut?
Solution turns out to be a generalized eigen
value problem!
k
j
k
i
kk vvwjid ||),(
![Page 59: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/59.jpg)
59
Bi
Ai
xi
N
1
1
1
1
1
1
x
j
i
NNN
jiwd
d
d
d
),(
1
1
D
0
0
NNNNwNwNw
Nwww
Nwww
),()2,()1,(
),2()2,2()1,2(
),1()2,1()1,1(
W
||
0
VN
d
d
k
i
i
x
i
i
![Page 60: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/60.jpg)
60
D11
x1WDx1
D11
x1WDx1TT )1(
))(()())(()(
),(
),(
),(
),(),(
0
0,0
0
0,0
kk
d
xxw
d
xxw
VBasso
BAcut
VAasso
BAcutBANcut
TT
x
i
xx
jiij
x
i
xx
jiij
i
ji
i
ji
Bx
Ax
Bx
Ax
i
i
i
i
1
0
20
1
2
x1x1
0000 kjkj x
kiki
x
jij
x
kik
x
jiji xwdxwxwxwd
![Page 61: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/61.jpg)
61
yDzzzW)D(DD
DyW)y(D
x)b(1x)(1yDyy
W)y(Dyx
T
T
yx
2
1
2
1
2
1
1
min)(min
k
kb
Ncut
A symmetric semi-positive-definite matrix
Real, >=0 eigen values
Orthogonal eigen vectors
0:eigenvalue:reigenvecto 2
1
1Dz o
![Page 62: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/62.jpg)
62
Hence, the second smallest eigen vector
contains the minimal cut solution (in
floating point format)
Even though computing all eigen
vectors/values are expensive O(n^3),
computing a small number of those are not
that expensive (Lanczos method)
Recursive applications of the procedure
![Page 63: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/63.jpg)
63
Results
![Page 64: Unsupervised Clustering - sites.cs.ucsb.edu](https://reader030.fdocuments.in/reader030/viewer/2022012510/61876710e8e51e262e1ce6be/html5/thumbnails/64.jpg)
64