Graph PartitioningGraph Partitioningand Clusteringand Clustering
0.1
0.2
0.8
0.7
0.6
0.8
0.8
0.8
E={wE={wijij}} Set of weighted edges indicating pair-wise Set of weighted edges indicating pair-wise similarity between pointssimilarity between points
Similarity GraphSimilarity Graph
Represent dataset as a weighted graph Represent dataset as a weighted graph G(V,E)G(V,E) Example datasetExample dataset },...,,{ 621 xxx
1
2
3
4
5
6
V={xV={xii}} Set of Set of nn vertices representing data points vertices representing data points
Graph PartitioningGraph Partitioning
Clustering can be viewed as partitioning a similarity graphClustering can be viewed as partitioning a similarity graph
Bi-partitioningBi-partitioning task: task: Divide vertices into two disjoint groups Divide vertices into two disjoint groups (A,B)(A,B)
1
2
3
4
5
6
A B
Relevant Issues:Relevant Issues: How can we define a “good” partition of the graph?How can we define a “good” partition of the graph? How can we efficiently identify such a partition?How can we efficiently identify such a partition?
Clustering ObjectivesClustering Objectives
Traditional definition of a “good” clustering:Traditional definition of a “good” clustering:1.1. Points assigned to same cluster should be highly similar.Points assigned to same cluster should be highly similar.
2.2. Points assigned to different clusters should be highly dissimilar.Points assigned to different clusters should be highly dissimilar.
2. Minimise weight of 2. Minimise weight of between-groupbetween-group connections connections
0.1
0.2
1. Maximise weight of 1. Maximise weight of within-groupwithin-group connectionsconnections
0.8
0.7
0.6
0.8
0.8
0.8
1
2
3
4
5
6
Apply these objectives to our graph representationApply these objectives to our graph representation
Graph CutsGraph Cuts
Express partitioning objectives as a function of the Express partitioning objectives as a function of the “edge cut” of the partition.“edge cut” of the partition.
Cut:Cut: Set of edges with only one vertex in a group. Set of edges with only one vertex in a group.
BjAi
ijwBAcut,
),(
0.1
0.2
0.8
0.7
0.6
0.8
0.8
1
2
3
4
5
6
0.8
A B
cut(A,B) = 0.3
Graph Cut CriteriaGraph Cut Criteria
Criterion: Minimum-cutCriterion: Minimum-cut Minimise weight of connections between groupsMinimise weight of connections between groups
min cut(A,B)
Optimal cutMinimum cut
Problem:Problem: Only considers external cluster connectionsOnly considers external cluster connections Does not consider internal cluster densityDoes not consider internal cluster density
Degenerate case:Degenerate case:
Graph Cut Criteria Graph Cut Criteria (continued)(continued)
Criterion: Normalised-cut Criterion: Normalised-cut (Shi & Malik,’97)(Shi & Malik,’97) Consider the connectivity between groups relative to Consider the connectivity between groups relative to
the density of each group.the density of each group.
Normalise the association between groups by Normalise the association between groups by volumevolume.. Vol(A)Vol(A): The total weight of the edges originating from : The total weight of the edges originating from
group group AA. .
)(
),(
)(
),(),(min
Bvol
BAcut
Avol
BAcutBANcut
Why use this criterion?Why use this criterion? Minimising the normalised cut is equivalent to Minimising the normalised cut is equivalent to
maximising normalised association.maximising normalised association. Produces more balanced partitions.Produces more balanced partitions.
How do we efficiently identify How do we efficiently identify a “good” partition?a “good” partition?
Problem:Problem: Computing an optimal cut is NP-hardComputing an optimal cut is NP-hard
Top Related