Integrating Constraints and Metric Learning in Semi-Supervised Clustering
Nonlinear Adaptive Distance Metric Learning for Clustering
description
Transcript of Nonlinear Adaptive Distance Metric Learning for Clustering
1Intelligent Database Systems Lab
國立雲林科技大學National Yunlin University of Science and Technology
Nonlinear Adaptive Distance Metric Learning for Clustering
Jianhui Chen, Zheng Zhao, Jieping Ye, Huan Liu
KDD, 2007
Reported by Wen-Chung Liao, 2010/01/19
2
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Outlines Motivation Objective ADAPTIVE DISTANCE METRIC LEARNING:
THE LINEAR CASE ADAPTIVE DISTANCE METRIC LEARNING:
THE NONLINEAR CASE NAML Experiment Conclusions Comments
In distance metric learning, the goal is to achieve better
compactness (reduced dimensionality) separability (inter-cluster distance)
on the data, in comparison with usual distance metrics, such as Euclidean distance.
3
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Motivation
Traditionally, dimensionality reduction and clustering are applied in two separate steps. ─ If distance metric learning (via dimensionality reductio
n) and clustering can be performed together, the cluster separability in the data can be better maximized in the dimensionality-reduced space.
Many real-world applications may involve data with nonlinear and complex patterns.─ Kernel methods
4
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Objectives Propose NAML for simultaneous distance metric le
arning and clustering. NAML
─ first maps the data to a high-dimensional space through a kernel function;
─ next applies a linear projection to find a low-dimensional manifold;
─ and then perform clustering in the low-dimensional space. The key idea of NAML is to integrate
─ kernel learning, ─ dimensionality reduction, ─ clustering
in a joint framework so that the separability of the data is maximized in the low-dimensional space.
5
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.ADAPTIVE DISTANCE METRIC LEARNING: THE LINEAR CASE
Data set: Data matrix:
Mahalanobis distance measure:
Linear transformation W:
≡MaxMin
Cluster indicator matrix Weighted cluster indicator matrix
6
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.ADAPTIVE DISTANCE METRIC LEARNING: THE NONLINEAR CASE
Kernel method:Hilbert space (feature space)nonlinear mapping
Symmetric kernel function K:kernel Gram matrix G:
For a given kernel function K, the nonlinear adaptive metric learning problem can be formulated as :
(inner product)
ψK (X) : the data matrix in the feature space
convex combination of p kernel matrices
7
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.ADAPTIVE DISTANCE METRIC LEARNING: THE NONLINEAR CASE
3.1 The computation of L for given Q and G
≡
8
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.ADAPTIVE DISTANCE METRIC LEARNING: THE NONLINEAR CASE
3.2 The computation of Q for given L and G
Max≡
9
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.ADAPTIVE DISTANCE METRIC LEARNING: THE NONLINEAR CASE
3.3 The computation of G for given Q and L
Min≡
MOSEK
10
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.NAML
O(pk3n3)
11
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Experiment
• K-means algorithm as the baseline for comparison• three representative unsupervised distance metric learning algor
ithms: Principle Component Analysis (PCA), Local Linear Embeddin
g (LLE), and Laplacian Eigenmap (Leigs)
Performance Measures
ci the obtained cluster indicator yi the true class label
C the set of cluster indicators Y be the set of class labels
12
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.ExperimentsExperimental Results (10 RBF kernels for NAML)
13
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Sensitivity Study: the effect of the input kernels and the regularization parameter λ
NAML provides a way to learn from multiple input kernelsand generate a metric, with which an unsupervised learningalgorithm, like K-means, is more likely to perform as wellas with the best input kernel
K-means using 10 kernels for NAML
the quality of the initial kernel is low
14
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.ExperimentsA series of different λ values ranging from 10−8 to 105
•λ value in the range of [10−4, 102] is helpful in most cases
15
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Conclusion NAML The joint kernel learning, metric learning, and clustering can b
e formulated as a trace maximization problem, which can be solved iteratively in an EM framework.
NAML is effective in learning a good distance metric and improving the clustering performance.
multiple biological data, e.g., amino acid sequences, hydropathy profiles, and gene expression data.
A future work is to study how to combine a set of pre-specified Laplacian matrices to achieve better performance in spectral clustering.
16
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Comments
Advantage─ A joint framework for kernel learning, distance
metric learning and clustering Shortage
Applications─ clustering