Post on 14-Jan-2016
Spoken Language GroupSpoken Language GroupChinese Information Processing Lab.Chinese Information Processing Lab.Institute of Information ScienceInstitute of Information ScienceAcademia Sinica, Taipei, TaiwanAcademia Sinica, Taipei, Taiwanhttp://sovideo.iis.sinica.edu.tw/SLG/index.htmhttp://sovideo.iis.sinica.edu.tw/SLG/index.htm
Multiple Parameter Selection of Multiple Parameter Selection of Support Vector MachineSupport Vector Machine
Hung-Yi Lo
2
2007/07/11
OutlineOutline
Phonetic Boundary Refinement Using Support Vector Machine (ICASSP’07, ICSLP’07)
Automatic Model Selection for Support Vector Machine (Distance Metric Learning for Support Vector Machine)
3
2007/07/11
Automatic Model Selection for Support Vector Machine(Distance Metric Learning for
Support Vector Machine)
4
2007/07/11
Automatic Model Selection for SVMAutomatic Model Selection for SVM
The problem of choosing a good parameter or model setting for a better generalization ability is the so called model selection.
We have two parameter in support vector machine: regularization variable C Gaussian kernel width parameter γ
Support vector machine formulation:
Gaussian kernel:
n
i ii yxeyxK 1
2)(),(
ww2
1min
s. t.
(QP)
0
)(
eebAwD
mnRbw 1),,(
5
2007/07/11
C.-M. Huang, Y.-J. Lee, Dennis K. J. Lin and S.-Y. Huang. "Model Selection for Support Vector Machines via Uniform Design", A special issue on Machine Learning and Robust Data Mining of Computational Statistics and Data Analysis. (To appear)
Automatic Model Selection for SVMAutomatic Model Selection for SVM
6
2007/07/11
Automatic Model Selection for SVMAutomatic Model Selection for SVM
Strength: Automate the training progress of SVM, nearly no
human-effort needed. The object of the model selection procedure is directly
related to testing performance. In my experimental experience, testing correctness always better than the results of human-tuning.
Nested uniform-designed-based method is much faster than exhaustive grid search.
Weakness: No closed-form solution, need doing experimental
search. Time consuming.
7
2007/07/11
Distance Metric LearningDistance Metric Learning L. Yang "Distance Metric Learning: A Comprehensive
Survey", Ph.D. survey
Many works have done to learn a quadratic (Mahalanobis) distance measures:
where xi is the input vector for the ith training case and Q is a symmetric, positive semi-definite matrix.
Distance metric learning is equivalent to feature transformation:
)()( jijiij xxxxd Q
)()(
)AA()AA(
)(AA)(
jiji
jiji
jijiij
yyyy
xxxx
xxxxd
8
2007/07/11
Supervised Distance Metric Learning
Local
Local Adaptive Distance Metric Learning
Neighborhood Components Analysis
Relevant Component Analysis
Unsupervised Distance Metric Learning Nonlinear embedding
LLE, ISOMAP, Laplacian Eigenmaps
Distance Metric Learning based on SVM
Large Margin Nearest Neighbor Based Distance Metric Learning
Cast Kernel Margin Maximization into a SDP problem
Kernel Methods for Distance Metrics Learning
Kernel Alignment with SDP
Learning with Idealized Kernel
Linear embeddingPCA, MDS
Global Distance Metric Learning by Convex Programming
9
2007/07/11
Distance Metric LearningDistance Metric Learning
Strength: Usually have closed-form solution.
Weakness: The object of the distance metric learning is based
some data distribution criterion, but not the evaluation performance.
10
2007/07/11
Automatic Multiple Parameter Selection Automatic Multiple Parameter Selection for SVMfor SVM
n
i ii yxeyxK 1
2)(),(
Gaussian kernel:
Traditionally, each dimension of the feature vector will be normalized into zero-mean and one standard deviation. So each dimension have the same contribute to the kernel.
However, some features should be more important.
which is equivalent to diagonal distance metric learning:
n
i iii yxeyxK 1
2)(),(
)()(),( yxQyxeyxK
11
2007/07/11
I would like to do this task by experimental search, and incorporate data distribution criterion as some heuristic. Much more time consuming, might only applicable on small data.
Feature selection is another similar task and can be solved by experimental search, while the diagonal of the matrix is zero or one. Applicable on large data. But, already have many publication.
Automatic Multiple Parameter Selection Automatic Multiple Parameter Selection for SVMfor SVM
)()(),( yxQyxeyxK
12
2007/07/11
Thank you!Thank you!