Presenter : JHOU, YU-LIANG Authors : Yiu-ming Cheung, Hong Jia 2013,PR

Post on 21-Mar-2016

55 views 0 download

Tags:

description

Categorical-and-numerical-attribute data clustering based on a unified similarity metric without knowing cluster number. Presenter : JHOU, YU-LIANG Authors : Yiu-ming Cheung, Hong Jia 2013,PR. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. - PowerPoint PPT Presentation

Transcript of Presenter : JHOU, YU-LIANG Authors : Yiu-ming Cheung, Hong Jia 2013,PR

Intelligent Database Systems Lab

Presenter : JHOU, YU-LIANG

Authors : Yiu-ming Cheung, Hong Jia

2013,PR

Categorical-and-numerical-attribute data clustering based on a unified

similarity metric without knowing cluster number

Intelligent Database Systems Lab

OutlinesMotivationObjectivesMethodologyExperimentsConclusionsComments

Intelligent Database Systems Lab

Motivation• It is a nontrivial task to perform clustering on

mixed data because there exists an awkward

gap between the similarity metrics for

categorical and numerical data.

Intelligent Database Systems Lab

Objectives• This paper presents a general clustering framework

based on the concept of object-cluster similarity and

gives a unified similarity metric which can be applied

to the data with categorical, numerical, and mixed

attributes.

Intelligent Database Systems Lab

Methodologyobject-cluster similarity metric

categorical attribute

Intelligent Database Systems Lab

Methodologyobject-cluster similarity metric

• numerical attributes

• mixed data

Intelligent Database Systems Lab

MethodologyIterative clustering algorithm

Intelligent Database Systems Lab

MethodologyAutomatic selection of cluster number

Competition mechanism

Intelligent Database Systems Lab

MethodologyAutomatic selection of cluster number

Penalized mechanism

Intelligent Database Systems Lab

Experiments-data sets

Intelligent Database Systems Lab

Experiments mixed data

Intelligent Database Systems Lab

Experiments categorical data

Intelligent Database Systems Lab

Conclusions• We adopt our new approach can improve the time-

consuming and efficiency of the process and

overcome the cluster number selection problem.

Intelligent Database Systems Lab

Comments• Advantages More save time and efficiency .Applications-Clustering