Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a...
-
date post
20-Dec-2015 -
Category
Documents
-
view
220 -
download
1
Transcript of Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a...
![Page 1: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/1.jpg)
Cluster analysis
![Page 2: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/2.jpg)
• Partition MethodsDivide data into disjoint clusters
• Hierarchical Methods
Build a hierarchy of the observations and deduce the clusters from it.
![Page 3: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/3.jpg)
K-means
![Page 4: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/4.jpg)
Criteria
![Page 5: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/5.jpg)
Same criteria with multivariate data:
![Page 6: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/6.jpg)
Justifying the criteria• Anova: decomposition of the variance.
Univariate:
SST=SSW+SSB
Multivariate:
Minimizing the withing clusters variance is equivalent to maximize the between clusters variance (the difference between clusters).
![Page 7: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/7.jpg)
K-means algorithm
![Page 8: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/8.jpg)
Number of clusters
![Page 9: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/9.jpg)
Consequences of standardization
![Page 10: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/10.jpg)
Ruspini example
![Page 11: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/11.jpg)
![Page 12: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/12.jpg)
![Page 13: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/13.jpg)
![Page 14: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/14.jpg)
![Page 15: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/15.jpg)
Problems of k-means
• Very sensitive to outliers
• Euclidean distances not appropriate for eliptical clusters
• It does not give the number of clusters.
![Page 16: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/16.jpg)
Hierarchical Algoritms
![Page 17: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/17.jpg)
Agglomerative algorithms
![Page 18: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/18.jpg)
Nearest neighbour distance
![Page 19: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/19.jpg)
Farthest neighbour distance
![Page 20: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/20.jpg)
Average distance
![Page 21: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/21.jpg)
Centroid method distance
![Page 22: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/22.jpg)
Ward’s method distance
![Page 23: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/23.jpg)
Dendograms
![Page 24: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/24.jpg)
Example
![Page 25: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/25.jpg)
![Page 26: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/26.jpg)
![Page 27: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/27.jpg)
![Page 28: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/28.jpg)
![Page 29: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/29.jpg)
![Page 30: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/30.jpg)
![Page 31: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/31.jpg)
![Page 32: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/32.jpg)
Problems of hierarchical cluster
• If n is large, slow. Each time n(n-1)/2 comparisons.
• Euclidean distances not always appropriate
• If n is large, dendogram difficult to interpret
![Page 33: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/33.jpg)
Clustering by variables
![Page 34: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/34.jpg)
![Page 35: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/35.jpg)
Distances between quantitative variables
![Page 36: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/36.jpg)
Distances between qualitative variables
![Page 37: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/37.jpg)
Similarity between attributes
![Page 38: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/38.jpg)
![Page 39: Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.](https://reader034.fdocuments.in/reader034/viewer/2022042821/56649d435503460f94a20121/html5/thumbnails/39.jpg)