Cluster Analysis

12
Cluster Analysis •Market Segmentation •Document Similarity

description

Cluster Analysis. Market Segmentation Document Similarity. Segment Members. Segment Members. = 64. Biz. Math. Tech. Main Groups. Hierarchical Clustering. - PowerPoint PPT Presentation

Transcript of Cluster Analysis

Page 1: Cluster Analysis

Cluster Analysis

• Market Segmentation• Document Similarity

Page 2: Cluster Analysis

Segment Members

Page 3: Cluster Analysis

Segment Members

Biz

Tech Math

= 64

MainGroups

Page 4: Cluster Analysis

• Each object is assigned to its own cluster and then the algorithm proceeds iteratively, at each stage joining the two most similar clusters, continuing until there is just a single cluster.

• At each stage distances between clusters are recomputed by the Lance–Williams dissimilarity update formula according to the particular clustering method being used.

Hierarchical Clustering

Page 5: Cluster Analysis
Page 6: Cluster Analysis

biztech <- read.csv("survey-biztech.csv")biztech <- as.matrix(biztech)

#hierarchical clusteringd <- dist(as.matrix(biztech))dm <- data.matrix(d)write.csv(dm, "distance_matrix.csv")

Hierarchical Clustering

Page 7: Cluster Analysis

hc <- hclust(d)plot(hc)rect.hclust(hc, k=6, border="red")

Page 8: Cluster Analysis

Hierarchical Clustering

ct <- cutree(hc, k=6) #write to filewrite.csv(ct, "survey-hclust.csv")

Page 9: Cluster Analysis
Page 10: Cluster Analysis
Page 11: Cluster Analysis

• hierarchical clustering is very expensive in terms of time complexity

• though it provides better result

Page 12: Cluster Analysis

Cold Weather