Cluster Analysis
• Market Segmentation• Document Similarity
Segment Members
Segment Members
Biz
Tech Math
= 64
MainGroups
• Each object is assigned to its own cluster and then the algorithm proceeds iteratively, at each stage joining the two most similar clusters, continuing until there is just a single cluster.
• At each stage distances between clusters are recomputed by the Lance–Williams dissimilarity update formula according to the particular clustering method being used.
Hierarchical Clustering
biztech <- read.csv("survey-biztech.csv")biztech <- as.matrix(biztech)
#hierarchical clusteringd <- dist(as.matrix(biztech))dm <- data.matrix(d)write.csv(dm, "distance_matrix.csv")
Hierarchical Clustering
hc <- hclust(d)plot(hc)rect.hclust(hc, k=6, border="red")
Hierarchical Clustering
ct <- cutree(hc, k=6) #write to filewrite.csv(ct, "survey-hclust.csv")
• hierarchical clustering is very expensive in terms of time complexity
• though it provides better result
Cold Weather
Top Related