Download - Cluster Analysis

Cluster Analysis

• Market Segmentation• Document Similarity

Segment Members

Segment Members

Biz

Tech Math

= 64

MainGroups

• Each object is assigned to its own cluster and then the algorithm proceeds iteratively, at each stage joining the two most similar clusters, continuing until there is just a single cluster.

• At each stage distances between clusters are recomputed by the Lance–Williams dissimilarity update formula according to the particular clustering method being used.

Hierarchical Clustering

biztech <- read.csv("survey-biztech.csv")biztech <- as.matrix(biztech)

#hierarchical clusteringd <- dist(as.matrix(biztech))dm <- data.matrix(d)write.csv(dm, "distance_matrix.csv")


hc <- hclust(d)plot(hc)rect.hclust(hc, k=6, border="red")


ct <- cutree(hc, k=6) #write to filewrite.csv(ct, "survey-hclust.csv")

• hierarchical clustering is very expensive in terms of time complexity

• though it provides better result

Cold Weather