R 程式設計/聚類

基本聚類

您可以使用kmeans() 函式。

首先建立一些資料

> dat <- matrix(rnorm(100), nrow=10, ncol=10)

要應用kmeans()，您需要指定聚類的數量

> cl <- kmeans(dat, 3) # here 3 is the number of clusters
> table(cl$cluster)
 1  2  3 
38 44 18

基本層次聚類函式是hclust()，它對由dist() 函式生成的差異結構起作用

> hc <- hclust(dist(dat)) # data matrix from the example above
> plot(hc)

可以使用cutree() 函式裁剪生成的樹。

在給定高度裁剪它

> cl <- cutree(hc, h=5.1)
> table(cl)
cl
 1  2  3  4  5 
23 33 29  4 11

裁剪它以獲得給定的聚類數量

> cl <- cutree(hc, k=5)
> table(cl)
cl
 1  2  3  4  5 
23 33 29  4 11