了解R中的kmeans聚类 [英] Understanding kmeans clustering in r

查看：232 发布时间：2020/4/26 10:26:28 r k-means

本文介绍了了解R中的kmeans聚类的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

下面的代码(减去我的问题)生成了这张图:

Below code (minus my questions) generates this graph :

我用->"标记了4个混乱的地方

I have marked 4 areas of confusion with "->"

> m <- matrix(c(1,1,1) , ncol=3)
> 
> x <- rbind(matrix(c(1,0,1) , ncol=3),
+            matrix(c(1,1,1) , ncol=3),
+            matrix(c(1,1,0) , ncol=3),
+            matrix(c(0,1,1) , ncol=3),
+            matrix(c(0,0,1) , ncol=3),
+            matrix(c(0,0,0) , ncol=3),
+            matrix(c(1,1,1) , ncol=3),
+            matrix(c(1,1,1) , ncol=3),
+            matrix(c(1,1,0) , ncol=3),
+            matrix(c(1,0,0) , ncol=3),
+            matrix(c(0,0,1) , ncol=3),
+            matrix(c(0,0,0) , ncol=3),
+            matrix(c(0,0,1) , ncol=3),
+            matrix(c(0,1,1) , ncol=3),
+            matrix(c(1,0,1) , ncol=3),
+            matrix(c(0,1,0) , ncol=3))
> colnames(x) <- c("google", "stackoverflow", "tester")
> (cl <- kmeans(x, 3))

K-means clustering with 3 clusters of sizes 3, 10, 3
-> Where are sizes 3, 10 3 appearing  ?

Cluster means:
     google stackoverflow tester
1 0.6666667           1.0      0
2 0.5000000           0.5      1
3 0.3333333           0.0      0

-> There are three clusters, but what does each number signify ?

Clustering vector:
 [1] 2 2 1 2 2 3 2 2 1 3 2 3 2 2 2 1

-> This looks to be created by summing the values of each matrix but seems to be unordered as second element in this vector is '2' but second element in 'x' is matrix(c(1,1,1) , ncol=3) which is '3'

Within cluster sum of squares by cluster:
[1] 0.6666667 5.0000000 0.6666667
 (between_SS / total_SS =  46.1 %)

-> what are between_SS & total_SS ?

Available components:

[1] "cluster"      "centers"      "totss"        "withinss"     "tot.withinss"
[6] "betweenss"    "size"        
> plot(x, col = cl$cluster)
> points(cl$centers, col = 1:5, pch = 8, cex = 2)
>

通过阅读该算法的实现，可以提供这些问题的答案( http://en.wikipedia.org/wiki/K-means_clustering )我看不到r如何计算这些值

Can provide answers to these questions as from reading the implementation of this algorithm (http://en.wikipedia.org/wiki/K-means_clustering) I fail to see how r is computing these values

了解R中的kmeans聚类 [英] Understanding kmeans clustering in r

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

了解R中的kmeans聚类 [英] Understanding kmeans clustering in r

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭