在R中的kmeans的每次迭代中获取每个观测的坐标 [英] Getting the coordinates of every observation at each iteration of kmeans in R

查看:120
本文介绍了在R中的kmeans的每次迭代中获取每个观测的坐标的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在R中构建kmeans聚类算法的动画.该动画将显示数据集中按2(或3)维绘制的每个观测值(行),然后将它们移动到它们的集群中,如下所示:每次迭代都会滴答作响.

I would like to construct an animation of the kmeans clustering algorithm in R. The animation would show each of the observations (rows) in the the dataset plotted in 2 (or 3) dimensions and then have them move into their clusters as each iteration ticks by.

为此,我将需要在每次迭代时访问观测值的坐标.我可以在kmeans包中的哪个位置访问这些文件?

For this I would need to access the coordinates of the observations at each iteration. Where in the kmeans package can I access these?

谢谢

推荐答案

我不认为kmeans()输出此类跟踪信息.最好的办法可能是重新运行kmeans()多次,以保留群集中心.

I don't think kmeans() outputs this kind of tracing information. Your best best may be to re-run kmeans() multiple times, carrying over cluster centers.

set.seed(1)
clus.1 <- kmeans(iris[,1:2],5,iter.max=1)
clus.2 <- kmeans(iris[,1:2],centers=clus.1$centers,iter.max=1)
clus.3 <- kmeans(iris[,1:2],centers=clus.2$centers,iter.max=1)

changing <- which(apply(cbind(clus.1$cluster,clus.2$cluster,clus.3$cluster),1,sd)>0)
changing
opar <- par(mfrow=c(1,3))
    plot(iris[,c(1,2)],col=clus.1$cluster,pch=19,main="Iteration 1")
    points(iris[changing,c(1,2)],pch=21,cex=2)
    plot(iris[,c(1,2)],col=clus.2$cluster,pch=19,main="Iteration 2")
    points(iris[changing,c(1,2)],pch=21,cex=2)
    plot(iris[,c(1,2)],col=clus.3$cluster,pch=19,main="Iteration 3")
    points(iris[changing,c(1,2)],pch=21,cex=2)
par(opar)

我指出确实改变集群成员资格的要点;不幸的是,只有一个这样做,因为kmeans()收敛得太快了;-)

I indicate the points that do change cluster membership; unfortunately, only one does do so, because kmeans() just converges so darn fast ;-)

您写道,您希望随着每次迭代的进行,让它们移入它们的簇".当然,不要在聚类算法中移动.因此,像这样的彩色编码表示是最好的选择.

You write that you would like to "have them move into their clusters as each iteration ticks by". Of course points don't move in clustering algorithms. So a color-coded representation like this one is your best bet.

在两个以上的维度中,您可以尝试pairs(),或仅专注于两个维度.准备好解释为什么将n维簇投影到二维时看起来不像簇状".

In more than two dimensions, you can try pairs(), or just concentrate on two dimensions. Be prepared to explain why n-dimensional clusters don't look "cluster-like" when projected to two dimensions.

这篇关于在R中的kmeans的每次迭代中获取每个观测的坐标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆