在R中的kmeans的每次迭代中获取每个观测的坐标 [英] Getting the coordinates of every observation at each iteration of kmeans in R
问题描述
我想在R中构建kmeans聚类算法的动画.该动画将显示数据集中按2(或3)维绘制的每个观测值(行),然后将它们移动到它们的集群中,如下所示:每次迭代都会滴答作响.
I would like to construct an animation of the kmeans clustering algorithm in R. The animation would show each of the observations (rows) in the the dataset plotted in 2 (or 3) dimensions and then have them move into their clusters as each iteration ticks by.
为此,我将需要在每次迭代时访问观测值的坐标.我可以在kmeans包中的哪个位置访问这些文件?
For this I would need to access the coordinates of the observations at each iteration. Where in the kmeans package can I access these?
谢谢
推荐答案
我不认为kmeans()
输出此类跟踪信息.最好的办法可能是重新运行kmeans()
多次,以保留群集中心.
I don't think kmeans()
outputs this kind of tracing information. Your best best may be to re-run kmeans()
multiple times, carrying over cluster centers.
set.seed(1)
clus.1 <- kmeans(iris[,1:2],5,iter.max=1)
clus.2 <- kmeans(iris[,1:2],centers=clus.1$centers,iter.max=1)
clus.3 <- kmeans(iris[,1:2],centers=clus.2$centers,iter.max=1)
changing <- which(apply(cbind(clus.1$cluster,clus.2$cluster,clus.3$cluster),1,sd)>0)
changing
opar <- par(mfrow=c(1,3))
plot(iris[,c(1,2)],col=clus.1$cluster,pch=19,main="Iteration 1")
points(iris[changing,c(1,2)],pch=21,cex=2)
plot(iris[,c(1,2)],col=clus.2$cluster,pch=19,main="Iteration 2")
points(iris[changing,c(1,2)],pch=21,cex=2)
plot(iris[,c(1,2)],col=clus.3$cluster,pch=19,main="Iteration 3")
points(iris[changing,c(1,2)],pch=21,cex=2)
par(opar)
我指出确实改变集群成员资格的要点;不幸的是,只有一个这样做,因为kmeans()
收敛得太快了;-)
I indicate the points that do change cluster membership; unfortunately, only one does do so, because kmeans()
just converges so darn fast ;-)
您写道,您希望随着每次迭代的进行,让它们移入它们的簇".当然,不要在聚类算法中移动.因此,像这样的彩色编码表示是最好的选择.
You write that you would like to "have them move into their clusters as each iteration ticks by". Of course points don't move in clustering algorithms. So a color-coded representation like this one is your best bet.
在两个以上的维度中,您可以尝试pairs()
,或仅专注于两个维度.准备好解释为什么将n维簇投影到二维时看起来不像簇状".
In more than two dimensions, you can try pairs()
, or just concentrate on two dimensions. Be prepared to explain why n-dimensional clusters don't look "cluster-like" when projected to two dimensions.
这篇关于在R中的kmeans的每次迭代中获取每个观测的坐标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!