使用R进行层次聚类 [英] Hierarchical clustering with R

查看:72
本文介绍了使用R进行层次聚类的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请考虑几点:

A = (1, 2.5), B = (5, 10), C = (23, 34), D = (45, 47), E = (4, 17), F = (18, 4)

如何使用R对它们执行分层聚类?
我已经阅读过此示例集群分析,但是我不确定如何输入这些值而不是常规数字.

How can I perform hierarchical clustering on them with R?
I've read this example Cluster Analysis but I'm not sure how to enter these values as points rather than just regular numbers.

当我这样做

x <- c(...) #x values
y <- c(...) #y values

我可以使用

plot(x,y)

但是如何像示例中那样指定这些值:

But how can I specify those values like in the example:

mydata <- scale(mydata)

mydata <- scale(x,y)

我收到以下错误

Error in scale.default(x, y) : 
  length of 'center' must equal the number of columns of 'x'

推荐答案

类似这样的东西?

A = c(1, 2.5); B = c(5, 10); C = c(23, 34)
D = c(45, 47); E = c(4, 17); F = c(18, 4)
df <- data.frame(rbind(A,B,C,D,E,F))
colnames(df) <- c("x","y")
hc <- hclust(dist(df))
plot(hc)

这会将点放入两列的数据框,分别是 x y ,然后计算距离矩阵(每个点与其他点之间的成对距离),并对此进行层次聚类分析.

This puts the points into a data frame with two columns, x and y, then calculates the distance matrix (pairwise distance between every point and every other point), and does the hierarchical cluster analysis on that.

然后我们可以按簇对数据进行着色.

We can then plot the data with coloring by cluster.

df$cluster <- cutree(hc,k=2)    # identify 2 clusters
plot(y~x,df,col=cluster)

这篇关于使用R进行层次聚类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆