如何将两个因子转换为R中的邻接矩阵? [英] How to convert two factors to adjacency matrix in R?
问题描述
我有一个包含两列(键和值)的数据框,其中每一列都是一个因素:
I have a data frame with two columns (key and value) where each column is a factor:
df = data.frame(gl(3,4,labels=c('a','b','c')), gl(6,2))
colnames(df) = c("key", "value")
key value
1 a 1
2 a 1
3 a 2
4 a 2
5 b 3
6 b 3
7 b 4
8 b 4
9 c 5
10 c 5
11 c 6
12 c 6
我想将其转换为邻接矩阵(在这种情况下为3x6大小),例如:
I want to convert it to adjacency matrix (in this case 3x6 size) like:
1 2 3 4 5 6
a 1 1 0 0 0 0
b 0 0 1 1 0 0
c 0 0 0 0 1 1
这样我就可以使用kmeans或hclust在其上(具有相似值的组键)运行集群.
So that I can run clustering on it (group keys that have similar values together) with either kmeans or hclust.
我能获得的最接近的结果是使用 model.matrix( ~ value, df)
,结果是:
Closest that I was able to get was using model.matrix( ~ value, df)
which results in:
(Intercept) value2 value3 value4 value5 value6
1 1 0 0 0 0 0
2 1 0 0 0 0 0
3 1 1 0 0 0 0
4 1 1 0 0 0 0
5 1 0 1 0 0 0
6 1 0 1 0 0 0
7 1 0 0 1 0 0
8 1 0 0 1 0 0
9 1 0 0 0 1 0
10 1 0 0 0 1 0
11 1 0 0 0 0 1
12 1 0 0 0 0 1
但结果尚未按键分组.
从另一面看,我可以使用以下方法将此数据集折叠成组:
From another side I can collapse this dataset into groups using:
aggregate(df$value, by=list(df$key), unique)
Group.1 x.1 x.2
1 a 1 2
2 b 3 4
3 c 5 6
但是我不知道下一步该怎么做...
But I don't know what to do next...
有人可以帮助解决这个问题吗?
Can someone help to solve this?
推荐答案
在base
R:
res <-table(df)
res[res>0] <-1
res
value
#key 1 2 3 4 5 6
# a 1 1 0 0 0 0
# b 0 0 1 1 0 0
# c 0 0 0 0 1 1
这篇关于如何将两个因子转换为R中的邻接矩阵?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!