以所有可能的组合耦合数据 [英] Couple the data in all possible combinations
问题描述
我在这样的两列中都有数据
I have data in column in two columns like this
Id Value
1 a
2 f
1 c
1 h
2 a
并且我想基于相同的ID(例如,
and I'd like couple the data of the 'Value' column in all possible combinations based on the same Id such as
(a,c)
(a,h)
(c,h)
(f,a)
是否有R或Python或VBA代码可以完成此任务?
Is there any R or Python or VBA code to get this task?
推荐答案
要使用底数R返回具有这些组合的字符矩阵,请尝试
To return a character matrix with these combinations using base R, try
do.call(rbind, t(sapply(split(df, df$Id), function(i) t(combn(i$Value, 2)))))
[,1] [,2]
[1,] "a" "c"
[2,] "a" "h"
[3,] "c" "h"
[4,] "f" "a"
每行都是所需的组合.
为了稍微分解一下,split
将ID将data.frame拆分为两个data.frame的列表.然后,将sapply
馈入此列表和combn
函数,以在这些data.frames中查找成对组合.使用t
将每个data.frame(它是矩阵)的结果转置为适合您所需的结构.最后,此矩阵列表被馈送到do.call
,后者使用rbind
返回最终矩阵.
To break this down a bit, split
splits the data.frame by Id into a list of two data.frames. Then sapply
is fed this list and the combn
function to find the pairwise combinations within these data.frames. The result from each data.frame (which is a matrix) is transposed to fit your desired structure using t
. Finally, this list of matrices is fed to do.call
which uses rbind
to return the final matrix.
注意:假设值列是字符(不是讨厌因素变量类型).在read.
函数族(例如read.csv
和read.table
)中,可以通过将as.is = TRUE参数添加到读取函数(或更长的stringsAsFactors = FALSE)中来轻松实现此目的.如果变量已经是一个因素,则可以将i$Value
语句包装在as.character
:as.character(i$Value)
的结尾处,它将按需运行.
Note: There is an assumption that the value column is character (not the pesky factor variable type). This is easily accomplished in the read.
family of functions, like read.csv
and read.table
by adding the as.is=TRUE argument to your read function (or the longer stringsAsFactors=FALSE). If the variable is already a factor, you can wrap the i$Value
statement near the end in as.character
: as.character(i$Value)
and it will run as desired.
这篇关于以所有可能的组合耦合数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!