遍历数据帧中组的唯一组合 [英] Iterate over unique combination of groups in a data frame

查看:63
本文介绍了遍历数据帧中组的唯一组合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在一个数据框中总结了3个不同的组.数据框如下所示:

I have 3 different groups summarised in a data frame. The data frame looks like:

d <- data.frame(v1 = c("A","A","A","B","B","B","C","C","C"), 

                v2 = c(1:9), stringsAsFactors = FALSE)

我想要的是将A的值与B的值进行比较.还将A的值与B的值进行比较,最后将B的值与C的值进行比较

What I want is to compare the values of A against values of B. Also values of A against values of B and as a last comparison the values of B against the values of C

我构造了2个for循环以遍历v1以提取要比较的组.但是,for循环为我提供了所有可能的组合,例如:

I constructed 2 for loops to iterate over v1 to extract the groups to compare. However, the for-loops give me all possible combinations like:

A vs. A

A vs. B

A vs. C

B vs. A

B vs.B

B与C

C vs. A,依此类推...

C vs. A and so on...

这是我的for循环:

for(i in unique(d$v1)) {

    for(j in unique(d$v1)) {

        cat("i = ", i, "j = ", j, "\n")

        group1 <- d[which(d$v1 == i), ]

            group2 <- d[which(d$v1 == j), ]

        print(group1)
        print(group2)

        cat("---------------------\n\n")

    }
}

我如何设法仅对数据帧d进行迭代,以使在第一次迭代中group1包含A的值,而group2包含B的值.在第二次迭代中,group1包含A的值和group2的值最后,group1包含B的值,group2包含C的值.

How can I manage to only iterate over data frame d so that in the first iteration group1 contains the values of A and group2 contains the values of B. In the second iteration group1 contains the values of A and group2 the values of C. And as a last comparisons group1 contains values of B and group2 contains values of C.

我完全不知所措,希望在这里找到答案.

I am somehow totally stuck with that problem and hoping to find an answer here.

干杯!

推荐答案

也许这样的方法对您有用.通过更多的工作,输出也可以进行整理".

Perhaps something like this would work for you. With some more work, the output can be "tidied-up" a little bit too.

我们将使用combn查找组合,并使用lapply根据组合对数据进行细分:

We'll use combn to find out the combinations, and lapply to subset our data based on the combinations:

temp = combn(unique(d$v1), 2)
temp
#     [,1] [,2] [,3]
# [1,] "A"  "A"  "B" 
# [2,] "B"  "C"  "C" 
lapply(1:ncol(temp), function(x) cbind(d[d$v1 == temp[1, x], ],
                                       d[d$v1 == temp[2, x], ]))
# [[1]]
#   v1 v2 v1 v2
# 1  A  1  B  4
# 2  A  2  B  5
# 3  A  3  B  6
# 
# [[2]]
#   v1 v2 v1 v2
# 1  A  1  C  7
# 2  A  2  C  8
# 3  A  3  C  9
# 
# [[3]]
#   v1 v2 v1 v2
# 4  B  4  C  7
# 5  B  5  C  8
# 6  B  6  C  9

这篇关于遍历数据帧中组的唯一组合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆