使用dplyr对group_by进行分组并有条件地对数据帧进行分组 [英] Using dplyr to group_by and conditionally mutate a dataframe by group

查看:68
本文介绍了使用dplyr对group_by进行分组并有条件地对数据帧进行分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用dplyr函数对group_by进行有条件的突变。给定此样本数据:

I'd like to use dplyr functions to group_by and conditionally mutate a df. Given this sample data:

A   B   C   D
1   1   1   0.25
1   1   2   0
1   2   1   0.5
1   2   2   0
1   3   1   0.75
1   3   2   0.25
2   1   1   0
2   1   2   0.5
2   2   1   0
2   2   2   0
2   3   1   0
2   3   2   0
3   1   1   0.5
3   1   2   0
3   2   1   0.25
3   2   2   1
3   3   1   0
3   3   2   0.75

我想使用新列E通过B == 1,C == 2和D> 0来对A进行分类。对于满足所有这些条件的A的每个唯一值,则E = 1否则E =0。因此,输出应如下所示:

I want to use new column E to categorize A by whether B == 1, C == 2, and D > 0. For each unique value of A for which all of these conditions hold true, then E = 1, else E = 0. So, the output should look like this:

A   B   C   D    E
1   1   1   0.25 0
1   1   2   0    0
1   2   1   0.5  0
1   2   2   0    0
1   3   1   0.75 0
1   3   2   0.25 0
2   1   1   0    1
2   1   2   0.5  1
2   2   1   0    1
2   2   2   0    1
2   3   1   0    1
2   3   2   0    1
3   1   1   0.5  0
3   1   2   0    0
3   2   1   0.25 0
3   2   2   1    0
3   3   1   0    0
3   3   2   0.75 0

我最初尝试此代码,但条件条件似乎并非如此工作权:

I initially tried this code but the conditionals don't seem to be working right:

 foo$E <- foo %>% 
    group_by(A) %>% 
    mutate(E = {if (B == 1 & C == 2 & D > 0) 1 else 0})

任何见解都值得赞赏。谢谢!

Any insights appreciated. Thanks!

推荐答案

@ eipi10的答案有效。但是,我认为您应该使用 case_when 而不是 ifelse 。它是矢量化的,在更大的数据集上会更快。

@eipi10 's answer works. However, I think you should use case_when instead of ifelse. It is vectorised and will be much faster on larger datasets.

foo %>% group_by(A) %>%
  mutate(E = case_when(any(B == 1 & C == 2 & D > 0) ~ 1, TRUE ~ 0))

这篇关于使用dplyr对group_by进行分组并有条件地对数据帧进行分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆