dplyr中的COUNTIF个等效项汇总 [英] COUNTIF equivalent in dplyr summarise

查看:45
本文介绍了dplyr中的COUNTIF个等效项汇总的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,其中列出了参加活动(Sub)的学生总数(Stu)和每组的学生人数(ID):

I have a data frame listing total students (Stu) and number of students per group (ID) who are taking part in an activity (Sub):

     ID   Stu   Sub
  (int) (int) (int)
1   101    80    NA
2   102   130    NA
3   103    10    NA
4   104   210    20
5   105   180    NA
6   106   150    NA

我想知道大小带(> 400,> 200,> 100,> 0)中参与活动(Sub> 0)或不参与(Sub is.na)的组的数量

I would like to know the number of groups in size bands (>400, >200, >100, >0) who are either involved in an activity (Sub > 0), or not (Sub is.na)

output <- structure(list(ID = c(101L, 102L, 103L, 104L, 105L, 106L), 
                       Stu = c(80L, 130L, 10L, 210L, 180L, 150L), 
                       Sub = c(NA,NA, NA, 20L, NA, NA)), 
                  .Names = c("ID", "Stu", "Sub"), 
                  class = c("tbl_df", "data.frame"), 
                  row.names = c(NA, -6L))

temp <- output %>% 
mutate(Stu = ifelse(Stu >= 400, 400,
         ifelse(Stu >= 200, 200,
             ifelse(Stu >= 100, 100, 0
                 )))) %>%
group_by(Stu) %>%
summarise(entries = length(!is.na(Sub)),
          noentries = length(is.na(Sub)))

结果应为:

    Stu entries noentries
  (dbl)   (int)     (int)
1     0       0         2
2   100       0         3
3   200       1         0

但是我得到了

    Stu entries noentries
  (dbl)   (int)     (int)
1     0       2         2
2   100       3         3
3   200       1         1

如何使摘要中的长度函数像计数一样起作用?

How can I make the length function in the summarise act like a countif?

推荐答案

summarise 需要单个值,因此 sum 而不是 length 工作:

summarise expects a single value, so sum instead of length does the job:

output %>% 
  mutate(Stu = ifelse(Stu >= 400, 400,
                      ifelse(Stu >= 200, 200,
                             ifelse(Stu >= 100, 100, 0
                             )))) %>%
  group_by(Stu) %>% 
  summarise(entries = sum(!is.na(Sub)),
            noentries = sum(is.na(Sub)))

Source: local data frame [3 x 3]

Stu entries noentries
(dbl)   (int)     (int)
1     0       0         2
2   100       0         3
3   200       1         0

这篇关于dplyr中的COUNTIF个等效项汇总的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆