dplyr 计数变量的一个特定值 [英] dplyr count number of one specific value of variable
问题描述
假设我有一个这样的数据集:
Say I have a dataset like this:
id <- c(1, 1, 2, 2, 3, 3)
code <- c("a", "b", "a", "a", "b", "b")
dat <- data.frame(id, code)
即,
id code
1 1 a
2 1 b
3 2 a
4 2 a
5 3 b
6 3 b
使用 dplyr,我如何计算每个 id 有多少个 a
Using dplyr, how would I get a count of how many a's there are for each id
即
id countA
1 1 1
2 2 2
3 3 0
我正在尝试这样的东西,但不起作用,
I'm trying stuff like this which isn't working,
countA<- dat %>%
group_by(id) %>%
summarise(cip.completed= count(code == "a"))
上面给了我一个错误,错误:没有适用于group_by_"的方法应用于类逻辑"的对象"
The above gives me an error, "Error: no applicable method for 'group_by_' applied to an object of class "logical""
感谢您的帮助!
推荐答案
尝试以下方法:
library(dplyr)
dat %>% group_by(id) %>%
summarise(cip.completed= sum(code == "a"))
Source: local data frame [3 x 2]
id cip.completed
(dbl) (int)
1 1 1
2 2 2
3 3 0
这是可行的,因为逻辑条件 code == a
只是一系列零和一,而这一系列的总和就是出现的次数.
This works because the logical condition code == a
is just a series of zeros and ones, and the sum of this series is the number of occurences.
请注意,无论如何,您不一定在 summarise
中使用 dplyr::count
,因为它是 summarise
调用任一 的包装器>n()
或 sum()
本身.参见 ?dplyr::count
.如果您真的想使用 count
,我想您可以通过首先过滤数据集以仅保留 code==a
中的所有行,然后使用 count
然后会给你所有严格的正数(即非零)计数.例如,
Note that you would not necessarily use dplyr::count
inside summarise
anyway, as it is a wrapper for summarise
calling either n()
or sum()
itself. See ?dplyr::count
. If you really want to use count
, I guess you could do that by first filtering the dataset to only retain all rows in which code==a
, and using count
would then give you all strictly positive (i.e. non-zero) counts. For instance,
dat %>% filter(code==a) %>% count(id)
Source: local data frame [2 x 2]
id n
(dbl) (int)
1 1 1
2 2 2
这篇关于dplyr 计数变量的一个特定值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!