使用Group_by创建以值为条件的汇总计数 [英] Using Group_by create aggregated counts conditional on value
本文介绍了使用Group_by创建以值为条件的汇总计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个数据表,如下所示:
I have a data table that looks like this:
serialno state type type2
1 100 FL A C
2 100 CA A D
3 101 CA B D
4 102 GA A C
5 103 WA A C
6 103 PA B C
7 104 CA B D
8 104 CA B C
9 105 NY A D
10 105 NJ B C
我需要创建一个新的数据表,该数据表由 serialno
聚合,但要计算每种现有变量类型的计数。因此,最终结果将如下所示。
I need to create a new data table that is aggregated by serialno
but calculates the count of each type of existing variables. So the end result would look like this.
FL CA GA A B C D
100 1 1 2 1 1
101 1 1 1 1
102 1 1
103 1 1 1 1 2
104 2 2 1 1
105 1 1 1 1 1 1
我确定存在使用 group_by
和总结
,但我无法弄清楚。是否最简单的解决方案是先创建散布 state
, type
和 type2
列,然后使用汇总创建计数?
I'm sure there is a solution using some combination of group_by
and summarize
, but I haven't been able to figure this out. Is the easiest solution just to create first spread out the state
, type
, and type2
columns and then use summarize to create counts?
推荐答案
使用:
library(reshape2)
recast(dat, serialno ~ value, id.var = 'serialno', fun.aggregate = length)
给予:
serialno A B C CA D FL GA NJ NY PA WA
1 100 2 0 1 1 1 1 0 0 0 0 0
2 101 0 1 0 1 1 0 0 0 0 0 0
3 102 1 0 1 0 0 0 1 0 0 0 0
4 103 1 1 2 0 0 0 0 0 0 1 1
5 104 0 2 1 2 1 0 0 0 0 0 0
6 105 1 1 1 0 1 0 0 1 1 0 0
或:
library(dplyr)
library(tidyr)
dat %>%
gather(key, val, state:type2) %>%
group_by(serialno, val) %>%
tally() %>%
spread(val, n, fill = 0)
或:
library(data.table)
dcast(melt(setDT(dat), id = 1), serialno ~ value, fun.aggregate = length)
或者(根据@Frank的建议):
Or (per @Frank's suggestion):
res <- table(melt(dat, id="serialno")[, c("serialno","value")])
print(res, zero.print="")
最后一个结果为:
value
serialno A B C CA D FL GA NJ NY PA WA
100 2 1 1 1 1
101 1 1 1
102 1 1 1
103 1 1 2 1 1
104 2 1 2 1
105 1 1 1 1 1 1
这篇关于使用Group_by创建以值为条件的汇总计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文