使用Group_by创建以值为条件的汇总计数 [英] Using Group_by create aggregated counts conditional on value

查看:119
本文介绍了使用Group_by创建以值为条件的汇总计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据表,如下所示:

I have a data table that looks like this:

    serialno state type type2
1       100    FL    A     C
2       100    CA    A     D
3       101    CA    B     D
4       102    GA    A     C
5       103    WA    A     C
6       103    PA    B     C
7       104    CA    B     D
8       104    CA    B     C
9       105    NY    A     D
10      105    NJ    B     C

我需要创建一个新的数据表,该数据表由 serialno 聚合,但要计算每种现有变量类型的计数。因此,最终结果将如下所示。

I need to create a new data table that is aggregated by serialno but calculates the count of each type of existing variables. So the end result would look like this.

    FL  CA  GA  A   B   C   D
100 1   1       2       1   1
101     1   1       1       1
102             1       1   
103 1       1   1   1   2   
104     2           2   1   1
105 1   1       1   1   1   1

我确定存在使用 group_by 总结,但我无法弄清楚。是否最简单的解决方案是先创建散布 state type type2 列,然后使用汇总创建计数?

I'm sure there is a solution using some combination of group_by and summarize, but I haven't been able to figure this out. Is the easiest solution just to create first spread out the state, type, and type2 columns and then use summarize to create counts?

推荐答案

使用:

library(reshape2)
recast(dat, serialno ~ value, id.var = 'serialno', fun.aggregate = length)

给予:

  serialno A B C CA D FL GA NJ NY PA WA
1      100 2 0 1  1 1  1  0  0  0  0  0
2      101 0 1 0  1 1  0  0  0  0  0  0
3      102 1 0 1  0 0  0  1  0  0  0  0
4      103 1 1 2  0 0  0  0  0  0  1  1
5      104 0 2 1  2 1  0  0  0  0  0  0
6      105 1 1 1  0 1  0  0  1  1  0  0

或:

library(dplyr)
library(tidyr)
dat %>% 
  gather(key, val, state:type2) %>% 
  group_by(serialno, val) %>% 
  tally() %>% 
  spread(val, n, fill = 0)

或:

library(data.table)
dcast(melt(setDT(dat), id = 1), serialno ~ value, fun.aggregate = length)

或者(根据@Frank的建议):

Or (per @Frank's suggestion):

res <- table(melt(dat, id="serialno")[, c("serialno","value")])
print(res, zero.print="")

最后一个结果为:

        value
serialno A B C CA D FL GA NJ NY PA WA
     100 2   1  1 1  1               
     101   1    1 1                  
     102 1   1          1            
     103 1 1 2                   1  1
     104   2 1  2 1                  
     105 1 1 1    1        1  1      

这篇关于使用Group_by创建以值为条件的汇总计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆