更快的计算频率和从长到宽投射的方法 [英] Faster ways to calculate frequencies and cast from long to wide

查看:31
本文介绍了更快的计算频率和从长到宽投射的方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试获取两个变量week"和id"的每个级别组合的计数.我希望结果将id"作为行,将week"作为列,并将计数作为值.

I am trying to obtain counts of each combination of levels of two variables, "week" and "id". I'd like the result to have "id" as rows, and "week" as columns, and the counts as the values.

到目前为止我尝试过的示例(尝试了很多其他方法,包括添加一个虚拟变量 = 1,然后在上面添加 fun.aggregate = sum):

Example of what I've tried so far (tried a bunch of other things, including adding a dummy variable = 1 and then fun.aggregate = sum over that):

library(plyr)
ddply(data, .(id), dcast, id ~ week, value_var = "id", 
        fun.aggregate = length, fill = 0, .parallel = TRUE)

但是,我一定是做错了什么,因为这个功能没有完成.有没有更好的方法来做到这一点?

However, I must be doing something wrong because this function is not finishing. Is there a better way to do this?

输入:

id      week
1       1
1       2
1       3
1       1
2       3

输出:

  1  2  3
1 2  1  1
2 0  0  1

推荐答案

你不需要 ddply 这个.reshape2 中的 dcast 就足够了:

You don't need ddply for this. The dcast from reshape2 is sufficient:

dat <- data.frame(
    id = c(rep(1, 4), 2),
    week = c(1:3, 1, 3)
)

library(reshape2)
dcast(dat, id~week, fun.aggregate=length)

  id 1 2 3
1  1 2 1 1
2  2 0 0 1

<小时>

对于基本 R 解决方案(除了 table - Joshua Uhlrich 发布的),请尝试 xtabs:


Edit : For a base R solution (other than table - as posted by Joshua Uhlrich), try xtabs:

xtabs(~id+week, data=dat)

   week
id  1 2 3
  1 2 1 1
  2 0 0 1

这篇关于更快的计算频率和从长到宽投射的方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆