更快的频率计算方法以及从长到宽的转换 [英] Faster ways to calculate frequencies and cast from long to wide

查看:95
本文介绍了更快的频率计算方法以及从长到宽的转换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试获取周"和"id"两个变量水平的每种组合的计数.我希望结果将"id"作为行,将"week"作为列,并将计数作为值.

I am trying to obtain counts of each combination of levels of two variables, "week" and "id". I'd like the result to have "id" as rows, and "week" as columns, and the counts as the values.

到目前为止,我已经尝试过的示例(尝试了很多其他事情,包括添加一个虚拟变量= 1,然后在其上添加fun.aggregate = sum):

Example of what I've tried so far (tried a bunch of other things, including adding a dummy variable = 1 and then fun.aggregate = sum over that):

library(plyr)
ddply(data, .(id), dcast, id ~ week, value_var = "id", 
        fun.aggregate = length, fill = 0, .parallel = TRUE)

但是,我必须做错什么,因为此功能尚未完成.有更好的方法吗?

However, I must be doing something wrong because this function is not finishing. Is there a better way to do this?

输入:

id      week
1       1
1       2
1       3
1       1
2       3

输出:

  1  2  3
1 2  1  1
2 0  0  1

推荐答案

您不需要ddply. reshape2中的dcast就足够了:

You don't need ddply for this. The dcast from reshape2 is sufficient:

dat <- data.frame(
    id = c(rep(1, 4), 2),
    week = c(1:3, 1, 3)
)

library(reshape2)
dcast(dat, id~week, fun.aggregate=length)

  id 1 2 3
1  1 2 1 1
2  2 0 0 1


对于基本的R解决方案(除了table-由Joshua Uhlrich发布),请尝试xtabs:


Edit : For a base R solution (other than table - as posted by Joshua Uhlrich), try xtabs:

xtabs(~id+week, data=dat)

   week
id  1 2 3
  1 2 1 1
  2 0 0 1

这篇关于更快的频率计算方法以及从长到宽的转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆