计算频率和从长到宽投射的更快方法 [英] Faster ways to calculate frequencies and cast from long to wide
问题描述
我正在尝试获取两个变量周"和id"的每个级别组合的计数.我希望结果将id"作为行,week"作为列,并将计数作为值.
I am trying to obtain counts of each combination of levels of two variables, "week" and "id". I'd like the result to have "id" as rows, and "week" as columns, and the counts as the values.
到目前为止我尝试过的示例(尝试了很多其他事情,包括添加一个虚拟变量 = 1,然后在其上添加 fun.aggregate = sum
):
Example of what I've tried so far (tried a bunch of other things, including adding a dummy variable = 1 and then fun.aggregate = sum
over that):
library(plyr)
ddply(data, .(id), dcast, id ~ week, value_var = "id",
fun.aggregate = length, fill = 0, .parallel = TRUE)
但是,我一定是做错了什么,因为这个功能没有完成.有没有更好的方法来做到这一点?
However, I must be doing something wrong because this function is not finishing. Is there a better way to do this?
输入:
id week
1 1
1 2
1 3
1 1
2 3
输出:
1 2 3
1 2 1 1
2 0 0 1
推荐答案
为此,您不需要 ddply
.reshape2
中的 dcast
就足够了:
You don't need ddply
for this. The dcast
from reshape2
is sufficient:
dat <- data.frame(
id = c(rep(1, 4), 2),
week = c(1:3, 1, 3)
)
library(reshape2)
dcast(dat, id~week, fun.aggregate=length)
id 1 2 3
1 1 2 1 1
2 2 0 0 1
<小时>
对于基本的 R 解决方案(除了 table
- 由 Joshua Uhlrich 发布),请尝试 xtabs
:
Edit : For a base R solution (other than table
- as posted by Joshua Uhlrich), try xtabs
:
xtabs(~id+week, data=dat)
week
id 1 2 3
1 2 1 1
2 0 0 1
这篇关于计算频率和从长到宽投射的更快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!