更快的频率计算方法以及从长到宽的转换 [英] Faster ways to calculate frequencies and cast from long to wide
问题描述
我正在尝试获取周"和"id"两个变量水平的每种组合的计数.我希望结果将"id"作为行,将"week"作为列,并将计数作为值.
I am trying to obtain counts of each combination of levels of two variables, "week" and "id". I'd like the result to have "id" as rows, and "week" as columns, and the counts as the values.
到目前为止,我已经尝试过的示例(尝试了很多其他事情,包括添加一个虚拟变量= 1,然后在其上添加fun.aggregate = sum
):
Example of what I've tried so far (tried a bunch of other things, including adding a dummy variable = 1 and then fun.aggregate = sum
over that):
library(plyr)
ddply(data, .(id), dcast, id ~ week, value_var = "id",
fun.aggregate = length, fill = 0, .parallel = TRUE)
但是,我必须做错什么,因为此功能尚未完成.有更好的方法吗?
However, I must be doing something wrong because this function is not finishing. Is there a better way to do this?
输入:
id week
1 1
1 2
1 3
1 1
2 3
输出:
1 2 3
1 2 1 1
2 0 0 1
推荐答案
您不需要ddply
. reshape2
中的dcast
就足够了:
You don't need ddply
for this. The dcast
from reshape2
is sufficient:
dat <- data.frame(
id = c(rep(1, 4), 2),
week = c(1:3, 1, 3)
)
library(reshape2)
dcast(dat, id~week, fun.aggregate=length)
id 1 2 3
1 1 2 1 1
2 2 0 0 1
对于基本的R解决方案(除了table
-由Joshua Uhlrich发布),请尝试xtabs
:
Edit : For a base R solution (other than table
- as posted by Joshua Uhlrich), try xtabs
:
xtabs(~id+week, data=dat)
week
id 1 2 3
1 2 1 1
2 0 0 1
这篇关于更快的频率计算方法以及从长到宽的转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!