相当于data.table中的ddply(...,transform,...) [英] Equivalent to ddply(...,transform,...) in data.table
问题描述
我有以下代码使用plyr软件包中的 ddply
:
I have the following code using ddply
from plyr package:
ddply(mtcars,.(cyl),transform,freq=length(cyl))
.table的版本是:
The data.table version of this is :
DT<-data.table(mtcars)
DT[,freq:=.N,by=cyl]
一个函数像下面的一个函数?
How can I extend this when I have more than one function like the one below?
现在,我想在 ddply
和 data.table
:
ddply(mtcars,.(cyl),transform,freq=length(cyl),sum=sum(mpg))
DT[,list(freq=.N,sum=sum(mpg)),by=cyl]
但是, data.table
只有三列cyl,freq,和总和。好吧,我可以这样做:
But, data.table
gives me only three columns cyl,freq, and sum. Well, I can do like this:
DT[,list(freq=.N,sum=sum(mpg),mpg,disp,hp,drat,wt,qsec,vs,am,gear,carb),by=cyl]
但是,我有大量的变量在我读取的数据,我想让所有的人都在 ddply(... transform ....)
。在 data.table
中有快捷方式,就像在执行:=
时我们只有一个函数像中的
粘贴(names(mtcars),collapse =,)
注意:我也有大量的函数要运行。所以,我不能重复 =:
多次(但我希望如果 lapply
可以应用这里)。
But, I have large number of variables in my read data and I want all of them to be there as in ddply(...transform....)
. Is there shortcut in data.table
just like doing :=
when we have only one function (as above) or something like this paste(names(mtcars),collapse=",")
within data.table
?
Note: I also have a large number of function to run. So, I can't repeat =:
a number of times (but I would prefer this if lapply
can be applied here).
推荐答案
使用反引号:=
>
Use backquoted :=
like this...
DT[ , `:=`( freq = .N , sum = sum(mpg) ) , by=cyl ]
head( DT , 3 )
# mpg cyl disp hp drat wt qsec vs am gear carb freq sum
#1: 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 7 138.2
#2: 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 7 138.2
#3: 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 11 293.3
这篇关于相当于data.table中的ddply(...,transform,...)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!