data.table:使用with = False和转换函数/汇总函数? [英] data.table: Using with=False and transforming function/summary function?

查看:187
本文介绍了data.table:使用with = False和转换函数/汇总函数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在data.table中汇总几个变量,以宽格式输出,输出可能作为每个变量的列表。由于几个其他的方法没有工作,我试图做一个外部lapply,给变量的名称作为字符向量。我想传递这些,使用with = FALSE。

I want to summarise several variables in data.table, output in wide format, output possibly as a list per variable. Since several other approaches did not work, I tried to do an outer lapply, giving the names of the variables as character vectors. I wanted to pass these in, using with=FALSE.

carsx=as.data.table(cars)
lapply( list(speed="speed",dist= "dist"), #error object 'ansvals' not found
    function(x)  carsx[,list(mean(x), min(x), max(x) ), with=FALSE ] ) 

由于这不工作,我尝试了更简单的方法,没有lapply。

Since this does not work, I tried the more simple approach without lapply.

carsx[,list(mean("speed"), min("speed"), max("speed") ), with=FALSE ] #error object 'ansvals' not found

有什么办法做这样的事情吗?这是有的行为吗? (我知道?data.table 只提到选择列,但在我的情况下,能够转换他们也是有用的)

This does not work either. Is there any way to do something like this? Is this behaviour of 'with' wanted? (I am aware that ?data.table mentions with only to select columns, but in my case it would be useful to be able to transform them as well)


当with = FALSE时,j是要选择的名称或位置的向量,类似于data.frame。 with = FALSE在data.table中经常用于动态选择列。

When with=FALSE, j is a vector of names or positions to select, similar to a data.frame. with=FALSE is often useful in data.table to select columns dynamically.

EDIT
我的目的是获取摘要每个组以宽格式,对于不同的变量。
我试图扩展以下,只适用于一个变量,一个变量列表。

EDIT My aim is to get a summary per group in wide format, for different variables. I tried to extend the following, which works only for one variable, for a list of variables.

carsx[,list(mean(speed), min(speed), max(speed) ) ,by=(dist>50)

很遗憾,不允许我张贴我的其他问题。我描述了我想要一个类似输出的输出:

Lamentably SO doesnt let me post my other question. There I described that I want an output similiar to:

lapply( list(speed="speed",dist= "dist"),
        function(x) do.call("as.data.frame", aggregate(cars[,x], list(class=cars$dist>50), FUN=summary) ) )

预期输出将类似于:

$speed 
         V1       V2 V3
1: FALSE 12.96970  4 20
2:  TRUE 20.11765 14 25

$dist
         V1       V2 V3
1: FALSE 12.96970  4 20
2:  TRUE 20.11765 14 25


推荐答案

在Svens上构建一个.SDcols,rbindlist和外部和内部lapply的组合。内部lapply是必要的访问。

Building on Svens answer a combination of .SDcols, rbindlist, and outer and inner lapply did the trick. The inner lapply is necessary to access .SD.

lapply( list(speed="speed",dist= "dist"),
    function(x)  carsx[ , rbindlist(lapply(.SD, function(x) list(mean=mean(x), min=min(x), max=max(x)) )), 
                       .SDcols = x,by= (dist>50)] ) 

结果:

$speed
    dist     mean min max
1: FALSE 12.96970   4  20
2:  TRUE 20.11765  14  25

$dist
    dist     mean min max
1: FALSE 27.84848   2  50
2:  TRUE 72.35294  52 120

这篇关于data.table:使用with = False和转换函数/汇总函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆