可以在dcast中的value.var是一个列表还是有多个值变量? [英] can the value.var in dcast be a list or have multiple value variables?

查看:2104
本文介绍了可以在dcast中的value.var是一个列表还是有多个值变量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

dcast.data.table 的帮助文件中,有一个注释说明已经实现了一个新功能:dcast.data.table允许value.var列为类型列表



我认为这意味着列表中可以有多个值变量,即以下面的格式:

  dcast.data.table(dt,x1〜x2,value.var = list('var1','var2','var3'))

但是我们得到一个错误:'value.var'必须是字符向量长度为1。



是否有这样的功能,如果没有,
$ b

编辑:回复以下注释



有些情况下,作为 value.var 来处理。想象一下,例如,x2由3个不同的周组成,并且您有2个值变量,如盐和糖消耗,并且要在不同周内投放这些变量。当然,你可以'熔化'2个值变量到一个列,但为什么做一些使用两个函数,当你可以在一个函数,如 reshape 做?



(注意:我也注意到 reshape 不能将多个变量视为时间变量 dcast 。)



所以我的观点是,我不明白为什么这些功能不允许灵活性在 value.var time.var 中包含多个变量,正如我们允许 id.var

解决方案

从data.table的v1.9.6,可以同时投放多个 value.var 列(并且还可以在 fun.aggregate 中使用多个聚合函数)。请参见?dcast



以下是我们如何使用 dcast

  dcast(setDT(mydf),x1〜x2,value.var = c(salt,sugar))
#x1 salt_1 salt_2 salt_3 sugar_1 sugar_2 sugar_3
#1:1 3 4 6 1 2 2
#2:2 10 3 9 5 3 6
#3:3 10 7 7 4 6 7


In the help files for dcast.data.table, there is a note stating that a new feature has been implemented: "dcast.data.table allows value.var column to be of type list"

I take this to mean that one can have multiple value variables within a list, i.e. in this format:

dcast.data.table(dt, x1~x2, value.var=list('var1','var2','var3'))

But we get an error: 'value.var' must be a character vector of length 1.

Is there such a feature, and if not, what would be other one-liner alternatives?

EDIT: In reply to the comments below

There are situations where you have multiple variables that you want to treat as the value.var. Imagine for example that x2 consists of 3 different weeks, and you have 2 value variables such as salt and sugar consumption and you want to cast those variables across the different weeks. Sure, you can 'melt' the 2 value variables into a single column, but why do something using two functions, when you can do it in one function like reshape does?

(Note: I've also noticed that reshape cannot treat multiple variables as the time variable as dcast does.)

So my point is that I don't understand why these functions don't allow for the flexibility to include multiple variables within the value.var or the time.var just as we allow for multiple variables for the id.var.

解决方案

From v1.9.6 of data.table, we can cast multiple value.var columns simultaneously (and also use multiple aggregation functions in fun.aggregate). Please see ?dcast and the Efficient reshaping using data.tables vignette for more.

Here's how we could use dcast:

dcast(setDT(mydf), x1 ~ x2, value.var=c("salt", "sugar"))
#    x1 salt_1 salt_2 salt_3 sugar_1 sugar_2 sugar_3
# 1:  1      3      4      6       1       2       2
# 2:  2     10      3      9       5       3       6
# 3:  3     10      7      7       4       6       7

这篇关于可以在dcast中的value.var是一个列表还是有多个值变量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆