为什么“dcast”中有几个`value.var`? [英] Why can't one have several `value.var` in `dcast`?

查看:555
本文介绍了为什么“dcast”中有几个`value.var`?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为什么在 dcast 中不能有多个变量传递给 value.var ?从?dcast

Why can't one have multiple variables passed to value.var in dcast? From ?dcast:


value.var存储值的列名称,
请参阅guess_value以获取默认策略。

value.var name of column which stores values, see guess_value for default strategies to figure this out.

它没有明确指出只能传递一个变量作为价值。然而,如果我尝试,那么我会收到一个错误:

It doesn't explicitly indicate that only one single variable can be passed on as value. If however I try that, then I get an error:

> library("reshape2")
> library("MASS")
> 
> dcast(Cars93, AirBags ~ DriveTrain, mean, value.var=c("Price", "Weight"))
Error in .subset2(x, i, exact = exact) : subscript out of bounds
In addition: Warning message:
In if (!(value.var %in% names(data))) { :
  the condition has length > 1 and only the first element will be used

那么施加这个限制有很好的理由吗?是否可以解决这个问题(也许使用 reshape 等)?

So is there a good reason for imposing this limitation? And is it possible to work around this (perhaps using reshape, etc.)?

推荐答案

这个问题与你的另一个问题从今天早些时候

@beginneR在评论中写道只要现有的数据已经是长格式,我没有看到任何一般的需要在铸造之前融化。在我的其他问题的答复中,我举了一个例子,当需要融化,或者说如何决定你的数据是否足够长时间。

@beginneR wrote in the comments that "As long as the existing data is already in long-format, I don't see any general need to melt it before casting." In my answer posted at your other question, I gave an example of when melt would be required, or rather, how to decide whether your data are long enough.

这里的这个问题是另外一个例子,因为融化 /stackoverflow.com/a/25143917/1270695\">我的回答不满意。

This question here is another example of when further melting would be required since point 3 in my answer is not satisfied.

要获得所需的行为,请尝试以下操作:

To get the behavior you want, try the following:

C93L <- melt(Cars93, measure.vars = c("Price", "Weight"))
dcast(C93L, AirBags ~ DriveTrain + variable, mean, value.var = "value")
#              AirBags 4WD_Price 4WD_Weight Front_Price Front_Weight
# 1 Driver & Passenger       NaN        NaN    26.17273     3393.636
# 2        Driver only     21.38       3623    18.69286     2996.250
# 3               None     13.88       2987    12.98571     2703.036
#   Rear_Price Rear_Weight
# 1      33.20      3515.0
# 2      28.23      3463.5
# 3      14.90      3610.0






替代方法是使用聚合来计算平均值 s,然后使用 reshape dcast 从长到宽。两者都是必需的,因为 reshape 不执行任何聚合:


An alternative is to use aggregate to calculate the means, and then use reshape or dcast to go from "long" to "wide". Both are required since reshape does not perform any aggregation:

temp <- aggregate(cbind(Price, Weight) ~ AirBags + DriveTrain, 
                  Cars93, mean)
#              AirBags DriveTrain    Price   Weight
# 1        Driver only        4WD 21.38000 3623.000
# 2               None        4WD 13.88000 2987.000
# 3 Driver & Passenger      Front 26.17273 3393.636
# 4        Driver only      Front 18.69286 2996.250
# 5               None      Front 12.98571 2703.036
# 6 Driver & Passenger       Rear 33.20000 3515.000
# 7        Driver only       Rear 28.23000 3463.500
# 8               None       Rear 14.90000 3610.000

reshape(temp, direction = "wide", 
        idvar = "AirBags", timevar = "DriveTrain")
#              AirBags Price.4WD Weight.4WD Price.Front Weight.Front
# 1        Driver only     21.38       3623    18.69286     2996.250
# 2               None     13.88       2987    12.98571     2703.036
# 3 Driver & Passenger        NA         NA    26.17273     3393.636
#   Price.Rear Weight.Rear
# 1      28.23      3463.5
# 2      14.90      3610.0
# 3      33.20      3515.0

这篇关于为什么“dcast”中有几个`value.var`?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆