为什么“dcast”中有几个`value.var`? [英] Why can't one have several `value.var` in `dcast`?
问题描述
为什么在 dcast
中不能有多个变量传递给 value.var
?从?dcast
:
Why can't one have multiple variables passed to value.var
in dcast
? From ?dcast
:
value.var存储值的列名称,
请参阅guess_value以获取默认策略。
value.var name of column which stores values, see guess_value for default strategies to figure this out.
它没有明确指出只能传递一个变量作为价值。然而,如果我尝试,那么我会收到一个错误:
It doesn't explicitly indicate that only one single variable can be passed on as value. If however I try that, then I get an error:
> library("reshape2")
> library("MASS")
>
> dcast(Cars93, AirBags ~ DriveTrain, mean, value.var=c("Price", "Weight"))
Error in .subset2(x, i, exact = exact) : subscript out of bounds
In addition: Warning message:
In if (!(value.var %in% names(data))) { :
the condition has length > 1 and only the first element will be used
那么施加这个限制有很好的理由吗?是否可以解决这个问题(也许使用 reshape
等)?
So is there a good reason for imposing this limitation? And is it possible to work around this (perhaps using reshape
, etc.)?
推荐答案
这个问题与你的另一个问题从今天早些时候。
@beginneR在评论中写道只要现有的数据已经是长格式,我没有看到任何一般的需要在铸造之前融化。在我的其他问题的答复中,我举了一个例子,当需要融化
,或者说如何决定你的数据是否足够长时间。
@beginneR wrote in the comments that "As long as the existing data is already in long-format, I don't see any general need to melt it before casting." In my answer posted at your other question, I gave an example of when melt
would be required, or rather, how to decide whether your data are long enough.
这里的这个问题是另外一个例子,因为融化 /stackoverflow.com/a/25143917/1270695\">我的回答不满意。
This question here is another example of when further melt
ing would be required since point 3 in my answer is not satisfied.
要获得所需的行为,请尝试以下操作:
To get the behavior you want, try the following:
C93L <- melt(Cars93, measure.vars = c("Price", "Weight"))
dcast(C93L, AirBags ~ DriveTrain + variable, mean, value.var = "value")
# AirBags 4WD_Price 4WD_Weight Front_Price Front_Weight
# 1 Driver & Passenger NaN NaN 26.17273 3393.636
# 2 Driver only 21.38 3623 18.69286 2996.250
# 3 None 13.88 2987 12.98571 2703.036
# Rear_Price Rear_Weight
# 1 33.20 3515.0
# 2 28.23 3463.5
# 3 14.90 3610.0
替代方法是使用聚合
来计算平均值
s,然后使用 reshape
或 dcast
从长到宽。两者都是必需的,因为 reshape
不执行任何聚合:
An alternative is to use aggregate
to calculate the mean
s, and then use reshape
or dcast
to go from "long" to "wide". Both are required since reshape
does not perform any aggregation:
temp <- aggregate(cbind(Price, Weight) ~ AirBags + DriveTrain,
Cars93, mean)
# AirBags DriveTrain Price Weight
# 1 Driver only 4WD 21.38000 3623.000
# 2 None 4WD 13.88000 2987.000
# 3 Driver & Passenger Front 26.17273 3393.636
# 4 Driver only Front 18.69286 2996.250
# 5 None Front 12.98571 2703.036
# 6 Driver & Passenger Rear 33.20000 3515.000
# 7 Driver only Rear 28.23000 3463.500
# 8 None Rear 14.90000 3610.000
reshape(temp, direction = "wide",
idvar = "AirBags", timevar = "DriveTrain")
# AirBags Price.4WD Weight.4WD Price.Front Weight.Front
# 1 Driver only 21.38 3623 18.69286 2996.250
# 2 None 13.88 2987 12.98571 2703.036
# 3 Driver & Passenger NA NA 26.17273 3393.636
# Price.Rear Weight.Rear
# 1 28.23 3463.5
# 2 14.90 3610.0
# 3 33.20 3515.0
这篇关于为什么“dcast”中有几个`value.var`?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!