参考attach()中新定义的变量 [英] Refering to newly defined variable within attach()
问题描述
我想对数据框的列进行许多修改。然而,需要大量的列和转换,我想避免不必一次使用数据帧名称。
I would like to perform many modifications on the columns of data frame. However, having a large number of columns and transformations required, I would like to avoid having to use the data frame name over and over.
在SAS数据步骤中,在一个数据步骤中,您可以创建一个变量,并在定义变量后直接引用它:
In SAS data step, where within one data step, you can create a variable and refer to it right after defining it:
data A;
set A;
varA = varB > 1;
varC = var A + varB;
....
run;
可以在R中执行此操作吗?
Is it possible to do this in R?
我可以想到的一个方法是使用attach(),然后在detach()之前创建数百个数组,然后cbind())。我知道很多R老兵建议不要使用attach()。但是我需要做大量的数据操作(数百个新变量),并且对它们中的每个人依次调用transform(df)将是相当麻烦的。
One way I can think of is to use attach(), then create hundreds of arrays then cbind() them before detach(). I know many R veterans suggest not to use attach(). But I need to do heavy data manipulation (hundreds of new variables), and calling transform(df,) on everyone of them sequentially would be quite cumbersome.
例如:
attach(A)
varA <- varB > 1
varC <- varA + varB
A <- cbind(varA, varB, varC)
detach()
但我不知道这是否是最好的方法。
But I am not sure if it is the best way to do this in R.
推荐答案
您可以使用 plyr
和 mutate
。
A <- data.frame(varB = 1:5)
library(plyr)
A <- mutate(A, varA = varB>1, varC = varA + varB)
A
varB varA varC
1 1 FALSE 1
2 2 TRUE 3
3 3 TRUE 4
4 4 TRUE 5
5 5 TRUE 6
或 code> base
R.请注意,中的将以相反的顺序返回您创建的列。
Or within
in base
R. Notice that within
returns the columns you create in reverse order.
A <- data.frame(varB = 1:5)
A <- within(A, {varA <- varB>1; varC <- varA + varB})
A
varB varC varA
1 1 1 FALSE
2 2 3 TRUE
3 3 4 TRUE
4 4 5 TRUE
5 5 6 TRUE
我最喜欢的是 data.table
和:=
/ p>
By far and away my favourite is data.table
and :=
DA <- data.table(varB = 1:5)
DA[,varA := varB >1 ][, varC := varA + varB]
DA
varB varA varC
1: 1 FALSE 1
2: 2 TRUE 3
3: 3 TRUE 4
4: 4 TRUE 5
5: 5 TRUE 6
目前:=
在每次调用 [
]时最容易使用一次。有这方面的方法,但我认为 [
mutate 或使用data.frames的任何方法。)
currently :=
is most easily used only once per call to [
. There are ways around this, but I think the string of [
calls is not too hard to follow (and it will be MUCH MUCH faster than mutate
or any approach that uses data.frames.)
这篇关于参考attach()中新定义的变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!