使用字符串名称功能性地创建变量 [英] Functionally creating variables using string names
问题描述
我试图生成一个函数来在数据框上创建一堆具有相同命名约定并使用相同逻辑的列。不幸的是,我在创建变量时遇到了一些奇怪的行为,我希望别人能够解释这里发生了什么。
df < - data.frame(var1 = c(1,2,3),var2 = c(3,4,5),var3 = c(foo,bar,baz))
DoesNotWork< - 函数(df,varname){
df [paste(varname,_square,sep =)] < - df [varname] ^ 2
返回(df)
}
dfBad< - DoesNotWork(df,var1)
dfBad
var1 var2 var3 var1
1 1 3 foo 1
2 2 4 bar 4
3 3 5 baz 9
下面的函数通过将原始变量的所有值分配给新变量名称,然后仅对新变量执行相同的操作,这是令人讨厌的,我不确定如果我需要使用来自多个变量的逻辑会发生什么。 如果有更好的方法切换,字符串之间用于变量名称和引用列对象。 编辑:好的,所以我的上面的答案确实有效,但它并没有真正回答第二个问题的原因。 例如: 与其他所有非指数运算符一样。如果我们看一下data.frame运营商的源代码,我们会在底部看到这个有趣的位: 基本上这就是说如果运算符是列出的运算符之一,那么返回一个data.frame与给定的名称,否则返回具有给定名称的矩阵。出于某种原因,^运算符是唯一没有列出的运算符。我们可以很容易地证实这一点: 使用指数exponentiaton和 only 指数,矩阵的变形名称当您分配它时,data.frame的新列名称。 R很奇怪。这意味着你也可以通过在你的指数部分包装一个 如果您想使用您的初始函数来真的奇怪: / em>列的正确名称,但会显示您插入的矩阵的名称。 I'm trying to generate a function to create a bunch of columns on a data frame that have the same naming conventions and use the same logic. Unfortunately, I've bumped into some weird behavior when creating the variables, and I am hopeful someone else can explain what's going on here. The function below hacks around this problem by assigning all of the values of the original variable to the new variable name, then performing the same operation on only the new variable, but this is sort of obnoxious, and I'm not sure what would happen if I needed to use logic from multiple variables. Any guidance here would be greatly appreciated, especially if there's a nicer way to switch between strings for variable names and references to the column-objects. You're missing the commas. EDIT: Ok so my above answer does work, but it does not really answer the question as to why the second one works. For example: As do all the other non exponential operators. If we look at the data.frame Operators source code, we see this interesting bit at the bottom: Basically this is saying that if the operator is one of those listed, then return a data.frame with the given names, otherwise return a matrix with the given names. For some reason, the "^" operator is the only one not listed. We can confirm this pretty easily: With exponentiaton, and only with exponentiation, the dimnames of the matrix overrule the new column name of your data.frame when you assign it. R is weird. Comically this means that you could also get your code to work by wrapping an If you want to see something really strange using your initial function: R knows the column's correct name, but shows you the name of the matrix you stuck into it. 这篇关于使用字符串名称功能性地创建变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
dfBad
这里有两个名为 var1
的变量,而不是一个名为 var1
和一个名为
var1_squared
的变量,正如我所希望的那样。
Works < - function(df,varname){
df [paste(varname,_square,sep =)] < - df [varname]
df [paste(varname,_square ,sep =)]< - df [paste(varname,_square,sep =)] ^ 2
return(df)
}
dfGood < - Works(df,var1)
dfGood
var1 var2 var3 var1_square
1 1 3 foo 1
2 2 4 bar 4
3 3 5 baz 9
df < - data.frame(var1 = c(1,2,3),var2 = c(3,4,5),var3 = c(foo,bar,baz))
NowItWorks< - function(df,varname){
df [,paste(varname,_square,sep =)] < - df [,varname] ^ 2
return(df )
}
NowItWorks(df,var1)
> var1 var2 var3 var1_square
1 1 3 foo 1
2 2 4 bar 4
3 3 5 baz 9
pre $ MultiplicationWorks< - function(df,varname){
df [paste(varname, _square,sep =)]< - df [varname] * 2
return(df)
}
Ops.data.frame
$ b ...
if(.Generic%in%c(+, - ,*,/,%%,%/% )){
names(value)< - cn
data.frame(value,row.names = rn,check.names = FALSE,
check.rows = FALSE)
}
else矩阵(unlist(value,recursive = FALSE,use.names = FALSE),
nrow = nr,dimnames = list(rn,cn))
...
df < - data.frame(var1 = c(1,2,3), var2 = c(3,4,5),var3 = c(foo,bar,baz))
class(df [var1] * 2)
> [1]data.frame
class(df [var1] ^ 2)
> [1]matrix
as.data.frame()
来获得你的代码。
❥名称(dfBad)
[1]var1var2var3var1_square
❥dfBad
var1 var2 var3 var1
1 1 3 foo 1
2 2 4 bar 4
3 3 5 baz 9
❥str(dfBad)
'data.frame':3 obs。 4个变量:
$ var1:num 1 2 3
$ var2:num 3 4 5
$ var3:因子w / 3等级bar,baz,foo: 3 1 2
$ var1_square:num [1:3,1] 1 4 9
..- attr(*,dimnames)= 2
.. .. $: NULL
.. $:$ chrvar1
df <- data.frame(var1 = c(1,2,3), var2 = c(3,4,5), var3 = c("foo", "bar", "baz"))
DoesNotWork <- function(df, varname){
df[paste(varname, "_square", sep = "")] <- df[varname]^2
return(df)
}
dfBad <- DoesNotWork(df, "var1")
dfBad
var1 var2 var3 var1
1 1 3 foo 1
2 2 4 bar 4
3 3 5 baz 9
dfBad
here has two variables called var1
rather than one variable called var1
and one variable called var1_squared
as I had hoped. Works <- function(df, varname){
df[paste(varname, "_square", sep = "")] <- df[varname]
df[paste(varname, "_square", sep = "")] <- df[paste(varname, "_square", sep = "")]^2
return(df)
}
dfGood <- Works(df, "var1")
dfGood
var1 var2 var3 var1_square
1 1 3 foo 1
2 2 4 bar 4
3 3 5 baz 9
df <- data.frame(var1 = c(1,2,3), var2 = c(3,4,5), var3 = c("foo", "bar", "baz"))
NowItWorks <- function(df, varname){
df[,paste(varname, "_square", sep = "")] <- df[,varname]^2
return(df)
}
NowItWorks(df, "var1")
> var1 var2 var3 var1_square
1 1 3 foo 1
2 2 4 bar 4
3 3 5 baz 9
MultiplicationWorks <- function(df, varname){
df[paste(varname, "_square", sep = "")] <- df[varname]*2
return(df)
}
Ops.data.frame
...
if (.Generic %in% c("+", "-", "*", "/", "%%", "%/%")) {
names(value) <- cn
data.frame(value, row.names = rn, check.names = FALSE,
check.rows = FALSE)
}
else matrix(unlist(value, recursive = FALSE, use.names = FALSE),
nrow = nr, dimnames = list(rn, cn))
...
df <- data.frame(var1 = c(1,2,3), var2 = c(3,4,5), var3 = c("foo", "bar", "baz"))
class(df["var1"]*2)
>[1] "data.frame"
class(df["var1"]^2)
>[1] "matrix"
as.data.frame()
around your exponentiation part.❥ names(dfBad)
[1] "var1" "var2" "var3" "var1_square"
❥ dfBad
var1 var2 var3 var1
1 1 3 foo 1
2 2 4 bar 4
3 3 5 baz 9
❥ str(dfBad)
'data.frame': 3 obs. of 4 variables:
$ var1 : num 1 2 3
$ var2 : num 3 4 5
$ var3 : Factor w/ 3 levels "bar","baz","foo": 3 1 2
$ var1_square: num [1:3, 1] 1 4 9
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr "var1"