循环中的R变量名,get等 [英] R variable names in loop, get, etc

查看:107
本文介绍了循环中的R变量名,get等的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于R来说还是比较新的。尝试在循环中使用动态变量,但遇到各种各样的问题。初始代码看起来像这样(但更大)
$ b $ pre $ data_train $ Pclass_F < - as.factor(data.train $ Pclass)
data.test $ Pclass_F< - as.factor(data.test $ Pclass)



  datalist<  -  c(data (i,datalist){
i $ Pclass_F< - as.factor(i $ Pclass)
}

不起作用。有一点研究意味着,为了将字符串 datalist 转换成一个变量,我需要使用 get 函数。所以我的下一个尝试是

pre $ data $ c $ datalist< -c(data.train,data.test)$ b $ (b)for(i in datalist){
get(i $ Pclass_F)< - as.factor(get(i $ Pclass))
}
pre>

仍然不起作用 i $ Pclass中的错误:$运算符对原子向量无效。尝试

  datalist<  -  c(data.train,data.test)
for(i在数据列表中){
get(i)$ Pclass_F< - as.factor(get(i)$ Pclass)
}

仍然不行 get(i)中的错误$ Pclass_F< - as.factor(get(i)$ Pclass):
找不到函数get< -
。甚至试过

$ $ $ $ $ $ code $ datalist <我在datalist){
get(i [Pclass_F])< - as.factor(get(i [Pclass]))
}

仍然不起作用 get(i [Pclass])中的错误:object'Pclass'not found 。被试过的

  datalist< -c(data.train,data.test)
for我在datalist中){
get(i)[Pclass_F]< - as.factor(get(i)[Pclass])
}

仍然不起作用'[.data.frame'(get(i),Pclass)中的错误:object'Pclass'not找到



现在我意识到我从来没有包含任何数据,所以没有人可以自己运行这个,只是为了显示它不是数据问题。 b
$ b

 > class(data.train $ Pclass)
[1]integer
> class(data.test $ Pclass)
[1]integer
> datalist
[1]data.traindata.test


解决方案

这个问题涉及到在R中处理数据框和大多数其他对象的方式。在许多编程语言中,对象(或至少可以)通过引用传递给函数。在C ++中,如果将指向某个对象的指针传递给操作该对象的函数,则会修改原始对象。这是不是在大多数情况下工作的方式。



当一个对象被创建像这样:

< pre $ x < - list(a = 5,b = 9)

然后像这样复制:

  y < -  x 

最初 y x 到RAM中的同一个对象。但是一旦y被修改,就会创建一个副本。因此,分配 y $ c < - 12 x 没有影响。



get()不会返回指定对象的方式,如果不先将其指定给另一个变量(这将意味着原始变量保持原样)

在R中这样做的正确方法是将数据存储在名为 list 。然后,您可以遍历列表并使用替换语法来更改列。

  datalist<  -  list(data.train = data.train,data.test = data.test)
for(df in names(datalist)){
datalist [[df]] $ Pclass_F < - as.factor(datalist [[df]] $ Pclass_F)
}


$ b

您也可以使用:

pre $ code (data.train,data.test),function(data){
data $ Pclass_Fb< - as.factor(data $ Pclass_Fb)
data
$ b)),c(data.train,data.test))

正在使用lapply来处理列表中的每个成员,并使用修改的列返回一个新的列表。从理论上讲,你可以通过在全局中使用 [[]运算符来实现原本想做的事情环境,但这是一种非常规的做事方式,可能会导致以后的混乱。

Still relatively new to R. Trying to have dynamic variables in a loop but running into all sorts of problems. Initial code looks something like this (but bigger)

data.train$Pclass_F <- as.factor(data.train$Pclass)
data.test$Pclass_F <- as.factor(data.test$Pclass)

which I'm trying to build into a loop, imagining something like this

datalist <- c("data.train", "data.test")
for (i in datalist){
  i$Pclass_F <- as.factor(i$Pclass)
}

which doesn't work. A little research implies that inorder to convert the string datalist into a variable I need to use the get function. So my next attempt was

datalist <- c("data.train", "data.test")
for (i in datalist){
  get(i$Pclass_F) <- as.factor(get(i$Pclass))
}

which still doesn't work Error in i$Pclass : $ operator is invalid for atomic vectors. Tried

datalist <- c("data.train", "data.test")
for (i in datalist){
  get(i)$Pclass_F <- as.factor(get(i)$Pclass)
}

which still doesn't work Error in get(i)$Pclass_F <- as.factor(get(i)$Pclass) : could not find function "get<-". Even tried

datalist <- c("data.train", "data.test")
for (i in datalist){
  get(i[Pclass_F]) <- as.factor(get(i[Pclass]))
}

which still doesn't work Error in get(i[Pclass]) : object 'Pclass' not found. The tried

datalist <- c("data.train", "data.test")
for (i in datalist){
  get(i)[Pclass_F] <- as.factor(get(i)[Pclass])
}

which still doesn't work Error in '[.data.frame'(get(i), Pclass) : object 'Pclass' not found

Now realized I never included data so nobody can run this themselves, but just to show it's not a data problem

> class(data.train$Pclass)
[1] "integer"
> class(data.test$Pclass)
[1] "integer"
> datalist
[1] "data.train" "data.test" 

解决方案

The problem you have relates to the way data frames and most other objects are treated in R. In many programming languages, objects are (or at least can be) passed to functions by reference. In C++ if I pass a pointer to an object to a function which manipulates that object, the original is modified. This is not the way things work for the most part in R.

When an object is created like this:

x <- list(a = 5, b = 9)

And then copied like this:

y <- x

Initially y and x will point to the same object in RAM. But as soon as y is modified at all, a copy is created. So assigning y$c <- 12 has no effect on x.

get() doesn't return the named object in a way that can be modified without first assigning it to another variable (which would mean the original variable is left unaltered).

The correct way of doing this in R is storing your data frames in a named list. You can then loop through the list and use the replacement syntax to change the columns.

datalist <- list(data.train = data.train, data.test = data.test)
for (df in names(datalist)){
  datalist[[df]]$Pclass_F <- as.factor(datalist[[df]]$Pclass_F)
}

You could also use:

datalist <- setNames(lapply(list(data.train, data.test), function(data) {
  data$Pclass_Fb <- as.factor(data$Pclass_Fb)
  data
}), c("data.train", "data.test"))

This is using lapply to process each member of the list, returning a new list with the modified columns.

In theory, you could achieve what you were originally trying to do by using the [[ operator on the global environment, but it would be an unconventional way of doing things and may lead to confusion later on.

这篇关于循环中的R变量名,get等的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆