循环遍历data.table中的列,并转换这些列 [英] Loop through columns in a data.table and transform those columns

查看:216
本文介绍了循环遍历data.table中的列,并转换这些列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个data.table DT 有一个名为 RF 的列,并且许多列带有下划线 _ 。我想用一个
下划线循环所有这些列,并从中减去 RF 列。但是,我被卡住了。看起来在 data.table 中的
:= 运算符的RHS上的一切都不起作用



这是我的 DT 和所需的输出(硬编码):

  library(data.table)
DT < - data.table(RF = 1:10,
S_1 = 11: 20,
S_2 = 21:30)
#所需输出
DT [,S_1:= S_1 - RF]
DT [,S_2:= S_2 - RF]
DT
RF S_1 S_2
[1,] 1 10 20
[2,] 2 10 20
[3,] 3 10 20
...

但是,我想让它更灵活,即在名称中用_并减去 RF

 #1。尝试:不工作;有趣的是,i在LHS上的:=被解释为列i,但是在
#:=的RHS被解释为2和3,分别
for(i in grep(_ ,名称(DT))){
DT [,i:= i-1,with = FALSE]
}
DT
RF S_1 S_2
[1 ,] 1 1 2
[2,] 2 1 2
[3,] 3 1 2
...

#2。尝试:使用解析和eval
for(i in grep(_,names(DT),value = TRUE)){
DT [,eval(parse(text = i)): eval(parse(text = i)) - RF]
}
eval中的错误(expr,envir,enclos):object'S_1'not found



任何提示都会很好。



编辑:问题,我想到自己:为什么你首先使用:= 运算符,并且肯定,我只是意识到我不必。这不工作,不需要循环:

  DT [,grep(_,names(DT)对不起,您可以使用以下格式:= FALSE]  -  DT [,RF] 


$ b。然而,我离开这个问题,因为我仍然感兴趣的为什么我的方法与:= 运算符不工作。

解决方案

你在第二次尝试时就在正确的轨道上。下面是一个使用 substitute 构建表达式的方法,该表达式作为'j' c $ c> DT [,j]

  ,name(DT),value = TRUE)){
e < - replace(X:= X-RF,list(X = as.symbol(i)))
DT [,eval )]
}
DT
#RF S_1 S_2
#[1,] 1 10 20
#[2,] 2 10 20
# 3,] 3 10 20
#[4,] 4 10 20
#[5,] 5 10 20






或现在(1年后) with = FALSE 适用于:= ok:

  name(DT),value = TRUE))
DT [,i:= get(i)-RF,with = FALSE]

$通过使LHS成为表达式而不是符号,可以避免使用= FALSE 的b
$ b

  for(i in grep(_,names(DT),value = TRUE))
DT [,(i):= get )-RF]


I have a data.table DT with a column named RF and many columns with an underline _in it. I want to loop through all those columns with an underline and subtract the RF column from it. However, I'm stuck. It seems that everything on the RHS of the := operator in a data.table does not work with dynamic variables.

Here is my DT and the desired output (hardcoded):

library(data.table)
DT <- data.table(RF  = 1:10,
                 S_1 = 11:20,
                 S_2 = 21:30)
#Desired output
DT[ , S_1 := S_1 - RF]
DT[ , S_2 := S_2 - RF]
DT
      RF S_1 S_2
 [1,]  1  10  20
 [2,]  2  10  20
 [3,]  3  10  20
...

However, I want this to be more flexible, i.e. loop through every column with "_" in its name and subtract RF:

#1. try: Does not work; Interestingly, the i on the LHS of := is interpreted as the column i, but on the RHS of
#:= it is interpreted as 2 and 3, respectively
for (i in grep("_", names(DT))){
  DT[ , i:= i - 1, with=FALSE]
}
DT
          RF  S_1 S_2
 [1,]  1   1   2
 [2,]  2   1   2
 [3,]  3   1   2
...

#2. try: Work with parse and eval
for (i in grep("_", names(DT), value=TRUE)){
  DT[ , eval(parse(text=i)):= eval(parse(text=i)) - RF]
}
#Error in eval(expr, envir, enclos) : object 'S_1' not found

Any hints how to do that would be great.

EDIT: As soon as I posted the question, I thought to myself: Why are you working with the := operator in the first place, and sure enough, I just realized I don't have to. This does work and doesn't need a loop:

DT[, grep("_", names(DT)), with=FALSE] - DT[, RF]

Sorry for that. However, I leave the question open because I'm still interested on why my approach with the := operator doesn't work. So maybe someone can help me there.

解决方案

You were on the right track with your second attempt. Here is an approach that uses substitute to build the expression that gets passed in as the 'j' argument in DT[ , j ].

for (i in grep("_", names(DT), value=TRUE)){
    e <- substitute(X := X - RF, list(X = as.symbol(i)))
    DT[ , eval(e)]
}
DT
#     RF S_1 S_2
# [1,]  1  10  20
# [2,]  2  10  20
# [3,]  3  10  20
# [4,]  4  10  20
# [5,]  5  10  20


Or now (1 year later) that with=FALSE applies to the LHS of := ok :

for (i in grep("_", names(DT), value=TRUE))
    DT[, i:=get(i)-RF, with=FALSE]

or with=FALSE can be avoided by making the LHS an expression rather than a symbol :

for (i in grep("_", names(DT), value=TRUE))
    DT[, (i):=get(i)-RF]

这篇关于循环遍历data.table中的列,并转换这些列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆