对具有变量列名称的数据表执行操作 [英] Operations on data table with variable column name

查看:105
本文介绍了对具有变量列名称的数据表执行操作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图对作为变量传递的 data.table 的列执行操作。



这里是一个玩具示例:

 
set.seed(2)
DT< - data.table(replicate(3,runif(4)))
> DT
V1 V2 V3
V1 V2 V3
1:0.1848823 0.9438393 0.4680185
2:0.7023740 0.9434750 0.5499837
3:0.5733263 0.1291590 0.5526741
4:0.1680519 0.8334488 0.2388948

假设感兴趣的列作为变量的值传递:

 > print(target.column<  -  sample(colnames(DT),1))
[1]V3

因此,我想对列 V3 执行一些操作,为了简单起见,将值置于0.5。我已经成功地通过使用可怕的粘贴 parse eval

 > eval(parse(text = paste0(DT [,target.column,< 0.5,target.column,:= 0.5,])))
V1 V2 V3
1 :0.1848823 0.9438393 0.5000000
2:0.7023740 0.9434750 0.5499837
3:0.5733263 0.1291590 0.5526741
4:0.1680519 0.8334488 0.5000000

但在我的其他尝试中未能成功:

  DT [eval(target.column) 0.5,eval(target.column):= 0.5,] 
V1 V2 V3
1:0.1848823 0.9438393 0.4680185
2:0.7023740 0.9434750 0.5499837
3:0.5733263 0.1291590 0.5526741
4:0.1680519 0.8334488 0.2388948
> DT [as.name(target.column) 0.5,as.name(target.column):= 0.5,]
V1 V2 V3
1:0.1848823 0.9438393 0.4680185
2:0.7023740 0.9434750 0.5499837
3:0.5733263 0.1291590 0.5526741
4:0.1680519 0.8334488 0.2388948
> DT [deparse(substitute(target.column))& 0.5,deparse(substitute(target.column)):= 0.5,]
V1 V2 V3
1:0.1848823 0.9438393 0.4680185
2:0.7023740 0.9434750 0.5499837
3:0.5733263 0.1291590 0.5526741
4:0.1680519 0.8334488 0.2388948

我在SO和ol的interweb没有找到任何有用的东西...是否有一个data.table方法来做这个?

解决方案

可以使用

  DT [get(target.column)< .5,(target.column):= .5] 

p>

I am trying to perform operations on a data.table's column that is passed as a variable.

Here is a toy example:

library(data.table)
set.seed(2)
DT <- data.table(replicate(3, runif(4)))
> DT
          V1        V2        V3
1: 0.1848823 0.9438393 0.4680185
2: 0.7023740 0.9434750 0.5499837
3: 0.5733263 0.1291590 0.5526741
4: 0.1680519 0.8334488 0.2388948

Say the column of interest is passed as the value of a variable:

> print(target.column <- sample(colnames(DT), 1))
[1] "V3"

So I would like to perform some operation on column V3, say, flooring the value at 0.5 for simplicity. I have successfully made this work by using the dreaded paste, parse and eval:

> eval(parse(text = paste0("DT[", target.column, " < 0.5, ", target.column, " := 0.5, ]")))
          V1        V2        V3
1: 0.1848823 0.9438393 0.5000000
2: 0.7023740 0.9434750 0.5499837
3: 0.5733263 0.1291590 0.5526741
4: 0.1680519 0.8334488 0.5000000

But have been unsuccessful in my other attempts:

> DT[eval(target.column) < 0.5, eval(target.column) := 0.5, ]
          V1        V2        V3
1: 0.1848823 0.9438393 0.4680185
2: 0.7023740 0.9434750 0.5499837
3: 0.5733263 0.1291590 0.5526741
4: 0.1680519 0.8334488 0.2388948
> DT[as.name(target.column) < 0.5, as.name(target.column) := 0.5, ]
          V1        V2        V3
1: 0.1848823 0.9438393 0.4680185
2: 0.7023740 0.9434750 0.5499837
3: 0.5733263 0.1291590 0.5526741
4: 0.1680519 0.8334488 0.2388948
> DT[deparse(substitute(target.column)) < 0.5, deparse(substitute(target.column)) := 0.5, ]
          V1        V2        V3
1: 0.1848823 0.9438393 0.4680185
2: 0.7023740 0.9434750 0.5499837
3: 0.5733263 0.1291590 0.5526741
4: 0.1680519 0.8334488 0.2388948

I have looked for solutions on SO and the ol' interweb have not been able to find anything useful... is there a "data.table" way to do this?

解决方案

You can use

DT[ get(target.column) < .5, (target.column) := .5]

which gives the desired result.

这篇关于对具有变量列名称的数据表执行操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆