在data.table中使用eval [英] using eval in data.table

查看:198
本文介绍了在data.table中使用eval的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



使用以下data.table:

$

我尝试将data.table中eval的行为理解为框架 b
$ b

  set.seed(1)
foo = data.table(var1 = sample(1:3,1000,r = T) var2 = rnorm(1000),var3 = sample(letters [1:5],1000,replace = T))


$ b b

我试图复制此指令

  foo [var1 == 1,sum(var2) var b] 
$ b b b b b pre> eval1 = function(s)eval(parse(text = s),envir = sys.parent())

正如你所看到的,测试1和3是工作,但我不明白在eval中为test 2设置的正确环境:

  var_i =var1
var_j =var2
var_by =var3
$ b b#test 1 works
foo [eval1(var_i)== 1,sum(var2),by = var3]

#test 2不工作
foo [var1 == 1,sum(eval1(var_j)),by = var3]

#test 3 works
foo [var1 == 1,sum(var2),by = eval1(var_by) ]


解决方案

j-exp .SD 的环境中检查其变量,它代表数据子集 .SD 本身是一个 data.table ,它包含该组的列。



执行以下操作时:

  foo [var1 == 1,sum (eval(parse(text = var_j))),by = var3] 

c $ c> j-exp 获得内部优化/替换为 sum(var2)。但 sum(eval1(var_j))没有得到优化,并保持原样。



然后当它对每个组求值时,它必须找到 var2 在调用函数的parent.frame()中,但在 .SD 中。作为示例,让我们这样做:

  eval1<  -  function(s)eval(parse(text = s),envir = parent.frame())
foo [var1 == 1,{var2 = 1L; eval1(var_j)},by = var3]
#var3 V1
#1:e 1
#2:c 1
#3:a 1
#4 :b 1
#5:d 1

找到 var2 从它的父框架。也就是说,我们必须指向正确的环境来评估,有一个额外的参数值= .SD

  eval1 < -  function(s,env)eval(parse(text = s),envir = env,enclos = parent.frame())
foo [var1 == 1,sum(eval1(var_j,.SD)),by = var3]
#var3 V1
#1:e 11.178035
#2:c -12.236446
# 3:a -8.984715
#4:b -2.739386
#5:d -1.159506


I'm trying to understand the behaviour of eval in a data.table as a "frame".

With following data.table:

set.seed(1)
foo = data.table(var1=sample(1:3,1000,r=T), var2=rnorm(1000),  var3=sample(letters[1:5],1000,replace = T))

I'm trying to replicate this instruction

foo[var1==1 , sum(var2) , by=var3]

using a function of eval:

eval1 = function(s) eval( parse(text=s) ,envir=sys.parent() )

As you can see, test 1 and 3 are working, but I don't understand which is the "correct" envir to set in eval for test 2:

var_i="var1"
var_j="var2"
var_by="var3"

# test 1 works
foo[eval1(var_i)==1 , sum(var2) , by=var3 ]

# test 2 doesn't work
foo[var1==1 , sum(eval1(var_j)) , by=var3]

# test 3 works
foo[var1==1 , sum(var2) , by=eval1(var_by)]

解决方案

The j-exp, checks for it's variables in the environment of .SD, which stands for Subset of Data. .SD is itself a data.table that holds the columns for that group.

When you do:

foo[var1 == 1, sum(eval(parse(text=var_j))), by=var3]

directly, the j-exp gets internally optimised/replaced to sum(var2). But sum(eval1(var_j)) doesn't get optimised, and stays as it is.

Then when it gets evaluated for each group, it'll have to find var2, which doesn't exist in the parent.frame() from where the function is called, but in .SD. As an example, let's do this:

eval1 <- function(s) eval(parse(text=s), envir=parent.frame())
foo[var1 == 1, { var2 = 1L; eval1(var_j) }, by=var3]
#    var3 V1
# 1:    e  1
# 2:    c  1
# 3:    a  1
# 4:    b  1
# 5:    d  1

It find var2 from it's parent frame. That is, we have to point to the right environment to evaluate in, with an additional argument with value = .SD.

eval1 <- function(s, env) eval(parse(text=s), envir = env, enclos = parent.frame())
foo[var1 == 1, sum(eval1(var_j, .SD)), by=var3]
#    var3         V1
# 1:    e  11.178035
# 2:    c -12.236446
# 3:    a  -8.984715
# 4:    b  -2.739386
# 5:    d  -1.159506

这篇关于在data.table中使用eval的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆