使用lm(my_formula)里面[.data.table的j [英] using lm(my_formula) inside [.data.table's j

查看:169
本文介绍了使用lm(my_formula)里面[.data.table的j的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经习惯在 j 中访问data.table列,即使我不需要:

  require(data.table)
set.seed(1); n = 10
DT < - data.table(x = rnorm(n),y = rnorm(n))

frm < b
DT [,lm(x〜y)]#1 works
DT [,lm(frm)]#2 failed
lm(frm,data = DT) ll do instead do

我希望#2工作,因为 lm 应该在 DT 中搜索变量,然后在全局环境中...有一个优雅的方式来获得类似#2的工作吗?



在这种情况下,我使用 lm ,它接受一个data参数,所以#3工作正常。 / p>

EDIT。请注意,此操作适用于:

  x1 <-DT $ x 
y1 <-DT $ y
frm1 < - 公式(x1〜y1)
lm(frm1)
  <$ p> 

rm(x1,y1)
bah < - function(){
x1< - DT $ x
y1< - DT $ y
frm1& x1〜y1)
lm(frm1)
}
bah()

EDIT2。但是,这失败,说明@ eddi的回答

  frm1< (x1〜y1)
bah1 < - function(){
x1< - DT $ x
y1< - DT $ y
lm(frm1)$ b b}
bah1()


解决方案

lm 工作它查找在提供的公式的环境中使用的变量。因为你在全局环境中创建你的公式,它不会出现在 j - 表达式环境中,所以唯一的方法是使精确的表达式 lm(frm)工作将添加适当的变量到正确的环境:

  ,{assign('x',x,environment(frm)); 
assign('y',y,environment(frm));
lm(frm)}]

现在显然这不是一个很好的解决方案,



编辑另一个(可能更变态,更脆弱)的方式将改变手边的公式的环境(我永久在这里,但你可以回复它,或复制它,然后做它):

  DT [,{setattr(frm,'.Environment',get('SDenv',parent.frame(2))) lm(frm)}] 

当你使用 j - 表达式中获取,所有的变量都是构造的(因此不要可以避免它),这就是为什么我不需要也以一些方式使用 x y data.table 以了解这些变量是否需要。


I have gotten in the habit of accessing data.table columns in j even when I do not need to:

require(data.table)
set.seed(1); n = 10
DT <- data.table(x=rnorm(n),y=rnorm(n))

frm <- formula(x~y)

DT[,lm(x~y)]         # 1 works
DT[,lm(frm)]         # 2 fails
lm(frm,data=DT)      # 3 what I'll do instead

I expected # 2 to work, since lm should search for variables in DT and then in the global environment... Is there an elegant way to get something like # 2 to work?

In this case, I'm using lm, which takes a "data" argument, so # 3 works just fine.

EDIT. Note that this works:

x1 <- DT$x
y1 <- DT$y
frm1 <- formula(x1~y1)
lm(frm1)

and this, too:

rm(x1,y1)
bah <- function(){
    x1 <- DT$x
    y1 <- DT$y
    frm1 <- formula(x1~y1)
    lm(frm1)
}
bah()

EDIT2. However, this fails, illustrating @eddi's answer

frm1 <- formula(x1~y1)
bah1 <- function(){
    x1 <- DT$x
    y1 <- DT$y
    lm(frm1)
}
bah1()

解决方案

The way lm works it looks for the variables used in the environment of the formula supplied. Since you create your formula in the global environment, it's not going to look in the j-expression environment, so the only way to make the exact expression lm(frm) work would be to add the appropriate variables to the correct environment:

DT[, {assign('x', x, environment(frm));
      assign('y', y, environment(frm));
      lm(frm)}]

Now obviously this is not a very good solution, and both Arun's and Josh's suggestions are much better and I'm just putting it here for the understanding of the problem at hand.

edit Another (possibly more perverted, and quite fragile) way would be to change the environment of the formula at hand (I do it permanently here, but you could revert it back, or copy it and then do it):

DT[, {setattr(frm, '.Environment', get('SDenv', parent.frame(2))); lm(frm)}]

Btw a funny thing is happening here - whenever you use get in j-expression, all of the variables get constructed (so don't use it if you can avoid it), and this is why I don't need to also use x and y in some way for data.table to know that those variables are needed.

这篇关于使用lm(my_formula)里面[.data.table的j的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆