update()具有局部协变量的函数内部的模型 [英] update() a model inside a function with local covariate
问题描述
我需要从函数内部更新回归模型.理想情况下,该函数应适用于任何类型的模型(lm
,glm
,multinom
,clm
).更准确地说,我需要添加一个或几个在函数内部定义的协变量.这是一个例子.
I need to update a regression model from inside a function. Ideally, the function should work with any kind of models (lm
, glm
, multinom
, clm
). More precisely, I need to add one or several covariates that are defined inside the function. Here is an exemple.
MyUpdate <- function(model){
randData <- data.frame(var1=rnorm(length(model$residuals)))
model2 <- update(model, ".~.+randData$var1")
return(model2)
}
这是一个示例用法
data(iris)
model1 <- lm(Sepal.Length~Species, data=iris)
model2 <- MyUpdate(model1)
eval(expr,envir,enclos)中的错误:找不到对象'randData'
Error in eval(expr, envir, enclos) : object 'randData' not found
这是glm的另一个示例
Here is another example with glm
model1 <- glm(Sepal.Length>5~Species, data=iris, family=binomial)
model2 <- MyUpdate(model1)
有什么主意吗?
推荐答案
问题是var1
是在数据框和模型环境中查找的,而不是在MyUpdate
中的环境中查找的.
The problem is that var1
is looked up in the data frame and the model's environment but not within the environment in MyUpdate
.
1)为避免此问题,不仅要使用修订的公式来更新模型,还要使用包含var1
的修订的数据框来更新模型:
1) To avoid this problem update the model with not only the revised formula but also a revised data frame containing var1
:
MyUpdate <- function(model) {
mf <- model.frame(model)
n <- nrow(mf)
var1 <- rnorm(n)
update(model, formula = . ~ . + var1, data = data.frame(mf, var1))
}
以上可能是此答案中提出的解决方案的最佳解决方案,因为它避免了内部结构的混乱.它似乎适用于lm
,glm
,multinom
和clm
.下面的其他解决方案会影响内部结构,因此在模型拟合例程中不太通用.其他人都可以与lm
一起使用,但可能不适用于其他人.
The above is probably the best solution of the ones presented in this answer as it avoids mucking around with internal structures. It seems to work for lm
, glm
, multinom
and clm
. The other solutions below do muck around with internal structures and therefore are less general across model fitting routines. The others all work with lm
but may not work for others.
test 这是一个测试,如果MyUpdate
如上,则在问题中提到的每个模型拟合函数上运行时都没有错误,并且(2)中的解决方案都在没有错误.解决方案(3)至少适用于lm
.
test Here is a test which runs without errors on each of the model fitting functions mentioned in the question if MyUpdate
is as above and also the solutions in (2) all run the tests without error. The solution (3) works at least with lm
.
model.lm <- lm(Sepal.Length~Species, data=iris)
MyUpdate(model.lm)
model.glm <- glm(Sepal.Length~Species, data=iris)
MyUpdate(model.glm)
library(nnet)
example(multinom)
MyUpdate(bwt.mu)
library(ordinal)
model.clm <- clm(rating ~ temp * contact, data = wine)
MyUpdate(model.clm)
其余解决方案可以更直接地访问内部构件,从而使其对更改模型功能的鲁棒性降低.
The remaining solutions perform more direct access of internals making them less robust to changing the model function.
2)适应环境
此外,这里还有涉及环境混乱的三种解决方案.第一个是最干净的,其次是第二个,然后是第三个.第三个是最不可接受的,因为它实际上将var1
写入模型的环境中(危险地覆盖了其中的任何var1
),但这是最短的.它们与lm
,glm
multinom
和clm
一起使用.
In addition here are three solutions that involve messing with environments. The first is the cleanest followed by the second and then the third. The third is the least acceptable since it actually writes var1
into the model's environment (dangerously overwriting any var1
there) but it is the shortest. They work with lm
, glm
multinom
and clm
.
请注意,我们实际上并不需要将var1
放入数据框中,也不必将更新公式放在引号中,并且在下面的所有示例中我们都进行了更改.也可以删除return
语句,我们也已经删除了.
Note that we do not really need to put var1
into a data frame nor is it necessary to put the updating formula in quotes and we have changed both in all examples below. Also the return
statement can be removed and we have done that too.
2a)以下内容修改了原始模型的环境,以指向一个新的代理原型对象,该对象包含其父级是原始模型环境的var1
.这里的proto(p, var1 = rnorm(n))
是代理协议的原始对象(原始对象是具有不同语义的环境),而p
是代理的父对象.
2a) The following modifies the environment of the original model to point to a new proxy proto object containing var1
whose parent is the original model environment. Here proto(p, var1 = rnorm(n))
is the proxy proto object (a proto object is an environment with differing semantics) and p
is the parent of the proxy.
library(proto)
MyUpdate <- function(model){
mf <- model.frame(model)
n <- nrow(mf)
var1 <- rnorm(n)
p <- environment(formula(model))
if (is.null(model$formula)) {
attr(model$terms, ".Environment") <- proto(p, var1 = var1)
} else environment(model$formula) <- proto(p, var1 = var1)
update(model, . ~ . + var1)
}
有关更多信息,请阅读本文档中的代理部分: http://r-proto .googlecode.com/files/prototype_approaches.pdf
For more information read the Proxies section in this document: http://r-proto.googlecode.com/files/prototype_approaches.pdf
2b)可以不使用原始协议来替代执行此操作,但要以将##行扩展为包含一些其他丑陋环境操作的三行为代价.这里e
是代理环境.
2b) This could alternately be done without proto but at the expense of expanding the ## line to three lines containing some additional ugly environment manipulations. Here e
is the proxy environment.
MyUpdate <- function(model){
mf <- model.frame(model)
n <- nrow(mf)
var1 <- rnorm(n)
p <- environment(formula(model))
e <- new.env(parent = p)
e$var1 <- var1
if (is.null(model$formula)) attr(model$terms, ".Environment") <- e
else environment(model$formula) <- e
update(model, . ~ . + var1)
}
2c)最短但最棘手的是将var1
坚持到原始的model
环境中:
2c) Shortest but the most hackish is to stick var1
into the original model
environment:
MyUpdate <- function(model){
mf <- model.frame(model)
n <- nrow(mf)
var1 <- rnorm(n)
if (is.null(model$formula)) attr(model$terms, ".Environment")$var1 <- var1
else environment(model$formula)$var1 <- var1
update(model, . ~ . + var1)
}
3)评估/替代此解决方案确实使用了eval
有时会被忽略.它可以在lm
和glm
上工作,并且可以在clm
上工作,除了输出不显示var1
而是显示计算它的表达式之外.
3) eval/substitute This solution does use eval
which is sometimes frowned upon. It works on lm
and glm
and on clm
it works except that the output does not display var1
but rather the expression that computes it.
MyUpdate <- function(model) {
m <- eval.parent(substitute(update(model, . ~ . + rnorm(nrow(model.frame(model))))))
m$call$formula <- update(formula(model), . ~ . + var1)
names(m$coefficients)[length(m$coefficient)] <- "var1"
m
}
已修订:添加了其他解决方案,简化了(1),在(2)中获得了解决方案,以运行测试部分中的所有示例.
REVISED Added additional solutions, simplified (1), got solutions in (2) to run all examples in test section.
这篇关于update()具有局部协变量的函数内部的模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!