glm起始值不被接受log-link [英] glm starting values not accepted log-link

查看:249
本文介绍了glm起始值不被接受log-link的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想运行带有日志链接和偏移量的高斯GLM. 出现以下问题:

I want to run a Gaussian GLM with a log link and an offset. The following problems arise:

y <- c(1,1,0,0)
t <- c(5,3,2,4)

没问题:

exp(coef(glm(y~1 +  offset(log(t)), family=poisson)))

family=gaussian,需要指定起始值,它在这里起作用:

with family=gaussian, starting values need to be specified, it works here:

exp(coef(glm(y~1, family=gaussian(link=log), start=0)))

但在这里不起作用:

exp(coef(glm(y~1 +  offset(log(t)), family=gaussian(link=log), start=0)))

eval(expr,envir,enclos)错误:找不到有效的起始值:请指定一些"

Error in eval(expr, envir, enclos) : cannot find valid starting values: please specify some"

有人看到错了吗(希望只是在我的编码中)?

Does anyone see what's wrong (hopefully just in my coding) ?

推荐答案

以下是一些考古学的结果,解释了glm函数内部的情况:

Here are the results of some archaeology that explains what's going on, deep within the glm function:

调试(使用debug("glm"))并单步执行该函数表明该函数在以下调用中失败:

Debugging (with debug("glm")) and stepping through the function shows that it fails at the following call:

if (length(offset) && attr(mt, "intercept") > 0L) {
  fit$null.deviance <- eval(call(if (is.function(method)) "method" else method, 
    x = X[, "(Intercept)", drop = FALSE], y = Y, weights = weights, 
    offset = offset, family = family, control = control, 
    intercept = TRUE))$deviance
}

这是尝试计算模型的零偏差.仅在存在截距项和偏移项的情况下才进行评估(我不确定为什么;在这种情况下,可能是由先前对glm的调用计算出的默认null偏差是错误的,必须重新计算吗?).它调用glm.fit(method的默认值),但调用没有起始值,因为对于纯拦截模型通常不需要这些起始值.

This is an attempt to calculate the null deviance for the model. It's only evaluated if there's an intercept term and an offset term (I'm not sure why; it may be that the default null deviance calculated by the previous call to glm is wrong in that case and must be recalculated?). It calls glm.fit (the default value of method), but without starting values because these are usually unnecessary for the intercept-only model.

现在在glm.fit内部进行调试以查看会发生什么:我们(在对家族函数gaussian()的调用中)得到:

Now debugging inside glm.fit to see what happens: we get (within a call to the family function, gaussian()) to:

  if (is.null(etastart) && is.null(start) && is.null(mustart) && 
    ((family$link == "inverse" && any(y == 0)) || (family$link == 
        "log" && any(y <= 0))))
    stop("cannot find valid starting values: please specify some")

,我们看到由于未传递起始值,因为使用了日志链接,并且由于某些y值等于零,所以拟合失败.因此,如果同时(且仅当?)同时指定了偏移量和截距,使用了日志链接并且响应中的值为零时,这种情况应该发生.

and we see that because the starting values were not passed through, because a log link is used, and because some y values are equal to zero, the fit fails. So this is a case that should happen if (and only if?) an offset and an intercept are both specified, a log link is used, and there are zero values in the response.

如果您dump("glm",file="glmtemp.R");添加行

    start = start[1], etastart = etastart[1], mustart = mustart[1],

符合零偏差的通话(即上面显示的通话);和source("glmtemp.R"),似乎可以正常工作...我认为这应该是一个合理的常规解决方案.如果有人想将此问题提到R开发列表中,请放心.

to the call that fits the null deviance (i.e. the one shown above); and source("glmtemp.R"), it seems to work OK ... I think this should be a reasonable general solution. If anyone wants to bring this issue up on the R development list, feel free.

这篇关于glm起始值不被接受log-link的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆