在R中使用confint计算固定效果的CI [英] Calculating CIs of fixed effects using confint in R

查看:662
本文介绍了在R中使用confint计算固定效果的CI的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想执行引导程序以在二项式GLMM中获得我固定效果的95%顺式:

I would like to perform bootstrapping to obtain 95% Cis of my fixed effects in a binomial GLMM:

m <- glmer(cbind(df$Valid.detections, df$Missed.detections) ~ distance + 
              Habitat + Replicate + transmitter.depth + receiver.depth + 
              wind.speed + wtc + Transmitter + (1 | Unit) + 
              (1 | SUR.ID) + distance:Transmitter + 
              distance:Habitat + distance:transmitter.depth + distance:receiver.depth + 
              distance:wind.speed, data = df, family = binomial(link=logit),control=glmerControl(calc.derivs=F))

我发现confint()函数能够实现此目的,所以我指定了该函数:

I found that the confint() function is able to achieve this, so I specified the function:

confint(m, method = "boot", boot.type = "basic", seed = 123, nsim = 1000)

在我决定终止之前,该功能已经运行了8个多小时.终止后,我收到了以下警告消息(x10):

The function had been running for more than 8 hours before I decided to terminate. Upon termination, I got returned the following warning messages (x10):

Warning messages:
1: In (function (fn, par, lower = rep.int(-Inf, n), upper = rep.int(Inf,  :
  failure to converge in 10000 evaluations

我的问题是:1)我是否需要担心这些警告消息?如果是这样,我该如何处理?,2)因为8个小时后它仍在运行,所以我不知道执行此功能需要花费多长时间.因此,执行此功能时最好具有某种进度条.我读到confint()可以从bootMer中获取参数,因此我添加了参数.progress ="txt",结果是:

My questions are: 1) Do I have to worry about these warning messages? If so, how could I deal with them?, 2) Because after 8 hours it was still running I have no clue how long it takes to perform this function. Therefore, it would be nice to have some sort of progress bar while performing this function. I read that confint() can take arguments from bootMer, so I included the argument .progress = "txt", resulting in:

confint(m, method = "boot", boot.type = "basic", seed = 123, nsim = 1000, .progress = "txt")

但是它似乎不起作用.另外,如果有更好的方法可以实现相同的目标,我愿意提出建议.

but it doesn't seem to work. Alternatively, if there are better ways to achieve the same goal, I'm open to suggestions.

感谢您的帮助

推荐答案

下面是一个示例:

library("lme4")
(t1 <- system.time(
    gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
                 data = cbpp, family = binomial)))
##    user  system elapsed 
##   0.188   0.000   0.186

nranpars <- length(getME(gm1,"theta"))
nfixpars <- length(fixef(gm1))

(t2 <- system.time(c1 <- confint(gm1,method="boot", nsim=1000,
                  parm=(nranpars+1):(nranpars+nfixpars),
                  .progress="txt")))

##    user  system elapsed 
## 221.958   0.164 222.187

请注意,此进度条只有80个字符长,因此仅在每次1000/80 = 12个自举程序迭代后才递增.如果您的原始模型需要一个小时才能适应,那么直到12小时后您才应该看到进度栏上的任何活动...

Note that this progress bar is only 80 characters long, so it increments only after each 1000/80=12 bootstrap iterations. If your original model took an hour to fit then you shouldn't expect to see any progress-bar activity until 12 hours later ...

(t3 <- system.time(c2 <- confint(gm1,
                  parm=(nranpars+1):(nranpars+nfixpars))))

##    user  system elapsed 
##   5.212   0.012   5.236 

1000个引导程序代表可能是过大的杀伤力-如果模型拟合缓慢,则可能会从200个引导程序代表中获得合理的 CI.

1000 bootstrap reps is probably overkill -- if your model fit is slow, you can probably get reasonable CIs from 200 bootstrap reps.

我也尝试过使用optimizer="nloptwrap"进行此操作,希望它可以加快速度.做到了,尽管有一个缺点(见下文).

I tried this with optimizer="nloptwrap" as well, hoping it would speed things up. It did, although there is a drawback (see below).

(t4 <- system.time(
    gm1B <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
                 data = cbpp, family = binomial, 
                 control=glmerControl(optimizer="nloptwrap"))))
##   user  system elapsed 
##  0.064   0.008   0.075 

(t5 <- system.time(c3 <- confint(gm1B,method="boot", nsim=1000,
                  parm=(nranpars+1):(nranpars+nfixpars),
                  .progress="txt",PBargs=list(style=3))))
##
##   user  system elapsed 
## 65.264   2.160  67.504

这要快得多,会发出有关以下内容的警告(在这种情况下为37) 收敛.根据all.equal(),以这种方式计算的置信区间仅存在约2%的差异. (包装中仍有一些皱纹需要整理……)

This is much faster, but gives warnings (37 in this case) about convergence. According to all.equal(), there was only about 2% difference in the confidence intervals calculated this way. (There are still some wrinkles to sort out in the package itself ...)

您最好的选择是并行处理-不幸的是,这样会使您失去使用进度条的能力...

Your best bet for speeding this up will be to parallelize -- unfortunately, this way you lose the ability to use a progress bar ...

(t6 <- system.time(c4 <- confint(gm1,method="boot", nsim=1000,
                  parm=(nranpars+1):(nranpars+nfixpars),
                  parallel="multicore", ncpus=4)))

## 
##     user  system elapsed 
##  310.355   0.916 116.917

这花费了更多用户时间(它计算了所有内核上使用的时间),但是经过的时间却缩短了一半. (最好用4个内核来做得更好,但速度还是要快两倍.这些是虚拟Linux机器上的虚拟内核,实际(非虚拟)内核可能会有更好的性能.)

This takes more user time (it counts the time used on all cores), but the elapsed time is cut in half. (It would be nice to do better with 4 cores but twice as fast is still good. These are virtual cores on a virtual Linux machine, real (non-virtual) cores might have better performance.)

结合使用nloptwrap和多核,我可以将时间减少到91秒(用户)/36秒(已用).

With nloptwrap and multicore combined I can get the time down to 91 seconds (user)/ 36 seconds (elapsed).

这篇关于在R中使用confint计算固定效果的CI的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆