(MuMIn)全球混合效应模型等级不足时的挖泥机 [英] (MuMIn) Dredge when global mixed-effects model is rank deficient
问题描述
我正在尝试使用 glmer()
和 dredge()
在Poisson混合效应模型上运行变量选择.由于几个变量是共线的,因此我使用了 dredge
的子集函数来避免相关变量.但是,要有效使用 dredge()
,则需要一个包含所有术语的完整模型-这可能导致完整模型排名不足.
I am trying to run variable selection on Poisson mixed-effect models using glmer()
and dredge()
. Since several variables are collinear I use the subsetting function of dredge
to avoid correlated variables. However, to use dredge()
effectively one needs to have a full model including all terms - which can lead to full model to be rank-deficient.
[edited Feb 15 2016] 为了给出一个可重复的例子,让我们生成一个随机数据集:
[edited Feb 15 2016] To give a reproducible example, let's generate a random data set:
dfdat<-data.frame(replicate(6, round(rnorm(6),2)))
dfdat$group<-factor(sample(1:2,nrow(dfdat),replace=T))
dfdat$Y<-rpois(nrow(dfdat),10)+rpois(nrow(dfdat),as.numeric(dfdat$group))
dfdat
X1 X2 X3 X4 X5 X6 group Y
1 -0.88 0.05 1.33 -1.51 0.61 -0.09 2 8
2 -0.12 -0.57 0.05 -1.12 0.60 -0.41 1 7
3 0.14 -0.97 -1.04 0.40 0.87 0.27 1 9
4 -1.04 -0.26 -1.33 0.77 -1.84 1.67 1 11
5 -1.06 1.10 -0.09 0.50 -2.62 2.15 1 10
6 -1.74 -0.61 0.72 -0.29 -0.30 -0.93 1 8
尝试运行全部6个术语的模型不起作用,因为该模型排名不足:
Trying to run a model with all 6 terms does not work as the model is rank-deficient:
#library(MuMIn) # not run
#library(lme4) # not run
vars<-names(dfdat)[1:6]
form<-formula(paste0('Y~',paste0(vars,collapse='+'),'+(1|group)'))
fmod<-glmer(form,data=dfdat,family='poisson')
fixed-effect model matrix is rank deficient so dropping 1 column / coefficient
Error: pwrssUpdate did not converge in (maxit) iterations
在 fmod
上使用 dredge
将导致始终排除由 glmer
删除的一个变量.
Using dredge
on fmod
would lead to the one variable dropped by glmer
being always excluded.
在此处建议的解决方案建议.运行收敛的模型,然后2.trick挖泥船通过更改收敛模型中的公式来考虑变量的完整列表.
The solution, suggested here seems to 1. run a model that converges, and 2.trick dredge into considering the full list of variables by changing the formula in the converged model.
## full model is rank deficient, so use smaller subset
vars.red<-vars[1:3]
form.red<-formula(paste0('Y~',paste0(vars.red,collapse='+'),'+(1|group)'))
fmod.red<-glmer(form.red,data=dfdat,family='poisson')
此新模型 fmod.red
收敛,但仅包含变量X1,X2和X3.
This new model fmod.red
converges, but only includes variables X1,X2 and X3.
现在进入挖泥船"部分.上面链接的页面上提出的解决方案不适用于 glmer
,因为 mermod
的结构与游戏不同.所以我尝试使用:
Now to the "tricking dredge" part. The solution proposed on the page linked above didn't work as such with glmer
as the structure of mermod
is different from gamms. So I tried to use:
fom.red@call$formula<-form
其中 form
具有我所有的协变量(将被子集化).这没有用,但是使用下面的KamilBartoń所建议的在frame元素中使用公式就可以了:
where form
has all my covariates (to be subsetted).
This didn't work, but using the formula in the frame element, as suggested by Kamil Bartoń below, did work:
# replace formula in the frame element of fmod.red
attr(fmod.red@frame,"formula")<-form
# check
formula(fmod.red)
# now apply dredge function with covariates
# exclude variable combinations (randomly chosen for the sake of example)
sexpr<-expression(!((X1 && X3) || (X1&&X6) || (X4 && X6) || (X4 && X5)))
# run dredge()
options(na.action = na.fail)
ms<-dredge(fmod.red,subset=sexpr)
更新
虽然 ms
似乎包含了所有变量,如下所示:
UPDATE
While ms
seemed to include all variables, as shown by:
names(ms)
[1] "(Intercept)" "X1" "X2" "X3" "X4" "X5" "X6"
[8] "df" "logLik" "AICc" "delta" "weight"
实际上从未包含新变量(X4,X5,X6)(到处都是NA):
the new variables (X4,X5,X6) were never actually included (NAs everywhere):
summary(ms)
(Intercept) X1 X2 X3 X4 X5 X6
Min. :2.407 Min. :0.09698 Min. :-0.4026 Min. :-0.42078 + : 0 + : 0 + : 0
1st Qu.:2.443 1st Qu.:0.22688 1st Qu.:-0.3204 1st Qu.:-0.35303 NA's:26 NA's:26 NA's:26
Median :2.474 Median :0.27361 Median :-0.2980 Median :-0.22444
Mean :2.535 Mean :0.27539 Mean :-0.3059 Mean :-0.23517
3rd Qu.:2.515 3rd Qu.:0.32357 3rd Qu.:-0.2718 3rd Qu.:-0.17472
Max. :3.009 Max. :0.45664 Max. :-0.2177 Max. : 0.08802
NA's :20 NA's :13 NA's :16
发生了什么事?
推荐答案
在"merMod"
对象中,首先在 attr(< object> @frame,公式)
(请参见 getS3method(" formula," merMod)
的功能代码).因此,将其替换为call元素无效,可以使用 formula()
或 getAllTerms()
进行测试.替换 @frame
的公式"
属性.
In "merMod"
objects, formula is first looked for at attr(<object>@frame, "formula")
(see the function code of getS3method("formula", "merMod")
). So, replacing it in a call element was not effective, which can be tested with formula()
or getAllTerms()
. Replace the "formula"
attribute of @frame
.
编辑:事实证明欺骗挖泥
不是那么容易,因为它还会查看 coef
(或在这种情况下,使用fixef
).要解决此问题,请首先生成调用 eval
uate,然后使用 model.sel
构建表:
Edit: it turns out it isn't that easy to trick dredge
, because it also looks at coef
(or fixef
in this case) when building the table. To work that around, first generate calls, eval
uate, then build the table with model.sel
:
model.sel(lapply(dredge(..., evaluate = FALSE), eval), ...)
这篇关于(MuMIn)全球混合效应模型等级不足时的挖泥机的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!