在新级别上使用lme4进行预测 [英] Prediction with lme4 on new levels

查看:170
本文介绍了在新级别上使用lme4进行预测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试拟合混合效果模型,然后使用该模型对可能具有不同级别的新数据集生成估计.我期望新数据集上的估计将使用估计参数的平均值,但事实并非如此.这是一个最小的工作示例:

I'm trying to fit a mixed effects model and then use that model to generate estimates on a new dataset that may have different levels. I expected that the estimates on a new dataset would use the mean value of the estimated parameters, but that doesn't seem to be the case. Here's a minimum working example:

library(lme4)
d = data.frame(x = rep(1:10, times = 3),
               y = NA,
               grp = rep(1:3, each = 10))
d$y[d$grp == 1] = 1:10 + rnorm(10)
d$y[d$grp == 2] = 1:10 * 1.5 + rnorm(10)
d$y[d$grp == 3] = 1:10 * 0.5 + rnorm(10)
fit = lmer(y ~ (1+x)|grp, data = d)
newdata = data.frame(x = 1:10, grp = 4)
predict(fit, newdata = newdata, allow.new.levels = TRUE)

在此示例中,我实质上是使用不同的回归方程式(斜率分别为1、1.5和0.5)定义三个组.但是,当我尝试以未知水平对新数据集进行预测时,我得到的估计值是恒定的.我希望斜率和截距的期望值可用于生成此新数据的预测.我期待错了吗?或者,我的代码在做什么错了?

In this example, I'm essentially defining three groups with different regression equations (slopes of 1, 1.5 and 0.5). However, when I try to predict on a new dataset with an unseen level, I get a constant estimate. I would have expected the expected value of the slope and intercept to be used to generate predictions for this new data. Am I expecting the wrong thing? Or, what am I doing wrong with my code?

推荐答案

我通常不会在不包括固定斜率的情况下包括随机斜率.似乎predict.merMod与我同意,因为它似乎仅使用固定效果来预测新的水平.该文档说:预测将使用无条件(人口级别)的值来处理以前未曾观测到的数据",但是这些值似乎无法用您的模型规范来估计.

I generally wouldn't include a random slope without including a fixed slope. It seems like predict.merMod agrees with me, because it seems to simply use only the fixed effects to predict for new levels. The documentation says "the prediction will use the unconditional (population-level) values for data with previously unobserved levels", but these values don't seem to be estimated with your model specification.

因此,我建议使用以下模型:

Thus, I suggest this model:

fit = lmer(y ~ x + (x|grp), data = d)
newdata = data.frame(x = 1:10, grp = 4)
predict(fit, newdata = newdata, allow.new.levels = TRUE)
#       1         2         3         4         5         6         7         8         9        10 
#1.210219  2.200685  3.191150  4.181616  5.172082  6.162547  7.153013  8.143479  9.133945 10.124410

这与仅使用模型的固定效果部分相同:

This is the same as only using the fixed effects part of the model:

t(cbind(1, newdata$x) %*% fixef(fit))
#         [,1]     [,2]    [,3]     [,4]     [,5]     [,6]     [,7]     [,8]     [,9]    [,10]
#[1,] 1.210219 2.200685 3.19115 4.181616 5.172082 6.162547 7.153013 8.143479 9.133945 10.12441

这篇关于在新级别上使用lme4进行预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆