R 错误表示“模型并非都适合相同大小的数据集" [英] R error which says "Models were not all fitted to the same size of dataset"

查看:37
本文介绍了R 错误表示“模型并非都适合相同大小的数据集"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我创建了两个广义线性模型如下:

I have created two generalised linear models as follows:

glm1 <-glm(Y ~ X1 + X2 + X3, family=binomial(link=logit))

glm2 <-glm(Y ~ X1 + X2, family=binomial(link=logit))

然后我使用 anova 函数:

anova(glm2,glm1)

但收到错误消息:

"anova.glmlist(c(list(object),dotargs)中的错误,dispersion =dispersion,:
模型并非都适合相同大小的数据集"

"Error in anova.glmlist(c(list(object),dotargs), dispersion = dispersion, :
models were not all fitted to the same size of dataset"

这是什么意思,我该如何解决?我在代码的开头attach添加了数据集,因此两个模型都使用相同的数据集.

What does this mean and how can I fix this? I have attached the dataset at the start of my code so both models are working off of the same dataset.

推荐答案

该错误的主要原因是一个或多个预测变量中存在缺失值.在 R 的最新版本中,默认操作是省略所有缺少任何值的行(以前的默认值是产生错误).因此,例如,如果数据框有 100 行并且 X3 中有一个缺失值,那么您的模型 glm1 将适合 99 行数据(删除缺少 X3 的行),但 glm2 对象将适合完整100行数据(因为不使用X3,所以不需要删除行).

The main cause of that error is when there are missing values in one or more of the predictor variables. In recent versions of R the default action is to omit all rows that have any values missing (the previous default was to produce an error). So for example if the data frame has 100 rows and there is one missing value in X3 then your model glm1 will be fit to 99 rows of data (dropping the row where X3 is missing), but the glm2 object will be fit to the full 100 rows of data (since it does not use X3, no rows need to be deleted).

那么 anova 函数会给你一个错误,因为这 2 个模型适合不同的数据集(以及你如何计算自由度等).

So then the anova function gives you an error because the 2 models were fit to different datasets (and how do you compute degrees of freedom, etc.).

一种解决方案是创建一个新的数据框,其中仅包含将在至少一个模型中使用的列,并删除所有具有任何缺失值的行(na.omitna.exclude 函数将使这变得容易),然后将两个模型拟合到没有任何缺失值的同一数据框.

One solution is to create a new data frame that has only the columns that will be used in at least one of your models and remove all the rows with any missing values (the na.omit or na.exclude function will make this easy), then fit both models to the same data frame that does not have any missing values.

其他选择是查看多重插补工具或其他处理缺失数据的方法.

Other options would be to look at tools for multiple imputation or other ways of dealing with missing data.

这篇关于R 错误表示“模型并非都适合相同大小的数据集"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆