R MICE归因失败 [英] R MICE imputation failing

查看:152
本文介绍了R MICE归因失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我真的为为什么我的估算在R's Mice 2.22程序包中失败而感到困惑.我正在尝试使用以下数据框进行非常简单的操作:

I am really baffled about why my imputation is failing in R's Mice 2.22 package. I am attempting a very simple operation with the following data frame:

> dfn
   a b c  d
1  0 1 0  1
2  1 0 0  0
3  0 0 0  0
4 NA 0 0  0
5  0 0 0 NA

然后,我通过以下方式使用鼠标来进行简单的均值归因:

I then use mice in the following way to perform a simple mean imputation:

imp <- mice(dfn, method = "mean", m = 1, maxit =1)
filled <- complete(imp)

但是,我完成的数据如下:

However, my completed data looks like this:

> fill
 a b c  d
1 0.00 1 0  1
2 1.00 0 0  0
3 0.00 0 0  0
4 0.25 0 0  0
5 0.00 0 0 NA

为什么我仍然得到该尾随的NA?这是我可以构造的最简单的失败示例,但是我的实际数据集要大得多,我只是想了解问题出在哪里.任何帮助将不胜感激!

Why am I still getting this trailing NA? This is the simplest failing example I could construct, but my real data set is much larger and I am just trying to get a sense of where things are going wrong. Any help would be greatly appreciated!

推荐答案

我不确定这有多准确,但是尝试一下.即使method="mean"应该用来推断无条件的意思,但从文档中可以看出prdictorMatrix并没有进行相应的更改.

I'm not really sure how accurate this is, but here is an attempt. Even though method="mean" is supposed to impute the unconditional mean, it appears from the documentation that the prdictorMatrix is not being changed accordingly.

通常,由于预测变量遭受多重共线性或每个变量的案例太少(导致无法估算插补模型),所以会出现剩余的NA. 但是,method="mean"不应那样行事.

Normally, leftover NA occur because the predictors suffer from multicollinearity or because there are too few cases per variable (such that the imputation model cannot be estimated). However, method="mean" shouldn't behave that way.

这是我所做的:

dfn <- read.table(text="a b c  d
 0 1 0  1
 1 0 0  0
 0 0 0  0
NA 0 0  0
 0 0 0 NA", header=TRUE)

imp <- mice( dfn, method="mean", predictorMatrix=diag(ncol(dfn)) )
complete(imp)

# 1 0.00 1 0 1.00
# 2 1.00 0 0 0.00
# 3 0.00 0 0 0.00
# 4 0.25 0 0 0.00
# 5 0.00 0 0 0.25

您可以使用实际数据集尝试此操作,但应仔细检查结果.例如,执行以下操作:

You can try this using your actual data set, but you should check the results carefully. For example, do:

sapply(dfn, function(x) mean(x,na.rm=TRUE))

每个变量的均值应与推算的均值相同. 请告诉我是否可以解决您的问题.

The means for each variable should be identical to those that have been imputed. Please let me know if this solves your problem.

这篇关于R MICE归因失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆