R MICE归因失败 [英] R MICE imputation failing
问题描述
我真的为为什么我的估算在R's Mice 2.22程序包中失败而感到困惑.我正在尝试使用以下数据框进行非常简单的操作:
I am really baffled about why my imputation is failing in R's Mice 2.22 package. I am attempting a very simple operation with the following data frame:
> dfn
a b c d
1 0 1 0 1
2 1 0 0 0
3 0 0 0 0
4 NA 0 0 0
5 0 0 0 NA
然后,我通过以下方式使用鼠标来进行简单的均值归因:
I then use mice in the following way to perform a simple mean imputation:
imp <- mice(dfn, method = "mean", m = 1, maxit =1)
filled <- complete(imp)
但是,我完成的数据如下:
However, my completed data looks like this:
> fill
a b c d
1 0.00 1 0 1
2 1.00 0 0 0
3 0.00 0 0 0
4 0.25 0 0 0
5 0.00 0 0 NA
为什么我仍然得到该尾随的NA?这是我可以构造的最简单的失败示例,但是我的实际数据集要大得多,我只是想了解问题出在哪里.任何帮助将不胜感激!
Why am I still getting this trailing NA? This is the simplest failing example I could construct, but my real data set is much larger and I am just trying to get a sense of where things are going wrong. Any help would be greatly appreciated!
推荐答案
我不确定这有多准确,但是尝试一下.即使method="mean"
应该用来推断无条件的意思,但从文档中可以看出prdictorMatrix
并没有进行相应的更改.
I'm not really sure how accurate this is, but here is an attempt. Even though method="mean"
is supposed to impute the unconditional mean, it appears from the documentation that the prdictorMatrix
is not being changed accordingly.
通常,由于预测变量遭受多重共线性或每个变量的案例太少(导致无法估算插补模型),所以会出现剩余的NA
.
但是,method="mean"
不应那样行事.
Normally, leftover NA
occur because the predictors suffer from multicollinearity or because there are too few cases per variable (such that the imputation model cannot be estimated).
However, method="mean"
shouldn't behave that way.
这是我所做的:
dfn <- read.table(text="a b c d
0 1 0 1
1 0 0 0
0 0 0 0
NA 0 0 0
0 0 0 NA", header=TRUE)
imp <- mice( dfn, method="mean", predictorMatrix=diag(ncol(dfn)) )
complete(imp)
# 1 0.00 1 0 1.00
# 2 1.00 0 0 0.00
# 3 0.00 0 0 0.00
# 4 0.25 0 0 0.00
# 5 0.00 0 0 0.25
您可以使用实际数据集尝试此操作,但应仔细检查结果.例如,执行以下操作:
You can try this using your actual data set, but you should check the results carefully. For example, do:
sapply(dfn, function(x) mean(x,na.rm=TRUE))
每个变量的均值应与推算的均值相同. 请告诉我是否可以解决您的问题.
The means for each variable should be identical to those that have been imputed. Please let me know if this solves your problem.
这篇关于R MICE归因失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!