用该列的中位数替换矩阵的每一列中的NA [英] Replacing NA's in each column of matrix with the median of that column
问题描述
我试图用该列的中位数替换矩阵每一列中的NA,但是当我尝试使用lapply
或sapply
时出现错误;当我使用for循环并且一次更改一列时,代码起作用了,我在做什么错了?
I am trying to replace the NA's in each column of a matrix with the median of of that column, however when I try to use lapply
or sapply
I get an error; the code works when I use a for-loop and when I change one column at a time, what am I doing wrong?
示例:
set.seed(1928)
mat <- matrix(rnorm(100*110), ncol = 110)
mat[sample(1:length(mat), 700, replace = FALSE)] <- NA
mat1 <- mat2 <- mat
mat1 <- lapply(mat1,
function(n) {
mat1[is.na(mat1[,n]),n] <- median(mat1[,n], na.rm = TRUE)
}
)
for (n in 1:ncol(mat2)) {
mat2[is.na(mat2[,n]),n] <- median(mat2[,n], na.rm = TRUE)
}
推荐答案
我建议使用matrixStats
软件包对此向量进行矢量化,而不是使用任一循环来计算每列的中位数(sapply
也是感觉到它在每次迭代中都会评估一个函数).
I would suggest vectorizing this using the matrixStats
package instead of calculating a median per column using either of the loops (sapply
is also a loop in a sense that its evaluates a function in each iteration).
首先,我们将创建一个NA
s索引
First, we will create a NA
s index
indx <- which(is.na(mat), arr.ind = TRUE)
然后,使用预先计算的列中位数并根据索引替换NA
Then, replace the NA
s using the precalculated column medians and according to the index
mat[indx] <- matrixStats::colMedians(mat, na.rm = TRUE)[indx[, 2]]
这篇关于用该列的中位数替换矩阵的每一列中的NA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!