将NA行替换为前一行和某些列的非NA值 [英] Replace NA row with non-NA value from previous row and certain column

查看:111
本文介绍了将NA行替换为前一行和某些列的非NA值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个矩阵,其中行的所有列都可以具有NA.我想用上一行的非NA值和第K列替换这些NA行.

I have a matrix, where rows can have NA's for all columns. I want to replace these NA rows with previous row's non-NA value and K-th column.

例如,此矩阵:

      [,1] [,2]
 [1,]   NA   NA
 [2,]   NA   NA
 [3,]    1    2
 [4,]    2    3
 [5,]   NA   NA
 [6,]   NA   NA
 [7,]   NA   NA
 [8,]    6    7
 [9,]    7    8
[10,]    8    9

必须转换为此非NA矩阵,我们在其中使用第2列进行替换:

Must be transformed to this non-NA matrix, where we use 2-th column for replacement:

      [,1] [,2]
 [1,]   NA   NA
 [2,]   NA   NA
 [3,]    1    2
 [4,]    2    3
 [5,]    3    3
 [6,]    3    3
 [7,]    3    3
 [8,]    6    7
 [9,]    7    8
[10,]    8    9

我为此编写了一个函数,但是使用了循环:

I wrote a function for this, but using loop:

# replaces rows which contains all NAs with non-NA values from previous row and K-th column
na.replace <- function(x, k) {
    cols <- ncol(x)
    for (i in 2:nrow(x)) {
        if (sum(is.na(x[i - 1, ])) == 0 && sum(is.na(x[i, ])) == cols) {
            x[i, ] <- x[i - 1 , k]
        }
    }
    x
}

似乎此功能可以正常工作,但我想避免这些循环.任何人都可以提出建议,我如何不使用循环就可以进行这种替换?

Seems this function works correct, but I want to avoid these loops. Can anyone advice, how I can do this replacement without using loops?

更新

agstudy 建议它是自己的矢量化非循环解决方案:

agstudy suggested it's own vectorized non-loop solution:

na.replace <- function(mat, k){
  idx       <-  which(rowSums(is.na(mat)) == ncol(mat))
  mat[idx,] <- mat[ifelse(idx > 1, idx-1, 1), k]
  mat
}

但是,与我的带有循环的解决方案相比,该解决方案返回了不同且错误的结果.为什么会这样?理论上,循环解和非循环解是相同的.

But this solution returns different and wrong results, comparing to my solution with loops. Why this happens? Theoretically loop and non-loop solutions are identical.

推荐答案

最后,我实现了自己的矢量化版本.它返回预期的输出:

Finally I realized my own vectorized version. It returns expected output:

na.replace <- function(x, k) {
    isNA <- is.na(x[, k])
    x[isNA, ] <- na.locf(x[, k], na.rm = F)[isNA]
    x
}

更新

更好的解决方案,无需任何软件包

Better solution, without any packages

na.lomf <- function(x) {
    if (length(x) > 0L) {
        non.na.idx <- which(!is.na(x))
        if (is.na(x[1L])) {
            non.na.idx <- c(1L, non.na.idx)
        }
        rep.int(x[non.na.idx], diff(c(non.na.idx, length(x) + 1L)))
    }
}

na.lomf(c(NA, 1, 2, NA, NA, 3, NA, NA, 4, NA))
# [1] NA  1  2  2  2  3  3  3  4  4

这篇关于将NA行替换为前一行和某些列的非NA值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆