如果最后一个和下一个非NA值相同,则替换NA值 [英] Replace NA values if last and next non-NA value are the same

查看:74
本文介绍了如果最后一个和下一个非NA值相同,则替换NA值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图根据上一个和最后一个NA值是否相同来填充丢失的数据.例如,这是虚拟数据集:

I am trying to fill missing data based on whether the previous and last NA value are the same. For example, this is the dummy dataset:

df <- data.frame(ID = c(rep(1, 6), rep(2, 6), rep(3, 6), rep(4, 6), rep(5, 6), rep(6, 6), 
                    rep(7, 6), rep(8, 6), rep(9, 6), rep(10, 6)), 
             with_missing = c("a", "a", NA, NA, "a", "a", 
                              "a", "a", NA, "b", "b", "b", 
                              "a", NA, NA, NA, "c", "c", 
                              "b", NA, "a", "a", "a", "a", 
                              "a", NA, NA, NA, NA, "a", 
                              "a", "a", NA, "b", "a", "a", 
                              "a", "a", NA, NA, "a", "a", 
                              "a", "a", NA, "b", "b", "b", 
                              "a", NA, NA, NA, "c", "c", 
                              "b", NA, "a", "a", "a", "a"),
             desired_result = c("a", "a", "a", "a", "a", "a", 
                                "a", "a", NA, "b", "b", "b", 
                                "a", NA, NA, NA, "c", "c", 
                                "b", NA, "a", "a", "a", "a", 
                                "a", "a", "a", "a", "a", "a", 
                                "a", "b", "b", "b", "a", "a", 
                                "a", "a", "a", "a", "a", "a", 
                                "a", "a", NA, "b", "b", "b", 
                                "a", NA, NA, NA, "c", "c", 
                                "b", NA, "a", "a", "a", "a")) 

因此,例如,如果有四行的间隙,但是间隙之前和之后的值相同,那么我希望用那些相同的值来填充间隙;相反,如果NA之前和之后的值不同,则我不想填写它.另外,我需要按ID变量对数据进行分组.

So if there is a gap of four rows, for example, but the value before and after the gap are the same, then I want the gap to be filled with those same values; whereas if the values before and after the NA are different, I don't want to fill it. In addition, I need to group the data by the ID variable.

我已经尝试过na.locf,但是在如果它们在NA之前和之后是否相同"的条件下,我不知道如何添加.

I've tried na.locf but I can't work out how to add in the condition of "if they're the same before and after NA".

谢谢.

推荐答案

您可以前后填充,然后将不匹配的行设置为NA.

You can fill forwards and backwards, then set the rows where they don't match to NA.

library(zoo)
library(dplyr)

df %>% 
  mutate_if(is.factor, as.character) %>% 
  group_by(ID) %>%
  mutate(result = na.locf(with_missing, fromLast = T),
         result = ifelse(result == na.locf(with_missing), result, NA))

#    ID with_missing desired_result result
# 1   1            a              a      a
# 2   1            a              a      a
# 3   1         <NA>              a      a
# 4   1         <NA>              a      a
# 5   1            a              a      a
# 6   1            a              a      a
# 7   2            a              a      a
# 8   2            a              a      a
# 9   2         <NA>           <NA>   <NA>
# 10  2            b              b      b
# 11  2            b              b      b
# 12  2            b              b      b
# 13  3            a              a      a
# 14  3         <NA>           <NA>   <NA>
# 15  3         <NA>           <NA>   <NA>
# 16  3         <NA>           <NA>   <NA>
# 17  3            c              c      c
# 18  3            c              c      c
# 19  4            b              b      b
# 20  4         <NA>           <NA>   <NA>
# 21  4            a              a      a
# 22  4            a              a      a
# 23  4            a              a      a
# 24  4            a              a      a
# 25  5            a              a      a
# 26  5         <NA>              a      a
# 27  5         <NA>              a      a
# 28  5         <NA>              a      a
# 29  5         <NA>              a      a
# 30  5            a              a      a
# 31  6            a              a      a
# 32  6            a              b      a
# 33  6         <NA>              b   <NA>
# 34  6            b              b      b
# 35  6            a              a      a
# 36  6            a              a      a
# 37  7            a              a      a
# 38  7            a              a      a
# 39  7         <NA>              a      a
# 40  7         <NA>              a      a
# 41  7            a              a      a
# 42  7            a              a      a
# 43  8            a              a      a
# 44  8            a              a      a
# 45  8         <NA>           <NA>   <NA>
# 46  8            b              b      b
# 47  8            b              b      b
# 48  8            b              b      b
# 49  9            a              a      a
# 50  9         <NA>           <NA>   <NA>
# 51  9         <NA>           <NA>   <NA>
# 52  9         <NA>           <NA>   <NA>
# 53  9            c              c      c
# 54  9            c              c      c
# 55 10            b              b      b
# 56 10         <NA>           <NA>   <NA>
# 57 10            a              a      a
# 58 10            a              a      a
# 59 10            a              a      a
# 60 10            a              a      a

这篇关于如果最后一个和下一个非NA值相同,则替换NA值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆