如果最后一个和下一个非NA值相同,则替换NA值 [英] Replace NA values if last and next non-NA value are the same
问题描述
我试图根据上一个和最后一个NA值是否相同来填充丢失的数据.例如,这是虚拟数据集:
I am trying to fill missing data based on whether the previous and last NA value are the same. For example, this is the dummy dataset:
df <- data.frame(ID = c(rep(1, 6), rep(2, 6), rep(3, 6), rep(4, 6), rep(5, 6), rep(6, 6),
rep(7, 6), rep(8, 6), rep(9, 6), rep(10, 6)),
with_missing = c("a", "a", NA, NA, "a", "a",
"a", "a", NA, "b", "b", "b",
"a", NA, NA, NA, "c", "c",
"b", NA, "a", "a", "a", "a",
"a", NA, NA, NA, NA, "a",
"a", "a", NA, "b", "a", "a",
"a", "a", NA, NA, "a", "a",
"a", "a", NA, "b", "b", "b",
"a", NA, NA, NA, "c", "c",
"b", NA, "a", "a", "a", "a"),
desired_result = c("a", "a", "a", "a", "a", "a",
"a", "a", NA, "b", "b", "b",
"a", NA, NA, NA, "c", "c",
"b", NA, "a", "a", "a", "a",
"a", "a", "a", "a", "a", "a",
"a", "b", "b", "b", "a", "a",
"a", "a", "a", "a", "a", "a",
"a", "a", NA, "b", "b", "b",
"a", NA, NA, NA, "c", "c",
"b", NA, "a", "a", "a", "a"))
因此,例如,如果有四行的间隙,但是间隙之前和之后的值相同,那么我希望用那些相同的值来填充间隙;相反,如果NA之前和之后的值不同,则我不想填写它.另外,我需要按ID变量对数据进行分组.
So if there is a gap of four rows, for example, but the value before and after the gap are the same, then I want the gap to be filled with those same values; whereas if the values before and after the NA are different, I don't want to fill it. In addition, I need to group the data by the ID variable.
我已经尝试过na.locf,但是在如果它们在NA之前和之后是否相同"的条件下,我不知道如何添加.
I've tried na.locf but I can't work out how to add in the condition of "if they're the same before and after NA".
谢谢.
推荐答案
您可以前后填充,然后将不匹配的行设置为NA
.
You can fill forwards and backwards, then set the rows where they don't match to NA
.
library(zoo)
library(dplyr)
df %>%
mutate_if(is.factor, as.character) %>%
group_by(ID) %>%
mutate(result = na.locf(with_missing, fromLast = T),
result = ifelse(result == na.locf(with_missing), result, NA))
# ID with_missing desired_result result
# 1 1 a a a
# 2 1 a a a
# 3 1 <NA> a a
# 4 1 <NA> a a
# 5 1 a a a
# 6 1 a a a
# 7 2 a a a
# 8 2 a a a
# 9 2 <NA> <NA> <NA>
# 10 2 b b b
# 11 2 b b b
# 12 2 b b b
# 13 3 a a a
# 14 3 <NA> <NA> <NA>
# 15 3 <NA> <NA> <NA>
# 16 3 <NA> <NA> <NA>
# 17 3 c c c
# 18 3 c c c
# 19 4 b b b
# 20 4 <NA> <NA> <NA>
# 21 4 a a a
# 22 4 a a a
# 23 4 a a a
# 24 4 a a a
# 25 5 a a a
# 26 5 <NA> a a
# 27 5 <NA> a a
# 28 5 <NA> a a
# 29 5 <NA> a a
# 30 5 a a a
# 31 6 a a a
# 32 6 a b a
# 33 6 <NA> b <NA>
# 34 6 b b b
# 35 6 a a a
# 36 6 a a a
# 37 7 a a a
# 38 7 a a a
# 39 7 <NA> a a
# 40 7 <NA> a a
# 41 7 a a a
# 42 7 a a a
# 43 8 a a a
# 44 8 a a a
# 45 8 <NA> <NA> <NA>
# 46 8 b b b
# 47 8 b b b
# 48 8 b b b
# 49 9 a a a
# 50 9 <NA> <NA> <NA>
# 51 9 <NA> <NA> <NA>
# 52 9 <NA> <NA> <NA>
# 53 9 c c c
# 54 9 c c c
# 55 10 b b b
# 56 10 <NA> <NA> <NA>
# 57 10 a a a
# 58 10 a a a
# 59 10 a a a
# 60 10 a a a
这篇关于如果最后一个和下一个非NA值相同,则替换NA值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!