遍历一列,忽略但保留R中的NA值 [英] Iterate over a column ignoring but retaining NA values in R

查看:89
本文介绍了遍历一列,忽略但保留R中的NA值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在R中有一个时间序列数据帧,其中有一个列V1,该列由整数组成,整个整数散布着一些NA.我想遍历此列,并在以前的一个时间步骤中从自身减去V1.但是,我想忽略V1中的NA值,并在减法中使用最后一个非NA值.如果V1的当前值为NA,则该差值应返回NA.参见下面的示例

I have a time series data frame in R that has a column, V1, which consists of integers with a few NAs interspersed throughout. I want to iterate over this column and subtract V1 from itself one time step previously. However, I want to ignore the NA values in V1 and use the last non-NA value in the subtraction. If the current value of V1 is NA, then the difference should return NA. See below for an example

V1 <- c(1, 3, 4, NA, NA, 6, 9, NA, 10)
time <- 1:length(V1)
dat <- data.frame(time = time,
                     V1 = V1)
lag_diff <- c(NA, 2, 1, NA, NA, 2, 3, NA, 1) # The result I want
diff(dat$V1) # Not the result I want

我不希望使用循环来执行此操作,因为我有数百个数据帧,每个数据帧具有10,000行以上.

I'd prefer not to do this with loops because I have hundreds of data frames, each with >10,000 rows.

我解决这个问题的第一个想法是过滤掉NA行,执行迭代差计算,然后重新插入被过滤掉的行,但是我想不出一种方法.这样做也不是很整洁",我不确定它比循环要快.感谢您的任何帮助,如果解决方案使用tidyverse函数,则可以加分.

My first thought to solve this was to filter out the NA rows, perform the iterative difference calculation and then reinsert the rows that were filtered out but I can't think of a way to do that. It doesn't seem very "tidy" to do it that way either and I'm not sure it would be faster than looping. Any help is appreciated, bonus points if the solution uses tidyverse functions.

推荐答案

dat[!is.na(dat$V1), 'lag_diff'] <- c(NA, diff(dat[!is.na(dat$V1), 'V1']))
#   time V1 lag_diff
# 1    1  1       NA
# 2    2  3        2
# 3    3  4        1
# 4    4 NA       NA
# 5    5 NA       NA
# 6    6  6        2
# 7    7  9        3
# 8    8 NA       NA
# 9    9 10        1

或带有data.table(结果相同)

library(data.table)
setDT(dat)

dat[!is.na(V1), lag_diff := V1 - shift(V1)]

#    time V1 lag_diff
# 1:    1  1       NA
# 2:    2  3        2
# 3:    3  4        1
# 4:    4 NA       NA
# 5:    5 NA       NA
# 6:    6  6        2
# 7:    7  9        3
# 8:    8 NA       NA
# 9:    9 10        1

这篇关于遍历一列,忽略但保留R中的NA值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆