用NA之前和之后的平均情况替换NA [英] Replace NA with average of the case before and after the NA

查看:79
本文介绍了用NA之前和之后的平均情况替换NA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我有以下data.frame:

Say I have the following data.frame:

t<-c(1,1,2,4,5,4)
u<-c(1,3,4,5,4,2)
v<-c(2,3,4,5,NA,2)
w<-c(NA,3,4,5,2,3)
x<-c(2,3,4,5,6,NA)

df<-data.frame(t,u,v,w,x)

我想用代表NA前后情况平均的值替换NA,除非行以NA开始(第4行)或结束(第5行).当行以NA开头时,我想用以下情况替换NA.当行以NA结尾时,我想用前面的情况替换NA.

I would like to replace the NAs with values that represent the average of the case before and after the NA, unless a row starts (row 4) or ends (row 5) with an NA. When the row begins with NA, I would like to substitute the NA with the following case. When the row ends with NA, I would like to substitute the NA with the previous case.

因此,我希望我的输出看起来像:

Thus, I would like my output to look like:

t<-c(1,1,2,4,5,4)
u<-c(1,3,4,5,4,2)
v<-c(2,3,4,5,3.5,2)
w<-c(3,3,4,5,2,3)
x<-c(2,3,4,5,6,6)

df<-data.frame(t,u,v,w,x)

推荐答案

该问题涉及第4行,以NA开头,第5行以NA结尾,但实际上输入df的第4列以NA和第5列开头输入的末尾以NA开头,输入的第4行或第5行都不以NA开头或结尾,因此我们假设该列的意思是,而不是行.另外,在问题中还有两个数据帧都命名为df.显然,一个应该表示输入,另一个具有相同名称的数据帧是输出,但是为了完全清楚起见,我们重复了最后在注释"中使用的df的定义.

The question refers to row 4 starting with NA and row 5 ending in NA but in fact column 4 of the input df starts with an NA and column 5 of the input ends with an NA and neither row 4 nor row 5 of the input start or end with an NA so we will assume that column was meant, not row. Also there are two data frames both named df in the question. Evidently one is supposed to represent the input and the other data frame having the same name is the output but for complete clarity we have repeated the definition of the df we have used in the Note at the end.

na.approx几乎可以做到这一点. (如果矩阵结果正确,则省略data.frame()部分.)

na.approx in zoo pretty much does this. (If a matrix result is OK then omit the data.frame() part.)

library(zoo)

data.frame(na.approx(df, rule = 2))

给予:

  t u   v w x
1 1 1 2.0 3 2
2 1 3 3.0 3 3
3 2 4 4.0 4 4
4 4 5 5.0 5 5
5 5 4 3.5 2 6
6 4 2 2.0 3 6

注意:为清楚起见,我们使用此数据框作为上面的输入:

Note: For clarity, we used this data frame as input above:

df <- structure(list(t = c(1, 1, 2, 4, 5, 4), u = c(1, 3, 4, 5, 4, 
2), v = c(2, 3, 4, 5, NA, 2), w = c(NA, 3, 4, 5, 2, 3), x = c(2, 
3, 4, 5, 6, NA)), .Names = c("t", "u", "v", "w", "x"), row.names = c(NA, 
-6L), class = "data.frame")

这篇关于用NA之前和之后的平均情况替换NA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆