使用同一数据框中的数据填充data.frame中的缺失值 [英] Fill missing values in the data.frame with the data from the same data frame

查看:195
本文介绍了使用同一数据框中的数据填充data.frame中的缺失值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我看起来像...
(没有行)的数据框双方为NA,表按日期排序)。

 日期XY 
2012-07-05 00 :01:19 0.0122 NA
2012-07-05 03:19:34 0.0121 NA
2012-07-05 03:19:56 0.0121 0.027
2012-07-05 03:20 :31 0.0121 NA
2012-07-05 04:19:56 0.0121 0.028
2012-07-05 04:20:31 0.0121 NA
2012-07-05 04:20:50 0.0121 NA
2012-07-05 04:22:29 0.0121 0.027
2012-07-05 04:24:37 0.0121 NA
2012-07-05 20:48:45 0.0121 NA
2012-07-05 23:02:34 NA 0.029
2012-07-05 23:30:45 NA 0.029

与此,我正在寻找..


  1. 保留非数据丢失行因为它是。

  2. 如果任一方都缺少(NA),则填写最近的前一个具有有效对端值的行。

所以结果,我想让表看起来像...

 日期XY 
2012-07-05 00:01:19 0.0122 NA
2012-07-05 03:19:34 0.0121 NA
2012-07-05 03:19:56 0.0121 0.027
2012-07-05 03:20:31 0.0121 0.027
2012-07-05 04: 19:56 0.0121 0.028
2012-07-05 04:20:31 0.0121 0.028
2012-07-05 04:20:50 0.0121 0.028
2012-07-05 04:22: 29 0.0121 0.027
2012-07-05 04:24:37 0.0121 0.027
2012-07-05 20:48:45 0.0121 0.027
2012-07-05 23:02:34 0.0121 0.029
2012-07-05 23:30:45 0.0121 0.029

什么样的R命令可以用来实现吗?

解决方案

使用 na.locf 动物园

  dat < -  read.table =2012-07-05 00:01:19 0.0122 NA 
2 012-07-05 03:19:34 0.0121 NA
2012-07-05 03:19:56 0.0121 0.027
2012-07-05 03:20:31 0.0121 NA
2012- 07-05 04:19:56 0.0121 0.028
2012-07-05 04:20:31 0.0121 NA
2012-07-05 04:20:50 0.0121 NA
2012-07- 05 04:22:29 0.0121 0.027
2012-07-05 04:24:37 0.0121 NA
2012-07-05 20:48:45 0.0121 NA
2012-07-05 23 :02:34 NA 0.029
2012-07-05 23:30:45 NA 0.029)

require(zoo)
na.locf(dat)
#V1 V2 V3 V4
#1 2012-07-05 00:01:19 0.0122< NA>
#2 2012-07-05 03:19:34 0.0121< NA>
#3 2012-07-05 03:19:56 0.0121 0.027
#4 2012-07-05 03:20:31 0.0121 0.027
#5 2012-07-05 04:19 :56 0.0121 0.028
#6 2012-07-05 04:20:31 0.0121 0.028
#7 2012-07-05 04:20:50 0.0121 0.028
#8 2012-07- 05 04:22:29 0.0121 0.027
#9 2012-07-05 04:24:37 0.0121 0.027
#10 2012-07-05 20:48:45 0.0121 0.027
#11 2012-07-05 23:02:34 0.0121 0.029
#12 2012-07-05 23:30:45 0.0121 0.029


I'm trying to backfill a fully outerjoined table with nearest preceding column data.

The data frame I have looks like.. (No rows have both sides as NA and the table is sorted by date).

              date     X         Y
2012-07-05 00:01:19   0.0122     NA
2012-07-05 03:19:34   0.0121     NA
2012-07-05 03:19:56   0.0121   0.027
2012-07-05 03:20:31   0.0121     NA
2012-07-05 04:19:56   0.0121   0.028
2012-07-05 04:20:31   0.0121     NA
2012-07-05 04:20:50   0.0121     NA
2012-07-05 04:22:29   0.0121   0.027
2012-07-05 04:24:37   0.0121     NA
2012-07-05 20:48:45   0.0121     NA
2012-07-05 23:02:34    NA      0.029
2012-07-05 23:30:45    NA      0.029

with this, I'm looking to..

  1. leave the non-data missing rows as it is.
  2. If either one side is missing (NA), then fill it with the "nearest preceding" row which has valid opposite side's value.

And so as the result, I would like to have the table looking like...

              date     X         Y
2012-07-05 00:01:19   0.0122     NA
2012-07-05 03:19:34   0.0121     NA
2012-07-05 03:19:56   0.0121   0.027
2012-07-05 03:20:31   0.0121   0.027
2012-07-05 04:19:56   0.0121   0.028
2012-07-05 04:20:31   0.0121   0.028
2012-07-05 04:20:50   0.0121   0.028
2012-07-05 04:22:29   0.0121   0.027
2012-07-05 04:24:37   0.0121   0.027
2012-07-05 20:48:45   0.0121   0.027
2012-07-05 23:02:34   0.0121   0.029
2012-07-05 23:30:45   0.0121   0.029

What kind of R commands can I use to achieve this?

解决方案

Use na.locf from the zoo package

dat <- read.table(text="2012-07-05 00:01:19   0.0122     NA
2012-07-05 03:19:34   0.0121     NA
2012-07-05 03:19:56   0.0121   0.027
2012-07-05 03:20:31   0.0121     NA
2012-07-05 04:19:56   0.0121   0.028
2012-07-05 04:20:31   0.0121     NA
2012-07-05 04:20:50   0.0121     NA
2012-07-05 04:22:29   0.0121   0.027
2012-07-05 04:24:37   0.0121     NA
2012-07-05 20:48:45   0.0121     NA
2012-07-05 23:02:34    NA      0.029
2012-07-05 23:30:45    NA      0.029")

require("zoo")
na.locf(dat)
#           V1       V2     V3    V4
#1  2012-07-05 00:01:19 0.0122  <NA>
#2  2012-07-05 03:19:34 0.0121  <NA>
#3  2012-07-05 03:19:56 0.0121 0.027
#4  2012-07-05 03:20:31 0.0121 0.027
#5  2012-07-05 04:19:56 0.0121 0.028
#6  2012-07-05 04:20:31 0.0121 0.028
#7  2012-07-05 04:20:50 0.0121 0.028
#8  2012-07-05 04:22:29 0.0121 0.027
#9  2012-07-05 04:24:37 0.0121 0.027
#10 2012-07-05 20:48:45 0.0121 0.027
#11 2012-07-05 23:02:34 0.0121 0.029
#12 2012-07-05 23:30:45 0.0121 0.029

这篇关于使用同一数据框中的数据填充data.frame中的缺失值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆