如何用组中以前的非NaN替换NaN值 [英] How to replace NaN value with previous non-NaN within group
本文介绍了如何用组中以前的非NaN替换NaN值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我需要用组中以前的非NaN值替换NaN值。
I need to replace the NaN values with the previous non-NaN value within the group.
这里是一个例子:
+-------+------------+-------+
| ts_id | date | value |
+-------+------------+-------+
| 2 | 01/10/2014 | 18 |
| 2 | 01/11/2014 | 15 |
| 2 | 01/12/2014 | NaN |
| 2 | 01/01/2015 | NaN |
| 2 | 01/02/2015 | NaN |
| 3 | 01/03/2015 | 19 |
| 3 | 01/04/2015 | 20 |
| 3 | 01/10/2015 | 12 |
| 3 | 01/11/2015 | 17 |
| 3 | 01/12/2015 | NaN |
| 3 | 01/01/2016 | NaN |
| 3 | 01/08/2016 | 7 |
| 3 | 01/09/2016 | NaN |
| 3 | 01/10/2016 | NaN |
| 3 | 01/11/2016 | NaN |
| 3 | 01/12/2016 | NaN |
| 3 | 01/01/2017 | NaN |
+-------+------------+-------+
数据:
data <- structure(list(ts_id = c(2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3), date = structure(c(16344, 16375, 16405, 16436,
16467, 16495, 16526, 16709, 16740, 16770, 16801, 17014, 17045,
17075, 17106, 17136, 17167), class = "Date"), value = c(18, 15,
NaN, NaN, NaN, 19, 20, 12, 17, NaN, NaN, 7, NaN, NaN, NaN, NaN,
NaN)), row.names = c(NA, -17L), vars = "ts_id", drop = TRUE, indices = list(
0:16), group_sizes = 17L, biggest_group_size = 17L, labels = structure(list(
ts_id = 3L), row.names = c(NA, -1L), class = "data.frame", vars = "ts_id", drop = TRUE), class = "data.frame")
在每个组中( (由ts_id标识),我可以在任何给定日期使用NaN值。我需要用最新的非NaN值替换每个NaN。
Within each group (identified by ts_id), I can have NaN values at any given date. I need to replace each NaN with the most recent non-NaN value.
结果应如下所示:
+-------+------------+-------+
| ts_id | date | value |
+-------+------------+-------+
| 2 | 01/10/2014 | 18 |
| 2 | 01/11/2014 | 15 |
| 2 | 01/12/2014 | 15 |
| 2 | 01/01/2015 | 15 |
| 2 | 01/02/2015 | 15 |
| 3 | 01/03/2015 | 19 |
| 3 | 01/04/2015 | 20 |
| 3 | 01/10/2015 | 12 |
| 3 | 01/11/2015 | 17 |
| 3 | 01/12/2015 | 17 |
| 3 | 01/01/2016 | 17 |
| 3 | 01/08/2016 | 7 |
| 3 | 01/09/2016 | 7 |
| 3 | 01/10/2016 | 7 |
| 3 | 01/11/2016 | 7 |
| 3 | 01/12/2016 | 7 |
| 3 | 01/01/2017 | 7 |
+-------+------------+-------+
预先感谢。
推荐答案
您可以使用:
library(dplyr)
library(zoo) # for the na.locf function
data %>%
group_by(ts_id) %>% # group by id
mutate(value = na.locf(value,na.rm=F)) # na.locf fills with the last non-empty value
#head()
# # A tibble: 6 x 3
# # Groups: ts_id [2]
# ts_id date value
# <dbl> <date> <dbl>
# 1 2 2014-10-01 18
# 2 2 2014-11-01 15
# 3 2 2014-12-01 15
# 4 2 2015-01-01 15
# 5 2 2015-02-01 15
# 6 3 2015-03-01 19
这篇关于如何用组中以前的非NaN替换NaN值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文