na.locf填充NAs直到maxgap，即使gap> maxgap，与组 [英] na.locf fill NAs up to maxgap even if gap > maxgap, with groups

查看：115 发布时间：2017/3/12 11:42:36 r data.table na

本文介绍了na.locf填充NAs直到maxgap，即使gap> maxgap，与组的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我见过一个解决方案，但不能让它为组工作
（在时间序列中只填充有限的数字），并认为还有一个更简洁的方法来做到这一点？

I've seen a solution to this, but can't get it to work for groups (Fill NA in a time series only to a limited number), and thought there must be a neater way to do this also?

说我有以下dt：

dt <- data.table(ID = c(rep("A", 10), rep("B", 10)), Price = c(seq(1, 10, 1), seq(11, 20, 1)))
dt[c(1:2, 5:10), 2] <- NA 
dt[c(11:13, 15:19) ,2] <- NA 
dt
    ID Price
 1:  A    NA
 2:  A    NA
 3:  A     3
 4:  A     4
 5:  A    NA
 6:  A    NA
 7:  A    NA
 8:  A    NA
 9:  A    NA
10:  A    NA
11:  B    NA
12:  B    NA
13:  B    NA
14:  B    14
15:  B    NA
16:  B    NA
17:  B    NA
18:  B    NA
19:  B    NA
20:  B    20

我想做什么，是从最近的非 - NA 值填充 NA 的两者，但最多只能向前或向后两行。

What I would like to do, is to fill NAs both forward and back from the most recent non-NA value, but only up to a maximum of two rows forward or back.

我还需要按组（ID）完成。

I also need it to be done by group (ID).

我已尝试使用 na.locf / na.approx > maxgap = x 等，但它不填充 NA s，其中非 - NA 值大于 maxgap 。而我想填充这些前进和后退，即使非 - NA 值之间的差距大于 maxgap ，但只有两行。

I have tried using na.locf/na.approx with maxgap = x etc, but it does not fill NAs where the gap between non-NA values is greater than maxgap. Whereas I want to fill these forward and back even if the gap between non-NA values is greater than maxgap, but only by two rows.

最终结果应该类似：

    ID Price Price_Fill
 1:  A    NA          3
 2:  A    NA          3
 3:  A     3          3
 4:  A     4          4
 5:  A    NA          4
 6:  A    NA          4
 7:  A    NA         NA
 8:  A    NA         NA
 9:  A    NA         NA
10:  A    NA         NA
11:  B    NA         NA
12:  B    NA         14
13:  B    NA         14
14:  B    14         14
15:  B    NA         14
16:  B    NA         14
17:  B    NA         NA
18:  B    NA         20
19:  B    NA         20
20:  B    20         20

在现实中，我的数据集非常庞大，我想要能够在 NA

In reality, my data set is massive, and I want to be able to fill NAs forward and back for up to 672 rows, but no more, by group.

推荐答案

对于显示的示例，我们按ID分组，用 n = 0：2获得'Price'的 shift code>和类型作为'lead'创建3个临时列，从中获取 pmax 使用输出做 shift 和 type ='lag'（默认情况下是'lag' n ，获取 pmin 并将其指定为Price_Fill


For the example showed, we group by 'ID', get the shift of 'Price' with n = 0:2, and type as 'lead' to create 3 temporary columns, get the pmax from this, use the output to do the shift with type = 'lag' (by default it is 'lag') and same n, get the pmin and assign it as 'Price_Fill' 
dt[, Price_Fill := do.call(pmin, c(shift(do.call(pmax, c(shift(Price, n = 0:2, 
                  type = "lead"), na.rm=TRUE)), n= 0:2), na.rm = TRUE)) , by = ID]
dt
#    ID Price Price_Fill
#1:  A    NA          3
#2:  A    NA          3
#3:  A     3          3
#4:  A     4          4
#5:  A    NA          4
#6:  A    NA          4
#7:  A    NA         NA
#8:  A    NA         NA
#9:  A    NA         NA
#10: A    NA         NA
#11: B    NA         NA
#12: B    NA         14
#13: B    NA         14
#14: B    14         14
#15: B    NA         14
#16: B    NA         14
#17: B    NA         NA
#18: B    NA         20
#19: B    NA         20
#20: B    20         20

 
 
 
 
 
 更通用的方法是执行 pmin / pmax 在 .I 因为'价格'可以不同，而不是OP的帖子中显示的序列号。 p> 
 
 




A more general approach would be to do the pmin/pmax on .I as the 'Price' can be different and not the sequence number as showed in the OP's post.
i1 <- dt[,  do.call(pmin, c(shift(do.call(pmax, c(shift(NA^(is.na(Price))* 
    .I, n = 0:2, type = "lead"), na.rm = TRUE)), n = 0:2), na.rm = TRUE)), ID]$V1

dt$Price_Fill <  dt$Price[i1]
dt$Price_Fill
#[1]  3  3  3  4  4  4 NA NA NA NA NA 14 14 14 14 14 NA 20 20 20

 ie假设我们改变'价格'，它将是不同的
i.e. suppose we change the 'Price', it will be different
dt$Price[3] <- 10
dt$Price[14] <- 7
dt$Price_Fill <- dt$Price[i1]
dt$Price_Fill
#[1] 10 10 10  4  4  4 NA NA NA NA NA  7  7  7  7  7 NA 20 20 20


                        这篇关于na.locf填充NAs直到maxgap，即使gap&gt; maxgap，与组的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

na.locf填充NAs直到maxgap，即使gap> maxgap，与组 [英] na.locf fill NAs up to maxgap even if gap > maxgap, with groups

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

na.locf填充NAs直到maxgap，即使gap&gt; maxgap，与组 [英] na.locf fill NAs up to maxgap even if gap &gt; maxgap, with groups

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

na.locf填充NAs直到maxgap，即使gap> maxgap，与组 [英] na.locf fill NAs up to maxgap even if gap > maxgap, with groups

登录关闭