复制单元格值“向下"的惯用方式是:在R向量中 [英] Idiomatic way to copy cell values "down" in an R vector
本文介绍了复制单元格值“向下"的惯用方式是:在R向量中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
可能重复:
使用先前的非NA值将NA填充到向量中吗?
Possible Duplicate:
Populate NAs in a vector using prior non-NA values?
是否有惯用的方式将单元格值向下"复制到R向量中? 复制"是指用最接近的先前非NA值替换NA.
Is there an idiomatic way to copy cell values "down" in an R vector? By "copying down", I mean replacing NAs with the closest previous non-NA value.
尽管我可以使用for循环非常简单地执行此操作,但是它运行非常缓慢.关于如何将其向量化的任何建议,将不胜感激.
While I can do this very simply with a for loop, it runs very slowly. Any advice on how to vectorise this would be appreciated.
# Test code
# Set up test data
len <- 1000000
data <- rep(c(1, rep(NA, 9)), len %/% 10) * rep(1:(len %/% 10), each=10)
head(data, n=25)
tail(data, n=25)
# Time naive method
system.time({
data.clean <- data;
for (i in 2:length(data.clean)){
if(is.na(data.clean[i])) data.clean[i] <- data.clean[i-1]
}
})
# Print results
head(data.clean, n=25)
tail(data.clean, n=25)
试运行结果:
> # Set up test data
> len <- 1000000
> data <- rep(c(1, rep(NA, 9)), len %/% 10) * rep(1:(len %/% 10), each=10)
> head(data, n=25)
[1] 1 NA NA NA NA NA NA NA NA NA 2 NA NA NA NA NA NA NA NA NA 3 NA NA NA NA
> tail(data, n=25)
[1] NA NA NA NA NA 99999 NA NA NA NA
[11] NA NA NA NA NA 100000 NA NA NA NA
[21] NA NA NA NA NA
>
> # Time naive method
> system.time({
+ data.clean <- data;
+ for (i in 2:length(data.clean)){
+ if(is.na(data.clean[i])) data.clean[i] <- data.clean[i-1]
+ }
+ })
user system elapsed
3.09 0.00 3.09
>
> # Print results
> head(data.clean, n=25)
[1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3
> tail(data.clean, n=25)
[1] 99998 99998 99998 99998 99998 99999 99999 99999 99999 99999
[11] 99999 99999 99999 99999 99999 100000 100000 100000 100000 100000
[21] 100000 100000 100000 100000 100000
>
推荐答案
使用zoo::na.locf
将代码包装在函数f
中(包括最后返回data.clean
):
Wrapping your code in function f
(including returning data.clean
at the end):
library(rbenchmark)
library(zoo)
identical(f(data), na.locf(data))
## [1] TRUE
benchmark(f(data), na.locf(data), replications=10, columns=c("test", "elapsed", "relative"))
## test elapsed relative
## 1 f(data) 21.460 14.471
## 2 na.locf(data) 1.483 1.000
这篇关于复制单元格值“向下"的惯用方式是:在R向量中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文