复制单元格值“向下"的惯用方式是:在R向量中 [英] Idiomatic way to copy cell values "down" in an R vector

查看:71
本文介绍了复制单元格值“向下"的惯用方式是:在R向量中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

可能重复:
使用先前的非NA值将NA填充到向量中吗?

Possible Duplicate:
Populate NAs in a vector using prior non-NA values?

是否有惯用的方式将单元格值向下"复制到R向量中? 复制"是指用最接近的先前非NA值替换NA.

Is there an idiomatic way to copy cell values "down" in an R vector? By "copying down", I mean replacing NAs with the closest previous non-NA value.

尽管我可以使用for循环非常简单地执行此操作,但是它运行非常缓慢.关于如何将其向量化的任何建议,将不胜感激.

While I can do this very simply with a for loop, it runs very slowly. Any advice on how to vectorise this would be appreciated.

# Test code
# Set up test data
len <- 1000000
data <- rep(c(1, rep(NA, 9)), len %/% 10) * rep(1:(len %/% 10), each=10)
head(data, n=25)
tail(data, n=25)

# Time naive method
system.time({
  data.clean <- data;
  for (i in 2:length(data.clean)){
    if(is.na(data.clean[i])) data.clean[i] <- data.clean[i-1]
  }
})

# Print results
head(data.clean, n=25)
tail(data.clean, n=25)

试运行结果:

> # Set up test data
> len <- 1000000
> data <- rep(c(1, rep(NA, 9)), len %/% 10) * rep(1:(len %/% 10), each=10)
> head(data, n=25)
 [1]  1 NA NA NA NA NA NA NA NA NA  2 NA NA NA NA NA NA NA NA NA  3 NA NA NA NA
> tail(data, n=25)
 [1]     NA     NA     NA     NA     NA  99999     NA     NA     NA     NA
[11]     NA     NA     NA     NA     NA 100000     NA     NA     NA     NA
[21]     NA     NA     NA     NA     NA
> 
> # Time naive method
> system.time({
+   data.clean <- data;
+   for (i in 2:length(data.clean)){
+     if(is.na(data.clean[i])) data.clean[i] <- data.clean[i-1]
+   }
+ })
   user  system elapsed 
   3.09    0.00    3.09 
> 
> # Print results
> head(data.clean, n=25)
 [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3
> tail(data.clean, n=25)
 [1]  99998  99998  99998  99998  99998  99999  99999  99999  99999  99999
[11]  99999  99999  99999  99999  99999 100000 100000 100000 100000 100000
[21] 100000 100000 100000 100000 100000
> 

推荐答案

使用zoo::na.locf

将代码包装在函数f中(包括最后返回data.clean):

Wrapping your code in function f (including returning data.clean at the end):

library(rbenchmark)
library(zoo)

identical(f(data), na.locf(data))
## [1] TRUE

benchmark(f(data), na.locf(data), replications=10, columns=c("test", "elapsed", "relative"))
##            test elapsed relative
## 1       f(data)  21.460   14.471
## 2 na.locf(data)   1.483    1.000

这篇关于复制单元格值“向下"的惯用方式是:在R向量中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆