避免在R中使用两个for循环 [英] Avoid two for loops in R
问题描述
我有一个R代码,可以对两个函数进行卷积...
I have a R code that can do convolution of two functions...
convolveSlow <- function(x, y) {
nx <- length(x); ny <- length(y)
xy <- numeric(nx + ny - 1)
for(i in seq(length = nx)) {
xi <- x[[i]]
for(j in seq(length = ny)) {
ij <- i+j-1
xy[[ij]] <- xy[[ij]] + xi * y[[j]]
}
}
xy
}
是否可以删除两个for循环并使代码运行更快?
Is there a way to remove the two for loops and make the code run faster?
谢谢 San
推荐答案
由于R在计算向量运算方面非常快,因此在进行性能编程时要记住的最重要的事情是对尽可能多的向量进行向量化.
Since R is very fast at computing vector operations, the most important thing to keep in mind when programming for performance is to vectorise as many of your operations as possible.
这意味着要认真考虑用向量运算替换循环.这是我的快速卷积解决方案(对于每个长度为1000的输入向量,速度要快50倍):
This means thinking hard about replacing loops with vector operations. Here is my solution for fast convolution (50 times faster with input vectors of length 1000 each):
convolveFast <- function(x, y) {
nx <- length(x)
ny <- length(y)
xy <- nx + ny - 1
xy <- rep(0, xy)
for(i in (1:nx)){
j <- 1:ny
ij <- i + j - 1
xy[i+(1:ny)-1] <- xy[ij] + x[i] * y
}
xy
}
您会注意到内部循环(对于...中的j)已消失.相反,我将其替换为向量运算.现在将j定义为一个向量(j <-1:1:ny).还请注意,我引用的是整个向量y,而不是对其进行子集化(即,用y代替y [j]).
You will notice that the inner loop (for j in ...) has disappeared. Instead, I replaced it with a vector operation. j is now defined as a vector (j <- 1:ny). Notice also that I refer to the entire vector y, rather than subsetting it (i.e. y instead of y[j]).
j <- 1:ny
ij <- i + j - 1
xy[i+(1:ny)-1] <- xy[ij] + x[i] * y
我写了一个测量性能的小函数:
I wrote a small function to measure peformance:
measure.time <- function(fun1, fun2, ...){
ptm <- proc.time()
x1 <- fun1(...)
time1 <- proc.time() - ptm
ptm <- proc.time()
x2 <- fun2(...)
time2 <- proc.time() - ptm
ident <- all(x1==x2)
cat("Function 1\n")
cat(time1)
cat("\n\nFunction 2\n")
cat(time2)
if(ident) cat("\n\nFunctions return identical results")
}
对于两个长度为1000的向量,我的性能提高了98%:
For two vectors of length 1000 each, I get a 98% performance improvement:
x <- runif(1000)
y <- runif(1000)
measure.time(convolveSlow, convolveFast, x, y)
Function 1
7.07 0 7.59 NA NA
Function 2
0.14 0 0.16 NA NA
Functions return identical results
这篇关于避免在R中使用两个for循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!