帮助加快R中的循环 [英] Help speeding up a loop in R
问题描述
我基本上想在R中执行对角线平均.以下是从simsalabim包改编而成的一些代码,以进行对角线平均.只有这很慢.
basically i want to perform diagonal averaging in R. Below is some code adapted from the simsalabim package to do the diagonal averaging. Only this is slow.
有任何建议将其向量化而不是使用sapply吗?
Any suggestions for vectorizing this instead of using sapply?
reconSSA <- function(S,v,group=1){
### S : matrix
### v : vector
N <- length(v)
L <- nrow(S)
K <- N-L+1
XX <- matrix(0,nrow=L,ncol=K)
IND <- row(XX)+col(XX)-1
XX <- matrix(v[row(XX)+col(XX)-1],nrow=L,ncol=K)
XX <- S[,group] %*% t(t(XX) %*% S[,group])
##Diagonal Averaging
.intFun <- function(i,x,ind) mean(x[ind==i])
RC <- sapply(1:N,.intFun,x=XX,ind=IND)
return(RC)
}
对于数据,您可以使用以下内容
For data you could use the following
data(AirPassengers)
v <- AirPassengers
L <- 30
T <- length(v)
K <- T-L+1
x.b <- matrix(nrow=L,ncol=K)
x.b <- matrix(v[row(x.b)+col(x.b)-1],nrow=L,ncol=K)
S <- eigen(x.b %*% t(x.b))[["vectors"]]
out <- reconSSA(S, v, 1:10)
推荐答案
借助rowsum
的非常专业的技巧,您可以将计算速度提高近10倍:
You can speed up the computation by almost 10 times with the help of a very specialized trick with rowsum
:
reconSSA_1 <- function(S,v,group=1){
### S : matrix
### v : vector
N <- length(v)
L <- nrow(S)
K <- N-L+1
XX <- matrix(0,nrow=L,ncol=K)
IND <- row(XX)+col(XX)-1
XX <- matrix(v[row(XX)+col(XX)-1],nrow=L,ncol=K)
XX <- S[,group] %*% t(t(XX) %*% S[,group])
##Diagonal Averaging
SUMS <- rowsum.default(c(XX), c(IND))
counts <- if(L <= K) c(1:L, rep(L, K-L-1), L:1)
else c(1:K, rep(K, L-K-1), K:1)
c(SUMS/counts)
}
all.equal(reconSSA(S, v, 1:10), reconSSA_1(S, v, 1:10))
[1] TRUE
library(rbenchmark)
benchmark(SSA = reconSSA(S, v, 1:10),
SSA_1 = reconSSA_1(S, v, 1:10),
columns = c( "test", "elapsed", "relative"),
order = "relative")
test elapsed relative
2 SSA_1 0.23 1.0000
1 SSA 2.08 9.0435
[更新:正如约书亚(Joshua)所建议,通过使用行汇总代码的症结,可以进一步提高速度:
[Update: As Joshua suggested it could be speed up even further by using the crux of the rowsum code:
reconSSA_2 <- function(S,v,group=1){
### S : matrix
### v : vector
N <- length(v)
L <- nrow(S)
K <- N-L+1
XX <- matrix(0,nrow=L,ncol=K)
IND <- c(row(XX)+col(XX)-1L)
XX <- matrix(v[row(XX)+col(XX)-1],nrow=L,ncol=K)
XX <- c(S[,group] %*% t(t(XX) %*% S[,group]))
##Diagonal Averaging
SUMS <- .Call("Rrowsum_matrix", XX, 1L, IND, 1:N,
TRUE, PACKAGE = "base")
counts <- if(L <= K) c(1:L, rep(L, K-L-1), L:1)
else c(1:K, rep(K, L-K-1), K:1)
c(SUMS/counts)
}
test elapsed relative
3 SSA_2 0.156 1.000000
2 SSA_1 0.559 3.583333
1 SSA 5.389 34.544872
与原始代码相比,可以提高 x34.5 !
A speedup of x34.5 comparing to original code!!
]
这篇关于帮助加快R中的循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!