应用功能在R中太慢 [英] Apply function too slow in r
问题描述
我必须为很多物种每行计算一个特定的公式.该公式是丰度值与数据帧最后一行中存在的值之间的乘积.然后,将所有这些乘积求和.
I have to calculate for a lot of species a specific formula per row. The formula is a product between a value of abundance and a value present in the last row of the data frame. Then, all these products are summed.
我当前的脚本包括使用Apply函数,该函数似乎和我开始使用的for循环一样慢.
我使用名为az
的简单df在以下脚本中简化了问题:
My current script consists in using an apply function which appears to be as slow as the for-loop I started with.
I simplified the problem in the following script, using a simple df called az
:
az=data.frame(c(1,2,10),c(2,4,20),c(3,6,30))
colnames(az)=c("a","b","c")
# Initial for loop
prov=0 # prov for provisional number
for (i in 1:nrow(az)){
for (j in 1:ncol(az)){
prov=prov+az[i,j]*az[nrow(az),j]
}
print(prov)
prov=0
}
# Apply solution
apply(az[,], 1, function(x) {sum(x*az[nrow(az),], na.rm=TRUE)})
这两种解决方案都可以使用,但是它们的速度很慢(使用我的原始df),我必须对大量的物种重复该操作. 因此,我想知道是否有人可以使用矢量化表达式来提供更有效的解决方案.
Both solutions work but they are quite slow (with my original df) and I have to repeat the operation for a huge number of species. Thus, I was wondering if anyone has a more efficient solution, maybe using vectorized expressions.
亲切的问候.
推荐答案
最快的解决方案可能是矩阵代数:
The fastest solution is probably matrix algebra:
apply(az[,], 1, function(x) {sum(x*az[nrow(az),], na.rm=TRUE)})
#[1] 140 280 1400
m <- as.matrix(az)
m[is.na(m)] <- 0 #remove NA from sums
as.vector(m %*% m[nrow(m),])
#[1] 140 280 1400
这篇关于应用功能在R中太慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!