优化方差计算,以免循环太慢 [英] Optimize variance calculation, for loop too slow

查看:120
本文介绍了优化方差计算,以免循环太慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是在此链接上回答的问题的下一步[在r

Here is the next step of the question answered at this link [Apply function too slow in r

我必须为很多物种每行计算一个特定的公式.该公式对应于方差计算,因此需要上面链接中获得的结果.

I have to calculate for a lot of species a specific formula per row. The formula correspond to a variance calculation and so need the result obtained in the above link.

我当前的脚本包括使用for循环,这自然非常慢.我在下面的脚本中使用一个名为az的简单df简化了该问题.

My current script consists in using a for-loop which is naturally very slow. I simplified the problem in the following script, using a simple df called az.

az=data.frame(c(1,2,10),c(2,4,20),c(3,6,30))
colnames(az)=c("a","b","c")

# Necessary number calculated in step 1 (see link above)
m <- as.matrix(az)
m[is.na(m)] <- 0 #remove NA from sums
step1 = as.vector(m %*% m[nrow(m),])

# Initial for loop
prov=0 # prov for provisional number
    for (i in 1:nrow(az)){
            for (j in 1:ncol(az)){
                   prov=prov+az[i,j]*az[nrow(az),j]
                   prov=prov+az[i,j]*(az[nrow(az),j]-step1[i])^2
            }
        print(prov)
        prov=0
        }

由于我必须对大量物种重复该操作,所以我想知道是否有人可以使用矢量化表达式来提供更有效的解决方案.

As I have to repeat the operation for a huge number of species, I was wondering if anyone has a more efficient solution, maybe using vectorized expressions.

亲切的问候.

推荐答案

此代码将返回与您的代码打印出的值相同的值,但效率更高.

This code will return the same values that your code prints out, but more efficiently.

> n<-nrow(m)
> mm<-t(m)
> prov<-mm*mm[,n]
> prov<-prov+mm*(mm[,n]-step1[col(mm)])^2
> colSums(prov)
[1]     82140    791480 113717400

这篇关于优化方差计算,以免循环太慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆