如何使用多个 for 循环使 Rcpp 代码高效? [英] How to make Rcpp code efficient with multiple for loops?

查看:99
本文介绍了如何使用多个 for 循环使 Rcpp 代码高效?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过从 R 调用来实现以下 Rcpp 代码.计算时间非常慢.涉及很多 for 循环.

I am trying to implement following Rcpp code by calling from R. The computing time is extremely slow. There are lots of for loops involved.

#include <RcppArmadillo.h>
using namespace Rcpp;
// [[Rcpp::depends(RcppArmadillo)]]

// [[Rcpp::export]]
arma::mat qpart(
      const int& n,
      const int& p,
      const int& m,
      arma::vec& G,
      arma::vec& ftime,
      arma::vec& cause,
      arma::mat& covs,
      arma::mat& S1byS0hat,
      arma::vec& S0hat,
      arma::vec& expz){

      arma::mat q(n,p);
      q.zeros();
      for(int u=0;u<n;++u){
         arma::mat q1(1,p);
         q1.zeros();
         for(int iprime=0;iprime<n;++iprime){
            for(int i=0;i<n;++i){
               if(cause(iprime)==1 & cause(i)>1 & (ftime(i) < ftime(u)) & (ftime(u) <= ftime(iprime))){
                   q1 += (covs.row(i) - S1byS0hat.row(iprime))*G(iprime)/G(i)*expz(i)/S0hat(iprime);
               }
             }
         }
         q.row(u) = q1/(m*m);
      }
return q;
}

以下是 R 中的示例.

Following is an example in R.

#### In R ########
n = 2000
m = 500
p=3
G = runif(n)
ftime = runif(n,0.01,5)
cause = c(rep(0,600),rep(1,1000),rep(2,400))
covs = matrix(rnorm(n*p),n,p)
S1byS0hat = matrix(rnorm(n*p),p,n)
S0hat = rnorm(n)
expz = rnorm(n)

system.time( qpart(n,p,m,G,ftime,cause,covs,t(S1byS0hat),S0hat,expz))
user  system elapsed 
   21.5     0.0    21.5 

正如我们所见,计算时间非常长.

As we can see, the computing time is very high.

同样的代码在 R 中实现,计算时间非常长.

Same code implemented in R and the computing time is very high.

q = matrix(0,n,p)
for(u in 1 : n){
    q1 <- matrix(0,p,1)
  for(iprime in 1 : n){
    for(i in 1 : n){
      if(cause[iprime]==1 & cause[i]>1 & (time[i]<time[u]) & (time[u] <= time[iprime])){
          q1 = q1 + (covs[i,] - S1byS0hat[,iprime])*G[iprime]/G[i]*expz[i]/S0hat[iprime]
      }
    }

  }
    q[u,] = q1/(m*m)
}

以下是我正在尝试实施的公式.

Following is the formula that I am trying to implement.

推荐答案

有些条件只依赖于 uiprime 所以你可以在之前检查它们.你也可以预先计算一些东西.这给出:

Some conditions depends only on u and iprime so you can check them much before. You can also precompute some stuff. This gives:

arma::mat qpart2(
    double m,
    arma::vec& ftime,
    arma::vec& cause,
    arma::mat& covs,
    arma::mat& S1byS0hat,
    arma::vec& G_div_S0hat,
    arma::vec& expz_div_G){

  double m2 = m * m;

  int n = covs.n_rows;
  int p = covs.n_cols;

  arma::mat q(n, p, arma::fill::zeros);

  for (int u = 0; u < n; u++) {
    double ftime_u = ftime(u);
    for (int iprime = 0; iprime < n; iprime++) {
      if (cause(iprime) == 1 && ftime_u <= ftime(iprime)) {
        for (int i = 0; i < n; i++) {
          if (cause(i) > 1 && ftime(i) < ftime_u) {
            double coef = G_div_S0hat(iprime) * expz_div_G(i);
            for (int j = 0; j < p; j++) {
              q(u, j) += (covs(i, j) - S1byS0hat(iprime, j)) * coef;
            }
          }
        }
      }
    }
    for (int j = 0; j < p; j++)  q(u, j) /= m2;
  }

  return q;
}

使用 qpart2(m, ftime, cause, covs, t(S1byS0hat), G/S0hat, expz/G) 需要 3.7 秒(对比 32秒为您的代码).

Using qpart2(m, ftime, cause, covs, t(S1byS0hat), G / S0hat, expz / G) takes 3.7 sec (vs 32 sec for your code).

**

小备注:

  • 您使用 arma 结构而不是 Rcpp 结构有什么原因吗?
  • 您应该按列而不是按行访问矩阵,它应该快一点,因为它们是按列存储的.

这篇关于如何使用多个 for 循环使 Rcpp 代码高效?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆