我怎样才能加速这个 Rcpp 代码? [英] How could I speed up this Rcpp code?

查看:39
本文介绍了我怎样才能加速这个 Rcpp 代码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 R 中实现了一个运行时间很长的函数.我已经成功地在 R 中改进了它,但现在我想通过使用 Rcpp 包来加快它的速度.

I had implemented a function in R which was long to run. I have succeeded in improving it in R but now I would like to speed it up more by using Rcpp package.

我创建了以下 Rcpp 代码.不幸的是,它的运行时间与 R 代码大致相同.我想因此改进它.有没有人知道如何改进这段代码?

I have created the following Rcpp code. Unfortunately, it takes approximately the same time to run as the R code. I would like thus to improve it. Has anyone an idea on how to improve this piece of code?

非常感谢!

#include <math.h>
#include <Rcpp.h>
using namespace Rcpp;


// [[Rcpp::export]]
double kernelcpp(NumericVector a, NumericVector b, int N){
  int i;
  double sum=0.0;
  for (i=0;i<N;i++){
    if (a[i] > b[i])
      sum+= a[i] - b[i];
    else
      sum+= b[i] - a[i];
  }
  return(exp( - sum));
}
// [[Rcpp::export]]
NumericVector testFromontcpp(NumericMatrix z1, NumericMatrix z2, int Nbootstrap){
  // first element of TKeps = TK
  int i,j,k,t;
  int dim1 = z1.nrow();
  int dim2 = z2.nrow();
  double n1 = (double) dim1;
  double n2 = (double) dim2;
  int dimension = z1.ncol();
  int N = dim1 + dim2;
  NumericVector TKeps(Nbootstrap+1);
  Rcpp::NumericMatrix bb(N,N);

  double cc = 1 / (n1*n2*(n1+n2-2));
  double a = sqrt(1/(n1*n1-n1)-cc);
  double b = - sqrt(1/(n2*n2-n2)-cc);

  for (i=0 ; i<N ; i++){
    for (j=0 ; j<N ; j++){
    if (i != j){
      if (i < dim1) {
        if (j < dim1){
          bb(i,j) = kernelcpp(z1(i,_),z1(j,_),dimension);
        } else {
          bb(i,j) = kernelcpp(z1(i,_),z2(j-dim1,_),dimension);
        }
      }
      else{
        if (j < dim1){
          bb(i,j) = kernelcpp(z2(i-dim1,_),z1(j,_),dimension);
        } else {
          bb(i,j) = kernelcpp(z2(i-dim1,_),z2(j-dim1,_),dimension);
        }
      }
    }
    }
  }

  TKeps(0)=0.0;
  for (i=0 ; i<N ; i++){
    for (j=0 ; j<N ; j++){
    if (i != j){
      if (i < dim1) {
        if (j < dim1){
          TKeps(0) += bb(i,j)* (a*a + cc);
        } else {
          TKeps(0) += bb(i,j) * (a*b + cc);
        }
      }
      else{
        if (j < dim1){
          TKeps(0) += bb(i,j) * (a*b + cc);
        } else {
          TKeps(0) += bb(i,j) * (b*b + cc);
        }
      }
    }
    }
  }


  for (k=1 ; k<=Nbootstrap ; k++){
    TKeps(k)=0;
    int R[N];
    for (i = 0 ; i < N ; i++)
      R[i] = i;
    for (i = 0; i < N - 1 ; i++) {
      int j = i + rand() / (RAND_MAX / (N - i) + 1);
      t = R[j];
      R[j] = R[i];
      R[i] = t;
    }

    for (i=0 ; i<N ; i++){
      for (j=0 ; j<N ; j++){
        if (i != j){ 
          if (R[i] < n1) {
            if (R[j] < n1){
              TKeps(k) += bb(i,j) * (a*a + cc);
            } else {
              TKeps(k) += bb(i,j) * (a*b + cc);
            }
          } else{
            if (R[j] < n1){
              TKeps(k) += bb(i,j) * (b*a + cc);
            } else {
              TKeps(k) += bb(i,j) * (b*b + cc);
            }
          }
        }
      }
    }
  }
  return(TKeps);
}

推荐答案

由于我不完全知道你的代码是做什么的,所以我可以从头看到两件事:

Since I do not exactly know what your code does, I can see two things from the scratch:

  • 您从 R 环境调用的函数是 testFromontcpp(...).我建议这个函数应该有 SEXP 值作为参数.那些 S-Expressions 是指向 R 内存的指针. 如果你不使用 SEXP,那么两个矩阵都将被复制:考虑一个 1000x1000 矩阵,这意味着您在 R 中保存了 100 万个条目,这些条目被复制到 C++.这样做写:

  • The function you call from your R environment is testFromontcpp(...). I suggest that this function should have SEXP values as parameters. Those S-Expressions are pointer to the memory of R. If you don't use SEXP, then both matrices will be copied: Consider a 1000x1000 matrix, this means you have 1 million entries saved in R, which are copied to C++. To do so write:

testFromontcpp(SEXP x, SEXP y, SEXP z) {

testFromontcpp(SEXP x, SEXP y, SEXP z) {

数值矩阵 z1(x), z2(y);

NumericMatrix z1(x), z2(y);

int *Nbootstrap = INTEGER(z);

int *Nbootstrap = INTEGER(z);

...}

注意:在 for 循环中不能使用 i<Nbootstrap.你必须写 i<*Nbootstrap!!!

Be careful: In the for-loop you cannot use i<Nbootstrap. You have to write i<*Nbootstrap!!!

  • 其次...而且更重要的是:由于 R 的矩阵保存为指向列的指针和从列指针到行的指针,因此 C 的矩阵以相反的方式保存.我想说的是,跳入内存并跳回整个时间而不是遵循内存路径需要花费很多.我对此的建议是:切换 for 循环,因此首先迭代特定列的行,而不是相反.

最后一点:在大学的一项任务中,我也遇到了迭代矩阵的问题.就我而言,转置矩阵然后进行计算要便宜得多.

To the last point: In a task at university I had the problem with iterating over matrices, too. In my case it was way cheaper to transpose the matrix and then do calculations.

希望能帮到你.

最好的,迈克尔

PS:参考第 1 点...我刚刚使用您的实现和使用 SEXP 对您的代码进行了基准测试.对于具有 1 到 10 之间随机数的 100x100 矩阵,使用 SEXP 会稍微快一些.

PS: Referring to point 1...I just benchmarked your code with your implementation and with using SEXP. With SEXP it is slightly quicker for a 100x100 matrix with random numbers between 1 to 10.

这篇关于我怎样才能加速这个 Rcpp 代码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆