RCPP编程效率 [英] Rcpp programming efficiency
问题描述
我是Rcpp的初学者.目前,我编写了一个Rcpp代码,该代码应用于两个3维数组: Array1
和 Array2
.假设 Array1
的尺寸为(1000,100,40),而 Array2
的尺寸为(1000,96,40).
I am a beginner with Rcpp. Currently I wrote a Rcpp code, which was applied on two 3 dimensional arrays: Array1
and Array2
. Suppose Array1
has dimension (1000, 100, 40) and Array2
has dimension (1000, 96, 40).
我想使用以下命令执行 wilcox.test
:
I would like to perform wilcox.test
using:
wilcox.test(Array1[i, j,], Array2[i,,])
在 R 中,我编写了嵌套的 for
循环,该循环在大约半小时内完成了计算.
In R, I wrote nested for
loops that completed the calculation in about a half hour.
然后,我将其写入Rcpp.Rcpp中的计算花了一个小时才能达到相同的结果.我认为它应该是更快的,因为它是用C ++语言编写的.我想我的编码风格是效率低下的原因.
Then, I wrote it into Rcpp. The calculation within Rcpp took an hour to achieve the same results. I thought it should be faster since it is written in C++ language. I guess that my style of coding is the cause of the low efficient.
以下是我的Rcpp代码,您介意帮助我确定我应该进行哪些改进吗?我很感激!
The following is my Rcpp code, would you mind helping me find out what improvement should I make please? I appreciate it!
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector Cal(NumericVector Array1,NumericVector Array2,Function wilc) {
NumericVector vecArray1(Array1);
IntegerVector arrayDims1 = vecArray1.attr("dim");
NumericVector vecArray2(Array2);
IntegerVector arrayDims2 = vecArray2.attr("dim");
arma::cube cubeArray1(vecArray1.begin(), arrayDims1[0], arrayDims1[1], arrayDims1[2], false);
arma::cube cubeArray2(vecArray2.begin(), arrayDims2[0], arrayDims2[1], arrayDims2[2], false);
arma::mat STORE=arma::mat(arrayDims1[0], arrayDims1[1]);
for(int i=0;i<arrayDims1[1];i++)
{
for(int j=0;j<arrayDims1[0];j++){
arma::vec v_cl=cubeArray1.subcube(arma::span(j),arma::span(i),arma::span::all);
//arma::mat tem=cubeArray2.subcube(arma::span(j),arma::span::all,arma::span::all);
//arma::vec v_ct=arma::vectorise(tem);
arma::vec v_ct=arma::vectorise(cubeArray2.subcube(arma::span(j),arma::span::all,arma::span::all));
Rcpp::List resu=wilc(v_cl,v_ct);
STORE(j,i)=resu[2];
}
}
return(Rcpp::wrap(STORE));
}
函数 wilc
将是 R 中的 wilcox.test
.
以下是我的 R 代码的一部分,用于实现上述想法,其中 CELLS
和 CTRLS
是中的两个3D数组R .
The following is part of my R code for implementing the above idea, where CELLS
and CTRLS
are two 3D array in R.
for(i in 1:ncol(CELLS)) {
if(T){ print(i) }
for (j in 1:dim(CELLS)[1]) {
wtest = wilcox.test(CELLS[j,i,], CTRLS[j,,])
TSTAT_clcl[j,i] = wtest$p.value
}
}
推荐答案
然后,我将其写入Rcpp.Rcpp中的计算花了一个小时才能达到相同的结果.我认为它应该是更快的,因为它是用C ++语言编写的.
Then, I wrote it into Rcpp. The calculation within Rcpp took an hour to achieve the same results. I thought it should be faster since it is written in C++ language.
必需的免责声明:
在 C ++ 中嵌入 R 代码,并期望提高速度是傻瓜的游戏.您将需要用 C ++ 完全重写 wilcox.test
,而不是调用 R .否则,您将失去获得的任何加速优势.
Embedding R code in C++ and expecting a speed up is a fool's game. You will need to rewrite wilcox.test
full in C++ instead of making a call to R. Otherwise, you lose whatever speed up advantage you get.
我特别写了一个 post 说明了有关在 R 中使用 diff
函数的难题.在帖子中,我详细比较了例程中使用 R 函数的 pure C ++ 实现和 C ++ 实现.,以及纯 R 实现.窃取 microbenchmark
可以说明上述问题.
In particular, I wrote up a post illustrating this conundrum regarding the using the diff
function in R. Within the post, I detailed comparing a pure C++ implementation, an C++ implementation using an R function within the routine, and a pure R implementation. Stealing the microbenchmark
illustrates the above issue.
expr min lq mean median uq max neval
arma_fun 26.117 27.318 37.54248 28.218 29.869 751.087 100
r_fun 127.883 134.187 212.81091 138.390 151.148 1012.856 100
rcpp_fun 250.663 265.972 356.10870 274.228 293.590 1430.426 100
因此,纯 C ++ 实现的速度最快.
Thus, a pure C++ implementation had the largest speed up.
Hence, the take away is the need to translate the wilcox.test
R routine code to a pure C++ implementation to drop the run time. Otherwise, it is meaningless to write the code in C++ because the C++ component must stop and await results from R before continuing. This traditionally has a lot of overhead to ensure the data is well protected.
这篇关于RCPP编程效率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!