Matlab的花费过多时间在计算MEX功能大阵 [英] Matlab taking too much time in computing mex function for large array

查看:156
本文介绍了Matlab的花费过多时间在计算MEX功能大阵的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写在我传递一些标量和一个行向量作为MEX函数的输入参数和做一些计算后,它返回一个标量输出MATLAB脚本。该方法具有用于一个数组,其尺寸为1×1638400.下面是对应code中的所有元素来完成:

I wrote a MATLAB script in which I am passing few scalars and one row vector as input arguments of a mex function and after doing some calculation, it is returning a scalar as output. This process has to be done for all the elements of an array whose size is 1 X 1638400. Below is the corresponding code:

ans=0;
for i=0:1638400-1
    temp = sub_imed(r,i,diff);
    ans  = ans + temp*diff(i+1); 
end

其中r,i是标量,差异大小是1×1638400的矢量和sub_imed是MEX函数做下面的工作:

where r,i are scalars, diff is a vector of size 1 X 1638400 and sub_imed is a MEX function which does the below job:

void sub_imed(double r,mwSize base, double* diff, mwSize dim, double* ans)              
{                                                                                           
     mwSize i,k,l,k1,l1;
     double d,g,temp;

     for(i=0; i<dim; i++)
     {   
          k = (base/200) + 1;
          l = (base%200) + 1;
          k1 = (i/200) + 1;
          l1 = (i%200) + 1;

          d = sqrt(pow((k-k1),2) + pow((l-l1),2));

          g=(1/(2*pi*pow(r,2)))*exp(-(pow(d,2))/(2*(pow(r,2))));   

          temp = temp + diff[i]*g;
     }

     *ans  = temp;
}

void mexFunction(int nlhs,mxArray *plhs[],int nrhs,const mxArray *prhs[]) 
{
    double *diff;           /* Input data vectors */
    double r;               /* Value of r (input) */
    double* ans;            /* Output ImED distance */
    size_t base,ncols;      /* For storing the size of input vector and base */

    /* Checking for proper number of arguments */
    if(nrhs!=3) 
       mexErrMsgTxt("Error..Three inputs required.");

    if(nlhs!=1) 
       mexErrMsgTxt("Error..Only one output required.");

    /* make sure the first input argument(value of r) is scalar */
    if( !mxIsDouble(prhs[0]) || mxIsComplex(prhs[0]) || mxGetNumberOfElements(prhs[0])!=1) 
       mexErrMsgTxt("Error..Value of r must be a scalar."); 

    /* make sure that the input value of base is a scalar */
    if( !mxIsDouble(prhs[1]) || mxIsComplex(prhs[1]) || mxGetNumberOfElements(prhs[1])!=1) 
       mexErrMsgTxt("Error..Value of base must be a scalar."); 

    /* make sure that the input vector diff is of type double */
    if(!mxIsDouble(prhs[2]) || mxIsComplex(prhs[2]))    
       mexErrMsgTxt("Error..Input vector must be of type double.");

    /* check that number of rows in input arguments is 1 */
    if(mxGetM(prhs[2])!=1) 
       mexErrMsgTxt("Error..Inputs must be row vectors."); 

    /* Get the value of r */
    r = mxGetScalar(prhs[0]);
    base = mxGetScalar(prhs[1]);

    /* Getting the input vectors */
    diff = mxGetPr(prhs[2]);
    ncols = mxGetN(prhs[2]);

    /* Creating link for the scalar output */
    plhs[0] = mxCreateDoubleMatrix(1,1,mxREAL);
    ans = mxGetPr(plhs[0]); 

    sub_imed(r,base,diff,(mwSize)ncols,ans);
}

有关该问题的更多细节和强调算法请按照线程图像之间的欧氏距离

For more details about the problem and the underlining algorithm please follow the thread Euclidean distance between images.

我做了我MATLAB脚本的分析和认识了,它正在采取63秒。只是387调用sub_imed()函数MEX。因此,对于1638400来电sub_imed,最好将采取约为7,400小时,这实在太长了。

I did a profiling of my MATLAB script and got to know that it is taking 63 sec. just for 387 calls to sub_imed() mex function. So for 1638400 calls to sub_imed, ideally it will take around 74 hours which is just too long.

有人可以帮我推荐一些另类的方式来减少用于计算时间,优化了code。

Can someone please help me to optimize the code by suggesting some alternative ways to reduce the time taken for computation.

先谢谢了。

推荐答案

我移植的code回到MATLAB和做一些小adjustements,而结果应保持不变。我公司推出的下列常量:

I ported your code back to MATLAB and made some small adjustements, while the results should stay the same. I introduced the following constants:

N = 8192;
step = 0.005;

注意 N /步= 1638400 。有了这一点,你可以重写你的变量 K (并将其重命名为 baseDiv

Note that N / step = 1638400. With that, you can rewrite your variable k (and rename it to baseDiv):

baseDiv = 1 + (0 : step : (N-step)).';

即。这是 1:8193 的步骤0.005
同样, 1:200 = 1:(1 / 0.005)),重复8192次连胜,这是(现在叫 baseMod

i.e. it is 1:8193 in steps of 0.005. Similarly, l is 1:200 (=1:(1/0.005)), repeated 8192 times in a row, which is (now called baseMod):

baseMod = (repmat(1:1:(1/step), 1, N)).';

您变量 K1 L1 是简单的 I K ,即 baseDiv(我) baseMod(我)

Your variables k1 and l1 are simply the ith element of k and l, i.e. baseDiv(i) and baseMod(i).

使用向量 baseDiv baseMod ,可以计算 D 先按g 和你的临时变量 TMP

With the vectors baseDiv, and baseMod, one can calculate d, g and your temporary variable tmp with

d = sqrt((baseDiv(k)-baseDiv).^2 + (baseMod(k)-baseMod).^2);
g = 1/(2*pi*r^2) * exp(-(d.^2) / (2*r^2));
tmp = sum(diffVec .* g);

我们可以将其放入您的MATLAB for循环,因此整个程序变为

We can put this into your MATLAB for loop, so the whole program becomes

% Constants
N = 8192;
step = 0.005;

% Some example data
r = 2;
diffVec = rand(N/step,1);

base = (0:(numel(diffVec)-1)).';    
baseDiv = (1:step:1+N-step).';
baseMod = (repmat(1:1:(1/step), 1, N)).';

res = 0;
for k=1:(N/step)
    d = sqrt((baseDiv(k)-baseDiv).^2 + (baseMod(k)-baseMod).^2);
    g = 1/(2*pi*r^2) * exp(-(d.^2) / (2*r^2));
    tmp = sum(diffVec .* g);
    res = res + tmp * diffVec(k);
end

由于消除了内部循环,并在量化的方式计算它,你仍然需要11秒,1000次迭代,造成5小时总运行时间。还是 - 一个加速超过的 10倍的。为了获得更高的速度,你有两种可能性:

By eliminating the inner for loop and calculating it in a vectorized fashion, you still need 11 sec for 1000 iterations, resulting in a total runtime of 5 hours. Still - a speed-up of more than 10x. To get an even higher speed-up, you have two possibilities:

1)完整的矢量::您可以轻松地通过使用 bsxfun(@minus,baseDiv,baseDiv矢量化剩余for循环')。和 ING在列计算的在同一时间的所有的值。不幸的是,我们有一个小问题:一个的 1638400逐1638400 的双矩阵将占用的内存20TB,这 - 我想 - 你不要在你的笔记本电脑有; - )

1) Complete vectorization: You can easily vectorize the remaining for-loop by using bsxfun(@minus, baseDiv, baseDiv.') and suming over the columns to calculate all values at the same time. Unfortunately we have a small problem: a 1638400-by-1638400 double matrix would take up 20TB of RAM, which - I assume - you don't have in your Laptop ;-)

2)减样本:您正在做一些数学与步骤的分辨率变换= 0.005 。检查你的真的,真的的需要这个precision!如果你把precision 1/10:步= 0.05 ,你快100倍,和 3分钟内完成的!

2) Less samples: You are doing some mathematical transform with a resolution of step=0.005. Check if you really, really need this precision! If you take 1/10 of the precision: step=0.05, you are 100 times faster, and are finished within 3 minutes!

这篇关于Matlab的花费过多时间在计算MEX功能大阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆