如何优化该索引算法 [英] How can I optimize this indexing algorithm

查看:144
本文介绍了如何优化该索引算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  • 反正是有,我可以加速这种计算?
  • 有没有更好的算法或实现,我可以使用来计算相同的值?

我有一个复杂的索引的问题,我竭力要解决的有效途径。

I have a complex indexing problem that I'm struggling to solve in an efficient way.

我们的目标是计算矩​​阵 w_prime 使用值值的组合在相同大小的矩阵是W DY dX的

The goal is to calculate the matrix w_prime using values a combination of values from the equally sized matrices w, dY, and dX.

w_prime(I,J)的计算公式为平均的值(W(INDY&安培; INDX)),其中 INDX DY的指数 dX的是分别等于学家

The value of w_prime(i,j) is calculated as mean( w( indY & indX ) ), where indY and indX are the indices of dY and dX that are equal to i and j respectively.

下面是在MATLAB算法来计算一个简单的实施 w_prime

Here's a simple implementation in matlab of an algorithm to compute w_prime:

for i = 1:size(w_prime,1)
  indY = dY == i;
  for j = 1:size(w_prime,2)
    indX = dX == j; 
    w_prime(ind) = mean( w( indY & indX ) );
  end
end

性能问题

这是实现足够在下面的例子情况;然而,在我的实际使用情况是W DY dX的 3000x3000 是〜<$ C $>和 w_prime 是〜 60X900 。这意味着每个指标的计算是发生在一〜900万的元素。不用这种实现是太慢,是可用的。此外,我还需要运行这个codeA几十倍。

Performance Problems

This implementation is sufficient in example case below; however, in my actual use case w, dY, dX are ~3000x3000 and w_prime is ~60X900. Meaning that each index calculation is happening on a ~9 million elements. Needless this implementation is too slow to be usable. Additionally I'll need to run this code a few dozen times.

如果我想计算 W(1,1)

  • 找到 DY 即等于1,另存为
  • 的指数
  • 找到的 dX的的指数即等于1,另存为 INDX
  • Find the indices of dY that equal 1, save as indY
  • Find the indices of dX that equal 1, save as indX

  • 找到 INDX 另存为 IND
  • Find intersection of indY and indX save as ind

  • 保存平均值(W(IND)) w_prime(1,1)

我有两个向量 X T 定义一组点,两者都是1XN其中N是〜 3000。此外X和T的值是整数,分别由间隔(1 60)和(1 900)的约束。

I have a set points defined by two vectors X, and T, both are 1XN where N is ~3000. Additionally the values of X and T are integers bound by the intervals (1 60) and (1 900) respectively.

该矩阵 dX的的dT ,只是距离矩阵,这意味着它们所包含的点之间的成对距离。即 DX(I,J)等于 ABS(X(我) - X(J))。

The matrices dX and dT, are simply distance matrices, meaning that they contain the pairwise distances between the points. Ie dx(i,j) is equal abs( x(i) - x(j) ).

他们利用计算: DX = pdist(X);

矩阵是W 可以被看作是一个权重矩阵,描述多少影响一点对另一回事。

The matrix w can be thought of as a weight matrix that describes how much influence one point has on another.

计算 w_prime(A,B)是确定的子集点是由分离之间的平均重量的目的 T X 尺寸和 B code>尺寸。

The purpose of calculating w_prime(a,b) is to determine the average weight between the sub-set of points that are separated by a in the X dimension and b in the T dimension.

这可能是pssed EX $ P $如下:

This can be expressed as follows:

推荐答案

这是直接与 ACCUMARRAY

nx = max(dX(:));
ny = max(dY(:));

w_prime = accumarray([dX(:),dY(:)],w(:),[nx,ny],@mean,NaN)

输出将是一个 NX -by - 纽约大小的数组与NaN的地方也没有相应的一对指数。如果你肯定会有指数全面补充所有的时间,可以简化上述计算以

The output will be a nx-by-ny sized array with NaNs wherever there was no corresponding pair of indices. If you're sure that there will be a full complement of indices all the time, you can simplify the above calculation to

w_prime = accumarray([dX(:),dY(:)],w(:),[],@mean)

那么,什么是accumarray办?它着眼于的行[DX(:),DY(:)] 。每一行给出了(I,J)坐标对 w_prime 来该行作出贡献。对于所有对(1,1),它适用的功能( @mean )的相应条目 W(:),并写入到输出 w_prime(1,1)

So, what does accumarray do? It looks at the rows of [dX(:),dY(:)]. Each row gives the (i,j) coordinate pair in w_prime to which the row contributes. For all pairs (1,1), it applies the function (@mean) to the corresponding entries in w(:), and writes the output into w_prime(1,1).

这篇关于如何优化该索引算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆