如何优化该索引算法 [英] How can I optimize this indexing algorithm
问题描述
- 反正是有,我可以加速这种计算?
- 有没有更好的算法或实现,我可以使用来计算相同的值?
我有一个复杂的索引的问题,我竭力要解决的有效途径。
I have a complex indexing problem that I'm struggling to solve in an efficient way.
我们的目标是计算矩阵 w_prime
使用值值的组合在相同大小的矩阵是W
, DY
和 dX的
。
The goal is to calculate the matrix w_prime
using values a combination of values from the equally sized matrices w
, dY
, and dX
.
w_prime(I,J)
的计算公式为平均的值(W(INDY&安培; INDX))
,其中印
和 INDX
是 DY的指数
和 dX的
是分别等于我
和学家
The value of w_prime(i,j)
is calculated as mean( w( indY & indX ) )
, where indY
and indX
are the indices of dY
and dX
that are equal to i
and j
respectively.
下面是在MATLAB算法来计算一个简单的实施 w_prime
:
Here's a simple implementation in matlab of an algorithm to compute w_prime
:
for i = 1:size(w_prime,1)
indY = dY == i;
for j = 1:size(w_prime,2)
indX = dX == j;
w_prime(ind) = mean( w( indY & indX ) );
end
end
性能问题
这是实现足够在下面的例子情况;然而,在我的实际使用情况是W
, DY
, dX的
3000x3000 是〜<$ C $>和 w_prime
是〜 60X900
。这意味着每个指标的计算是发生在一〜900万的元素。不用这种实现是太慢,是可用的。此外,我还需要运行这个codeA几十倍。
Performance Problems
This implementation is sufficient in example case below; however, in my actual use case w
, dY
, dX
are ~3000x3000
and w_prime
is ~60X900
. Meaning that each index calculation is happening on a ~9 million elements. Needless this implementation is too slow to be usable. Additionally I'll need to run this code a few dozen times.
如果我想计算 W(1,1)
- 找到
DY
即等于1,另存为印
的指数
- 找到的
dX的
的指数即等于1,另存为INDX
- Find the indices of
dY
that equal 1, save asindY
- Find the indices of
dX
that equal 1, save asindX
- 找到
印
和INDX
另存为IND
- Find intersection of
indY
andindX
save asind
- 保存
平均值(W(IND))
到w_prime(1,1)
我有两个向量 X
和 T
定义一组点,两者都是1XN其中N是〜 3000。此外X和T的值是整数,分别由间隔(1 60)和(1 900)的约束。
I have a set points defined by two vectors X
, and T
, both are 1XN where N is ~3000. Additionally the values of X and T are integers bound by the intervals (1 60) and (1 900) respectively.
该矩阵 dX的
和的dT
,只是距离矩阵,这意味着它们所包含的点之间的成对距离。即 DX(I,J)
等于 ABS(X(我) - X(J))。
The matrices dX
and dT
, are simply distance matrices, meaning that they contain the pairwise distances between the points. Ie dx(i,j)
is equal abs( x(i) - x(j) )
.
他们利用计算: DX = pdist(X);
矩阵是W
可以被看作是一个权重矩阵,描述多少影响一点对另一回事。
The matrix w
can be thought of as a weight matrix that describes how much influence one point has on another.
计算 w_prime(A,B)
是确定的子集点是由分离之间的平均重量的目的
在 T 的
X
尺寸和 B
code>尺寸。
The purpose of calculating w_prime(a,b)
is to determine the average weight between the sub-set of points that are separated by a
in the X
dimension and b
in the T
dimension.
这可能是pssed EX $ P $如下:
This can be expressed as follows:
推荐答案
这是直接与 ACCUMARRAY 一>:
nx = max(dX(:));
ny = max(dY(:));
w_prime = accumarray([dX(:),dY(:)],w(:),[nx,ny],@mean,NaN)
输出将是一个 NX
-by - 纽约
大小的数组与NaN的地方也没有相应的一对指数。如果你肯定会有指数全面补充所有的时间,可以简化上述计算以
The output will be a nx
-by-ny
sized array with NaNs wherever there was no corresponding pair of indices. If you're sure that there will be a full complement of indices all the time, you can simplify the above calculation to
w_prime = accumarray([dX(:),dY(:)],w(:),[],@mean)
那么,什么是accumarray办?它着眼于的行[DX(:),DY(:)]
。每一行给出了(I,J)
坐标对 w_prime
来该行作出贡献。对于所有对(1,1)
,它适用的功能( @mean
)的相应条目 W(:)
,并写入到输出 w_prime(1,1)
。
So, what does accumarray do? It looks at the rows of [dX(:),dY(:)]
. Each row gives the (i,j)
coordinate pair in w_prime
to which the row contributes. For all pairs (1,1)
, it applies the function (@mean
) to the corresponding entries in w(:)
, and writes the output into w_prime(1,1)
.
这篇关于如何优化该索引算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!