MATLAB的bsxfun是最好的吗? Python的numpy.einsum? [英] Is MATLAB's bsxfun the best? Python's numpy.einsum?
问题描述
我有一个很大的乘法和求和运算,需要尽可能高效地实现.到目前为止,我发现的最佳方法是MATLAB中的bsxfun
,我将问题表达为:
I have a very large multiply and sum operation that I need to implement as efficiently as possible. The best method I've found so far is bsxfun
in MATLAB, where I formulate the problem as:
L = 10000;
x = rand(4,1,L+1);
A_k = rand(4,4,L);
tic
for k = 2:L
i = 2:k;
x(:,1,k+1) = x(:,1,k+1)+sum(sum(bsxfun(@times,A_k(:,:,2:k),x(:,1,k+1-i)),2),3);
end
toc
请注意,L
实际上会更大.有没有更快的方法?奇怪的是,我需要先将单例维度添加到x
,然后在其上添加sum
,但否则我将无法正常工作.
Note that L
will be larger in practice. Is there a faster method? It's strange that I need to first add the singleton dimension to x
and then sum
over it, but I can't get it to work otherwise.
它仍然比我尝试过的任何其他方法都快得多,但对于我们的应用程序来说还不够.我听说有传言说Python函数numpy.einsum
可能会更有效,但是在考虑移植代码之前,我想先问一下.
It's still much faster than any other method I've tried, but not enough for our application. I've heard rumors that the Python function numpy.einsum
may be more efficient, but I wanted to ask here first before I consider porting my code.
我正在使用MATLAB R2017b.
I'm using MATLAB R2017b.
推荐答案
我相信您的两个汇总都可以删除,但我暂时只删除了其中的一个.第二维上的求和是微不足道的,因为它仅影响A_k
数组:
I believe both of your summations can be removed, but I only removed the easier one for the time being. The summation over the second dimension is trivial, since it only affects the A_k
array:
B_k = sum(A_k,2);
for k = 2:L
i = 2:k;
x(:,1,k+1) = x(:,1,k+1) + sum(bsxfun(@times,B_k(:,1,2:k),x(:,1,k+1-i)),3);
end
通过此更改,笔记本电脑上的运行时间从〜8秒减少到〜2.5秒.
With this single change the runtime is reduced from ~8 seconds to ~2.5 seconds on my laptop.
第二次求和也可以通过将time + sum转换为矩阵向量乘积来删除.它需要一些单调摆弄才能获得正确的尺寸,但是如果您定义的辅助数组B_k
的第二个维度相反,则可以使用该辅助数组C_k
将剩余的总和生成为〜x*C_k
,给定或打电话给reshape
.
The second summation could also be removed, by transforming times+sum into a matrix-vector product. It needs some singleton fiddling to get the dimensions right, but if you define an auxiliary array that is B_k
with the second dimension reversed, you can generate the remaining sum as ~x*C_k
with this auxiliary array C_k
, give or take a few calls to reshape
.
因此,仔细观察后,我意识到我的原始评估过于乐观:您在剩余任期内在两个维度上都有乘法,因此这不是一个简单的矩阵乘积.无论如何,我们可以将该术语改写为矩阵乘积的对角线.这意味着我们正在计算一堆不必要的矩阵元素,但这似乎仍然比bsxfun
方法要快一些,而且我们也可以摆脱讨厌的单例维度:
So after a closer look I realized that my original assessment was overly optimistic: you have multiplications in both dimensions in your remaining term, so it's not a simple matrix product. Anyway, we can rewrite that term to be the diagonal of a matrix product. This implies that we're computing a bunch of unnecessary matrix elements, but this still seems to be slightly faster than the bsxfun
approach, and we can get rid of your pesky singleton dimension too:
L = 10000;
x = rand(4,L+1);
A_k = rand(4,4,L);
B_k = squeeze(sum(A_k,2)).';
tic
for k = 2:L
ii = 1:k-1;
x(:,k+1) = x(:,k+1) + diag(x(:,ii)*B_k(k+1-ii,:));
end
toc
这在我的笔记本电脑上可以运行约2.2秒,比以前获得的约2.5秒要快一些.
This runs in ~2.2 seconds on my laptop, somewhat faster than the ~2.5 seconds obtained previously.
这篇关于MATLAB的bsxfun是最好的吗? Python的numpy.einsum?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!