是否可以加快此MATLAB脚本的速度? [英] Is it possible to speed up this MATLAB script?
问题描述
我遇到了一些性能问题,因此我想加快那些运行缓慢的脚本.但是我对如何加快速度没有更多的想法.因为我发现我经常被索引挡住了.我发现抽象思维对我来说非常困难.
I've encountered some performance problems thus I want to speed up those running-slow scripts. But I have no more ideas on how to speed up them. Because I found I was often blocked with the indices. I found the abstract thinking is very difficult for me.
脚本是
tic,
n = 1000;
d = 500;
X = rand(n, d);
R = rand(n, n);
F = zeros(d, d);
for i=1:n
for j=1:n
F = F + R(i,j)* ((X(i,:)-X(j,:))' * (X(i,:)-X(j,:)));
end
end
toc
推荐答案
讨论&解决方案代码
这里很少建议使用bsxfun
的方法.另外,请继续阅读以了解如何在这样的问题上实现 30x+
加速!
Discussion & Solution Codes
Few approaches with bsxfun
could be suggested here. Also, read on to see how one can get 30x+
speedup on a problem like this!
方法1(天真向量化方法)
为适应X
行之间相减的两个操作,然后适应它们之间的后续逐元素乘法,基于朴素的bsxfun
的方法将导致一个与((X(i,:)-X(j,:))' * (X(i,:)-X(j,:)))
相对应的4D中间数组.之后,需要乘以R
以获得最终输出F
.如下所示实现-
To accommodate the two operations of subtractions between rows of X
and then the subsequent element-wise multiplications between them, a naive bsxfun
based approach would lead to a 4D intermediate array which would correspond to ((X(i,:)-X(j,:))' * (X(i,:)-X(j,:)))
. After that, one needs to multiply R
to have the final output F
. This is implemented as shown next -
v1 = bsxfun(@minus,X,permute(X,[3 2 1]));
v2 = bsxfun(@times,permute(v1,[1 3 2]),permute(v1,[1 3 4 2]));
F = reshape(R(:).'*reshape(v2,[],d^2),d,[]);
方法2(不是天真的矢量化方法)
前面提到的方法进入4D,这可能会减慢速度.因此,您可以通过重塑将中间数据保留到3D.接下来列出-
The earlier mentioned approach goes into 4D which could slow down things. So, instead you can keep the intermediate data until 3D by reshaping. This is listed next -
sub1 = bsxfun(@minus,X,permute(X,[3 2 1]));
sub1_2d = reshape(permute(sub1,[1 3 2]),n^2,[])
mult1 = bsxfun(@times,sub1_2d,permute(sub1_2d,[1 3 2]))
F = reshape(R(:).'*reshape(mult1,[],d^2),d,[])
方法3(混合方法)
现在,您可以基于方法2 (vectorized subtractions
+ loopy multiplications
)进行混合处理.这种方法的好处是它使用fast matrix multiplication
进行乘法运算,并将复杂度从早期的O(n ^ 2)降低到O(n),这应该使其效率更高.感谢@ Dev-iL,提出了这个想法!这是代码-
Now, you can make a hybrid approach based on Approach #2 (vectorized subtractions
+ loopy multiplications
). Benefit of this approach would be that it uses the fast matrix multiplication
to perform the multiplications and reduces the complexity to O(n) from the earlier O(n^2) and this should make it much more efficient. Thanks to @Dev-iL, for suggesting this idea! Here's the code -
sub1 = bsxfun(@minus,X,permute(X,[3 2 1]));
sub1 = bsxfun(@times,sub1,permute(sqrt(R),[1 3 2]));
F = zeros(d);
for k = 1:size(sub1,3)
blk = sub1(:,:,k);
F = F + blk.'*blk;
end
基准化
基准化代码,将原始方法与方法#3
Benchmarking
Benchmarking code comparing the original approach against Approach #3
%// Parameters
n = 500;
d = 250;
X = rand(n, d);
R = rand(n, n);
%// Warm up tic/toc.
for k = 1:100000
tic(); elapsed = toc();
end
disp('------------------------------ With Original Approach')
tic
F1 = zeros(d, d);
for i=1:n
for j=1:n
F1 = F1 + R(i,j)*((X(i,:)-X(j,:))' * (X(i,:)-X(j,:)));
end
end
toc, clear F1 i j
disp('------------------------------ With Proposed Approach #3')
tic
sub1 = bsxfun(@minus,X,permute(X,[3 2 1]));
sub1 = bsxfun(@times,sub1,permute(sqrt(R),[1 3 2]));
F = zeros(d);
for k = 1:size(sub1,3)
blk = sub1(:,:,k);
F = F + blk.'*blk;
end
toc
运行时结果
------------------------------ With Original Approach
Elapsed time is 29.728571 seconds.
------------------------------ With Proposed Approach #3
Elapsed time is 0.839726 seconds.
那么,谁准备好进行 30x + 的提速!?
So, who's ready for a 30x+ speedup!?
这篇关于是否可以加快此MATLAB脚本的速度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!