是否可以加快此MATLAB脚本的速度? [英] Is it possible to speed up this MATLAB script?

查看:70
本文介绍了是否可以加快此MATLAB脚本的速度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遇到了一些性能问题,因此我想加快那些运行缓慢的脚本.但是我对如何加快速度没有更多的想法.因为我发现我经常被索引挡住了.我发现抽象思维对我来说非常困难.

I've encountered some performance problems thus I want to speed up those running-slow scripts. But I have no more ideas on how to speed up them. Because I found I was often blocked with the indices. I found the abstract thinking is very difficult for me.

脚本是

    tic,
    n = 1000;
    d = 500;
    X = rand(n, d);
    R = rand(n, n);
    F = zeros(d, d);
    for i=1:n
        for j=1:n
           F = F + R(i,j)* ((X(i,:)-X(j,:))' * (X(i,:)-X(j,:)));
        end
    end
    toc

推荐答案

讨论&解决方案代码

这里很少建议使用bsxfun的方法.另外,请继续阅读以了解如何在这样的问题上实现 30x+ 加速!

Discussion & Solution Codes

Few approaches with bsxfun could be suggested here. Also, read on to see how one can get 30x+ speedup on a problem like this!

方法1(天真向量化方法)

为适应X行之间相减的两个操作,然后适应它们之间的后续逐元素乘法,基于朴素的bsxfun的方法将导致一个与((X(i,:)-X(j,:))' * (X(i,:)-X(j,:)))相对应的4D中间数组.之后,需要乘以R以获得最终输出F.如下所示实现-

To accommodate the two operations of subtractions between rows of X and then the subsequent element-wise multiplications between them, a naive bsxfun based approach would lead to a 4D intermediate array which would correspond to ((X(i,:)-X(j,:))' * (X(i,:)-X(j,:))). After that, one needs to multiply R to have the final output F. This is implemented as shown next -

v1 = bsxfun(@minus,X,permute(X,[3 2 1]));
v2 = bsxfun(@times,permute(v1,[1 3 2]),permute(v1,[1 3 4 2]));
F = reshape(R(:).'*reshape(v2,[],d^2),d,[]);

方法2(不是天真的矢量化方法)

前面提到的方法进入4D,这可能会减慢速度.因此,您可以通过重塑将中间数据保留到3D.接下来列出-

The earlier mentioned approach goes into 4D which could slow down things. So, instead you can keep the intermediate data until 3D by reshaping. This is listed next -

sub1 = bsxfun(@minus,X,permute(X,[3 2 1]));
sub1_2d = reshape(permute(sub1,[1 3 2]),n^2,[])
mult1 = bsxfun(@times,sub1_2d,permute(sub1_2d,[1 3 2]))
F = reshape(R(:).'*reshape(mult1,[],d^2),d,[])

方法3(混合方法)

现在,您可以基于方法2 (vectorized subtractions + loopy multiplications)进行混合处理.这种方法的好处是它使用fast matrix multiplication进行乘法运算,并将复杂度从早期的O(n ^ 2)降低到O(n),这应该使其效率更高.感谢@ Dev-iL,提出了这个想法!这是代码-

Now, you can make a hybrid approach based on Approach #2 (vectorized subtractions + loopy multiplications). Benefit of this approach would be that it uses the fast matrix multiplication to perform the multiplications and reduces the complexity to O(n) from the earlier O(n^2) and this should make it much more efficient. Thanks to @Dev-iL, for suggesting this idea! Here's the code -

sub1 = bsxfun(@minus,X,permute(X,[3 2 1]));
sub1 = bsxfun(@times,sub1,permute(sqrt(R),[1 3 2]));

F = zeros(d);
for k = 1:size(sub1,3)
    blk = sub1(:,:,k);    
    F = F + blk.'*blk;
end


基准化

基准化代码,将原始方法与方法#3


Benchmarking

Benchmarking code comparing the original approach against Approach #3

%// Parameters
n = 500;
d = 250;
X = rand(n, d);
R = rand(n, n);

%// Warm up tic/toc.
for k = 1:100000
    tic(); elapsed = toc();
end

disp('------------------------------ With Original Approach')
tic
F1 = zeros(d, d);
for i=1:n
    for j=1:n
        F1 = F1 + R(i,j)*((X(i,:)-X(j,:))' * (X(i,:)-X(j,:)));
    end
end
toc, clear F1 i j

disp('------------------------------ With Proposed Approach #3')
tic
sub1 = bsxfun(@minus,X,permute(X,[3 2 1]));
sub1 = bsxfun(@times,sub1,permute(sqrt(R),[1 3 2]));

F = zeros(d);
for k = 1:size(sub1,3)
    blk = sub1(:,:,k);    
    F = F + blk.'*blk;
end
toc

运行时结果

------------------------------ With Original Approach
Elapsed time is 29.728571 seconds.
------------------------------ With Proposed Approach #3
Elapsed time is 0.839726 seconds.

那么,谁准备好进行 30x + 的提速!?

So, who's ready for a 30x+ speedup!?

这篇关于是否可以加快此MATLAB脚本的速度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆