使包含大尺寸循环的Matlab代码更快 [英] Making faster a Matlab code involving loop over big dimensions

查看:87
本文介绍了使包含大尺寸循环的Matlab代码更快的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一段Matlab代码,如果可能的话,我想使其更加高效.特别是,我想加快两个位(分别称为BIT 1BIT 2)-在n上的循环内-对于n_mn_w

I have this piece of Matlab code that I want to make more efficient if possible. In particular, I want to speed up the two bits (called BIT 1 and BIT 2) - inside the loop over n - which may uselessly take a lot of time for n_m and n_w large

clear
N=[3 4; 100 200; 300 400; 2000 2000; 100000 100000];
output1=cell(size(N,1),1);
output2=cell(size(N,1),1);
for n=1:size(N,1)
    n_m=N(n,1);
    n_w=N(n,2);
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%      
    %BIT 1
    temp=zeros(n_w+1, n_m);
    for i=1:n_m
        temp(:,i)=(i:n_m:n_m*n_w+i).'; 
    end
    output1{n}=temp(:).'; %1x(n_m*(n_w+1))
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    %BIT 2
    temp=zeros(n_m+1,n_w);
    for j=1:n_w
        temp(:,j)=[(j-1)*n_m+1:j*n_m n_m*n_w+n_m+j].';
    end
    output2{n}=temp(:).'; %1x(n_w*(n_m+1))
end

您有更快的建议吗?

关于BIT 1的简要说明,对于给定的n_mn_w,BIT 1创建尺寸为1x(n_m*(n_w+1))的行向量,该行向量可以拆分为每个尺寸为1x(n_w+1)n_m子行.子行i包含整数(i:n_m:n_m*n_w+i).

A brief explanation on BIT 1: for a given n_m and n_w, BIT 1 creates a row vector of dimension 1x(n_m*(n_w+1)) which can be splitted in n_m sub-rows each with dimension 1x(n_w+1). Sub-row i contains the integers (i:n_m:n_m*n_w+i).

关于BIT 2的简要说明,对于给定的n_mn_w,BIT 2创建尺寸为1x(n_w*(n_m+1))的行向量,该行向量可以拆分为每个尺寸为1x(n_m+1)n_w子行.子行j包含整数[(j-1)*n_m+1:j*n_m, n_m*n_w+n_m+j].

A brief explanation on BIT 2: for a given n_m and n_w, BIT 2 creates a row vector of dimension 1x(n_w*(n_m+1)) which can be splitted in n_w sub-rows each with dimension 1x(n_m+1). Sub-row j contains the integers [(j-1)*n_m+1:j*n_m, n_m*n_w+n_m+j].

在这里,我将循环版本与reshape选项进行了比较:reshape没有帮助.

Here I compare the loop version with the reshape option: reshape does not help.

clear
N=3 5; 100 200; 300 400; 2000 2000; 5000 5000; 10000 10000; 20000 20000];
output1=cell(size(N,1),1);
output2=cell(size(N,1),1);
output3=cell(size(N,1),1); %alternative of output1 with reshape
output4=cell(size(N,1),1); %alternative of output2 with reshape
time=zeros(size(N,1),4);
for n=1:size(N,1)
    n_m=N(n,1);
    n_w=N(n,2);
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
    %BIT 1 with loop  
    tic
    temp=zeros(n_w+1, n_m);
    for i=1:n_m
        temp(:,i)=(i:n_m:n_m*n_w+i).'; 
    end
    output1{n}=temp(:).'; 
    time(n,1)=toc;
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%    
    %BIT 1 with reshape
    tic
    tempor=reshape(1:1:n_m*(n_w+1), n_m, n_w+1);
    temp1=tempor.'; 
    output3{n}=temp1(:).';
    time(n,3)=toc;
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    %BIT 2 with loop
    tic
    temp=zeros(n_m+1,n_w);
    for j=1:n_w
        temp(:,j)=[(j-1)*n_m+1:j*n_m n_m*n_w+n_m+j].';
    end
    output2{n}=temp(:).'; 
    time(n,2)=toc;
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
    %BIT 2 with reshape   
    tic
    temp1=tempor(:,1:end-1);
    temp2=n_m*n_w+n_m+1:n_m*n_w+n_m+n_w;
    temp=[temp1; temp2];
    output4{n}=temp(:).';
    time(n,4)=toc;
end

我得到

time=
0.0003    0.0006    0.0001    0.0001
0.0005    0.0005    0.0003    0.0002
0.0021    0.0011    0.0029    0.0006
0.0159    0.0189    0.0230    0.0189
0.0915    0.1068    0.1503    0.1260
0.3015    0.3757    0.6035    0.5211
1.1501    1.3801    2.4459    2.0828

(第三和第四列速度较慢,我尝试超过20000,但reshape会一直运行)

(Third and Fourth columns slower, I have tried to go over 20000 but reshape runs forever)

推荐答案

对于较小的矩阵,我使用repmat优于for循环

I use repmat which is better than for loops for smaller matrices

function testf(k, N)


n_m=N(1);
n_w=N(2);

switch k
    case 1
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%      
        %BIT 1
        tempA = ones(n_w+1,1) + (0:n_w).'*n_m;
        tempB = repmat( 0:(n_m-1), n_w+1, 1);
        tempC = tempB(:) + repmat(tempA, n_m, 1);
        output1=tempC(:).'; %1x(n_m*(n_w+1))
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        %BIT 2
        tempC = zeros(n_m+1,n_w);
        tempA = repmat((1:n_m).', 1,n_w);
        tempB = repmat( 0:(n_w-1), n_m, 1)*(n_m);
        tempC(1:end-1, :) = tempA + tempB;
        tempC(end, :) = (1:n_w) + (n_w+1)*n_m;
        output2=tempC(:).'; %1x(n_w*(n_m+1))
    case 2
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%      
        %BIT 1
        temp=zeros(n_w+1, n_m);
        for i=1:n_m
            temp(:,i)=(i:n_m:n_m*n_w+i).'; 
        end
        output1=temp(:).'; %1x(n_m*(n_w+1))
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        %BIT 2
        temp=zeros(n_m+1,n_w);
        for j=1:n_w
            temp(:,j)=[(j-1)*n_m+1:j*n_m n_m*n_w+n_m+j].';
        end
        output2=temp(:).'; %1x(n_w*(n_m+1))
end

end

测试代码

figure
N = [100,150; 150,180; 200,250; 250,300; 300,350; 400,500; 450, 550];
T = zeros(size(N,1),size(N,2),10);
for mm = 1:10
    for nn = 1:size(N,1)
        T(nn,:,mm) = [timeit(@() testf(1, N(nn,:))), timeit(@() testf(2, N(nn,:)))];
    end
end
T = mean(T,3);
plot(T)

在Matlab R2015b上运行

running on Matlab R2015b

我注意到,即使timeit也无法测量准确的运行时间. 因此,我添加了for循环以多次运行timeit.

I noticed that even timeit cannot measure accurate run time. So I added a for loop to run timeit several times.

回复评论.

有趣! ones(n_w+1,1) + (0:n_w).'*n_m真的比 (1:n_m:n_m*n_w+1).'?另外,repmat( 0:(n_w-1), n_m, 1)*(n_m)应该是 比repmat( 0:(n_w-1)*n_m, n_m, 1)慢,只是因为有很多 完成了更多的乘法运算. –克里斯·伦戈(Cris Luengo)

Interesting! Is ones(n_w+1,1) + (0:n_w).'*n_m really faster than (1:n_m:n_m*n_w+1).'? Also, repmat( 0:(n_w-1), n_m, 1)*(n_m) should be slower than repmat( 0:(n_w-1)*n_m, n_m, 1) just because there are many more multiplications done. – Cris Luengo

第一个问题,是的. 我注释掉了我方法中第一个tempA之后的所有内容,以及OP中第一个for循环之后的所有内容. 结果如下.

First question, yes. I commented out everything after the first tempA in my method, and everything after the first for loop in OP's. Result is below.

但这有点不公平,因为for循环中只有一行,但是我的方法只有三行. 无论如何,我最初的动机是从向量的for生成中节省时间. 我可以一次生成一堆向量.

But it's a bit unfair because there is only one line in the for loop but my method has three lines. Anyway, my original motivation is save time from for-ing generation of vectors. I can generate a bunch of vectors at one time.

对于乘法,我比较了两种策略.令人惊讶的是,对于250x300之类的小型矩阵,两者之间几乎没有任何区别. 对于较大的矩阵,从乘法中节省的时间远少于仅存储它们所花费的时间,因此时间图并没有真正改变.

For the multiplications, I have compared the two strategies. Surprisingly, for small matrices like 250x300, there is barely any difference between the two. For bigger matrices, the time savings earned from multiplications is far less than the expense of just storing them, so the time plot doesn't really change.

我真的很在意N大(大于500),您的回答表明 没有比循环更好的了吗? – user3285148

I really care about N large (above 500) and your answer is suggesting that there is nothing better than looping? – user3285148

这是一个具有挑战性的部分. 如果您真的在乎Matlab代码的速度.... 好吧,这就是我能想到的. 仅当块足够小时,该方法才比for更快. 因此,您可以将大矩阵切成较小的块,并对每个小块进行repmat样式. 显然,您需要使用for循环将所有片段缝合在一起,但是我敢打赌,这样会更快. 另外,您还必须考虑如何有效地将大矩阵修整为实际大小-假设您有一个1234x5678的矩阵,并且您的自动代码会生成100x100的块.

It's a challenging part. If you really care about the speed with a piece of Matlab code.... Well here's what I can think of. The idea is block-wise is faster than for only when the block is small enough. So you may chop up the big matrix into smaller chunks, and do the repmat style for each small pieces. Apparently you will need to stitch all pieces together with a for loop, but my bet is on this way would be faster.... Also you will have to consider how to effectively trim the big matrix to the actual size - say you have a 1234x5678 matrix, and your automated code makes blocks of 100x100.

一种示例方式可能是这样

One example way could be like this

    case 3
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%      
        %BIT 1
        temp=zeros(n_w+1, n_m);
        vec = (1:n_m:(n_m*n_w+1)).';
        for ii=1:n_m
            temp(:,ii) = vec;
            vec = vec + 1;
        end
        output1=temp(:).'; %1x(n_m*(n_w+1))
        %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
        %BIT 2
        temp=zeros(n_m+1,n_w);
        vec = (1:n_m).';
        for jj = 1:n_w
            temp(1:end-1,jj) = vec;
            vec = vec + n_m;
        end
        temp(end,:) = n_m*n_w+n_m + (1:n_w);
        output2=temp(:).'; %1x(n_w*(n_m+1))

测试代码就是这样

figure
N = [100,150; 150,180; 200,250; 250,300; 300,350; 400,500; 450, 550;
    550,650; 700,800; 800,1000];
T = zeros(size(N,1),3,10);
for mm = 1:10
    for nn = 1:size(N,1)
        T(nn,:,mm) = [timeit(@() testf(1, N(nn,:))), ....
            timeit(@() testf(2, N(nn,:))), ....
            timeit(@() testf(3, N(nn,:)))];
    end
end
T = mean(T,3);
plot(T)

然后时间图是这样的

表示节省了约20%的时间.

which shows ~20% of time saving.

这篇关于使包含大尺寸循环的Matlab代码更快的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆