MATLAB快速(逐个分量)矢量运算非常快 [英] MATLAB fast (componentwise) vector operations are...really fast

查看:121
本文介绍了MATLAB快速(逐个分量)矢量运算非常快的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一段时间以来,我一直在编写MATLAB脚本,但我仍然不了解它在幕后"的工作方式.考虑以下脚本,该脚本以三种不同的方式使用(大)向量进行一些计算:

I am writing MATLAB scripts since some time and, still, I do not understand how it works "under the hood". Consider the following script, that do some computation using (big) vectors in three different ways:

  1. MATLAB向量运算;
  2. 简单的循环,在组件上进行相同的计算;
  3. 一个优化的周期应该比2快.因为避免了一些分配和分配.

这是代码:

N = 10000000;

A = linspace(0,100,N);
B = linspace(-100,100,N);
C = linspace(0,200,N);
D = linspace(100,200,N);

% 1. MATLAB Operations
tic
C_ = C./A;
D_ = D./B;

G_ = (A+B)/2;
H_ = (C_+D_)/2;
I_ = (C_.^2+D_.^2)/2;

X = G_ .* H_;
Y = G_ .* H_.^2 + I_;
toc
tic
X;
Y;
toc

% 2. Simple cycle
tic
C_ = zeros(1,N);
D_ = zeros(1,N);
G_ = zeros(1,N);
H_ = zeros(1,N);
I_ = zeros(1,N);
X = zeros(1,N);
Y = zeros(1,N);
for i = 1:N,
  C_(i) = C(i)/A(i);
  D_(i) = D(i)/B(i);

  G_(i) = (A(i)+B(i))/2;
  H_(i) = (C_(i)+D_(i))/2;
  I_(i) = (C_(i)^2+D_(i)^2)/2;

 X(i) = G_(i) * H_(i);
 Y(i) = G_(i) * H_(i)^2 + I_(i);
end
toc
tic
X;
Y;
toc

% 3. Opzimized cycle
tic
X = zeros(1,N);
Y = zeros(1,N);
for i = 1:N,
  X(i) = (A(i)+B(i))/2 * (( C(i)/A(i) + D(i)/B(i) ) /2);
  Y(i) = (A(i)+B(i))/2 * (( C(i)/A(i) + D(i)/B(i) ) /2)^2 +  ( (C(i)/A(i))^2 + (D(i)/B(i))^2 ) / 2;
end
toc
tic
X;
Y;
toc

我知道人们总是会尝试对计算进行矢量化,因为MATLAB是基于矩阵/矢量构建的(因此,如今,它并不总是最好的选择),所以我期望这样的东西:

I know that one shall always try to vectorize computations, being MATLAB build over matrices/vectors (thus, nowadays, it is not always the best choice), so I am expecting that something like:

C = A .* B;

比以下速度快:

for i in 1:N,
  C(i) = A(i) * B(i);
end

所期望的是,即使在上面的脚本中,它实际上也更快,尽管我使用的第二种和第三种方法只经历了一个循环,而第一种方法却执行了许多向量操作(理论上每次都是"for"循环).这迫使我得出结论,MATLAB具有一些 magic ,例如(

What I am not expecting is that it is actually faster even in the above script, despite the second and the third methods I am using go through only one cycle, whereas the first method performs many vector operations (which, theoretically, are a "for" cycle every time). This force me to conclude that MATLAB has some magic that permit (for example) to:

C = A .* B;
D = C .* C;

比单个"for"循环运行更快,并且内部执行一些操作.

to be run faster than a single "for" cycle with some operation inside it.

所以:

  1. 魔术为何能避免这么快地执行第一部分?
  2. 当您编写"D = A.* B"时,MATLAB是否实际上使用"for"循环进行了按组件计算,还是只是跟踪D包含"bla"和"bla"的某些乘积?
  1. what is the magic that avoid the 1st part to be executed so fast?
  2. when you write "D= A .* B" does MATLAB actually do a component wise computation with a "for" cycle, or simply keeps track that D contains some multiplication of "bla" and "bla"?

编辑

  1. 假设我想使用C ++(可能使用某些库)实现相同的计算.是第一个MATLAB方法甚至比用C ++实现的第三个方法还要快吗?(我会自己回答这个问题,请给我一些时间.)

编辑2

根据要求,这里有实验运行时:

As requested, here there are the experiment runtimes:

第1部分:0.237143

Part 1: 0.237143

第2部分:4.440132其中0.195154可供分配

Part 2: 4.440132 of which 0.195154 for allocation

第3部分:2.280640其中0.057500可供分配

Part 3: 2.280640 of which 0.057500 for allocation

并且没有JIT:

第1部分:0.337259

Part 1: 0.337259

第2部分:149.602017其中0.033886可供分配

Part 2: 149.602017 of which 0.033886 for allocation

第3部分:82.167713其中0.010852可供分配

Part 3: 82.167713 of which 0.010852 for allocation

推荐答案

第一个是最快的,因为向量化的代码可以轻松地解释为少量优化的C ++库调用.Matlab还可以在更高层次上对其进行优化,例如,将 G * H + I 替换为优化的 mul_add(G,H,I)而不是 add(mul(G,H),I)的核心.

The first one is the fastest because vectorized code can be easily interpreted to a small number of optimized C++ library calls. Matlab could also optimize it at more high level, for example, replace G*H+I with an optimized mul_add(G,H,I) instead of add(mul(G,H),I) in its core.

第二个不能轻易转换为C ++调用.它必须被解释或编译.脚本语言的最现代方法是JIT编译.Matlab JIT编译器不是很好,但这并不意味着一定是这样.我不知道为什么MathWorks无法对其进行改进.因此,#2的执行速度如此之慢,以至于即使执行更多的数学"运算,#1也会更快.

The second one can't be converted to C++ calls easily. It has to be interpreted or compiled. The most modern approach for scripting languages is JIT-compilation. The Matlab JIT compiler is not very good but it doesn't mean it has to be so. I don't know why MathWorks don't improve it. Thus #2 performs so slow that #1 is faster even it makes more "mathematical" operations.

发明Julia语言是Matlab表达式和C ++速度之间的折衷.相同的非矢量化代码( julia matlab )的工作速度非常快,因为JIT编译非常好.

Julia language was invented to be a compromise between Matlab expression and C++ speed. The same non-vectorized code (julia vs matlab) works very fast because JIT-compilation is very good.

这篇关于MATLAB快速(逐个分量)矢量运算非常快的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆