了解如何计算FLOP [英] Understanding how to count FLOPs

查看:602
本文介绍了了解如何计算FLOP的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很难掌握如何计算FLOP。有一刻我想我明白了,接下来对我来说就没有意义了。一些帮助解释这将不胜感激。我已经看过关于这个主题的所有其他帖子,没有一个完全用我熟悉的编程语言(我知道一些MATLAB和FORTRAN)来解释。

这是一个例如,从我的一本书中,我正在尝试做什么。

对于下面的代码片段,总的触发器可以写成<$ c (n *(n-1)/ 2)+(n *(n + 1)/ 2)相当于 n ^ 2 + O(n )

  [m,n] =大小(A)
nb = n +1;
Aug = [A b];
x =零(n,1);
x(n)= Aug(n,nb)/ Aug(n,n);
(i = n-1):-1:1
x(i)=(Aug(i,nb)-Aug(i,i + 1:n)* x(i + 1:n) )/ 8(I,I);
end

我试图使用上面的相同原理来查找FLOP的总数作为以下代码(MATLAB)中方程数 n 的函数。

 n =长度(f); 

%远期消除
(对于k = 2):n
因子= e(k)/ f(k-1);
f(k)= f(k) - factor * g(k-1); (k)= r(k) - 因子* r(k-1);
结束

返回替换
x(n)= r(n)/ f(n); (k)= g(k)* x(k + 1))/ f(k);对于k = n-1:
end


解决方案

MATLAB的专家,但我会去。



我注意到你的代码索引范围没有你的向量。好,那意味着我看到的每一个操作都涉及到一对数字。所以我认为第一个循环是每次迭代5个FLOPS,第二个循环是每个迭代3个。然后中间就有一个单一的操作。

但是,MATLAB将所有的东西都默认保存为double。所以循环变量k本身是在每个循环上运行一次,然后每次从中计算一个索引。所以这是第一个循环的额外4和第二个2。但是等待 - 第一个循环有两次k-1,所以理论上可以通过计算和存储来优化一点,通过减少FLOP的数量每次迭代一次。 MATLAB解释器可能能够发现自己的那种优化。而且我知道可以证明k可以是一个整数,一切都还可以。

所以你的问题的答案是,这取决于。你想知道CPU的FLOP数量,或者你的代码中表达的最小数量(也就是你自己的矢量上的操作数量),还是只有在没有优化的情况下,MATLAB才能执行的FLOP的严格数目? MATLAB曾经有一个flops()函数来计算这种事情,但现在不存在了。我并不是MATLAB方面的专家,但我怀疑flops()已经走了,因为解释器变得太聪明了,而且做了很多的优化。

我有点好奇,想知道你为什么想知道。我曾经使用flops()来计算一块数学做了多少操作,作为一个粗略的方法来估计我需要使用C语言实时工作多少计算。



现在我看看原语本身(例如,根据库数据表,有一个1k复数的FFT,在CPU上是7us,有一个2k的矢量乘,这将是2.5us等等)。这有点棘手,因为必须考虑缓存速度,数据集大小等。数学库(例如fftw)本身是非常不透明的,这是所有人都可以做到的。



所以,如果你为此计算FLOP,你可能得不到一个很好的答案。

I am having a hard time grasping how to count FLOPs. One moment I think I get it, and the next it makes no sense to me. Some help explaining this would greatly be appreciated. I have looked at all other posts about this topic and none have completely explained in a programming language I am familiar with (I know some MATLAB and FORTRAN).

Here is an example, from one of my books, of what I am trying to do.

For the following piece of code, the total number of flops can be written as (n*(n-1)/2)+(n*(n+1)/2) which is equivalent to n^2 + O(n).

[m,n]=size(A)
nb=n+1;
Aug=[A b];
x=zeros(n,1);
x(n)=Aug(n,nb)/Aug(n,n);
for i=n-1:-1:1
    x(i) = (Aug(i,nb)-Aug(i,i+1:n)*x(i+1:n))/Aug(i,i);
end

I am trying to apply the same principle above to find the total number of FLOPs as a function of the number of equations n in the following code (MATLAB).

% e = subdiagonal vector
% f = diagonal vector
% g = superdiagonal vector
% r = right hand side vector
% x = solution vector

n=length(f);

% forward elimination
for k = 2:n
    factor = e(k)/f(k­‐1);
    f(k) = f(k) – factor*g(k‐1);
    r(k) = r(k) – factor*r(k‐1);
end

% back substitution
x(n) = r(n)/f(n);
for k = n‐1:­‐1:1
    x(k) = (r(k)‐g(k)*x(k+1))/f(k);
end

解决方案

I'm by no means expert at MATLAB but I'll have a go.

I notice that none of the lines of your code index ranges of your vectors. Good, that means that every operation I see before me is involving a single pair of numbers. So I think the first loop is 5 FLOPS per iteration, and the second is 3 per iteration. And then there's that single operation in the middle.

However, MATLAB stores everything by default as a double. So the loop variable k is itself being operated on once per loop and then every time an index is calculated from it. So that's an extra 4 for the first loop and 2 for the second.

But wait - the first loop has 'k-1' twice, so in theory one could optimise that a bit by calculating and storing that, reducing the number of FLOPs by one per iteration. The MATLAB interpreter is probably able to spot that sort of optimisation for itself. And for all I know it can work out that k could in fact be an integer and everything is still okay.

So the answer to your question is that it depends. Do you want to know the number of FLOPs the CPU does, or the minimum number expressed in your code (ie the number of operations on your vectors alone), or the strict number of FLOPs that MATLAB would perform if it did no optimisation at all? MATLAB used to have a flops() function to count this sort of thing, but it's not there anymore. I'm not an expert in MATLAB by any means, but I suspect that flops() has gone because the interpreter has gotten too clever and does a lot of optimisation.

I'm slightly curious to know why you wish to know. I used to use flops() to count how many operations a piece of maths did as a crude way of estimating how much computing grunt I'd need to make it work in real time written in C.

Nowadays I look at the primitives themselves (eg there's a 1k complex FFT, that'll be 7us on that CPU according to the library datasheet, there's a 2k vector multiply, that'll be 2.5us, etc). It gets a bit tricky because one has to consider cache speeds, data set sizes, etc. The maths libraries (eg fftw) themselves are effectively opaque so that's all one can do.

So if you're counting the FLOPs for that reason you'll probably not get a very good answer.

这篇关于了解如何计算FLOP的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆