两个环体或一(结果相同) [英] Two loop bodies or one (result identical)

查看:117
本文介绍了两个环体或一(结果相同)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直想知道是关于更好地利用CPU缓存(这是众所周知的,从引用的局部性受益)更有效 - 两个回路的每个迭代在同一数学组数字,每一个不同的循环体,或具有一个循环,会连接这两个机构成一体,从而完成相同总结果,但都在本身?

在我看来,有两个循环,因为更多的指令和数据所使用的循环适合在高速缓存会带来更少的缓存未命中和驱逐。我说得对不对?

假设:


  1. 成本˚F先按g 相比,在完成包含每个
  2. ˚F先按g 最常使用的每个高速缓存的本身,所以缓存将由一个失效相继被调用(这将是一个单环体版本的情况下)

  3. 英特尔酷睿双核CPU

  4. C语言源程序code

  5. GCC 编译器,没有开关

的集合被迭代是一个数学集,而不是数字的在像载体或列表存储器的容器。请参见下面的例子。

请没有的答案,premature优化是邪恶的字符: - )

的双回路的版本,我主张的一个例子:

  INT J = 0,K = 0;的for(int i = 0; I< 1000000;我++)
{
    J + = F(ⅰ);
}的for(int i = 0; I< 1000000;我++)
{
    K + =克(ⅰ);
}


解决方案

我可以看到三个变量(甚至在code的一个看似简单的块):


  • 做什么 F()克()吗?其中一人可以全部无效指令高速缓存行(有效推动其他一出)的?可以在发生在二级指令缓存太(不太可能)?然后只保持其中的一个中可能是有益的。 注意:逆并不意味着有一个循环,这是因为:

  • F()克()对大量的数据进行操作,根据 I ?然后,它会是不错的知道,如果他们经营的相同的数据集 - 再次,你必须考虑是否通过高速缓存未命中的两套不同的螺丝您操作

  • 如果 F()克()确实是因为你先状态原始,和我无论是在code尺寸的假设以及运行时间和code的复杂性,缓存局域性问题不会在code这样的小块出现 - 你最关心的是,如果一些其他进程用计划实际的工作要做,而无效所有缓存,直到它是你的进程转向运行。

最后一个想法:既然像上面这样的过程可能是在你的系统中很少出现(我用罕见相当宽松),你可以考虑让你的两个内嵌的功能,让编译器展开循环。这是因为指令缓存,错误回L2也没什么大不了的,而且那将会包含一个高速缓存行 I,J,K 将失效的概率在循环看起来不那么可怕。但是,如果不是这种情况,一些细节将是有益的。

I have long wondered what is more efficient with regards to making better use of CPU caches (which are known to benefit from locality of reference) - two loops each iterating over the same mathematical set of numbers, each with a different loop body, or having one loop that "concatenates" the two bodies into one, and thus accomplishes identical total result, but all in itself?

In my opinion, having two loops would introduce fewer cache misses and evictions because more instructions and data used by the loop fit in the cache. Am I right?

Assuming:

  1. Cost of f and g each is negligible compared to cost of completing the entire loop containing each
  2. f and g use most of the cache each by itself, and so the cache will be invalidated by one being called after another (which would be the case with a single-loop-body version)
  3. Intel Core Duo CPU
  4. C language source code
  5. gcc compiler, no switches

The set being iterated over is a mathematical set, not a container of numbers in memory like a vector or a list. See the example below.

Please no answers of the "premature optimization is evil" character :-)

An example of the two-loops version that I am advocating for:

int j = 0, k = 0;

for(int i = 0; i < 1000000; i++)
{
    j += f(i);
}

for(int i = 0; i < 1000000; i++)
{
    k += g(i);
}

解决方案

I can see three variables (even in a seemingly simple chunk of code):

  • What do f() and g() do? Can one of them invalidate all of the instruction cache lines (effectively pushing the other one out)? Can that happen in L2 instruction cache too (unlikely)? Then keeping only one of them in it might be beneficial. Note: The inverse does not imply "have a single loop", because:
  • Do f() and g() operate on large amounts of data, according to i? Then, it'd be nice to know if they operate on the same set of data - again you have to consider whether operating on two different sets screws you up via cache misses.
  • If f() and g() are indeed that primitive as you first state, and I'm assuming both in code size as well as running time and code complexity, cache locality issues won't arise in little chunks of code like this - your biggest concern would be if some other process were scheduled with actual work to do, and invalidated all the caches until it were your process' turn to run.

A final thought: given that such processes like above might be a rare occurrence in your system (and I'm using "rare" quite liberally), you could consider making both your functions inline, and let the compiler unroll the loop. That is because for the instruction cache, faulting back to L2 is no big deal, and the probability that the single cache line that'd contain i, j, k would be invalidated in that loop doesn't look so horrible. However, if that's not the case, some more details would be useful.

这篇关于两个环体或一(结果相同)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆