什么时候(如果有)循环展开仍然有用吗? [英] When, if ever, is loop unrolling still useful?

查看:102
本文介绍了什么时候(如果有)循环展开仍然有用吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试通过循环展开来优化一些对性能至关重要的代码(一种快速排序算法,在蒙特卡洛仿真中被称为百万次).这是我要加快的内循环:

I've been trying to optimize some extremely performance-critical code (a quick sort algorithm that's being called millions and millions of times inside a monte carlo simulation) by loop unrolling. Here's the inner loop I'm trying to speed up:

// Search for elements to swap.
while(myArray[++index1] < pivot) {}
while(pivot < myArray[--index2]) {}

我尝试展开以下内容:

while(true) {
    if(myArray[++index1] < pivot) break;
    if(myArray[++index1] < pivot) break;
    // More unrolling
}


while(true) {
    if(pivot < myArray[--index2]) break;
    if(pivot < myArray[--index2]) break;
    // More unrolling
}

这绝对没有区别,所以我将其改回了可读性更好的形式.我尝试循环展开时也有类似的经历.鉴于现代硬件上分支预测器的质量,何时展开循环仍然是有用的优化?

This made absolutely no difference so I changed it back to the more readable form. I've had similar experiences other times I've tried loop unrolling. Given the quality of branch predictors on modern hardware, when, if ever, is loop unrolling still a useful optimization?

推荐答案

如果可以打破依赖关系链,则可以展开循环.这使乱序或超标量CPU可以更好地安排事情,从而更快地运行.

Loop unrolling makes sense if you can break dependency chains. This gives a out of order or super-scalar CPU the possibility to schedule things better and thus run faster.

一个简单的例子:

for (int i=0; i<n; i++)
{
  sum += data[i];
}

此处,参数的依赖关系链非常短.如果由于数据数组上有高速缓存未命中而导致停顿,则cpu只能等待.

Here the dependency chain of the arguments is very short. If you get a stall because you have a cache-miss on the data-array the cpu cannot do anything but to wait.

另一方面,此代码:

for (int i=0; i<n; i+=4)
{
  sum1 += data[i+0];
  sum2 += data[i+1];
  sum3 += data[i+2];
  sum4 += data[i+3];
}
sum = sum1 + sum2 + sum3 + sum4;

可以运行得更快.如果您在一次计算中遇到缓存未命中或其他停顿,那么仍然有其他三个不依赖于停顿的依赖链.乱序的CPU可以执行这些操作.

could run faster. If you get a cache miss or other stall in one calculation there are still three other dependency chains that don't depend on the stall. A out of order CPU can execute these.

这篇关于什么时候(如果有)循环展开仍然有用吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆