将一个循环分成两个循环的性能 [英] Performance of breaking apart one loop into two loops
问题描述
美好的一天,
假设您有一个简单的for循环,如下所示...
Suppose that you have a simple for loop like below...
for(int i=0;i<10;i++)
{
//statement 1
//statement 2
}
假定语句1和2为O(1).除了启动"另一个循环的少量开销外,将for循环分解为两个(非嵌套但顺序的)循环是否会同样快呢?例如...
Assume that statement 1 and statement 2 were O(1). Besides the small overhead of "starting" another loop, would breaking down that for loop into two (not nested, but sequential) loops be as equally fast? For example...
for(int i=0;i<10;i++)
{
//statement 1
}
for(int i=0;i<10;i++)
{
//statement 2
}
为什么我问这样一个愚蠢的问题是我有一个碰撞检测系统(CDS),该系统必须遍历所有对象.我想将CDS系统的功能分区",以便我可以简单地调用
Why I ask such a silly question is that I have a Collision Detection System(CDS) that has to loop through all the objects. I want to "compartmentalize" the functionality of my CDS system so I can simply call
cds.update(objectlist);
无需中断cds系统. (不必太担心我的CDS实施...我想我知道我在做什么,我只是不知道如何解释它,我真正需要知道的是我是否会对循环造成巨大的性能损失再次 .
instead of having to break my cds system up. (Don't worry too much about my CDS implementation... I think I know what I am doing, I just don't know how to explain it, what I really need to know is if I take a huge performance hit for looping through all my objects again.
推荐答案
这取决于您的应用程序.
It depends on your application.
可能的缺点(分拆):
- 您的数据不适合L1数据缓存,因此您在第一个循环中加载一次,然后在第二个循环中重新加载
可能的收益(拆分):
- 您的循环包含许多变量,拆分有助于减少寄存器/堆栈压力,并且优化程序将其转换为更好的机器代码
- 您使用的函数会破坏L1指令高速缓存,以便在每次迭代时都加载高速缓存,而通过拆分,您可以设法在每次循环的第一次迭代中仅(一次)加载一次.
这些列表当然并不全面,但是您已经可以感觉到 code 和 data 之间存在紧张关系.因此,当我们都不知道时,很难做出有根据的/疯狂的猜测.
These lists are certainly not comprehensive, but already you can sense that there is a tension between code and data. So it is difficult for us to take an educated/a wild guess when we know neither.
有疑问:个人资料.使用callgrind,在每种情况下检查高速缓存未命中,检查已执行的指令数.测量花费的时间.
In doubt: profile. Use callgrind, check the cache misses in each case, check the number of instructions executed. Measure the time spent.
这篇关于将一个循环分成两个循环的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!