如何告诉编译器展开这个循环 [英] How to tell the compiler to unroll this loop
问题描述
我在 ARM 处理器上运行了以下循环.
I have the following loop that I am running on an ARM processor.
// pin here is pointer to some part of an array
for (i = 0; i < v->numelements; i++)
{
pe = pptr[i];
peParent = pe->parent;
SPHERE *ps = (SPHERE *)(pe->data);
pin[0] = FLOAT2FIX(ps->rad2);
pin[1] = *peParent->procs->pe_intersect == &SphPeIntersect;
fixifyVector( &pin[2], ps->center ); // Is an inline function
pin = pin + 5;
}
通过循环的缓慢性能,我可以判断编译器无法展开此循环,因为当我手动展开时,它变得非常快.我认为编译器对 pin
指针感到困惑.这里可以使用restrict
关键字来帮助编译器,还是restrict
只保留给函数参数?一般来说,我们如何告诉编译器展开它而不用担心 pin
指针.
By the slow performance of the loop, I can judge that the compiler was unable to unroll this loop, as when I manually do the unrolling, it becomes quite fast. I think the compiler is getting confused by the pin
pointer. Can we use restrict
keyword to help the compiler here, or is restrict
only reserved for function parameters? In general how can we tell the compiler to unroll it and don't worry about the pin
pointer.
推荐答案
要告诉 gcc 展开所有循环,您可以使用优化标志 -funroll-loops
.
To tell gcc to unroll all loops you can use the optimization flag -funroll-loops
.
要仅展开特定循环,您可以使用:
To unroll only a specific loop you can use:
__attribute__((optimize("unroll-loops")))
查看此答案了解更多详情.
编辑
如果编译器在进入时无法确定循环的迭代次数,您将需要使用 -funroll-all-loops
.请注意,来自 文档:展开所有循环,即使它们的迭代次数在进入循环时不确定.这通常会使程序运行得更慢."
If the compiler cannot determine the number of iterations of the loop upon entry you will need to use -funroll-all-loops
. Note that from the documentation: "Unroll all loops, even if their number of iterations is uncertain when the loop is entered. This usually makes programs run more slowly."
这篇关于如何告诉编译器展开这个循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!