编译器内联函数有多深? [英] How deep do compilers inline functions?

查看:111
本文介绍了编译器内联函数有多深?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我有一些函数,每个约两个简单的代码,并且他们这样调用: A 调用 B 调用 C 调用 D ...调用 K 。 (因此基本上是一系列长的函数调用。)编译器通常在调用树中如何深入到内联这些函数?

Say I have some functions, each of about two simple lines of code, and they call each other like this: A calls B calls C calls D ... calls K. (So basically it's a long series of short function calls.) How deep will compilers usually go in the call tree to inline these functions?

推荐答案

这个问题没有意义。

如果你想到内联及其后果,你会意识到:

If you think about inlining, and its consequences, you'll realise it:


  • 避免函数调用(所有寄存器保存/帧调整)

  • 向优化器暴露更多上下文

当决定是否内联时,编译器因此在创建的潜在膨胀和预期的速度增益之间执行平衡动作。这个平衡动作受到以下选项的影响:for gcc -O3 表示优化速度, -Oz 表示优化大小,

When deciding whether to inline or not, the compiler thus performs a balancing act between the potential bloat created and the speed gain expected. This balancing act is affected by options: for gcc -O3 means optimize for speed while -Oz means optimize for size, on inlining they have quasi opposite behaviors!

因此,重要的不是嵌套级别,它是指令数量(可能被加权,因为并非所有都被创建为相等)

Therefore, what matters is not the "nesting level" it is the number of instruction (possibly weighted as not all are created equal).

这意味着一个简单的转发函数:

This means that a simple forwarding function:

int foo(int a, int b) { return foo(a, b, 3); }

从内联的角度来看基本上是透明的。

is essentially "transparent" from the inlining point of view.

另一方面,计算一百行代码的函数不太可能内联。除了 static 只调用一次的自由函数是准系统地内联,因为在这种情况下不会创建任何重复。

One the other hand, a function counting a hundred lines of code is unlikely to get inlined. Except that a static free functions called only once are quasi systematically inlined, as it does not create any duplication in this case.

从这两个例子我们得到了启发式如何的行为的预示:

From this two examples we get a hunch of how the heuristics behave:


  • 函数的指令越少,

  • 调用次数越少,内联效果越好

参数你应该能够设置影响一种或另一种(MSVC as __ force_inline 这暗示了inling, gcc 因为他们 -finline-limit 标志提高指令计数的阈值等等)

After that, they are parameters you should be able to set to influence one way or another (MSVC as __force_inline which hints strongly at inling, gcc as they -finline-limit flag to "raise" the treshold on the instruction count, etc...)

在切线上:你知道部分内联吗?

在4.6。这个想法,顾名思义,是部分内联函数。大多数情况下,为了避免函数调用的开销,当函数保护并且可能(在某些情况下)几乎立即返回。

It was introduced in gcc in 4.6. The idea, as the name suggests, is to partially inline a function. Mostly, to avoid the overhead of a function call when the function is "guarded" and may (in some cases) return nearly immediately.

例如:

void foo(Bar* x) {
  if (not x) { return; } // null pointer, pfff!

  // ... BIG BLOC OF STATEMENTS ...
}

void bar(Bar* x) {
  // DO 1
  foo(x);
  // DO 2
}

/ p>

could get "optimized" as:

void foo@0(Bar* x) {
  // ... BIG BLOC OF STATEMENTS ...
}

void bar(Bar* x) {
  // DO 1
  if (x) { foo@0(x); }
  // DO 2
}

当然,

最后,除非您使用WPO(全程序优化)或LTO (链接时间优化),只有当它们的定义在与调用站点相同的TU(翻译单位)中时,才能内联函数。

And finally, unless you use WPO (Whole Program Optimization) or LTO (Link Time Optimization), functions can only be inlined if their definition is in the same TU (Translation Unit) that the call site.

这篇关于编译器内联函数有多深?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆