C ++链接器是否会自动内联函数(不带"inline"关键字,而没有在标头中实现)? [英] Will C++ linker automatically inline functions (without "inline" keyword, without implementation in header)?

查看:455
本文介绍了C ++链接器是否会自动内联函数(不带"inline"关键字,而没有在标头中实现)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

C ++链接器是否会自动内联传递"功能,这些功能未在标头中定义,并且未通过inline关键字明确要求内联"?

Will the C++ linker automatically inline "pass-through" functions, which are NOT defined in the header, and NOT explicitly requested to be "inlined" through the inline keyword?

例如,以下情况经常发生,并且应该始终从内联"中受益,似乎每个编译器供应商都应该通过自动"处理通过链接器内联"(在可能的情况下):

For example, the following happens so often, and should always benefit from "inlining", that it seems every compiler vendor should have "automatically" handled it through "inlining" through the linker (in those cases where it is possible):

//FILE: MyA.hpp
class MyA
{
  public:
    int foo(void) const;
};

//FILE: MyB.hpp
class MyB
{
  private:
    MyA my_a_;
  public:
    int foo(void) const;
};

//FILE: MyB.cpp
// PLEASE SAY THIS FUNCTION IS "INLINED" BY THE LINKER, EVEN THOUGH
// IT WAS NOT IMPLICITLY/EXPLICITLY REQUESTED TO BE "INLINED"?
int MyB::foo(void)
{
  return my_a_.foo();
}

我知道MSVS链接器将通过其链接时间代码生成(LTGCC)执行一些内联",并且GCC工具链还支持链接时间优化(LTO)(请参阅:链接器可以内联功能吗?).

I'm aware the MSVS linker will perform some "inlining" through its Link Time Code Generation (LTGCC), and that the GCC toolchain also supports Link Time Optimization (LTO) (see: Can the linker inline functions?).

此外,我知道在某些情况下无法内联"此" ",例如当链接器不可用"实现时(例如,跨共享库边界,会发生单独的链接).

Further, I'm aware that there are cases where this cannot be "inlined", such as when the implementation is not "available" to the linker (e.g., across shared library boundaries, where separate linking occurs).

但是,如果代码是链接到不跨越DLL/共享库边界的单个可执行文件中,则我期望编译器/链接器供应商可以自动内联函数,这是简单明了的优化(对性能和大小都有利)?

However, if this is code is linked into a single executable that does not cross DLL/shared-lib boundaries, I'd expect the compiler/linker vendor to automatically inline the function, as a simple-and-obvious optimization (benefiting both performance-and-size)?

我的希望太幼稚了吗?

推荐答案

下面是对您的示例的快速测试(使用MyA::foo()实现仅返回42).所有这些测试都是针对32位目标的-使用64位目标可能会看到不同的结果.还值得注意的是,使用-flto选项(GCC)或/GL选项(MSVC)会进行全面优化-无论在何处调用MyB::foo(),都将其简单地替换为42.

Here's a quick test of your example (with a MyA::foo() implementation that simply returns 42). All these tests were with 32-bit targets - it's possible that different results might be seen with 64-bit targets. It's also worth noting that using the -flto option (GCC) or the /GL option (MSVC) results in full optimization - wherever MyB::foo() is called, it's simply replaced with 42.

使用GCC(MinGW 4.5.1):

With GCC (MinGW 4.5.1):

gcc -g -O3 -o test.exe myb.cpp mya.cpp test.cpp

对MyB :: foo()的调用尚未优化. MyB::foo()本身进行了稍微优化,以:

the call to MyB::foo() was not optimized away. MyB::foo() itself was slightly optimized to:

Dump of assembler code for function MyB::foo() const:
   0x00401350 <+0>:     push   %ebp
   0x00401351 <+1>:     mov    %esp,%ebp
   0x00401353 <+3>:     sub    $0x8,%esp
=> 0x00401356 <+6>:     leave
   0x00401357 <+7>:     jmp    0x401360 <MyA::foo() const>

入口序中的哪个保留在原处,但立即撤消(leave指令),代码跳至MyA :: foo()以完成实际工作.但是,这是编译器(不是链接程序)正在做的一种优化,因为它意识到MyB::foo()只是返回了MyA::foo()返回的内容.我不确定为什么会留下序言.

Which is the entry prologue is left in place, but immediately undone (the leave instruction) and the code jumps to MyA::foo() to do the real work. However, this is an optimization that the compiler (not the linker) is doing since it realizes that MyB::foo() is simply returning whatever MyA::foo() returns. I'm not sure why the prologue is left in.

MSVC 16(来自VS 2010)在处理方面略有不同:

MSVC 16 (from VS 2010) handled things a little differently:

MyB::foo()最终以两次跳跃结束-一次跳到某种"thunk":

MyB::foo() ended up as two jumps - one to a 'thunk' of some sort:

0:000> u myb!MyB::foo
myb!MyB::foo:
001a1030 e9d0ffffff      jmp     myb!ILT+0(?fooMyAQBEHXZ) (001a1005)

然后那个笨拙的人跳到了MyA::foo():

And the thunk simply jumped to MyA::foo():

myb!ILT+0(?fooMyAQBEHXZ):
001a1005 e936000000      jmp     myb!MyA::foo (001a1040)

再次-这主要(完全是?)由编译器执行,因为如果您查看链接之前生成的目标代码,则会将MyB::foo()编译为直接跳转到MyA::foo()的地方.

Again - this was largely (entirely?) performed by the compiler, since if you look at the object code produced before linking, MyB::foo() is compiled to a plain jump to MyA::foo().

因此,将所有情况归结为一个事实-似乎没有显式调用LTO/LTCG,当今的链接器不愿意/不能够执行完全删除对MyB::foo()的调用的优化,即使MyB::foo()只是一个简单的跳转. MyA::foo().

So to boil all this down - it looks like without explicitly invoking LTO/LTCG, linkers today are unwilling/unable to perform the optimization of removing the call to MyB::foo() altogether, even if MyB::foo() is a simple jump to MyA::foo().

所以我想如果您想优化链接时间,请使用-flto(对于GCC)或/GL(对于MSVC编译器)和/LTCG(对于MSVC链接器)选项.

So I guess if you want link time optimization, use the -flto (for GCC) or /GL (for the MSVC compiler) and /LTCG (for the MSVC linker) options.

这篇关于C ++链接器是否会自动内联函数(不带"inline"关键字,而没有在标头中实现)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆