只有在标题中定义的函数才会被内联。我缺少什么? [英] Functions only getting inlined if defined in a header. Am I missing something?

查看:275
本文介绍了只有在标题中定义的函数才会被内联。我缺少什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用gcc v4.8.1

Using gcc v4.8.1

如果我这样做:

//func.hpp

#ifndef FUNC_HPP
#define FUNC_HPP

int func(int);

#endif

//func.cpp

#include "func.hpp"

int func(int x){
    return 5*x+7;
}

//main.cpp

#include <iostream>

#include "func.hpp"

using std::cout;
using std::endl;

int main(){
    cout<<func(5)<<endl;
    return 0;
}

即使是简单的函数 func 不会被内联。没有 inline extern static code> __ attribute __((always_inline))在原型和/或定义更改这个(显然,这些说明符的一些组合导致它甚至不编译和/或产生警告,那些)。我使用 g ++ * .cpp -O3 -o run g ++ * .cpp -O3 -S 输出。当我看看汇编输出,我仍然看到调用func 。它出现只有方式我可以得到的功能,以正确内联是有原型(可能不必要)和头文件中的函数的定义。如果头文件只包含在整个程序中的一个文件中(例如由 main.cpp 包含),它将被编译并且函数将被正确地内联,而不需要 inline 说明符。如果头文件被多个文件包含,则 inline 说明符似乎是解决多个定义错误所必需的,这似乎是它的唯一目的。

Even the simple function func will not get inlined. No combination of inline, extern, static, and __attribute__((always_inline)) on the prototype and/or the definition changes this (obviously some combinations of these specifiers cause it to not even compile and/or produce warnings, not talking about those). I'm using g++ *.cpp -O3 -o run and g++ *.cpp -O3 -S for assembly output. When I look at the assembly output, I still see call func. It appears only way I can get the function to be properly inlined is to have the prototype (probably not necessary) and the definition of the function in the header file. If the header is only included by one file in the whole program (included by only main.cpp for example) it will compile and the function will be properly inlined without even needing the inline specifier. If the header is to be included by multiple files, the inline specifier appears to be needed to resolve multiple definition errors, and that appears to be its only purpose. The function is of course inlined properly.

所以我的问题是:我做错了什么?我缺少什么?无论发生什么:

So my question is: am I doing something wrong? Am I missing something? Whatever happened to:

编译器比你聪明,它知道函数应该如何内联比你做的更好,永远不要使用C数组。 :: vector!

"The compiler is smarter than you. It knows when a function should be inlined better than you do. And never ever use C arrays. Always use std::vector!"

- 其他StackOverflow用户

-Every other StackOverflow user

真的吗?所以调用func(5)和打印结果比只打印32?我会盲目地跟着你离开悬崖的边缘所有强大的所有知道和所有明智的gcc。

Really? So calling func(5) and printing the result is faster than just printing 32? I will blindly follow you off the edge of a cliff all mighty all knowing and all wise gcc.

记录,上面的代码只是一个例子。我写了一个光线跟踪器,当我把我的数学和其他实用程序类的所有代码到他们的头文件,并使用 inline 说明符,我看到巨大的性能提升。字面上喜欢10倍更快的一些场景。

For the record, the above code is just an example. I am writing a ray tracer and when I moved all of the code of my math and other utility classes to their header files and used the inline specifier, I saw massive performance gains. Literally like 10 times faster for some scenes.

推荐答案

最近的GCC能够通过链接时优化(LTO)。您需要编译并链接到 -flto ;请参见链接时优化和内联 GCC优化选项

Recent GCC is able to inline across compilation units through link-time optimizations (LTO). You need to compile - and link - with -flto; see Link-time optimization and inline and GCC optimize options.

(实际上,LTO是由特殊变体 lto1 链接时的编译器; LTO通过在目标文件内部序列化一些GCC的内部表示,它们也被 lto1 ;所以使用 -flto 会发生的事情是,当编译 src1.c $ c> src1.o 包含除了对象二进制之外的GIMPLE表示;以及当 gcc -flto src * .o code> lto1 前端从 src * .o 中提取GIMPLE表示,几乎全部重新编译...)

(Actually, LTO is done by a special variant lto1 of the compiler at link time; LTO works by serializing, inside the object files, some internal representations of GCC, which are also used by lto1; so what happens with -flto is that when compiling a src1.c with it the generated src1.o contains the GIMPLE representations in addition of the object binary; and when linking with gcc -flto src*.o the lto1 "front-end" is extracting that GIMPLE representations from inside the src*.o and almost recompiling all again...)

您需要在编译时显式传递 -flto 链接时间(请参见)。如果使用 Makefile ,您可以尝试 make CC ='gcc -flto';否则,用每个翻译单元编译。 gcc -Wall -flto -O2 -c src1.c (同样 src2.c 等等)并将 gcc -Wall -flto -O2 src1.o src2.o -o prog -lsomelib

You need to explicitly pass -flto both at compile time AND at link time (see this). If using a Makefile you could try make CC='gcc -flto'; otherwise, compile each translation unit with e.g. gcc -Wall -flto -O2 -c src1.c (and likewise for src2.c etc...) and link all of your program (or library) with gcc -Wall -flto -O2 src1.o src2.o -o prog -lsomelib

请注意, -flto 会显着减慢您的构建(它不会通过 -O3 所以你需要明确使用它,你需要链接到它也)。通常你会得到一个5%或10%的性能提高的内置程序,以几乎翻倍的构建时间为代价。有时候你可以得到更多的改进。

Notice that -flto will significantly slow down your build (it is not passed by -O3 so you need to use it explicitly, and you need to link with it also). Often you get a 5% or 10% improvement of performance -of the built program- at the expense of nearly doubling the build time. Sometimes you can get more improvements.

这篇关于只有在标题中定义的函数才会被内联。我缺少什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆