Glibc函数的GCC,-flto,-fno-builtin和自定义函数实现 [英] GCC, -flto, -fno-builtin and custom function implementation of glibc functions

查看:214
本文介绍了Glibc函数的GCC,-flto,-fno-builtin和自定义函数实现的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在观察带有GCC标志 -flto jemalloc / tcmalloc 。一旦使用 -flto 并链接到上面的库malloc / calloc,朋友就不会被 je / tc malloc 取代实现,则调用glibc实现。删除 -flto 标志后,一切都会按预期进行。我尝试将 -fno-builtin / -fno-builtin-* -flto一起使用,但仍然没有选择 je / tc malloc 实现。

I'm observing unexpected behaviour (at least I cant find explanation for it) with GCC flag -flto and jemalloc/tcmalloc. Once -flto is used and I link with above libraries malloc/calloc and friends are not replaced by je/tc malloc implementation, the glibc implementation is called. Once I remove -flto flag, everything works as expected. I tried to use -fno-builtin/-fno-builtin-* with -flto but still, it doesnt pick the je/tc malloc implementation.

-flto 机械厂?为什么二进制文件不选择新的实现?当它在未解析的外部(例如, printf )失败时,它甚至如何与 -fno-builtin 链接?

How the -flto machinery works? Why the binary doesnt pick new implementation? How it even links with -fno-builtin when it should fail on unresolved external for, say, printf?

EDIT001:

GCC 7.3

示例代码

EDIT001:
GCC 7.3
Sample code

int main()
{
    auto p = malloc(1024);
    free(p);
    return 0;
}

编译:


/ usr / bin / c ++ -O2 -g -DNDEBUG -flto -std = gnu ++ 14 -o
CMakeFiles / flto.dir / main.cpp.o- c
/home/user/Development/CPPJunk/flto/main.cpp

/usr/bin/c++ -O2 -g -DNDEBUG -flto -std=gnu++14 -o CMakeFiles/flto.dir/main.cpp.o -c /home/user/Development/CPPJunk/flto/main.cpp

链接:


/ usr / bin / c ++ -O2 -g -DNDEBUG -flto CMakeFiles / flto.dir / main.cpp.o
-o flto -L / home / user / Development / jemalloc -Wl,-rpath,/ home / user / Development / jemalloc -ljemalloc

/usr/bin/c++ -O2 -g -DNDEBUG -flto CMakeFiles/flto.dir/main.cpp.o -o flto -L/home/user/Development/jemalloc -Wl,-rpath,/home/user/Development/jemalloc -ljemalloc

EDIT002:

更合适的示例代码

EDIT002:
More suitable sample code

#include <cstdlib>

int main()
{
    auto p = malloc(1024);
    if (p) {
        free(p);
    }

    auto p1 = new int;
    if (p1) {
        delete p1;
    }

    auto p2 = new int[32];
    if (p2) {
        delete[] p2;
    }
    return 0;
}


推荐答案

首先,您的示例代码是错误。仔细阅读C11标准 n1570 。如果要使用 standard malloc ,则应 #include< stdlib.h>

First, your sample code is wrong. Read carefully the C11 standard n1570. When you want to use the standard malloc, you should #include <stdlib.h>.

在C ++ 11中(请阅读 n3337 malloc 被皱眉了,不应该使用(首选)。如果您仍然想使用 std :: malloc 在C ++中,您应该 #include< cstdlib> (在GCC中,内部包括< stdlib .h>

In C++11 (read n3337) malloc is frowned upon and should not be used (prefer new). If you still want to use std::malloc in C++ you should #include <cstdlib> (which, in GCC, is internally including <stdlib.h>)

然后,您的示例代码几乎就是C代码(一旦替换了 auto 带有 void * ),而不是C ++。可能已优化(一旦包含< stdlib.h> ),甚至没有 -flto 但只有 -O3 ,根据按条件规则,将主要。 (我什至写了一份公开报告, bismon-chariot-doc.pdf ,其中第1.4.2节在几页中解释了优化的过程。)

Then your sample code is almost C code (once you replace auto with void*), not C++. It could be optimized (once you include <stdlib.h>), even without -flto but with just -O3, according to the as-if rule, to an empty main. (I've even wrote a public report, bismon-chariot-doc.pdf, which has a section §1.4.2 explaining in several pages how that optimization happens).

要围绕 malloc 免费,GCC使用了某些 __ attribute __(malloc) 函数属性(在< stdlib.h> )的 malloc

To optimize around malloc and free, GCC uses some __attribute__(malloc) function attribute in the declaration (inside <stdlib.h>) of malloc.


-flto机械如何工作?

How the -flto machinery works?



LTO在 GCC内部结构§25



通过使用某些内部结构(类似于GIMPLE 和/或类似于SSA 的代码的表示形式,无论是在编译还是在链接时(实际上,链接步骤都变成了另一个具有整个程序优化功能的编译)在实践中两次编译代码。)

LTO is explained in GCC internals §25.

It works by using some internal (GIMPLE-like and/or SSA-like) representation of the code both at "compile" and at "link" time (actually, the linking step becomes another compilation with whole-program optimization, so your code gets "compiled" twice in practice).

LTO 始终(在实践中)应与一些优化标志一起使用(例如 -O2 或什至 -O3 )。因此,您应该使用 g ++ -flto -O2 编译并链接(使用 -flto 并没有实际意义。 code> 至少 -O2 完全相同优化标志应在编译和编译时使用链接时间)。

LTO always should (in practice) be used with some optimization flag (e.g. -O2 or even -O3) both at compile and at link time. So you should compile and link with g++ -flto -O2 (it has no practical sense to use -flto without at least -O2 and the exact same optimization flags should be used at compile and at link time).

更准确地说, -flto 还会在目标文件中嵌入一些内部文件( GIMPLE )表示形式,并且也用于链接时(尤其是优化内联在链接您的整个程序,重新使用其GIMPLE时再次发生。实际上,GCC包含一些LTO前端和名为 lto1 的编译器(此外,C ++前端和名为 cc1plus )和 lto1 是(当您将 link g ++ -flto -O2 一起使用时)

More precisely -flto also embeds in the object files some internal (GIMPLE-like) representation of the source code, and that is also used "at link time" (notably for optimization and inlining happening again when "linking" your entire program, re-using its GIMPLE). Actually GCC contains some LTO front-end and compiler called lto1 (in addition of the C++ front-end and compiler called cc1plus) and lto1 is (when you link with g++ -flto -O2) used at link time to reprocess these GIMPLE representations.

可能, libjemalloc 有其自己的标头,并且可能具有 inline (或可插入)函数。然后,从源代码编译该库时,还需要使用 -flto -O2 (这样它的Gimple会存储在该库中)

Probably, libjemalloc has its own headers, and might have inline (or inlinable) functions. Then you also need to use -flto -O2 when compiling that library from its source code (so that its Gimple is stored in the library)

最后,通常的 malloc 被调用的事实独立于 -flto 。这是一个链接器问题,而不是编译器问题。您可以尝试静态链接 -ljemalloc (然后最好使用 gcc -flto -O2 ;如果不这样构建,就不会在 malloc 调用中获得LTO优化)。

At last, the fact that the usual malloc gets called is independent of -flto. It is a linker issue, not a compiler one. You could try to link -ljemalloc statically (and then you'll better build that library also with gcc -flto -O2; if you don't build it like that you won't get LTO optimizations across malloc calls).

您也可以将 -v 传递给编译和链接命令,以了解 g ++ 的功能。您甚至可以通过 -Wl,-verbose 询问 ld (以 g ++开头) )。

You could pass also -v to your compilation and linking commands to understand what g++ is doing. You could even pass -Wl,--verbose to ask the ld (started by g++) to be verbose.

请注意,LTO(及其使用的内部表示形式)是特定于编译器和版本的。内部(Gimple& SSA )表示略微 GCC 7 GCC 8 (在Clang中,非常与众不同)当然不兼容)。动态链接器 ld-linux(8)可以

Notice that LTO (and the internal representations that it is using) is compiler and version specific. The internal (Gimple & SSA) representation is slightly different between GCC 7 & GCC 8 (and in Clang it is very different, so of course incompatible). The dynamic linker ld-linux(8) does not know about LTO.

PS。您可以安装 libjemalloc-dev 软件包,并在代码中添加 #include< jemalloc / jemalloc.h> 。另请参见 jemalloc(3)手册页。可以配置或修补 libjemalloc 来定义一些 je_malloc 符号,以代替 malloc 。然后(对于LTO)在代码中使用 je_malloc 会更简单(避免几个 malloc 之间的冲突 ELF 符号)。要了解有关共享库中符号的更多信息,请阅读Drepper的 如何编写共享库 纸。当然,您应该期望LTO改变链接的行为!

PS. You could install the libjemalloc-dev package and add #include <jemalloc/jemalloc.h> in your code. See also jemalloc(3) man page. Probably libjemalloc could be configured or patched to define some je_malloc symbol as a replacement for malloc. Then it would be simpler (for LTO) to use je_malloc in your code (to avoid conflict between several malloc ELF symbols). To learn more about symbols in shared libraries, read Drepper's How to Write Shared Libraries paper. And of course you should expect LTO to change the behavior of linking!

这篇关于Glibc函数的GCC,-flto,-fno-builtin和自定义函数实现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆