源代码合并是否真的可以提高C或C ++程序的性能? [英] Does source code amalgamation really increase the performances of a C or C++ program?

查看:86
本文介绍了源代码合并是否真的可以提高C或C ++程序的性能?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

代码合并包括将整个源代码复制到一个文件中。



例如,它是由 SQLite 可以减少编译时间并提高生成的可执行文件的性能。在这里,它导致一个文件,包含18.4万行代码。 com / questions / 543697 / include-all-cpp-files-into-a-single-compilation-unit>这个问题),但有关可执行文件的效率。



SQLite开发人员说:


除了使SQLite易于合并到其他项目中之外,合并还使其运行快点。当代码包含在单个翻译单元中(例如合并在一起)时,许多编译器可以对代码进行其他优化。当我们使用合并功能编译SQLite而不是单独的源文件时,我们测得的性能提升为5%至10%。这样做的缺点是,其他优化通常采用函数内联的形式,这倾向于使结果二进制图像的大小变大。


据我了解,这是由于程序间优化(IPO)造成的,



GCC 开发人员也这样说(感谢@nwp提供的链接):


编译器根据其知识进行优化具有该程序。一次将多个文件编译为一个输出文件模式,使编译器可以在编译每个文件时使用从所有文件获得的信息。


但是他们没有谈论这最终的收益。



除了SQLite以外,是否还有其他方法可以证实或驳斥IPO > with 合并时与使用gcc编译时IPO 没有合并时相比,IPO产生更快的可执行文件?



作为附带的问题,是否相同进行代码合并或将所有.cpp(或.c)文件包含到有关此优化的单个文件中?

解决方案

源代码文件的组织不会生成更有效的二进制文件,并且从多个源文件中检索的速度可以忽略不计。



版本控制系统将采用任何文件的 deltas ,而不管大小如何。



Ordi通常,将分别编译这些单独的组件以生成包含相关目标代码的二进制 libraries :源代码不会每次都重新编译。当应用程序A使用已更改的库B时,则必须重新链接应用程序A ,但是如果库的API没有更改,则不必重新编译。



而且,就库本身而言,如果它包含(数百个)单独的源文件,则仅需在库之前重新编译已更改的文件即可重新链接。 (任何 Makefile 都将执行此操作。)如果源代码是一件大事,则您每次都必须重新编译所有文件,而这可能会花费一个时间。 很长时间...基本上是浪费时间。



从库中获取对象代码有两种方法(一次已构建...)可以合并到可执行文件中: static 链接和 dynamic。如果使用静态链接,则必要部分库的内容将被复制到可执行文件中...,但不是全部。运行可执行文件时,库文件不必存在。



如果使用 dynamic 链接,则整个库都位于单独的文件(例如 .DLL .so ) em>必须在运行时出现,但同时使用它的每个应用程序都将共享它们。



我建议您首先将其视为源代码管理问题,而不是赋予任何形式的技术或运行时优势的问题。 (不会。)我发现很难看到一个有说服力的理由去做。


Code amalgamation consists in copying the whole source code in one single file.

For instance, it is done by SQLite to reduce the compile time and increase the performances of the resulting executable. Here, it results in one file of 184K lines of code.

My question is not about compile time (already answered in this question), but about the efficiency of the executable.

SQLite developers say:

In addition to making SQLite easier to incorporate into other projects, the amalgamation also makes it run faster. Many compilers are able to do additional optimizations on code when it is contained with in a single translation unit such as it is in the amalgamation. We have measured performance improvements of between 5 and 10% when we use the amalgamation to compile SQLite rather than individual source files. The downside of this is that the additional optimizations often take the form of function inlining which tends to make the size of the resulting binary image larger.

From what I understood, this is due to interprocedural optimization (IPO), an optimization made by the compiler.

GCC developers also say this (thanks @nwp for the link):

The compiler performs optimization based on the knowledge it has of the program. Compiling multiple files at once to a single output file mode allows the compiler to use information gained from all of the files when compiling each of them.

But they do not speak about the eventual gain of this.

Are there any measurements, apart from those of SQLite, which confirm or refute the claim that IPO with amalgamation produces faster executables than IPO without amalgamation when compiled with gcc?

As a side question, is it the same thing to do code amalgamation or to #include all the .cpp (or .c) files into one single file regarding this optimization?

解决方案

The organization of the source-code files will not "produce a more efficient binary," and the speed of retrieving from multiple source files is negligible.

A version control system will take deltas of any file regardless of size.

Ordinarily, separate components such as these are separately compiled to produce binary libraries containing the associated object code: the source code is not recompiled each time. When an "application A" uses a "library B" that is changed, then "application A" must be re-linked but it does not have to be recompiled if the library's API has not changed.

And, in terms of the library itself, if it consists of (hundreds of) separate source-files, only the files that have been changed have to be recompiled before the library is re-linked. (Any Makefile will do this.) If the source-code were "one huge thing," you'd have to recompile all of it every time, and that could take a long time ... basically, a waste of time.

There are two ways in which the object-code from a library (once it has been built ...) can be incorporated into an executable: static linking, and dynamic. If static linking is used, the necessary parts of the library will be copied into the executable ... but, not all of it. The library-file does not have to be present when the executable is run.

If dynamic linking is used, the entire library exists in a separate file (e.g. .DLL or .so) which does have to be present at runtime but which will be shared by every application that is using it at the same time.

I recommend that you primarily view this as a source-code management issue, not as something that will confer any sort of technical or runtime advantages. (It will not.) I find it difficult to see a compelling reason to do this at all.

这篇关于源代码合并是否真的可以提高C或C ++程序的性能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆