我应该使用哪个GC​​C优化标志? [英] Which gcc optimization flags should I use?

查看:147
本文介绍了我应该使用哪个GC​​C优化标志?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我希望尽量减少我的C程序运行的时候,我应该用什么样的优化参数(我想保持它的标准太)

If I want to minimize the time my c programs run, what optimization flags should I use (I want to keep it standard too)

目前我使用的是:

 -Wall -Wextra -pedantic -ansi -O3

我应该也使用

-std=c99

例如?

和有我特定的顺序,我应该把这些标志在我的makefile?这有什么区别呢?

And is there I specific order I should put those flags on my makefile? Does it make any difference?

和也,有什么理由不使用所有的优化标志可以找到?做他们曾经海誓山盟柜台或类似的东西?

And also, is there any reason not to use all the optimization flags I can find? do they ever counter eachother or something like that?

推荐答案

我建议编制新的code。与 -std = gnu11 或<$ C如果需要,$ C> -std = C11 。沉默所有的 -Wall 警告通常是一个不错的主意,IIRC。 -Wextra 警告你可能不希望改变一些东西。

I'd recommend compiling new code with -std=gnu11, or -std=c11 if needed. Silencing all -Wall warnings is usually a good idea, IIRC. -Wextra warns for some things you might not want to change.

要检查一番如何编译的一个好方法是看编译器ASM输出。 http://gcc.godbolt.org/ 很好地格式化输出ASM(剥出噪声)。把一些关键功能那里,看着什么不同的编译器版本做,如果你了解ASM都非常有用。

A good way to check how something compiles is to look at the compiler asm output. http://gcc.godbolt.org/ formats the asm output nicely (stripping out the noise). Putting some key functions up there and looking at what different compiler versions do is useful if you understand asm at all.

使用新的编译器版本。 gcc和铛双双在新版本中显著的改善。 GCC 5.3和3.8铿锵的是当前版本。 gcc5使得明显好转code比在某些情况下,GCC 4.9.3。

Use a new compiler version. gcc and clang have both improved significantly in newer versions. gcc 5.3 and clang 3.8 are the current releases. gcc5 makes noticeably better code than gcc 4.9.3 in some cases.

如果您只需要二进制在自己的机器上运行,你应该使用 -O3 -march =本地

如果您需要二进制在其他机器上运行,选择与东西指令集扩展,比如 -mssse3 -mpopcnt 基线。您可以使用 -mtune = Haswell的来优化Haswell的甚至同时使code仍然在旧的CPU运行(如 -march <确定/ code>)。

If you need the binary to run on other machines, choose the baseline for instruction-set extensions with stuff like -mssse3 -mpopcnt. You can use -mtune=haswell to optimize for Haswell even while making code that still runs on older CPUs (as determined by -march).

如果你的程序不依赖于严格的浮点舍入行为,使用 -ffast-数学。如果是这样,你通常可以仍然使用 -fno-数学错误号之类的东西,不启用 -funsafe-数学优化。有些FP code可以得到的的加速比从快速数学运算,比如自动矢量。

If your program doesn't depend on strict FP rounding behaviour, use -ffast-math. If it does, you can usually still use -fno-math-errno and stuff like that, without enabling -funsafe-math-optimizations. Some FP code can get big speedups from fast-math, like auto-vectorization.

如果你可以有效地做你的程序的试运行的锻炼最需要一个真正的运行进行优化,然后使用profile指导的优化code路径:

If you can usefully do a test-run of your program that exercises most of the code paths that need to be optimized for a real run, then use profile-directed optimization:

gcc  -fprofile-generate -Wall -Wextra -std=gnu11 -O3 -ffast-math -march=native -fwhole-program *.c -o my_program
./my_program -option1 < test_input1
./my_program -option2 < test_input2
gcc  -fprofile-use      -Wall -Wextra -std=gnu11 -O3 -ffast-math -march=native -fwhole-program *.c -o my_program

-fprofile使用启用 -funroll-循环,因为它有足够的信息来决定何时真正展开。循环展开所有的地方可以让事情变得更糟。然而,这是值得尝试 -funroll-循环来看看是否有帮助。

-fprofile-use enables -funroll-loops, since it has enough information to decide when to actually unroll. Unrolling loops all over the place can make things worse. However, it's worth trying -funroll-loops to see if it helps.

如果您的测试运行没有覆盖所有的code路径,那么一些重要的将被标记为冷和优化的少。

If your test runs don't cover all the code paths, then some important ones will be marked as "cold" and optimized less.

-O3 启用自动向量化,其中 -O2 没有。这可以使大的加速

-O3 enables auto-vectorization, which -O2 doesn't. This can give big speedups

-fwhole程序 可让跨文件内联,但只有当你把在一个GCC命令行所有的源文件的工作。 -flto 另一种方式来获得同样的效果。 (链接时优化)。铛支持 -flto 而不是 -fwhole程序

-fwhole-program allows cross-file inlining, but only works when you put all the source files on one gcc command-line. -flto is another way to get the same effect. (Link-Time Optimization). clang supports -flto but not -fwhole-program.

-fomit-frame-pointer的 已经一段时间的默认现在的x86-64,以及最近针对x86(32位)

-fomit-frame-pointer has been the default for a while now for x86-64, and more recently for x86 (32bit).

除了GCC,尝试用铿锵编译程序。锵有时可以更好的code比GCC,有时更糟糕。尝试这两种基准和

As well as gcc, try compiling your program with clang. Clang sometimes makes better code than gcc, sometimes worse. Try both and benchmark.

这篇关于我应该使用哪个GC​​C优化标志?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆