--force-lto gem5 scons生成选项是否可以显着加快仿真速度,并且与gem5.fast生成相比如何? [英] Does the --force-lto gem5 scons build option speed up simulation significantly and how does it compare to a gem5.fast build?
问题描述
在寻找加快仿真速度的方法时,我遇到了--force-lto
选项.
While looking for ways to speed up my simulation, I came across the --force-lto
option.
我以前听说过LTO(链接时间优化),所以让我想知道为什么--force-lto
在构建gem5时不是默认设置?
I've heard about LTO (Link Time Optimization) before, so that made me wonder why isn't --force-lto
the default while building gem5?
与gem5.opt构建相比,使模拟进行的速度比gem5.fast构建快得多吗?
Would that make a simulation go much faster than a gem5.fast build compared to a gem5.opt build?
推荐答案
在gem5 fe15312aae8007967812350f8cdac9ad766dcff7(2019)中,默认情况下gem5.fast构建已启用LTO,因此您通常不希望显式使用该选项,而只希望gem5.opt
.
In gem5 fe15312aae8007967812350f8cdac9ad766dcff7 (2019), the gem5.fast build already enables LTO by default, so you generally never want to use that option explicitly, but rather want just gem5.opt
.
关于.fast
的其他事项:
- 它也会删除
-g
,因此您没有调试符号.我不知道为什么,因为这不能使运行速度更快. - 它还会打开
NDEBUG
,它具有完全禁用assert
的标准库效果,但加上一些#ifndef NDEBUG
检查 ,在代码中散布了一些gem5特有的效果
- 它禁用了
TRACING_ON
,这使DPRINTF和family变为空语句,如src/base/trace.hh 所示.
- it also removes
-g
and so you get no debug symbols. I wonder why, since that does not make runs any faster. - it also turns on
NDEBUG
, which has the standard library effect of disablingassert
s entirely, but plus some gem5 specific effects spread throughout the code with#ifndef NDEBUG
checks - it disables
TRACING_ON
, which makes DPRINTF and family become empty statements as seen at: src/base/trace.hh
这些效果很容易看到在src/SConstruct
.
Those effects can be seen easily at src/SConstruct
.
That option exists because the more common gem5.opt
build also uses partial linking, which in some versions of GCC was incompatible with LTO.
因此,顾名思义,--force-lto
强制将LTO与部分链接一起使用,这可能是不稳定的.这就是为什么我建议您使用gem5.fast
而不是触摸--force-lto
的原因.
Therefore, as its the name suggests, --force-lto
forces the use of LTO together with partial linking, which might not be stable. That's why I recommend that you use gem5.fast
rather than touching --force-lto
.
部分链接的目的大概是为了加快链接步骤,这很容易成为更改文件,重新构建,重新链接,测试"循环的瓶颈,尽管在我的实验中尚不清楚它是否有效在这样做.今天,它可能只是过去的遗物.
The goal of partial linking is presumably to speed up the link step, which can easily be the bottleneck in a "change on file, rebuild, relink, test" loop, although in my experiments it is not clear that it is efficient at doing that. Today it might just be a relic from the past.
要尝试加快链接速度,建议您改用scons --gold-linker
,它
To try to speed up linking, I recommend that you try scons --gold-linker
instead, which uses the GOLD linker instead of ld. Note that this option was more noticeably effective for gem5.debug however.
我发现对于原子CPU,gem5.fast
通常比gem5.opt
快20%.
I have found that gem5.fast
is generally 20% faster than gem5.opt
for Atomic CPUs.
这篇关于--force-lto gem5 scons生成选项是否可以显着加快仿真速度,并且与gem5.fast生成相比如何?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!