--force-lto gem5 scons生成选项是否可以显着加快仿真速度,并且与gem5.fast生成相比如何? [英] Does the --force-lto gem5 scons build option speed up simulation significantly and how does it compare to a gem5.fast build?

查看:203
本文介绍了--force-lto gem5 scons生成选项是否可以显着加快仿真速度,并且与gem5.fast生成相比如何?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在寻找加快仿真速度的方法时,我遇到了--force-lto选项.

While looking for ways to speed up my simulation, I came across the --force-lto option.

我以前听说过LTO(链接时间优化),所以让我想知道为什么--force-lto在构建gem5时不是默认设置?

I've heard about LTO (Link Time Optimization) before, so that made me wonder why isn't --force-lto the default while building gem5?

与gem5.opt构建相比,使模拟进行的速度比gem5.fast构建快得多吗?

Would that make a simulation go much faster than a gem5.fast build compared to a gem5.opt build?

推荐答案

在gem5 fe15312aae8007967812350f8cdac9ad766dcff7(2019)中,默认情况下gem5.fast构建已启用LTO,因此您通常不希望显式使用该选项,而只希望gem5.opt.

In gem5 fe15312aae8007967812350f8cdac9ad766dcff7 (2019), the gem5.fast build already enables LTO by default, so you generally never want to use that option explicitly, but rather want just gem5.opt.

关于.fast的其他事项:

  • it also removes -g and so you get no debug symbols. I wonder why, since that does not make runs any faster.
  • it also turns on NDEBUG, which has the standard library effect of disabling asserts entirely, but plus some gem5 specific effects spread throughout the code with #ifndef NDEBUG checks
  • it disables TRACING_ON, which makes DPRINTF and family become empty statements as seen at: src/base/trace.hh

这些效果很容易看到src/SConstruct .

Those effects can be seen easily at src/SConstruct.

存在该选项是因为更常见的gem5.opt构建也使用

That option exists because the more common gem5.opt build also uses partial linking, which in some versions of GCC was incompatible with LTO.

因此,顾名思义,--force-lto强制将LTO与部分链接一起使用,这可能是不稳定的.这就是为什么我建议您使用gem5.fast而不是触摸--force-lto的原因.

Therefore, as its the name suggests, --force-lto forces the use of LTO together with partial linking, which might not be stable. That's why I recommend that you use gem5.fast rather than touching --force-lto.

部分链接的目的大概是为了加快链接步骤,这很容易成为更改文件,重新构建,重新链接,测试"循环的瓶颈,尽管在我的实验中尚不清楚它是否有效在这样做.今天,它可能只是过去的遗物.

The goal of partial linking is presumably to speed up the link step, which can easily be the bottleneck in a "change on file, rebuild, relink, test" loop, although in my experiments it is not clear that it is efficient at doing that. Today it might just be a relic from the past.

要尝试加快链接速度,建议您改用scons --gold-linker,它

To try to speed up linking, I recommend that you try scons --gold-linker instead, which uses the GOLD linker instead of ld. Note that this option was more noticeably effective for gem5.debug however.

我发现对于原子CPU,gem5.fast通常比gem5.opt快20%.

I have found that gem5.fast is generally 20% faster than gem5.opt for Atomic CPUs.

这篇关于--force-lto gem5 scons生成选项是否可以显着加快仿真速度,并且与gem5.fast生成相比如何?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆