为什么clang和gcc无法优化从int到float的转换? [英] Why can't clang and gcc optimize away this int-to-float conversion?

查看:176
本文介绍了为什么clang和gcc无法优化从int到float的转换?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请考虑以下代码:

void foo(float* __restrict__ a)
{
    int i; float val;
    for (i = 0; i < 100; i++) {
        val = 2 * i;
        a[i] = val;
    }
}

void bar(float* __restrict__ a)
{
    int i; float val = 0.0;
    for (i = 0; i < 100; i++) {
        a[i] = val;
        val += 2.0;
    }
}

它们基于示例7.26a和7.26b在Agner Fog的使用C ++优化软件中,应该做同样的事情; bar 的写法是效率更高,因为我们并不在每次迭代时都执行整数到浮点转换,而是进行浮点加法,这种方法更便宜(在x86_64上)。

They're based on Examples 7.26a and 7.26b in Agner Fog's Optimizing software in C++ and should do the same thing; bar is more "efficient" as written in the sense that we don't do an integer-to-float conversion at every iteration, but rather a float addition which is cheaper (on x86_64).

此处是clang和gcc在这两个函数上产生结果(没有向量化和展开)。

Here are the clang and gcc results on these two functions (with no vectorization and unrolling).

问题:在我看来,用循环索引替换乘法的最佳方法是常量值的添加(这是有益的)应该由编译器执行,即使(或尤其是)涉及类型转换也是如此。为什么这两个函数没有出现这种情况?

Question: It seems to me that the optimization of replacing a multiplication by the loop index with an addition of a constant value - when this is beneficial - should be carried out by compilers, even if (or perhaps especially if) there's a type conversion involved. Why is this not happening for these two functions?

请注意,如果我们使用int而不是float的话:

Note that if we use int's rather than float's:

void foo(int* __restrict__ a)
{
    int i; int val = 0;
    for (i = 0; i < 100; i++) {
        val = 2 * i;
        a[i] = val;
    }
}

void bar(int* __restrict__ a)
{
    int i; int val = 0;
    for (i = 0; i < 100; i++) {
        a[i] = val;
        val += 2;
    }
}

clang和gcc都执行预期的优化,尽管不是完全相同(请参见此问题) 。

Both clang and gcc perform the expected optimization, albeit not quite in the same way (see this question).

推荐答案

您正在寻找启用归纳变量优化。这种优化在浮点域中通常不安全,因为它会更改程序语义。在您的示例中,因为初始值( 0.0 )和步长( 2.0 )都可以精确表示为IEEE格式,但这在实践中很少见。

You are looking for enabling induction variable optimization for floating point numbers. This optimization is generally unsafe in floating point land as it changes program semantics. In your example it'll work because both initial value (0.0) and step (2.0) can be precisely represented in IEEE format but this is a rare case in practice.

可以在 -ffast-math 下启用它似乎这在GCC中不被认为是重要案例,因为它在早期就拒绝了非整数归纳变量(请参见 tree-scalar-evolution.c )。

It could be enabled under -ffast-math but it seems this wasn't considered as important case in GCC as it rejects non-integral induction variables early on (see tree-scalar-evolution.c).

如果您认为这是一个重要的用例,您可以考虑在 GCC Bugzilla 提交请求。

If you believe that this is an important usecase you might consider filing request at GCC Bugzilla.

这篇关于为什么clang和gcc无法优化从int到float的转换?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆