ICC是否满足C99规范的复数乘法? [英] Does ICC satisfy C99 specs for multiplication of complex numbers?

查看:153
本文介绍了ICC是否满足C99规范的复数乘法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑以下简单代码:

#include <complex.h>
complex float f(complex float x) {
  return x*x;
}

如果使用英特尔编译器通过-O3 -march=core-avx2 -fp-model strict进行编译,则会得到:

If you compile it with -O3 -march=core-avx2 -fp-model strict using the Intel Compiler you get:

f:
        vmovsldup xmm1, xmm0                                    #3.12
        vmovshdup xmm2, xmm0                                    #3.12
        vshufps   xmm3, xmm0, xmm0, 177                         #3.12
        vmulps    xmm4, xmm1, xmm0                              #3.12
        vmulps    xmm5, xmm2, xmm3                              #3.12
        vaddsubps xmm0, xmm4, xmm5                              #3.12
        ret 

这比从gccclang获得的代码简单得多,并且比在网上找到的乘以复数的代码简单得多.例如,它似乎并未明确处理复杂的NaN或无穷大.

This is much simpler code than you get from both gcc and clang and also much simpler than the code you will find online for multiplying complex numbers. It doesn't, for example appear explicitly to deal with complex NaN or infinities.

此程序集符合C99复数乘法的规范吗?

Does this assembly meet the specs for C99 complex multiplication?

推荐答案

代码不符合要求.

附录G,第5.1节,第4段,

Annex G, Section 5.1, Paragraph 4 reads

*/运算符对所有实数,虚数和复数操作数均满足以下无限性:

The * and / operators satisfy the following infinity properties for all real, imaginary, and complex operands:

-如果一个操作数是无穷大,而另一个操作数是非零有限数或无穷大,则*运算符的结果是无穷大;

— if one operand is an infinity and the other operand is a nonzero finite number or an infinity, then the result of the * operator is an infinity;

因此,如果 z = a * i b是无限的,而 w = c * i d是无限的,数字 z * w 必须是无限的.

So if z = a * ib is infinite and w = c * id is infinite, the number z * w must be infinite.

同一附件的第3节第1款定义了复数是无限的含义:

The same annex, Section 3, Paragraph 1 defines what it means for a complex number to be infinite:

具有至少一个无穷大部分的复数或虚数值被视为无穷大(即使其另一部分是NaN).

A complex or imaginary value with at least one infinite part is regarded as an infinity (even if its other part is a NaN).

因此,如果a或b是 z 是无限的.
这确实是一个明智的选择,因为它反映了数学框架 1 .

So z is infinite if either a or b are.
This is indeed a sensible choice as it reflects the mathematical framework1.

但是,如果让 z =∞+ i ∞(无穷大)和 w = i ∞(和无穷大)Intel代码的结果为 z * w = NaN + i NaN由于∞·0中间物 2 .

However if we let z = ∞ + i∞ (an infinite value) and w = i∞ (and infinite value) the result for the Intel code is z * w = NaN + iNaN due to the ∞ · 0 intermediates2.

这足以将其标记为不合格.

This suffices to label it as non-conforming.

我们可以通过看一下第一引号的脚注来进一步确认这一点(此处未报告脚注),其中提到了CX_LIMITED_RANGE pragma指令.

We can further confirm this by taking a look at the footnote on the first quote (the footnote was not reported here), it mentions the CX_LIMITED_RANGE pragma directive.

第7.3.4节的第1段内容为

Section 7.3.4, Paragraph 1 reads

复数乘法,除法和绝对值的常用数学公式存在问题,因为它们处理的是无穷大,并且由于过分的上溢和下溢. CX_LIMITED_RANGE编译指示可用于通知实现(状态为"on")(可以生成NaN)的常用数学公式是可以接受的.

The usual mathematical formulas for complex multiply, divide, and absolute value are problematic because of their treatment of infinities and because of undue overflow and underflow. The CX_LIMITED_RANGE pragma can be used to inform the implementation that (where the state is ‘‘on’’) the usual mathematical formulas [that produces NaNs] are acceptable.

标准委员会正在努力减轻复杂乘法(和除法)的繁琐工作.
实际上,海湾合作委员会有一个标志来控制这种行为:

Here the standard committee is trying to alleviate the huge mole of work for the complex multiplication (and division).
In fact GCC has a flag to control this behaviour:

-fcx-limited-range
启用后,此选项表明执行复杂除法时不需要范围缩小步骤.

-fcx-limited-range
When enabled, this option states that a range reduction step is not needed when performing complex division.

此外,也没有检查复数乘法或除法的结果是否为NaN + I * NaN,从而试图挽救这种情况.

默认值为-fno-cx-limited-range,但已由-ffast-math 启用.
此选项控制ISO C99 CX_LIMITED_RANGE编译指示的默认设置.

The default is -fno-cx-limited-range, but is enabled by -ffast-math.
This option controls the default setting of the ISO C99 CX_LIMITED_RANGE pragma.

仅此选项,即使GCC生成缓慢的代码和其他检查,如果没有它,它生成的代码具有与英特尔的(我将源代码翻译成C ++)

It this option alone that makes GCC generate slow code and additional checks, without it the code it generate has the same flaws of Intel's one (I translated the source to C++)

f(std::complex<float>):
        movq    QWORD PTR [rsp-8], xmm0
        movss   xmm0, DWORD PTR [rsp-8]
        movss   xmm2, DWORD PTR [rsp-4]
        movaps  xmm1, xmm0
        movaps  xmm3, xmm2
        mulss   xmm1, xmm0
        mulss   xmm3, xmm2
        mulss   xmm0, xmm2
        subss   xmm1, xmm3
        addss   xmm0, xmm0
        movss   DWORD PTR [rsp-16], xmm1
        movss   DWORD PTR [rsp-12], xmm0
        movq    xmm0, QWORD PTR [rsp-16]
        ret

没有它的代码是

f(std::complex<float>):
        sub     rsp, 40
        movq    QWORD PTR [rsp+24], xmm0
        movss   xmm3, DWORD PTR [rsp+28]
        movss   xmm2, DWORD PTR [rsp+24]
        movaps  xmm1, xmm3
        movaps  xmm0, xmm2
        call    __mulsc3
        movq    QWORD PTR [rsp+16], xmm0
        movss   xmm0, DWORD PTR [rsp+16]
        movss   DWORD PTR [rsp+8], xmm0
        movss   xmm0, DWORD PTR [rsp+20]
        movss   DWORD PTR [rsp+12], xmm0
        movq    xmm0, QWORD PTR [rsp+8]
        add     rsp, 40
        ret

__mulsc3函数实际上与标准C99对复数乘法推荐的相同.
它包括上述检查.

and the __mulsc3 function is practically the same the standard C99 recommends for complex multiplication.
It includes the above mentioned checks.

1 其中,数字的模数是从实数| z |扩展而来的到复数"z",因为无穷限制而保持了无限的定义.简而言之,在复平面中有一个无限值的整个圆周,并且只需要一个坐标"就可以使一个无限大的模数成为无限大.

1 Where the modulus of a number is extended from the real case |z| to the complex one ‖z‖, keeping the definition of infinite as the result of unbounded limits. Simply put, in the complex plane there is a whole circumference of infinite values and it takes just one "coordinate" to be infinite to get an infinite modulus.

2 如果我们记得 z = NaN + i ∞或 z =∞,情况将变得最糟+ i NaN是有效的无限值

2 The situation get worst if we remember that z = NaN + i∞ or z = ∞ + iNaN are valid infinite values

这篇关于ICC是否满足C99规范的复数乘法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆