如何控制C数学是否使用SSE2? [英] How to control whether C math uses SSE2?

查看:95
本文介绍了如何控制C数学是否使用SSE2?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我以fp:strict模式使用MSVC进入了C库的超验数学函数的汇编.它们似乎都遵循相同的模式,这就是sin发生的情况.

首先,有一个名为"disp_pentium4.inc"的文件调度例程.它检查变量___use_sse2_mathfcns是否已设置;如果是这样,则调用__sin_pentium4,否则调用__sin_default.

__sin_pentium4(在"sin_pentium4.asm"中)从将参数从x87 fpu传递到xmm0寄存器开始,使用SSE2指令执行计算,然后将结果加载回fpu.

__sin_default(在"sin.asm"中)将变量保留在x87堆栈上,只需调用fsin.

因此,在两种情况下,操作数都被压入x87堆栈并返回到其上,从而使其对调用方透明,但如果定义了___use_sse2_mathfcns,则该操作实际上是在SSE2中而不是在x87中执行的./p>

这种行为对我来说非常有趣,因为x87先验函数因其实现方式而略有不同,因此臭名昭著,而给定的SSE2代码段应始终提供可重现的结果.

是否有一种方法可以确定在编译时还是在运行时使用SSE2代码路径?我不是熟练地编写汇编,因此,如果涉及编写任何汇编,将不胜感激代码示例.

解决方案

我通过仔细研究math.h找到了答案.这由称为_set_SSE2_enable的方法控制.这是此处:

启用或禁用Streaming SIMD Extensions 2(SSE2) CRT数学例程中的说明. (此功能不适用于 x64体系结构,因为默认情况下启用了SSE2.)

这将导致前面提到的___use_sse2_mathfcns标志设置为提供的值,从而有效地启用或禁用_pentium4 SSE2例程.

文档中提到这仅影响某些先验功能,但是从反汇编来看,这似乎会影响每个人.

除了以下内容外,进入每个功能都显示它们在SSE2中都可用:

  • fmod
  • sinh
  • cosh
  • tanh
  • sqrt

Sqrt是最大的违法者,但是使用内在函数在SSE2中实现却很简单.对于其他人,除了可能使用第三方库,没有简单的解决方案,但是我可能没有.

I stepped into the assembly of the transcendental math functions of the C library with MSVC in fp:strict mode. They all seem to follow the same pattern, here's what happens for sin.

First there is a dispatch routine from a file called "disp_pentium4.inc". It checks if the variable ___use_sse2_mathfcns has been set; if so, calls __sin_pentium4, otherwise calls __sin_default.

__sin_pentium4 (in "sin_pentium4.asm") starts by transferring the argument from the x87 fpu to the xmm0 register, performs the calculation using SSE2 instructions, and loads the result back in the fpu.

__sin_default (in "sin.asm") keeps the variable on the x87 stack and simply calls fsin.

So in both cases, the operand is pushed on the x87 stack and returned on it as well, making it transparent to the caller, but if ___use_sse2_mathfcns is defined, the operation is actually performed in SSE2 rather than x87.

This behavior is very interesting to me because the x87 transcendental functions are notorious for having slightly different behaviors depending on the implementation, whereas a given piece of SSE2 code should always give reproducible results.

Is there a way to determine for certain, either at compile or run-time, that the SSE2 code path will be used? I am not proficient writing assembly, so if this involves writing any assembly, a code example would be appreciated.

解决方案

I found the answer through careful investigation of math.h. This is controlled by a method called _set_SSE2_enable. This is a public symbol documented here:

Enables or disables the use of Streaming SIMD Extensions 2 (SSE2) instructions in CRT math routines. (This function is not available on x64 architectures because SSE2 is enabled by default.)

This causes the aforementionned ___use_sse2_mathfcns flag to be set to the provided value, effectively enabling or disabling use of the _pentium4 SSE2 routines.

The documentation mentions this affects only certain transcendental functions, but looking at the disassembly, this seems to affect everyone of them.

Edit: stepping into every function reveals that they're all available in SSE2 except for the following:

  • fmod
  • sinh
  • cosh
  • tanh
  • sqrt

Sqrt is the biggest offender, but it's trivial to implement in SSE2 using intrinsics. For the others, there's no simple solution except perhaps using a third-party library, but I can probably do without.

这篇关于如何控制C数学是否使用SSE2?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆