如何确定C数学是否使用SSE2? [英] How to determine whether C math uses SSE2?
问题描述
我踏进的C库与MSVC在FP的超越数学函数集:严格模式。它们似乎都遵循相同的模式,这里是罪恶发生的事情
。
I stepped into the assembly of the transcendental math functions of the C library with MSVC in fp:strict mode. They all seem to follow the same pattern, here's what happens for sin
.
首先是由一个名为disp_pentium4.inc文件中的调度例程。它检查变量 ___ use_sse2_mathfcns
已设置;如果是的话,调用 __ sin_pentium4
,否则调用 __ sin_default
。
First there is a dispatch routine from a file called "disp_pentium4.inc". It checks if the variable ___use_sse2_mathfcns
has been set; if so, calls __sin_pentium4
, otherwise calls __sin_default
.
__ sin_pentium4
(在sin_pentium4.asm)开始由参数从的x87 FPU转移到XMM0注册,使用SSE2指令进行计算,并且将结果早在FPU。
__sin_pentium4
(in "sin_pentium4.asm") starts by transferring the argument from the x87 fpu to the xmm0 register, performs the calculation using SSE2 instructions, and loads the result back in the fpu.
__ sin_default
(在sin.asm)保持的x87栈上的变量,只是简单地调用 FSIN
。
__sin_default
(in "sin.asm") keeps the variable on the x87 stack and simply calls fsin
.
因此,在这两种情况下,操作数被压入的x87堆栈上并在其上返回的为好,使它透明的呼叫者,但如果 ___ use_sse2_mathfcns
被定义,则操作在SSE2实际执行,而不是的x87。
So in both cases, the operand is pushed on the x87 stack and returned on it as well, making it transparent to the caller, but if ___use_sse2_mathfcns
is defined, the operation is actually performed in SSE2 rather than x87.
此行为是对我来说很有趣,因为使用x87超越函数是臭名昭著的具有取决于实施的行为稍有不同,而SSE2 code给定的片断应该总是给重复的结果。
This behavior is very interesting to me because the x87 transcendental functions are notorious for having slightly different behaviors depending on the implementation, whereas a given piece of SSE2 code should always give reproducible results.
有没有一种方法来确定肯定的,无论是在编译或运行时,该SSE2 code路径将被使用?我不精通写作组装,因此,如果这涉及编写的任何组件,code例子是AP preciated。
Is there a way to determine for certain, either at compile or run-time, that the SSE2 code path will be used? I am not proficient writing assembly, so if this involves writing any assembly, a code example would be appreciated.
推荐答案
我发现通过math.h中的缜密侦查答案这是通过调用方法 _set_SSE2_enable
控制。这是记录了公共符号这里 :
I found the answer through careful investigation of math.h. This is controlled by a method called _set_SSE2_enable
. This is a public symbol documented here:
启用或禁用使用SIMD流指令扩展2(SSE2)
在CRT数学例程的指令。 (此功能不可用
x64体系结构,因为SSE2默认启用)。
Enables or disables the use of Streaming SIMD Extensions 2 (SSE2) instructions in CRT math routines. (This function is not available on x64 architectures because SSE2 is enabled by default.)
这导致aforementionned ___use_sse2_mathfcns标志被设置为设置值,从而有效地使能或禁止使用_pentium4 SSE2例程
This causes the aforementionned ___use_sse2_mathfcns flag to be set to the provided value, effectively enabling or disabling use of the _pentium4 SSE2 routines.
说明文档中提到这只影响某些超越函数,但是看着拆卸,这似乎影响到他们的每一个人。
The documentation mentions this affects only certain transcendental functions, but looking at the disassembly, this seems to affect everyone of them.
编辑:步入每个函数显示,他们都可以在SSE2但以下情况除外:
stepping into every function reveals that they're all available in SSE2 except for the following:
- FMOD
- 的sinh
- COSH
- 正切
- 开方
的Sqrt是最大的罪犯,但它的琐碎使用内部函数在SSE2实现。对于其他人,还有也许除了使用第三方库没有简单的解决办法,但我可能可以不用。
Sqrt is the biggest offender, but it's trivial to implement in SSE2 using intrinsics. For the others, there's no simple solution except perhaps using a third-party library, but I can probably do without.
这篇关于如何确定C数学是否使用SSE2?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!