如何确定C数学是否使用SSE2? [英] How to determine whether C math uses SSE2?

查看:1124
本文介绍了如何确定C数学是否使用SSE2?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我踏进的C库与MSVC在FP的超越数学函数集:严格模式。它们似乎都遵循相同的模式,这里是罪恶发生的事情

I stepped into the assembly of the transcendental math functions of the C library with MSVC in fp:strict mode. They all seem to follow the same pattern, here's what happens for sin.

首先是由一个名为disp_pentium4.inc文件中的调度例程。它检查变量 ___ use_sse2_mathfcns 已设置;如果是的话,调用 __ sin_pentium4 ,否则调用 __ sin_default

First there is a dispatch routine from a file called "disp_pentium4.inc". It checks if the variable ___use_sse2_mathfcns has been set; if so, calls __sin_pentium4, otherwise calls __sin_default.

__ sin_pentium4 (在sin_pentium4.asm)开始由参数从的x87 FPU转移到XMM0注册,使用SSE2指令进​​行计算,并且将结果早在FPU。

__sin_pentium4 (in "sin_pentium4.asm") starts by transferring the argument from the x87 fpu to the xmm0 register, performs the calculation using SSE2 instructions, and loads the result back in the fpu.

__ sin_default (在sin.asm)保持的x87栈上的变量,只是简单地调用 FSIN

__sin_default (in "sin.asm") keeps the variable on the x87 stack and simply calls fsin.

因此​​,在这两种情况下,操作数被压入的x87堆栈上并在其上返回的为好,使它透明的呼叫者,但如果 ___ use_sse2_mathfcns 被定义,则操作在SSE2实际执行,而不是的x87。

So in both cases, the operand is pushed on the x87 stack and returned on it as well, making it transparent to the caller, but if ___use_sse2_mathfcns is defined, the operation is actually performed in SSE2 rather than x87.

此行​​为是对我来说很有趣,因为使用x87超越函数是臭名昭著的具有取决于实施的行为稍有不同,而SSE2 code给定的片断应该总是给重复的结果。

This behavior is very interesting to me because the x87 transcendental functions are notorious for having slightly different behaviors depending on the implementation, whereas a given piece of SSE2 code should always give reproducible results.

有没有一种方法来确定肯定的,无论是在编译或运行时,该SSE2 code路径将被使用?我不精通写作组装,因此,如果这涉及编写的任何组件,code例子是AP preciated。

Is there a way to determine for certain, either at compile or run-time, that the SSE2 code path will be used? I am not proficient writing assembly, so if this involves writing any assembly, a code example would be appreciated.

推荐答案

我发现通过math.h中的缜密侦查答案这是通过调用方法 _set_SSE2_enable 控制。这是记录了公共符号这里

I found the answer through careful investigation of math.h. This is controlled by a method called _set_SSE2_enable. This is a public symbol documented here:

启用或禁用使用SIMD流指令扩展2(SSE2)
  在CRT数学例程的指令。 (此功能不可用
  x64体系结构,因为SSE2默认启用)。

Enables or disables the use of Streaming SIMD Extensions 2 (SSE2) instructions in CRT math routines. (This function is not available on x64 architectures because SSE2 is enabled by default.)

这导致aforementionned ___use_sse2_mathfcns标志被设置为设置值,从而有效地使能或禁止使用_pentium4 SSE2例程

This causes the aforementionned ___use_sse2_mathfcns flag to be set to the provided value, effectively enabling or disabling use of the _pentium4 SSE2 routines.

说明文档中提到这只影响某些超越函数,但是看着拆卸,这似乎影响到他们的每一个人。

The documentation mentions this affects only certain transcendental functions, but looking at the disassembly, this seems to affect everyone of them.

编辑:步入每个函数显示,他们都可以在SSE2但以下情况除外:

stepping into every function reveals that they're all available in SSE2 except for the following:


  • FMOD

  • 的sinh

  • COSH

  • 正切

  • 开方

的Sqrt是最大的罪犯,但它的琐碎使用内部函数在SSE2实现。对于其他人,还有也许除了使用第三方库没有简单的解决办法,但我可能可以不用。

Sqrt is the biggest offender, but it's trivial to implement in SSE2 using intrinsics. For the others, there's no simple solution except perhaps using a third-party library, but I can probably do without.

这篇关于如何确定C数学是否使用SSE2?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆