Clang的'_mm256_pow_ps'内在函数在哪里? [英] Where is Clang's '_mm256_pow_ps' intrinsic?

查看:127
本文介绍了Clang的'_mm256_pow_ps'内在函数在哪里?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我似乎找不到_mm_pow_ps或_mm256_pow_ps的内在函数,这两个函数都应该包含在'immintrin.h'中.

I can't seem to find the intrinsics for either _mm_pow_ps or _mm256_pow_ps, both of which are supposed to be included with 'immintrin.h'.

Clang是否未定义它们,或者它们是否包含在我不包含的标头中?

Does Clang not define these or are they in a header I'm not including?

推荐答案

这不是内在的;这是一个英特尔SVML库 function 名称,它混淆了使用与实际内部函数相同的命名方案. 没有vpowps指令.(至强融核上的AVX512ER确实具有半相关的vexp2ps 指令...)

That's not an intrinsic; it's an Intel SVML library function name that confusingly uses the same naming scheme as actual intrinsics. There's no vpowps instruction. (AVX512ER on Xeon Phi does have the semi-related vexp2ps instruction...)

IDK,如果此命名方案是在使用其编译器(随SVML附带)编写SIMD代码时诱使人们依赖于Intel工具,或者是因为如果输入是已知或其他原因.

IDK if this naming scheme is to trick people into depending on Intel tools when writing SIMD code with their compiler (which comes with SVML), or because their compiler does treat it like an intrinsic/builtin for doing constant propagation if inputs are known, or some other reason.

要使类似的功能和_mm_sin_ps可用,您需要Intel的短向量数学库(SVML).大多数人只是避免使用它们.但是,如果它具有所需的实现,则值得研究. IDK还有哪些其他vector pow实现.

For functions like that and _mm_sin_ps to be usable, you need Intel's Short Vector Math Library (SVML). Most people just avoid using them. If it has an implementation of something you want, though, it's worth looking into. IDK what other vector pow implementations exist.

内部查找器中,您可以避免看到这些非-如果您未选中SVML框,则会在搜索结果中显示便携式功能.

In the intrinsics finder, you can avoid seeing these non-portable functions in your search results if you leave the SVML box unchecked.

有些像"c5"这样的复合"内部函数通常会编译为多个加载和重排,它们在编译器之间可移植,并且可以内联而不是调用库函数.

There are some "composite" intrinsics like _mm_set_epi8() that typically compile to multiple loads and shuffles which are portable across compilers, and do inline instead of being calls to library functions.

还要注意,sqrtps是本机指令,所以_mm_sqrt_ps()是真正的内在指令. IEEE 754将mul,div,add,sub和sqrt指定为基本"操作,这些操作需要产生正确取整的结果(错误< = 0.5ulp),因此sqrt()是特殊的并且确实具有直接的硬件支持,与其他大多数数学库"功能.

Also note that sqrtps is a native machine instruction, so _mm_sqrt_ps() is a real intrinsic. IEEE 754 specifies mul, div, add, sub, and sqrt as "basic" operations that are requires to produce correctly-rounded results (error <= 0.5ulp), so sqrt() is special and does have direct hardware support, unlike most other "math library" functions.

有各种SIMD数学函数库.其中一些带有C ++包装库,这些库允许使用a+b而不是_mm_add_ps(a,b).

There are various libraries of SIMD math functions. Some of them come with C++ wrapper libraries that allow a+b instead of _mm_add_ps(a,b).

  • glibc libmvec -自glibc 2.22开始,支持OpenMP 4.0矢量数学函数. GCC知道如何使用它们自动向量化某些功能,例如cos()sin(),甚至可能是pow(). 此答案显示了一种不便的方式,将其显式地用于手动矢量化. (希望有更好的方法可能会在源代码中没有名称混乱的地方.)

  • glibc libmvec - since glibc 2.22, to support OpenMP 4.0 vector math functions. GCC knows how to auto-vectorize some functions like cos(), sin(), and probably pow() using it. This answer shows one inconvenient way of using it explicitly for manual vectorization. (Hopefully better ways are possible that don't have mangled names in the source code).

Agner Fog的VCL 具有一些数学函数,例如log . (以前是GPL许可,现在是Apache).

Agner Fog's VCL has some math functions like exp and log. (Formerly GPL licensed, now Apache).

https://sleef.org/-显然性能出色,您可以选择可变精度.以前仅在Windows的MSVC上受支持,其网站上的支持列表现在包括GCC和Clang(用于x86-64 GNU/Linux和AArch64).

https://sleef.org/ - apparently great performance, with variable accuracy you can choose. Formerly only supported on MSVC on Windows, the support matrix on its web site now includes GCC and Clang for x86-64 GNU/Linux and AArch64.

Intel's own SVML (comes with ICC; ICC auto-vectorizes with SVML by default). Confusingly has its prototypes in immintrin.h along with actual intrinsics. Maybe they want to trick people into writing code that's dependent on Intel tools/libraries. Or maybe they think fewer includes are better and that everyone should use their compiler...

还涉及到:具有矩阵BLAS功能的Intel MKL(数学内核库).

Also related: Intel MKL (Math Kernel Library), with matrix BLAS functions.

AMD ACML -报废的封闭源代码免费软件.我认为它只具有循环遍历数组/矩阵的函数(例如Intel MKL),而不是单个SIMD向量的函数.

AMD ACML - end-of-life closed-source freeware. I think it just has functions that loop over arrays/matrices (like Intel MKL), not functions for single SIMD vectors.

sse_mathfun (zlib许可)SSE2和ARM NEON.似乎自2011年以来就没有更新过.但是确实有单向量数学/三角函数的实现.

sse_mathfun (zlib license) SSE2 and ARM NEON. Hasn't been updated since about 2011 it seems. But does have implementations of single-vector math / trig functions.

这篇关于Clang的'_mm256_pow_ps'内在函数在哪里?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆