使用 SIMD 右移 32 位压缩负数 [英] Using SIMD to right shift 32 bit packed negative number

查看:51
本文介绍了使用 SIMD 右移 32 位压缩负数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一些 SSE/AVX 代码,并且有一项任务是将打包的有符号 32 位整数除以 2 的补码.当值为正时,此移位工作正常,但由于符号位移位,它会为负值产生错误结果.
是否有任何 SIMD 操作可以让我移动保留符号位的位置?谢谢

I'm writing some SSE/AVX code and there's a task to divide a packed signed 32 bit integers by 2's complement. When the values are positive this shift works fine, however it produces wrong results for negative values, because of shifting the sign bit.
Is there any SIMD operation that lets me shift preserving the position of the sign bit? Thanks

推荐答案

SSE2/AVX2 可以选择算术1 与 16 位和 32 位元素大小的逻辑右移.(对于 64 位元素,在 AVX512 之前只有逻辑可用).

SSE2/AVX2 has a choice of arithmetic1 vs. logical right shifts for 16 and 32-bit element sizes. (For 64-bit elements, only logical is available until AVX512).

使用 _mm_srai_epi32 (psrad) 而不是 _mm_srli_epi32 (psrld).

Use _mm_srai_epi32 (psrad) instead of _mm_srli_epi32 (psrld).

请参阅英特尔的内在指南,以及 SSE 标签 wiki https://stackoverflow.com/tags/sse/info 中的其他链接.(如果需要,可以过滤它以排除 AVX512,因为现在所有 3 种尺寸的所有蒙版版本都非常混乱......)

See Intel's intrinsics guide, and other links in the SSE tag wiki https://stackoverflow.com/tags/sse/info. (Filter it to exclude AVX512 if you want, because it's pretty cluttered these days with all the masked versions for all 3 sizes...)

或者只是查看 asm 指令集参考,其中包括具有它们的指令的内在函数.在 http://felixcloutier.com/x86/index.html 中搜索算术"找到你想要的班次.

Or just look at the asm instruction-set reference, which includes intrinsics for instructions that have them. Searching for "arithmetic" in http://felixcloutier.com/x86/index.html finds the shifts you want.

注意 a=arithmetic vs. l=logical,而不是 epu32 的常用内部函数命名方案,用于无符号.asm 助记符简单且一致(例如 Packed Shift Right Arithmetic Dword = psrad).

Note the a=arithmetic vs. l=logical, instead of the usual intrinsics naming scheme of epu32 for unsigned. The asm mnemonics are simple and consistent (e.g. Packed Shift Right Arithmetic Dword = psrad).

算术右移也可用于 AVX2 变量移位(vpsravd,以及立即移位的所有元素的一个变量版本.

Arithmetic right shifts are also available for AVX2 variable-shifts (vpsravd, and for the one-variable-for-all-elements version of the immediate shifts.

脚注 1:

算术右移在符号位的副本中移位,而不是零.

这正确地实现了 2 的补码符号除以 2 的幂并朝负无穷大舍入,这与从 C 符号除法中得到的向零截断不同.查看 int foo(int a){return a/4;} 的 asm 输出,了解编译器如何根据移位实现有符号除法语义.

This correctly implement 2's complement signed division by powers of 2 with rounding towards negative infinity, unlike the truncation toward zero you get from C signed division. Look at the asm output for int foo(int a){return a/4;} to see how compilers implement signed division semantics in terms of shifts.

这篇关于使用 SIMD 右移 32 位压缩负数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆