如何在sse2上模拟pcmpgtq? [英] How to simulate pcmpgtq on sse2?

查看:39
本文介绍了如何在sse2上模拟pcmpgtq?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

PCMPGTQ 是在 sse4.2 中引入的,它为产生掩码的 64 位数字提供大于符号的比较.

PCMPGTQ was introduced in sse4.2, and it provides a greater than signed comparison for 64 bit numbers that yields a mask.

如何在早于 sse4.2 的指令集上支持此功能?

How does one support this functionality on instructions sets predating sse4.2?

更新:同样的问题适用于带有 Neon 的 ARMv7,它也缺少 64 位比较器.姐妹问题在这里找到:在带有 Neon 的 ARMv7a 上通过 64 位有符号比较支持 CMGT 的最有效方法是什么?

Update: This same question applies to ARMv7 with Neon which also lacks a 64-bit comparator. The sister question to this is found here: What is the most efficient way to support CMGT with 64bit signed comparisons on ARMv7a with Neon?

推荐答案

__m128i pcmpgtq_sse2 (__m128i a, __m128i b) {
    __m128i r = _mm_and_si128(_mm_cmpeq_epi32(a, b), _mm_sub_epi64(b, a));
    r = _mm_or_si128(r, _mm_cmpgt_epi32(a, b));
    return _mm_shuffle_epi32(r, _MM_SHUFFLE(3,3,1,1));
}

我们有 32 位有符号比较内在函数,因此将打包的 qwords 拆分为 dwords 对.

We have 32-bit signed comparison intrinsics so split the packed qwords into dwords pairs.

如果a中的high dword大于b中的high dword,则不需要比较low dword.

If the high dword in a is greater than the high dword in b then there is no need to compare the low dwords.

if (a.hi > b.hi) { r.hi = 0xFFFFFFFF; }
if (a.hi <= b.hi) { r.hi = 0x00000000; }

如果 a 中的高位双字等于 b 中的高位双字,那么 64 位减法将清除或设置结果的所有 32 位高位(如果高位双字相等,则它们相互抵消",实际上是低位双字的无符号比较,将结果放在高位双字中).

If the high dword in a is equal to the high dword in b then a 64-bit subtract will either clear or set all 32 high bits of the result (if the high dwords are equal then they "cancel" each other out, effectively a unsigned compare of the low dwords, placing the result in the high dwords).

if (a.hi == b.hi) { r = (b - a) & 0xFFFFFFFF00000000; }

将高 32 位中的比较掩码复制到低 32 位.

Copy the comparison mask in the high 32-bits to the low 32-bits.

r.lo = r.hi

更新:这是适用于 SSE2 和 ARMv7+Neon 的 Godbolt.

Updated: Here's the Godbolt for SSE2 and ARMv7+Neon.

这篇关于如何在sse2上模拟pcmpgtq?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆