是否可以使用 PTEST 来测试两个寄存器是否都为零或其他条件? [英] Can PTEST be used to test if two registers are both zero or some other condition?

查看：15 发布时间：2022/1/6 13:05:24 assembly x86 sse intrinsics sse4

本文介绍了是否可以使用 PTEST 来测试两个寄存器是否都为零或其他条件?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

你可以用 SSE4.1 ptest 做什么其他比测试单个寄存器是否全为零?

What can you do with SSE4.1 ptest other than testing if a single register is all-zero?

您能否结合使用 SF 和 CF 来测试有关两个未知输入寄存器的任何有用信息?

Can you use a combination of SF and CF to test anything useful about two unknown input registers?

PTEST 有什么用?您认为检查打包比较的结果(如 PCMPEQD 或 CMPPS)会很好，但至少在英特尔 CPU 上，使用 PTEST + JCC 进行比较和分支比使用 PMOVMSK(B/PS/PD) + 宏融合 CMP 花费更多的 uops+JCC.

What is PTEST good for? You'd think it would be good for checking the result of a packed-compare (like PCMPEQD or CMPPS), but at least on Intel CPUs, it costs more uops to compare-and-branch using PTEST + JCC than with PMOVMSK(B/PS/PD) + macro-fused CMP+JCC.

另见检查两个SSE 寄存器在不破坏它们的情况下不是都为零

推荐答案

不，除非我遗漏了一些聪明的东西，带有两个未知寄存器的 ptest 通常对于检查关于两者的某些属性没有用他们.(除了明显的东西，你已经想要一个按位与，比如两个位图之间的交集).

No, unless I'm missing something clever, ptest with two unknown registers is generally not useful for checking some property about both of them. (Other than obvious stuff you'd already want a bitwise-AND for, like intersection between two bitmaps).

测试两个寄存器是否全为零，或将它们放在一起，然后针对自身进行 PTEST.

To test two registers for both being all-zero, OR them together and PTEST that against itself.

ptest xmm0, xmm1 产生两个结果:

ZF = 是 xmm0 &xmm1 全零?
CF = 是 (~xmm0) &xmm1 全零?

ZF = is xmm0 & xmm1 all-zero?
CF = is (~xmm0) & xmm1 all-zero?

如果第二个向量全为零，则标志完全不依赖于第一个向量中的位.

将全零"检查视为 AND 和 ANDNOT 结果的 NOT(bitwise horizontal-OR()) 可能很有用.但可能不会，因为这对我的大脑来说太多了，无法轻松思考.垂直与然后水平或的序列确实可能让您更容易理解为什么 PTEST 没有告诉您很多关于两个未知寄存器的组合的信息，就像整数 TEST 指令一样.

It may be useful to think of the "is-all-zero" checks as a NOT(bitwise horizontal-OR()) of the AND and ANDNOT results. But probably not, because that's too many steps for my brain to think through easily. That sequence of vertical-AND and then horizontal-OR does maybe make it easier to understand why PTEST doesn't tell you much about a combination of two unknown registers, just like the integer TEST instruction.

这是 2 位 ptest a,mask 的真值表.希望这有助于考虑 128b 输入的 0 和 1 混合.

Here's a truth table for a 2-bit ptest a,mask. Hopefully this helps in thinking about mixes of zeros and ones with 128b inputs.

注意CF(a,mask) == ZF(~a,mask).

a    mask     ZF    CF
00   00       1     1
01   00       1     1
10   00       1     1
11   00       1     1

00   01       1     0
01   01       0     1
10   01       1     0
11   01       0     1

00   10       1     0
01   10       1     0
10   10       0     1
11   10       0     1

00   11       1     0
01   11       0     0
10   11       0     0
11   11       0     1

<小时>

英特尔的内在函数指南为它列出了 2 个有趣的内在函数.请注意 args 的命名:a 和 mask 是一个线索，它们告诉您 a 的部分由已知的 AND 掩码选择.

Intel's intrinsics guide lists 2 interesting intrinsics for it. Note the naming of the args: a and mask are a clue that they tell you about the parts of a selected by a known AND-mask.

_mm_test_mix_ones_zeros (__m128i a, __m128i mask):返回(ZF == 0 && CF == 0)
_mm_test_all_zeros (__m128i a, __m128i mask):返回ZF

_mm_test_mix_ones_zeros (__m128i a, __m128i mask): returns (ZF == 0 && CF == 0)
_mm_test_all_zeros (__m128i a, __m128i mask): returns ZF

还有更简单命名的版本:

There's also the more simply-named versions:

int _mm_testc_si128 (__m128i a, __m128i b):返回CF
int _mm_testnzc_si128 (__m128i a, __m128i b):返回(ZF == 0 && CF == 0)
int _mm_testz_si128 (__m128i a, __m128i b):返回ZF

int _mm_testc_si128 (__m128i a, __m128i b): returns CF
int _mm_testnzc_si128 (__m128i a, __m128i b): returns (ZF == 0 && CF == 0)
int _mm_testz_si128 (__m128i a, __m128i b): returns ZF

这些内在函数有 AVX2 __m256i 版本，但该指南仅列出了 __m128i 操作数的 all_zeros 和 mix_ones_zeros 备用名称版本.

There are AVX2 __m256i versions of those intrinsics, but the guide only lists the all_zeros and mix_ones_zeros alternate-name versions for __m128i operands.

如果你想从 C 或 C++ 测试其他一些条件，你应该使用 testc 和 testz 和相同的操作数，并希望你的编译器意识到它只是需要做一个 PTEST，甚至希望使用单个 JCC、SETCC 或 CMOVCC 来实现您的逻辑.(我建议检查 asm，至少对于您最关心的编译器.)

If you want to test some other condition from C or C++, you should use testc and testz with the same operands, and hope that your compiler realizes that it only needs to do one PTEST, and hopefully even use a single JCC, SETCC, or CMOVCC to implement your logic. (I'd recommend checking the asm, at least for the compiler you care about most.)

请注意，_mm_testz_si128(v, set1(0xff)) 始终与 _mm_testz_si128(v,v) 相同，因为 AND 就是这样工作的.但 CF 结果并非如此.

Note that _mm_testz_si128(v, set1(0xff)) is always the same as _mm_testz_si128(v,v), because that's how AND works. But that's not true for the CF result.

您可以使用

You can check for a vector being all-ones using

bool is_all_ones = _mm_testc_si128(v, _mm_set1_epi8(0xff));

这可能并不比 PCMPEQB 对全 1 向量的速度更快，但代码量更小，然后是通常的 movemask + cmp.它并没有避免对向量常量的需要.

This is probably no faster, but smaller code-size, than a PCMPEQB against a vector of all-ones, then the usual movemask + cmp. It doesn't avoid the need for a vector constant.

PTEST 的优势在于它不会破坏任何输入操作数，即使没有 AVX.

PTEST does have the advantage that it doesn't destroy either input operand, even without AVX.

这篇关于是否可以使用 PTEST 来测试两个寄存器是否都为零或其他条件?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

是否可以使用 PTEST 来测试两个寄存器是否都为零或其他条件? [英] Can PTEST be used to test if two registers are both zero or some other condition?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

是否可以使用 PTEST 来测试两个寄存器是否都为零或其他条件? [英] Can PTEST be used to test if two registers are both zero or some other condition?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭