比较AVX/AVX2中的两个向量(c) [英] Comparing 2 vectors in AVX/AVX2 (c)

查看：86 发布时间：2021/4/12 20:53:56 c simd avx avx2

本文介绍了比较AVX/AVX2中的两个向量(c)的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有两个 __ m256i 向量(每个向量都包含字符)，我想找出它们是否完全相同.如果所有位都相等，我需要的是 true ，否则是 0 .

I have two __m256i vectors (each containing chars), and I want to find out if they are completely identical or not. All I need is true if all bits are equal, and 0 otherwise.

最有效的方法是什么?这是加载数组的代码:

What's the most efficient way of doing that? Here's the code loading the arrays:

char * a1 = "abcdefhgabcdefhgabcdefhgabcdefhg";
__m256i r1 = _mm256_load_si256((__m256i *) a1);

char * a2 = "abcdefhgabcdefhgabcdefhgabcdefhg";
__m256i r2 = _mm256_load_si256((__m256i *) a2);

推荐答案

当前Intel和AMD CPU上最有效的方法是逐元素比较是否相等，然后检查所有元素的比较是否正确.

The most efficient way on current Intel and AMD CPUs is an element-wise comparison for equality, and then check that the comparison was true for all elements.

这可以编译为多个指令，但是它们都很便宜，而且(如果您跳转到结果的话)比较+分支甚至将宏融合到单个uop中.

This compiles to multiple instructions, but they're all cheap and (if you branch on the result) the compare+branch even macro-fuses into a single uop.

#include <immintrin.h>
#include <stdbool.h>

bool vec_equal(__m256i a, __m256i b) {
    __m256i pcmp = _mm256_cmpeq_epi32(a, b);  // epi8 is fine too
    unsigned bitmask = _mm256_movemask_epi8(pcmp);
    return (bitmask == 0xffffffffU);
}

生成的asm应该为 vpcmpeqd/vpmovmskb/cmp 0xffffffff/je ，在Intel CPU上仅为3 ups.

The resulting asm should be vpcmpeqd / vpmovmskb / cmp 0xffffffff / je, which is only 3 uops on Intel CPUs.

vptest 为2微秒，并且不与 jcc 进行宏融合，因此与 movmsk / cmp <相等或更差/code>用于测试打包比较的结果.(请参见 http://agner.org/optimize/


vptest is 2 uops and doesn't macro-fuse with jcc, so equal or worse than movmsk / cmp for testing the result of a packed-compare.  (See http://agner.org/optimize/

                        这篇关于比较AVX/AVX2中的两个向量(c)的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

比较AVX/AVX2中的两个向量(c) [英] Comparing 2 vectors in AVX/AVX2 (c)

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

比较AVX/AVX2中的两个向量(c) [英] Comparing 2 vectors in AVX/AVX2 (c)

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭