无符号32位整数在SSE中的水平最小值和位置 [英] Horizontal minimum and position in SSE for unsigned 32-bit integers

查看:196
本文介绍了无符号32位整数在SSE中的水平最小值和位置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种方法,以找到无符号的32位整数的最小值及其在SSE中的位置(类似于_mm_minpos_epu16).我知道我可以通过一系列_mm_min_epu32并进行混洗/平移来找到最小值,但这并不能使我获得排名.

I am looking for a way to find the minimum and its position in SSE for unsigned 32-bit integers (similar to _mm_minpos_epu16). I know I can find the minimum through a series of _mm_min_epu32 and shuffles/shifts but that doesn't get me the position.

有人有什么好办法吗?

推荐答案

可能有一个更聪明的方法,但是现在这是一种蛮力方法:

There is probably a cleverer method, but for now here's a brute force approach:

#include <stdio.h>
#include <smmintrin.h> // SSE4.1

int main(void)
{
    __m128i v = _mm_setr_epi32(42, 1, 43, 2);

    printf("v     = %vlu\n", v);

    __m128i vmin = v;

    vmin = _mm_min_epu32(vmin, _mm_alignr_epi8(vmin, vmin, 4));
    vmin = _mm_min_epu32(vmin, _mm_alignr_epi8(vmin, vmin, 8));
                                                   // get min value in all elements of vmin

    printf("vmin  = %vlu\n", vmin);

    __m128i vmask = _mm_cmpeq_epi32(v, vmin);      // set min element(s) in mask to -1,
                                                   // all others to 0 [1]

    printf("vmask = %vld\n", vmask);

    int16_t mask = _mm_movemask_epi8(vmask);       // get mask as scalar [2]

    printf("mask  = %#x\n", mask);

    int pos = __builtin_ctz(mask) >> 2;            // convert scalar mask to index [3]

    printf("pos   = %d\n", pos);

    return 0;
}

如果可以使用在最小元素位置设置的遮罩,则可以停在[1],否则继续[3]以获得(最低有效位)的索引. )的最小元素.

If you can use a mask which is set at the position(s) of the minimum element(s) then you can just stop at [1], otherwise continue to [3] to get the index of the (least significant) minimum element.

还要注意,__builtin_ctz是特定于gcc的内在函数(尽管也可以在其他与gcc兼容的编译器中找到).如果您使用的是MSVC,则需要使用等效的Microsoft内部函数(_BitScanForward).

Note also that __builtin_ctz is a gcc-specific intrinsic (although it's found in other gcc-compatible compilers too). If you're using MSVC then you'll need to use the equivalent Microsoft intrinsic (_BitScanForward).

这篇关于无符号32位整数在SSE中的水平最小值和位置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆