ARM Neon:有条件的存储建议 [英] ARM Neon: conditional store suggestion
问题描述
我想弄清楚如何在 ARM neon 中生成条件存储.我想做的是相当于这个SSE指令:
I'm trying to figure out how to generate a conditional Store in ARM neon. What I would like to do is the equivalent of this SSE instruction:
void _mm_maskmoveu_si128(__m128i d, __m128i n, char *p);
void _mm_maskmoveu_si128(__m128i d, __m128i n, char *p);
which 有条件地存储d的字节元素到地址p.选择器n中每个字节的高位决定了d中对应的字节是否会被存储.
which Conditionally stores byte elements of d to address p.The high bit of each byte in the selector n determines whether the corresponding byte in d will be stored.
有关如何使用 NEON 内在函数执行此操作的任何建议?谢谢
Any suggestion on how to do it with NEON intrinsics? Thank you
这就是我所做的:
int8x16_t store_mask = {0,0,0,0,0,0,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff};
int8x16_t tmp_dest = vld1q_u8((int8_t*)p_dest);
vbslq_u8(source,tmp_dest,store_mask);
vst1q_u8((int8_t*)p_dest,tmp_dest);
推荐答案
假设向量为 16 x 1 字节元素,您将设置一个掩码向量,其中每个元素要么全为 0 (0x00
)或全 1 (0xff
) 来确定元素是否应该存储在 not.然后你需要做以下(伪代码):
Assuming vectors of 16 x 1 byte elements, you would set up a mask vector where each element is either all 0s (0x00
) or all 1s (0xff
) to determine whether the element should be stored on not. Then you need to do the following (pseudo code):
init mask vector = 0x00/0xff in each element
init source vector = data to be selectively stored
load dest vector from dest location
apply `vbslq_u8` (`vbit` instruction) with dest vector, source vector and mask vector
store dest vector back to dest location
这篇关于ARM Neon:有条件的存储建议的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!