ARM Neon:有条件的存储建议 [英] ARM Neon: conditional store suggestion

查看:22
本文介绍了ARM Neon:有条件的存储建议的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想弄清楚如何在 ARM neon 中生成条件存储.我想做的是相当于这个SSE指令:

I'm trying to figure out how to generate a conditional Store in ARM neon. What I would like to do is the equivalent of this SSE instruction:

void _mm_maskmoveu_si128(__m128i d, __m128i n, char *p);

void _mm_maskmoveu_si128(__m128i d, __m128i n, char *p);

which 有条件地存储d的字节元素到地址p.选择器n中每个字节的高位决定了d中对应的字节是否会被存储.

which Conditionally stores byte elements of d to address p.The high bit of each byte in the selector n determines whether the corresponding byte in d will be stored.

有关如何使用 NEON 内在函数执行此操作的任何建议?谢谢

Any suggestion on how to do it with NEON intrinsics? Thank you

这就是我所做的:

int8x16_t store_mask = {0,0,0,0,0,0,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff};

int8x16_t tmp_dest = vld1q_u8((int8_t*)p_dest);
vbslq_u8(source,tmp_dest,store_mask);
vst1q_u8((int8_t*)p_dest,tmp_dest);

推荐答案

假设向量为 16 x 1 字节元素,您将设置一个掩码向量,其中每个元素要么全为 0 (0x00)或全 1 (0xff) 来确定元素是否应该存储在 not.然后你需要做以下(伪代码):

Assuming vectors of 16 x 1 byte elements, you would set up a mask vector where each element is either all 0s (0x00) or all 1s (0xff) to determine whether the element should be stored on not. Then you need to do the following (pseudo code):

 init mask vector = 0x00/0xff in each element
 init source vector = data to be selectively stored
 load dest vector from dest location
 apply `vbslq_u8` (`vbit` instruction) with dest vector, source vector and mask vector
 store dest vector back to dest location

这篇关于ARM Neon:有条件的存储建议的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆