如何将字节数组加载到__m128i? [英] How to load byte array to __m128i ?

查看:222
本文介绍了如何将字节数组加载到__m128i?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好。



我将字节数组定义如下。



BYTE缓冲区[16] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};



我打电话给'_mm_load_si128()'。



__m128i current_0 = _mm_load_si128((__ m128i *)缓冲区);



但是,它发生错误消息'访问违规'。



这是错的吗?我真的不知道..



请给我建议。谢谢:)



我尝试了什么:



我尝试过如下代码。



BYTE缓冲区[16] = {0,1,2,3,4,5,6,7,8,9,10,11 ,12,13,14,15};

__m128i current_0 = _mm_loadu_si128((__ m128i *)缓冲区);



它有效。

但是我不知道当图像尺寸很大时,它们在性能上有'_mm_loadu_si128'和'_mm_load_si128'的差异。

Hi all.

I defined byte array as following.

BYTE buffer[16] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};

And I call '_mm_load_si128()'.

__m128i current_0 = _mm_load_si128((__m128i*)buffer);

But, It occured error message 'Access Violation'.

Is it wrong? I really don't know..

Please give me advice. Thank you :)

What I have tried:

I tried as following code.

BYTE buffer[16] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__m128i current_0 = _mm_loadu_si128((__m128i*)buffer);

It worked.
But I don't know they have difference of performance '_mm_loadu_si128' and '_mm_load_si128' when the image has big size.

推荐答案

参见 _mm_loadu_si128 [ ^ ]。 _mm_load_si128 需要字节数组为16字节对齐。非对齐数组的性能会较慢。
See _mm_loadu_si128[^]. _mm_load_si128 needs the byte array to be 16-byte aligned. The performance will be slower for the non-aligned array.


这篇关于如何将字节数组加载到__m128i?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆