字节序如何与SIMD寄存器一起工作? [英] How does endianness work with SIMD registers?

查看:69
本文介绍了字节序如何与SIMD寄存器一起工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用整数和SSE,并且对于字节顺序如何影响将数据移入和移出寄存器感到非常困惑.

I'm working with integers and SSE and have become very confused about how endianness affects moving data in and out of registers.

最初,我的理解如下.如果我有一个4字节整数的数组,由于x86体系结构是低位字节序的,因此内存的布局将如下所示:

Initially my understanding was as follows. If I have an array of 4 byte integers the memory would be laid out as follows since x86 architectures are little endian:

0D 0C 0B 0A 1D 1C 1B 1A 2D 2C 2B 2A .... nD nC nB nA

其中字母 A B C D 索引整数元素中的字节,并且数字索引元素.

Where the letters A, B, C and D index the bytes within an integer element, and numbers index the element.

在XMM寄存器中,我的理解是四个整数的布局如下:

In an XMM register, my understanding is that four integers would be laid out as follows:

0A 0B 0C 0D 1A 1B 1C 1D 2A 2B 2C 2D 3A 3B 3C 3D

但是,我很确定这张图片有误,原因有几个.第一个是 mm_load_si128 内在函数的文档,该文档适用于任何整数数据,但在上图中仅适用于一个字长.类似地,还有此(已存档)线程.最终,我看到人们在编写如下代码:

However, I'm pretty sure this picture is wrong for several reasons. The first is the documentation for the mm_load_si128 intrinsic, which is supposed to work for any integer data, but in the above picture should only work for one word size. Similarly there is this (archived) thread. Finally I see people writing code like the following:

__declspec(align(16)) int32_t A[N];
__m128i* As = (__m128i*)A;

一张可能正确的图片

维基百科有关字节序的文章说,我应该想到内存地址从右到左增加.那么下面的图片如何存储?

A potentially correct picture

The Wikipedia article on endianness says I should think of memory addresses increasing right to left. How about the following picture for memory then?

nA nB nC nD ... 2A 2B 2C 2D 1A 1B 1C 1D 0A 0B 0C 0D

然后在寄存器中:

3A 3B 3C 3D 2A 2B 2C 2D 1A 1B 1C 1D 0A 0B 0C 0D

推荐答案

这只是解释问题.我们读/写从左到右的数字,从最高位到最低位.因此,对于具有最高字节A,然后是B,然后是C,最低字节D的32位数字,我们将读/写ABCD.我们对128位整数进行了同样的标记.

It's just a question of interpretation. We read/write digits of a number from left to right and highest digit to lowest digit. So for a 32-bit number with the highest byte A then B then C and lowest byte D we would read/write ABCD. We do the same notating a 128-bit integer.

3A3B3C3D 2A2B2C2D 1A1B1C1D 0A0B0C0D

但是在小端系统中,它会像这样从最低地址到最高地址读写数字

But in a little endian system it reads and writes digits from the lowest address to the highest like this

0D0C0B0A 1D1C1B1A 2D2C2B2A 3D3C3B3A

对于16位整数,这是相同的逻辑.我们可以将其读/写为

For 16-bit integers it's the same logic. We could read/write it as

7A7B 6A6B 5A5B 4A4B 3A3B 2A2B 1A1B 0A0B

和小端计算机将其从最低地址到最高地址的读取/存储为

and the little endian computer read/stores it from lowest to highest address as

0B0A 1B1A 2B2A 3B3A 4B4A 5B5A 6A6B 7B7A

这就是为什么只有一条指令在128位寄存器中读取/写入32位,16位和8字节整数的原因:即movdqa和movaps(或未对齐的变体movdqu和movups).

That's why there is only one instruction to read/write 32-bit, 16-bit and 8-byte integers int a 128-bit register: namely movdqa and movaps (or the unaligned variants movdqu and movups).

这篇关于字节序如何与SIMD寄存器一起工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆