32 位 Intel 处理器上的内存对齐 [英] Memory alignment on a 32-bit Intel processor

查看:30
本文介绍了32 位 Intel 处理器上的内存对齐的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

英特尔的 32 位处理器(如奔腾)具有 64 位宽的数据总线,因此每次访问可获取 8 个字节.基于此,我假设这些处理器在地址总线上发出的物理地址始终是 8 的倍数.

Intel's 32-bit processors such as Pentium have 64-bit wide data bus and therefore fetch 8 bytes per access. Based on this, I'm assuming that the physical addresses that these processors emit on the address bus are always multiples of 8.

首先,这个结论是否正确?

Firstly, is this conclusion correct?

其次,如果它是正确的,那么应该在 8 字节边界上对齐数据结构成员.但是我看到人们在这些处理器上使用 4 字节对齐.

Secondly, if it is correct, then one should align data structure members on an 8 byte boundary. But I've seen people using a 4-byte alignment instead on these processors.

他们怎么能有理由这样做?

How can they be justified in doing so?

推荐答案

通常的经验法则(直接来自 Intel 和 AMD 的优化手册)是每种数据类型都应根据其自身的大小进行对齐.int32 应该在 32 位边界上对齐,int64 在 64 位边界上对齐,依此类推.一个字符可以放在任何地方.

The usual rule of thumb (straight from Intels and AMD's optimization manuals) is that every data type should be aligned by its own size. An int32 should be aligned on a 32-bit boundary, an int64 on a 64-bit boundary, and so on. A char will fit just fine anywhere.

另一个经验法则当然是编译器已被告知对齐要求".您无需担心,因为编译器知道添加正确的填充和偏移量以允许有效访问数据.

Another rule of thumb is, of course "the compiler has been told about alignment requirements". You don't need to worry about it because the compiler knows to add the right padding and offsets to allow efficient access to data.

唯一的例外是在处理 SIMD 指令时,您必须手动确保大多数编译器的对齐.

The only exception is when working with SIMD instructions, where you have to manually ensure alignment on most compilers.

其次,如果正确,则一应该对齐数据结构成员一个 8 字节的边界.但我见过人们使用 4 字节对齐而不是在这些处理器上.

Secondly, if it is correct, then one should align data structure members on an 8 byte boundary. But I've seen people using a 4-byte alignment instead on these processors.

我看不出这有什么不同.CPU 可以简单地读取包含这 4 个字节的 64 位块.这意味着它要么在请求的数据之前获得 4 个额外字节,要么在它之后获得.但在这两种情况下,只需要一次读取.32 位宽数据的 32 位对齐确保它不会跨越 64 位边界.

I don't see how that makes a difference. The CPU can simply issue a read for the 64-bit block that contains those 4 bytes. That means it either gets 4 extra bytes before the requested data, or after it. But in both cases, it only takes a single read. 32-bit alignment of 32-bit-wide data ensures that it won't cross a 64-bit boundary.

这篇关于32 位 Intel 处理器上的内存对齐的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆