为什么在32位x86架构上,"int64_t" 8字节的默认对齐方式? [英] Why is the default alignment for `int64_t` 8 byte on 32 bit x86 architecture?

查看:349
本文介绍了为什么在32位x86架构上,"int64_t" 8字节的默认对齐方式?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为什么在32位x86 ABI中int64_t(例如long long)的默认对齐方式为8个字节? 4字节对齐似乎很好,因为只能将其作为两个4B一半来访问.

Why is the default alignment 8 byte for int64_t (e.g. long long) in 32 bit x86 ABIs? 4 byte alignment would appear to be fine, because it can only be accessed as two 4B halves.

推荐答案

有趣的一点:如果您只将其作为两半加载到32位GP寄存器中,则4B对齐意味着这些操作将自然对齐.

Interesting point: If you only ever load it as two halves into 32bit GP registers, then 4B alignment means those operations will happen with their natural alignment.

但是,最好将变量的两个部分都放在同一缓存行中,因为几乎所有访问都将读/写这两个部分.与整个事物的自然对齐保持一致就可以解决这一问题,甚至可以忽略下面的其他原因.

However, it's probably best if both halves of the variable are in the same cache line, since almost all accesses will read / write both halves. Aligning to the natural alignment of the whole thing takes care of that, even ignoring the other reasons below.

32位x86可以使用MMX或SSE2 movq在单个64位加载中加载64位整数.只要您不需要立即常量或mul或div,使用向量指令处理64位add/sub/shift/和按位布尔值的效率就更高(单指令).具有64b元素的矢量指令在32b模式下仍然可用.

32bit x86 can load 64bit integers in a single 64bit-load using MMX or SSE2 movq. Handling 64bit add/sub/shift/ and bitwise booleans using vector instructions is more efficient (single instruction), as long as you don't need immediate constants or mul or div. The vector instructions with 64b elements are still available in 32b mode.

64位原子比较和交换也可以在32位模式下使用(lock CMPXCHG8B m64的工作方式类似于64位模式的lock CMPXCHG16B m128,使用两个隐式寄存器(edx:eax)). IDK对跨越缓存行边界有什么样的惩罚.

Atomic 64bit compare-and-exchange is also available in 32bit mode (lock CMPXCHG8B m64 works just like 64bit mode's lock CMPXCHG16B m128, using two implicit registers (edx:eax)). IDK what kind of penalty it has for crossing a cache-line boundary.

现代的x86 CPU基本上不会对未对齐的加载/存储进行任何惩罚,除非它们越过缓存行边界,这就是为什么我只说而不是未对齐的64b通常不好.请参见 Wiki的链接,尤其是. Agner Fog的指南.

Modern x86 CPUs have essentially no penalty for misaligned loads/stores unless they cross cache-line boundaries, which is why I'm only saying that, and not saying that misaligned 64b would be bad in general. See the links in the x86 wiki, esp. Agner Fog's guides.

这篇关于为什么在32位x86架构上,"int64_t" 8字节的默认对齐方式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆