AArch64 是否支持非对齐访问? [英] Does AArch64 support unaligned access?

查看:45
本文介绍了AArch64 是否支持非对齐访问?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

AArch64 本身是否支持非对齐访问?我问是因为目前 ocamlopt 假设不".

Does AArch64 support unaligned access natively? I am asking because currently ocamlopt assumes "no".

推荐答案

提供用于严格对齐检查的硬件位没有打开(这在 x86 上,没有通用操作系统实际上会这样做),AArch64 做了允许使用常规加载/存储指令对普通(而非设备)内存进行未对齐的数据访问.

Providing the hardware bit for strict alignment checking is not turned on (which, as on x86, no general-purpose OS is realistically going to do), AArch64 does permit unaligned data accesses to Normal (not Device) memory with the regular load/store instructions.

然而,编译器仍然希望维护对齐数据的原因有几个:

However, there are several reasons why a compiler would still want to maintain aligned data:

  • 读取和写入的原子性:自然对齐的加载和存储保证是原子的,即如果一个线程同时读取对齐的内存位置,另一个线程写入同一位置,则读取将只返回旧值或新值.如果位置未与访问大小对齐,则该保证不适用 - 在这种情况下,读取可能会返回两个值的某些未知混合.如果该语言具有依赖于不会发生这种情况的并发模型,则它可能不会允许未对齐的数据.
  • 原子读-修改-写操作:如果语言有一个并发模型,其中部分或所有数据类型可以原子地更新(不仅仅是读或写),那么对于这些​​操作,代码生成将涉及使用独占加载/独占存储指令来构建原子读-修改-写序列,而不是简单的加载/存储.如果地址未与访问大小对齐,则独占指令将始终出错.
  • 效率:在大多数内核上,未对齐的访问充其量仍比正确对齐的访问至少长 1 个周期.在最坏的情况下,单个未对齐的访问可能会跨越缓存线边界(其本身具有额外的开销),并产生两次 缓存未命中或什至两个连续的页面错误.除非您处于内存极度受限的环境中,或者无法控制数据布局(例如,从网络接收缓冲区中提取数据包),否则最好避免使用未对齐的数据.
  • 必要性:如果语言有合适的数据模型,即没有指针,并且来自外部来源的任何数据已经在较低级别被编组为适当的数据类型,那么无论如何真的不需要未对齐的访问,并且它使编译器的完全忽略这个想法的生活要容易得多.
  • Atomicity of reads and writes: naturally-aligned loads and stores are guaranteed to be atomic, i.e. if one thread reads an aligned memory location simultaneously with another thread writing the same location, the read will only ever return the old value or the new value. That guarantee does not apply if the location is not aligned to the access size - in that case the read could return some unknown mixture of the two values. If the language has a concurrency model which relies on that not happening, it's probably not going to allow unaligned data.
  • Atomic read-modify-write operations: If the language has a concurrency model in which some or all data types can be updated (not just read or written) atomically, then for those operations the code generation will involve using the load-exclusive/store-exclusive instructions to build up atomic read-modify-write sequences, rather than plain loads/stores. The exclusive instructions will always fault if the address is not aligned to the access size.
  • Efficiency: On most cores, an unaligned access at best still takes at least 1 cycle longer than a properly-aligned one. In the worst case, a single unaligned access can cross a cache line boundary (which has additional overhead in itself), and generate two cache misses or even two consecutive page faults. Unless you're in an incredibly memory-constrained environment, or have no control over the data layout (e.g. pulling packets out of a network receive buffer), unaligned data is still best avoided.
  • Necessity: If the language has a suitable data model, i.e. no pointers, and any data from external sources is already marshalled into appropriate datatypes at a lower level, then there's really no need for unaligned accesses anyway, and it makes the compiler's life that much easier to simply ignore the idea altogether.

我不知道 OCaml 特别关注什么,但如果是以上所有",我当然不会感到惊讶.

I have no idea what concerns OCaml in particular, but I certainly wouldn't be surprised if it were "all of the above".

这篇关于AArch64 是否支持非对齐访问?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆