x86体系结构上的内存排序限制 [英] Memory ordering restrictions on x86 architecture

查看:104
本文介绍了x86体系结构上的内存排序限制的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

安东尼·威廉姆斯(Anthony Williams)在他的著作《行动中的C ++并发性》中写了以下内容(第309页):

例如,在x86和x86-64体系结构上,原子加载操作是 无论标记为memory_order_relaxed还是memory_order_seq_cst,始终相同 (请参阅第5.3.3节).这意味着使用宽松的内存顺序编写的代码可能会 在具有x86架构的系统上工作,而在具有更好的架构的系统上可能会失败, 细粒度的内存排序指令集,例如SPARC.

我是否正确理解在x86架构上所有原子加载操作都是memory_order_seq_cst?另外,在 cppreference std::memory_order网站上,提到在x86版本上,自动获取订单.

如果此限制有效,那么排序是否仍适用于编译器优化?

解决方案

是的,排序仍然适用于编译器优化.

此外,在x86上原子加载操作始终相同"并不是完全准确.

在x86上,使用mov完成的所有负载都具有语义,而使用mov完成的所有存储都具有发布语义.因此,acq_rel,acq和relief负载是简单的mov,并且类似acq_rel,rel和relaxed存储(acq商店和rel负载始终等于relaxed).

但是,这对于seq_cst不一定是正确的:体系结构并不能保证mov的seq_cst语义.实际上,x86指令集没有用于顺序一致的加载和存储的任何特定指令. x86上只有原子的读取-修改-写入操作将具有seq_cst语义.因此,您可以通过执行参数为0的fetch_and_add操作(lock xadd指令)获得负载的seq_cst语义,并通过执行seq_cst交换操作(xchg指令)并丢弃先前的值来获得存储的seq_cst语义. /p>

但是您无需同时执行这两项操作!只要所有seq_cst存储都使用xchg完成,就可以使用mov简单地实现seq_cst加载.双重地,如果所有加载都由lock xadd完成,则seq_cst存储可以简单地由mov实现.

xchglock xaddmov慢得多.因为程序通常具有比存储更多的负载,所以使用xchg进行seq_cst存储很方便,因此(更频繁的)seq_cst负载可以简单地使用mov.此实现细节在x86应用程序二进制接口(ABI)中进行了整理.在x86上,兼容的编译器必须将seq_cst存储库编译为xchg,以便可以使用更快的mov指令完成seq_cst加载(可能会出现在另一个翻译单元中,并使用其他编译器进行编译).

因此,在x86上使用相同的指令完成seq_cst和获取负载通常是不正确的.这是真的,因为ABI指定seq_cst存储区被编译为xchg.

In his great book 'C++ Concurrency in Action' Anthony Williams writes the following (page 309):

For example, on x86 and x86-64 architectures, atomic load operations are always the same, whether tagged memory_order_relaxed or memory_order_seq_cst (see section 5.3.3). This means that code written using relaxed memory ordering may work on systems with an x86 architecture, where it would fail on a system with a finer- grained set of memory-ordering instructions such as SPARC.

Do I get this right that on x86 architecture all atomic load operations are memory_order_seq_cst? In addition, on the cppreference std::memory_order site is mentioned that on x86 release-aquire ordering is automatic.

If this restriction is valid, do the orderings still apply to compiler optimizations?

解决方案

Yes, ordering still applies to compiler optimizations.

Also, it is not entirely exact that on x86 "atomic load operations are always the same".

On x86, all loads done with mov have acquire semantics and all stores done with mov have release semantics. So acq_rel, acq and relaxed loads are simple movs, and similarly acq_rel, rel and relaxed stores (acq stores and rel loads are always equal to relaxed).

This however is not necessarily true for seq_cst: the architecture does not guarantee seq_cst semantics for mov. In fact, the x86 instruction set does not have any specific instruction for sequentially consistent loads and stores. Only atomic read-modify-write operations on x86 will have seq_cst semantics. Hence, you could get seq_cst semantics for loads by doing a fetch_and_add operation (lock xadd instruction) with an argument of 0, and seq_cst semantics for stores by doing a seq_cst exchange operation (xchg instruction) and discarding the previous value.

But you do not need to do both! As long as all seq_cst stores are done with xchg, seq_cst loads can be implemented simply with a mov. Dually, if all loads were done with lock xadd, seq_cst stores could be implemented simply with a mov.

xchg and lock xadd are much slower than mov. Because a program has (usually) more loads than stores, it is convenient to do seq_cst stores with xchg so that the (more frequent) seq_cst loads can simply use a mov. This implementation detail is codified in the x86 Application Binary Interface (ABI). On x86, a compliant compiler must compile seq_cst stores to xchg so that seq_cst loads (which may appear in another translation unit, compiled with a different compiler) can be done with the faster mov instruction.

Thus it is not true in general that seq_cst and acquire loads are done with the same instruction on x86. It is only true because the ABI specifies that seq_cst stores be compiled to an xchg.

这篇关于x86体系结构上的内存排序限制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆