x86 架构的内存排序限制 [英] Memory ordering restrictions on x86 architecture

查看：28 发布时间：2021/11/17 2:40:02 c++ multithreading architecture c++11 memory-model

本文介绍了x86 架构的内存排序限制的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

Anthony Williams 在他的伟大著作《C++ 并发实践》中写道(第 309 页):

In his great book 'C++ Concurrency in Action' Anthony Williams writes the following (page 309):

例如，在 x86 和 x86-64 架构上，原子加载操作是始终相同，无论是标记 memory_order_relaxed 还是 memory_order_seq_cst(见第 5.3.3 节).这意味着使用宽松的内存排序编写的代码可能在具有 x86 架构的系统上工作，在具有更好的系统上它会失败一组粒度的内存排序指令，例如 SPARC.

For example, on x86 and x86-64 architectures, atomic load operations are always the same, whether tagged memory_order_relaxed or memory_order_seq_cst (see section 5.3.3). This means that code written using relaxed memory ordering may work on systems with an x86 architecture, where it would fail on a system with a finer- grained set of memory-ordering instructions such as SPARC.

我是否理解在 x86 架构上所有原子加载操作都是 memory_order_seq_cst?另外，关于cppreference std::memory_order网站提到在 x86 上发布-获取订购是自动的.

Do I get this right that on x86 architecture all atomic load operations are memory_order_seq_cst? In addition, on the cppreference std::memory_order site is mentioned that on x86 release-aquire ordering is automatic.

如果此限制有效，这些顺序是否仍适用于编译器优化?

If this restriction is valid, do the orderings still apply to compiler optimizations?

推荐答案

是的，排序仍然适用于编译器优化.

Yes, ordering still applies to compiler optimizations.

此外，在 x86 上原子加载操作总是相同的"并不完全准确.

Also, it is not entirely exact that on x86 "atomic load operations are always the same".

在 x86 上，所有使用 mov 完成的加载都具有获取语义，并且所有使用 mov 完成的存储都具有释放语义.所以acq_rel、acq 和relaxed 加载是简单的movs，同样acq_rel、rel 和relaxed 存储(acq 存储和rel 加载总是等于relaxed).

On x86, all loads done with mov have acquire semantics and all stores done with mov have release semantics. So acq_rel, acq and relaxed loads are simple movs, and similarly acq_rel, rel and relaxed stores (acq stores and rel loads are always equal to relaxed).

然而，这对于 seq_cst 来说不一定是正确的:该架构不保证 mov 的 seq_cst 语义.实际上，x86 指令集没有任何特定的指令来实现顺序一致的加载和存储.只有 x86 上的原子读-修改-写操作才会有 seq_cst 语义.因此，您可以通过执行参数为 0 的 fetch_and_add 操作(lock xadd 指令)获得加载的 seq_cst 语义，并通过执行 seq_cst 交换操作(xchg指令)并丢弃之前的值.


This however is not necessarily true for seq_cst: the architecture does not guarantee seq_cst semantics for mov.  In fact, the x86 instruction set does not have any specific instruction for sequentially consistent loads and stores.  Only atomic read-modify-write operations on x86 will have seq_cst semantics.  Hence, you could get seq_cst semantics for loads by doing a fetch_and_add operation (lock xadd instruction) with an argument of 0, and seq_cst semantics for stores by doing a seq_cst exchange operation (xchg instruction) and discarding the previous value.
但是您不需要两者都做！只要所有 seq_cst 存储都使用 xchg 完成，seq_cst 加载可以简单地使用 mov 实现.双重，如果所有加载都使用 lock xadd 完成，seq_cst 存储可以简单地使用 mov 实现.
But you do not need to do both!  As long as all seq_cst stores are done with xchg, seq_cst loads can be implemented simply with a mov.  Dually, if all loads were done with lock xadd, seq_cst stores could be implemented simply with a mov.
xchg 和 lock xadd 比 mov 慢得多.因为一个程序(通常)的加载比存储多，所以用 xchg 做 seq_cst 存储很方便，这样(更频繁的)seq_cst 加载可以简单地使用 mov.此实现细节已编入 x86 应用程序二进制接口 (ABI) 中.在 x86 上，兼容编译器必须将 seq_cst 存储编译为 xchg 以便 seq_cst 加载(可能出现在另一个翻译单元中，用不同的编译器编译)可以用更快的 mov指令.

xchg and lock xadd are much slower than mov.  Because a program has (usually) more loads than stores, it is convenient to do seq_cst stores with xchg so that the (more frequent) seq_cst loads can simply use a mov.  This implementation detail is codified in the x86 Application Binary Interface (ABI).  On x86, a compliant compiler must compile seq_cst stores to xchg so that seq_cst loads (which may appear in another translation unit, compiled with a different compiler) can be done with the faster mov instruction.
因此，在 x86 上使用相同的指令完成 seq_cst 和获取加载通常是不正确的.这只是因为 ABI 指定将 seq_cst 存储编译为 xchg.
Thus it is not true in general that seq_cst and acquire loads are done with the same instruction on x86.  It is only true because the ABI specifies that seq_cst stores be compiled to an xchg.

                        这篇关于x86 架构的内存排序限制的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

x86 架构的内存排序限制 [英] Memory ordering restrictions on x86 architecture

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

x86 架构的内存排序限制 [英] Memory ordering restrictions on x86 architecture

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭