获取/释放与顺序一致的内存顺序 [英] Acquire/Release versus Sequentially Consistent memory order

查看:283
本文介绍了获取/释放与顺序一致的内存顺序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于任何 std :: atomic< T> 其中T是原始类型:



std :: memory_order_acq_rel 用于 fetch_xxx 操作和 std :: memory_order_acquire 加载操作和 std :: memory_order_release >操作盲目(我的意思就是重置这些函数的默认内存排序)




  • 结果与使用 std :: memory_order_seq_cst (作为默认值)用于任何声明的操作?

  • 如果结果相同,用法与使用 std :: memory_order_seq_cst 在效率方面不同?


解决方案

原子操作的C ++ 11内存排序参数指定了对排序的约束。如果你使用 std :: memory_order_release 做一个存储,并且从另一个线程加载读取 std :: memory_order_acquire 那么来自第二线程的后续读取操作将看到由存储释放之前的第一线程存储到任何存储器位置的任何值或者稍后存储到这些存储器位置中的任何一个。 / p>

如果存储和后续加载都是 std :: memory_order_seq_cst ,那么这两个线程之间的关系是相同的。您需要更多的话题才能看到差异。



例如。 std :: atomic 变量 x y ,最初都为0。



主题1:

  x.store (1,std :: memory_order_release); 

主题2:

  y.store(1,std :: memory_order_release); 

主题3:

  int a = x.load(std :: memory_order_acquire); // x before y 
int b = y.load(std :: memory_order_acquire);

主题4:

  int c = y.load(std :: memory_order_acquire); // y before x 
int d = x.load(std :: memory_order_acquire);

如图所示, x y ,因此很有可能看到 a == 1 b == 0 在线程3中, c == 1 d == 0



如果所有内存排序都改为 std :: memory_order_seq_cst 在商店之间订购 x y 。因此,如果线程3看到 a == 1 b == 0 $ c> x 必须在存储到 y 之前,所以如果线程4看到 c == 1 ,表示 y 的存储已完成,则 x 的存储也必须完成我们必须有 d == 1



在实践中,然后使用 std :: memory_order_seq_cst everywhere将根据您的编译器和处理器体系结构,对加载或存储或两者添加额外开销。例如x86处理器的一个常见技术是使用 XCHG 指令,而不是 std: MOV :memory_order_seq_cst 存储,以便提供必要的顺序保证,而对于 std :: memory_order_release code>就足够了。在具有更宽松的存储器架构的系统上,开销可能更大,因为平常加载和存储具有较少的保证。



内存排序困难。我在我的书中几乎整整篇一章。


For any std::atomic<T> where T is a primitive type:

If I use std::memory_order_acq_rel for fetch_xxx operations, and std::memory_order_acquire for load operation and std::memory_order_release for store operation blindly (I mean just like resetting the default memory ordering of those functions)

  • Will the results be same as if I used std::memory_order_seq_cst (which is being used as default) for any of the declared operations?
  • If the results were the same, is this usage anyhow different than using std::memory_order_seq_cst in terms of efficiency?

解决方案

The C++11 memory ordering parameters for atomic operations specify constraints on the ordering. If you do a store with std::memory_order_release, and a load from another thread reads the value with std::memory_order_acquire then subsequent read operations from the second thread will see any values stored to any memory location by the first thread that were prior to the store-release, or a later store to any of those memory locations.

If both the store and subsequent load are std::memory_order_seq_cst then the relationship between these two threads is the same. You need a more threads to see the difference.

e.g. std::atomic<int> variables x and y, both initially 0.

Thread 1:

x.store(1,std::memory_order_release);

Thread 2:

y.store(1,std::memory_order_release);

Thread 3:

int a=x.load(std::memory_order_acquire); // x before y
int b=y.load(std::memory_order_acquire); 

Thread 4:

int c=y.load(std::memory_order_acquire); // y before x
int d=x.load(std::memory_order_acquire);

As written, there is no relationship between the stores to x and y, so it is quite possible to see a==1, b==0 in thread 3, and c==1 and d==0 in thread 4.

If all the memory orderings are changed to std::memory_order_seq_cst then this enforces an ordering between the stores to x and y. Consequently, if thread 3 sees a==1 and b==0 then that means the store to x must be before the store to y, so if thread 4 sees c==1, meaning the store to y has completed, then the store to x must also have completed, so we must have d==1.

In practice, then using std::memory_order_seq_cst everywhere will add additional overhead to either loads or stores or both, depending on your compiler and processor architecture. e.g. a common technique for x86 processors is to use XCHG instructions rather than MOV instructions for std::memory_order_seq_cst stores, in order to provide the necessary ordering guarantees, whereas for std::memory_order_release a plain MOV will suffice. On systems with more relaxed memory architectures the overhead may be greater, since plain loads and stores have fewer guarantees.

Memory ordering is hard. I devoted almost an entire chapter to it in my book.

这篇关于获取/释放与顺序一致的内存顺序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆