std :: atomic :: load的内存排序行为 [英] Memory ordering behavior of std::atomic::load

查看:197
本文介绍了std :: atomic :: load的内存排序行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我错误地假设atomic :: load也应该作为内存屏障,确保所有以前的非原子写入将被其他线程可见?

Am I wrong to assume that the atomic::load should also act as a memory barrier ensuring that all previous non-atomic writes will become visible by other threads?

示例:

volatile bool arm1 = false;
std::atomic_bool arm2 = false;
bool triggered = false;

Thread1:

arm1 = true;
//std::std::atomic_thread_fence(std::memory_order_seq_cst); // this would do the trick 
if (arm2.load())
    triggered = true;

Thread2:

arm2.store(true);
if (arm1)
    triggered = true;

我预计执行后两个'triggered'请不要建议使arm1原子,点是探索atomic :: load的行为。

I expected that after executing both 'triggered' would be true. Please don't suggest to make arm1 atomic, the point is to explore the behavior of atomic::load.

虽然我不得不承认我不完全理解对记忆顺序的不同宽松语义的形式定义我认为顺序一致的排序非常简单,因为它保证存在单个总排序,其中所有线程都以相同的顺序观察所有修改。对我来说,这意味着默认内存顺序为std :: memory_order_seq_cst的std :: atomic :: load也将作为内存栅栏。这通过顺序一致排序下面的语句进一步证实:

While I have to admit I don't fully understand the formal definitions of the different relaxed semantics of memory order I thought that the sequentially consistent ordering was pretty straightforward in that it guarantees that "a single total order exists in which all threads observe all modifications in the same order." To me this implies that the std::atomic::load with the default memory order of std::memory_order_seq_cst will also act as a memory fence. This is further corroborated by the following statement under "Sequentially-consistent ordering":

总顺序排序在所有多核系统上都需要一个完整的内存栅栏CPU指令。

Total sequential ordering requires a full memory fence CPU instruction on all multi-core systems.

然而,下面的简单示例演示了MSVC 2013,gcc 4.9(x86)和clang 3.5.1(x86)不是这种情况,转换为加载指令。

Yet, my simple example below demonstrates this is not the case with MSVC 2013, gcc 4.9 (x86) and clang 3.5.1 (x86), where the atomic load simply translates to a load instruction.

#include <atomic>

std::atomic_long al;

#ifdef _WIN32
__declspec(noinline)
#else
__attribute__((noinline))
#endif
long load() {
    return al.load(std::memory_order_seq_cst);
}

int main(int argc, char* argv[]) {
    long r = load();
}

使用gcc,它看起来像:

With gcc this looks like:

load():
   mov  rax, QWORD PTR al[rip]   ; <--- plain load here, no fence or xchg
   ret
main:
   call load()
   xor  eax, eax
   ret

我将省略msvc和clang,它们本质上是相同的。现在在gcc的ARM我们得到我的期望:

I'll omit the msvc and clang which are essentially identical. Now on gcc for ARM we get what I expected:

load():
     dmb    sy                         ; <---- data memory barrier here
     movw   r3, #:lower16:.LANCHOR0
     movt   r3, #:upper16:.LANCHOR0
     ldr    r0, [r3]                   
     dmb    sy                         ; <----- and here
     bx lr
main:
    push    {r3, lr}
    bl  load()
    movs    r0, #0
    pop {r3, pc}

这不是一个学术问题,条件在我们的代码中,这使我对std :: atomic的行为的理解。

This is not an academic question, it results in a subtle race condition in our code which called into question my understanding of the behavior of std::atomic.

推荐答案

叹息,这个评论太长了:

Sigh, this was too long for a comment:


原子的意义不是瞬间发生在系统的其他部分?

Isn't the meaning of atomic "to appear to occur instantaneously to the rest of the system"?

我会说是的,没有对那个,这取决于你如何看待它。对于使用 SEQ_CST 的写入,是。但是如何处理原子载荷,请查看C ++ 11标准的29.3。具体来说,29.3.3是真正好的阅读,29.3.4可能具体是你正在寻找:

I'd say yes and no to that one, depending on how you think of it. For writes with SEQ_CST, yes. But as far as how atomic loads are handled, check out 29.3 of the C++11 standard. Specifically, 29.3.3 is really good reading, and 29.3.4 might be specifically what you're looking for:


B读取原子对象M的值,如果存在在B之前排序的memory_order_seq_-
cst fence X,则B观察在总数S中X之前的M
的最后一个memory_order_seq_cst修改,或者

For an atomic operation B that reads the value of an atomic object M, if there is a memory_order_seq_- cst fence X sequenced before B, then B observes either the last memory_order_seq_cst modification of M preceding X in the total order S or a later modification of M in its modification order.

基本上, SEQ_CST 一个全局顺序就像标准所说,但读取可以返回和旧的值,而不违反原子约束。

Basically, SEQ_CST forces a global order just like the standard says, but reads can return and old value without violating the 'atomic' constraint.

要完成获得绝对最新值需要执行强制硬件一致性协议锁定的操作(x86_64上的 lock 指令)。这是原子比较和交换操作,如果你看看汇编输出。

To accomplish 'getting the absolute latest value' you'll need to perform an operation that forces the hardware coherency protocol to lock(the lock instruction on x86_64). This is what the atomic compare-and-exchange operations do, if you look at the assembly output.

这篇关于std :: atomic :: load的内存排序行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆