x86_64 和 ARM 上的原子 CAS 操作总是使用 std::memory_order_seq_cst 吗? [英] Do atomic CAS-operations on x86_64 and ARM always use std::memory_order_seq_cst?

查看:46
本文介绍了x86_64 和 ARM 上的原子 CAS 操作总是使用 std::memory_order_seq_cst 吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

正如 安东尼·威廉姆斯所说:

some_atomic.load(std::memory_order_acquire) 只是掉线到一个简单的加载指令,以及some_atomic.store(std::memory_order_release) 进入一个简单的存储指令.

some_atomic.load(std::memory_order_acquire) does just drop through to a simple load instruction, and some_atomic.store(std::memory_order_release) drops through to a simple store instruction.

众所周知,在 x86 上,load()store() 操作的内存屏障 memory_order_consume, memory_order_acquire, memory_order_release,memory_order_acq_rel不需要处理器指令.

It is known that on x86 for the operations load() and store() memory barriers memory_order_consume, memory_order_acquire, memory_order_release, memory_order_acq_rel does not require a processor instructions.

但是在 ARMv8 我们知道这里有 load()store() 的内存屏障:http://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-1-of-2http://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-2-of-2

But on ARMv8 we known that here are memory barriers both for load() and store(): http://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-1-of-2 http://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-2-of-2

关于CPU的不同架构:http://g.oswego.edu/dl/jmm/cookbook.html

About different architectures of CPUs: http://g.oswego.edu/dl/jmm/cookbook.html

接下来,对于 x86 上的 CAS 操作,这两行具有不同内存屏障的反汇编代码(MSVS2012 x86_64)是相同的:

Next, but for the CAS-operation on x86, these two lines with different memory barriers are identical in Disassembly code (MSVS2012 x86_64):

    a.compare_exchange_weak(temp, 4, std::memory_order_seq_cst, std::memory_order_seq_cst);
000000013FE71A2D  mov         ebx,dword ptr [temp]  
000000013FE71A31  mov         eax,ebx  
000000013FE71A33  mov         ecx,4  
000000013FE71A38  lock cmpxchg dword ptr [temp],ecx  

    a.compare_exchange_weak(temp, 5, std::memory_order_relaxed, std::memory_order_relaxed);
000000013FE71A4D  mov         ecx,5  
000000013FE71A52  mov         eax,ebx  
000000013FE71A54  lock cmpxchg dword ptr [temp],ecx  

GCC 4.8.1 x86_64 - GDB 编译的反汇编代码:

a.compare_exchange_weak(temp, 4, std::memory_order_seq_cst, std::memory_order_seq_cst);
a.compare_exchange_weak(temp, 5, std::memory_order_relaxed, std::memory_order_relaxed);

0x4613b7  <+0x0027>         mov    0x2c(%rsp),%eax
0x4613bb  <+0x002b>         mov    $0x4,%edx
0x4613c0  <+0x0030>         lock cmpxchg %edx,0x20(%rsp)
0x4613c6  <+0x0036>         mov    %eax,0x2c(%rsp)
0x4613ca  <+0x003a>         lock cmpxchg %edx,0x20(%rsp)

在 x86/x86_64 平台上进行任何原子 CAS 操作,例如这样的例子 atomic_val.compare_exchange_weak(temp, 1, std::memory_order_relaxed, std::memory_order_relaxed); 总是满意排序 std::memory_order_seq_cst?

Is on x86/x86_64 platforms for any atomic CAS-operations, an example such like this atomic_val.compare_exchange_weak(temp, 1, std::memory_order_relaxed, std::memory_order_relaxed); always satisfied with the ordering std::memory_order_seq_cst?

如果 x86 上的任何 CAS 操作总是以顺序一致性(std::memory_order_seq_cst)运行,而不管障碍如何,那么在 ARMv8 上它是一样的吗?

And if the any CAS operation on the x86 always run with sequential consistency (std::memory_order_seq_cst) regardless of barriers, then on the ARMv8 it is the same?

问题:CASstd::memory_order_relaxed 顺序是否应该在 x86 或 ARM 上阻塞内存总线?

QUESTION: Should the order of std::memory_order_relaxed for CAS block memory bus on x86 or ARM?

答案:x86 上,任何 compare_exchange_weak() 操作与任何 std::memory_orders(甚至 std::memory_order_relaxed) 总是转换为 LOCK CMPXCHG 带锁总线,真正具有原子性,并且与 XCHG - cmpxchgxchg 指令一样昂贵".

ANSWER: On x86 any compare_exchange_weak() operations with any std::memory_orders(even std::memory_order_relaxed) always translates to the LOCK CMPXCHG with lock bus, to be really atomic, and have equal expensive to XCHG - "the cmpxchg is just as expensive as the xchg instruction".

(另外:XCHG 等于 LOCK XCHG,但 CMPXCHG 不等于 LOCK CMPXCHG(这真的是原子的)

(An addition: XCHG equal to LOCK XCHG, but CMPXCHG doesn't equal to LOCK CMPXCHG(which is really atomic)

ARM 和 PowerPC 上,对于任何`compare_exchange_weak(),对于不同的 std::memory_orders,有不同的锁的处理器指令,通过 LL/SC.

On ARM and PowerPC for any`compare_exchange_weak() for different std::memory_orders there are differents lock's processor instructions, through LL/SC.

x86(CAS 除外)、ARM 和 PowerPC 的处理器内存屏障说明:http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html

Processor memory-barriers-instructions for x86(except CAS), ARM and PowerPC: http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html

推荐答案

您不必担心编译器将给定的 C11 构造映射到哪些指令,因为这不会捕获所有内容.相反,您需要根据 C11 内存模型的保证来开发代码.正如上面的注释所指出的,只要不违反 C11 内存模型,您的编译器或未来的编译器可以自由地重新排序宽松的内存操作.也值得通过 CDSChecker 之类的工具运行您的代码,以查看在内存模型下允许哪些行为.

You shouldn't worry about what instructions the compiler maps a given C11 construct to as this doesn't capture everything. Instead you need to develop code with respect to the guarantees of the C11 memory model. As the above comment notes, your compiler or future compilers are free to reorder relaxed memory operations as long as it doesn't violate the C11 memory model. It is also a worthwhile running your code through a tool like CDSChecker to see what behaviors are allowed under the memory model.

这篇关于x86_64 和 ARM 上的原子 CAS 操作总是使用 std::memory_order_seq_cst 吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆