实际上,无锁原子是无地址的吗? [英] Are lock-free atomics address-free in practice?

查看:96
本文介绍了实际上,无锁原子是无地址的吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Boost.Interprocess 是一个很棒的库,简化了不同进程之间共享内存的使用.它提供了互斥锁,条件变量和信号量,当在共享内存中进行读写操作时,它们可以实现同步.

但是,在某些情况下,这些(相对)性能密集的同步机制不是必需的-原子操作足以满足我的用例,并且可能会提供更好的性能.

不幸的是,Boost.Interprocess似乎没有原子.


C ++标准库提供了 std::atomic 类模板,该模板封装了对象,这些对象的操作需要是原子的,并且还具有测试原子操作是否无锁的功能.但是,它也不要求无锁原子也必须是无地址的: [atomics.lockfree]/4 鼓励鼓励无锁操作是无地址的,这与解决方案

是的,无锁原子在所有普通CPU上的所有C ++实现中都是无地址的,并且可以安全地用于进程之间的共享内存中.但是,无锁原子 1 在进程之间并不安全.每个进程都有自己的锁哈希表( std的锁在哪里:: atomic?).

C ++标准旨在使无锁原子在进程之间的共享存储器中工作,但是它只能达到应该"的程度.没有定义术语等等.

C ++ draft 29.5无锁属性

[注意:无锁的操作也应无地址.也就是说,通过两个不同的地址对同一内存位置进行的原子操作将进行原子通信.实现不应依赖于任何每个进程的状态. 此限制允许通过多次映射到一个进程的内存以及通过在两个进程之间共享的内存进行通信. —请注意"

这是一个实施质量建议,很容易在当前硬件上实施,实际上,您必须努力设计一个在x86/ARM/PowerPC/其他主流平台上违反该标准的deathstation9000 C ++实现CPU,而实际上是无锁的.


硬件为原子读取-修改-写入操作公开的机制基于MESI缓存一致性,该一致性仅关心物理地址. x86 lock cmpxchg/lock add等使一个内核以修改"状态挂接到高速缓存行,因此在原子操作的中间没有其他内核可以读取/写入它. ( num ++可以表示'int num'是原子的吗?)./p>

大多数非x86架构都使用 LL/SC ,使您可以编写一个重试循环,该循环仅在原子存储时才执行存储. LL/SC可以在不引入地址的情况下以免等待的方式模拟具有O(1)开销的CAS.

C ++无锁原子被编译为直接使用LL/SC指令.有关x86示例,请参见我对num++问题的回答.参见以原子方式清除无符号整数的最低非零位有关使用LL/SC指令的compare_exchange_weakfetch_add的AArch64代码源的一些示例.

原子纯负载或纯存储更容易,并且通过对齐的数据免费发生.在x86上,请参见为什么是整数分配在x86上自然对齐的可变原子上?其他ISA具有相似的规则.


相关:我在真正测试std :: atomic是否无锁.我不确定它们是否有帮助或正确. :/


脚注1 :

所有主流CPU都具有对象的无锁原子,直到指针的宽度为止.有些具有更宽泛的原子,例如x86具有lock cmpxchg16b,但并非所有实现都选择将其用于双倍宽度的无锁原子对象.检查C ++ 17 std::atomic::is_always_lock_free ATOMIC_xxx_LOCK_FREE定义,用于编译时检测.

(某些微控制器无法将指针保存在单个寄存器中(或通过单个操作将其复制),但是通常不存在此类ISA的多核实现.)


为什么在地球上,实现会使用无锁的非无地址原子?

我不知道在像普通现代CPU一样工作的硬件上有任何合理的理由.您可以想象通过将地址提交给某些对象来进行原子操作的某种体系结构

我认为C ++标准希望尽可能避免限制非主流实现.例如在某种解释器之上的C ++,而不是编译用于正常"代码的机器代码. CPU体系结构.

IDK,如果您可以在松散耦合的共享内存系统(例如具有以太网链接的群集而不是共享内存或非一致性共享内存(必须显式刷新以便其他线程才能看到您的商店)的集群)上有效地实现C ++

我认为,主要是C ++委员会在不假设实现将如何执行原子的前提下,没有假设实现将在多个进程可以设置共享内存的OS下运行程序的情况下.

>

他们可能正在想像将来的ISA,其中不可能没有无地址原子,但是我认为他们更有可能不想谈论多个C ++程序之间的共享内存.该标准仅要求一个实现运行一个程序.

显然std::atomic_flag实际上可以保证没有地址Boost.Interprocess is a wonderful library that simplifies the usage of shared memory amongst different processes. It provides mutexes, condition variables, and semaphores, which allow for synchronization when writing and reading from the shared memory.

However, in some situations these (relatively) performance-intensive synchronization mechanisms are not necessary - atomic operations suffice for my use case, and will likely give much better performance.

Unfortunately, Boost.Interprocess does not seem to come with atomics.


The C++ Standard Library provides the std::atomic class template, which encapsulates objects whose operations need to be atomic, and also has functions to test if atomic operations are lock-free. But it does not require lock-free atomics to be address-free as well: [atomics.lockfree]/4 merely encourages that lock-free operations be address-free, and this is in agreement with cppreference.

I cannot think of any reason why one would implement lock-free atomics in a non-address-free manner. It even appears to me to be considerably easier to implement lock-free atomics in an address-free manner.

Since I would gain significant performance benefits when using atomics instead of mutexes (from Boost.Interprocess), it seems tempting to discount standard-compliance here and store std::atomic objects in the shared memory.


There are two parts to this question:

  1. Do CPUs implement lock-free atomics in an address-free manner in practice? (I only care about CPUs that are used to run modern desktop and mobile operating systems (e.g. Windows, MacOS, Linux, Android, iOS), but not embedded systems)
  2. Why on earth would an implementation use non-address-free atomics that are lock-free?

解决方案

Yes, lock-free atomics are address-free on all C++ implementations on all normal CPUs, and can safely be used on shared-memory between processes. Non-lock-free atomics1 won't be safe between processes, though. Each process will have its own hash table of locks (Where is the lock for a std::atomic?).

The C++ standard intends lock-free atomics to work in shared memory between processes, but it can only go as far as "should" without defining terms and so on.

C++draft 29.5 Lock-free property

[ Note: Operations that are lock-free should also be address-free. That is, atomic operations on the same memory location via two different addresses will communicate atomically. The implementation should not depend on any per-process state. This restriction enables communication by memory that is mapped into a process more than once and by memory that is shared between two processes. — end note ]

This is a quality-of-implementation recommendation that is very easy to implement on current hardware, and in fact you'd have to try hard to design a deathstation9000 C++ implementation that violates it on x86 / ARM / PowerPC / other mainstream CPU while actually being lock-free.


The mechanism hardware exposes for atomic read-modify-write operations is based on MESI cache coherency which only cares about physical addresses. x86 lock cmpxchg / lock add / etc. makes a core hang on to a cache line in Modified state so no other core can read/write it in the middle of the atomic operation. (Can num++ be atomic for 'int num'?).

Most non-x86 architectures use LL/SC, which lets you write a retry loop that only does the store if it will be atomic. LL/SC can emulate CAS with O(1) overhead in a wait-free manner without introducing addresses.

C++ lock-free atomics compile to use LL/SC instructions directly. See my answer on the num++ question for x86 examples. See Atomically clearing lowest non-zero bit of an unsigned integer for some examples of AArch64 code-gen for compare_exchange_weak vs fetch_add using LL/SC instructions.

Atomic pure-load or pure-store are easier and happen for free with aligned data. On x86, see Why is integer assignment on a naturally aligned variable atomic on x86? Other ISAs have similar rules.


Related: I included some comments about address-free in my answer on Genuinely test std::atomic is lock-free or not. I'm not sure whether they're helpful or correct. :/


Footnote 1:

All mainstream CPUs have lock-free atomics for objects up to the width of a pointer. Some have wider atomics, like x86 has lock cmpxchg16b, but not all implementations choose to use it for double-width lock-free atomic objects. Check C++17 std::atomic::is_always_lock_free, or ATOMIC_xxx_LOCK_FREE if defined, for compile-time detection.

(Some microcontrollers can't hold a pointer in a single register (or copy it around with a single operation), but there aren't usually multi-core implementations of such ISAs.)


Why on earth would an implementation use non-address-free atomics that are lock-free?

I don't know any plausible reason on hardware that works anything like normal modern CPUs. You could may imagine some architecture where you do atomic operations by submitting the address to some

I think the C++ standard wants to avoid constraining non-mainstream implementations as much as possible. e.g. C++ on top of some kind of interpreter, rather than compiled do machine code for a "normal" CPU architecture.

IDK if you could usefully implement C++ on a loosely-coupled shared memory system like a cluster with ethernet links instead of shared memory, or non-coherent shared memory (that has to be flushed explicitly for other threads to see your stores).

I think it's mostly that the C++ committee can't say much about how atomics must be implemented without assuming that implementations will run programs under an OS where multiple processes can set up shared memory.

They might be imagining some future ISA where address-free atomics aren't possible, but I think more likely they don't want to talk about shared-memory between multiple C++ programs. The standard only requires that an implementation run one program.

Apparently std::atomic_flag is actually guaranteed to be address-free Why only std::atomic_flag is guaranteed to be lock-free?, so IDK why they don't make the same requirement for any atomic<T> that the implementation chooses to implement as lock-free.

这篇关于实际上,无锁原子是无地址的吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆