使用哪个std :: sync :: atomic :: Order? [英] Which std::sync::atomic::Ordering to use?

查看:110
本文介绍了使用哪个std :: sync :: atomic :: Order?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

std::sync::atomic::AtomicBool 的所有方法均采用内存排序(已放松) ,Release,Acquire,AcqRel和SeqCst),这些我以前没有使用过.在什么情况下应使用这些值?该文档使用了我不太了解的令人困惑的加载"和存储"术语.例如:

All the methods of std::sync::atomic::AtomicBool take a memory ordering (Relaxed, Release, Acquire, AcqRel, and SeqCst), which I have not used before. Under what circumstances should these values be used? The documentation uses confusing "load" and "store" terms which I don’t really understand. For example:

生产者线程改变 Mutex 所拥有的某些状态,然后调用 AtomicBool :: compare_and_swap(false, true, ordering)(以合并无效),以及交换后,将无效"消息发布到并发队列(例如 mpsc 或winapi PostMessage).使用者线程重置AtomicBool,从队列中读取,并读取Mutex持有的状态.生产者是否可以使用互斥量前面的宽松顺序,还是必须使用Release?使用者可以使用store(false, Relaxed),还是必须使用compare_and_swap(true, false, Acquire)从互斥锁接收更改?

A producer thread mutates some state held by a Mutex, then calls AtomicBool::compare_and_swap(false, true, ordering) (to coalesce invalidations), and if it swapped, posts an "invalidate" message to a concurrent queue (e.g. mpsc or a winapi PostMessage). A consumer thread resets the AtomicBool, reads from the queue, and reads the state held by the Mutex. Can the producer use Relaxed ordering because it is preceded by a mutex, or must it use Release? Can the consumer use store(false, Relaxed), or must it use compare_and_swap(true, false, Acquire) to receive the changes from the mutex?

如果生产者和消费者共享 RefCell 而不是Mutex?

What if the producer and consumer share a RefCell instead of a Mutex?

推荐答案

我不是专家,它确实很复杂,所以请随时批评我的帖子.正如mdh.heydari指出的那样,cppreference.com具有更好的订购文档比Rust(C ++具有几乎相同的API).

I'm not an expert on this, and it's really complicated, so please feel free to critique my post. As pointed out by mdh.heydari, cppreference.com has much better documentation of orderings than Rust (C++ has an almost identical API).

您需要在生产者中使用发布"订购,在消费者中使用获取"订购.这样可以确保在AtomicBool设置为true之前就发生数据突变.

You'd need to use "release" ordering in your producer and "acquire" ordering in your consumer. This ensures that the data mutation occurs before the AtomicBool is set to true.

如果您的队列是异步的,那么使用者将需要继续尝试循环读取它,因为生产者可能会在设置AtomicBool和将某些内容放入队列之间被打断.

If your queue is asynchronous, then the consumer will need to keep trying to read from it in a loop, since the producer could get interrupted between setting the AtomicBool and putting something in the queue.

如果生产者代码可能在客户端运行之前运行了多次,则您不能使用RefCell,因为它们可能会在客户端读取数据时使数据变异.否则就可以了.

If the producer code might run multiple times before client runs, then you can't use RefCell because they could mutate the data while the client is reading it. Otherwise it's fine.

还有其他更好,更简单的方法来实现此模式,但我想您只是以它为例.

There are other better and simpler ways to implement this pattern, but I assume you were just giving it as an example.

不同的顺序与发生原子操作时另一个线程看到的内容有关.通常,编译器和CPU都可以对指令进行重新排序以优化代码,并且排序会影响它们对指令进行重新排序的程度.

The different orderings have to do with what another thread sees happen when an atomic operation occurs. Compilers and CPUs are normally both allowed to reorder instructions in order to optimize code, and the orderings effect how much they're allowed to reorder instructions.

您可以始终使用SeqCst,这基本上可以保证每个人都可以将其视为相对于其他指令的任何位置,但是在某些情况下,如果您指定的限制性较低,则LLVM和CPU可以更好地执行优化代码.

You could just always use SeqCst, which basically guarantees everyone will see that instruction as having occurred wherever you put it relative to other instructions, but in some cases if you specify a less restrictive ordering then LLVM and the CPU can better optimize your code.

您应该认为这些顺序适用于内存位置(而不是适用于指令).

You should think of these orderings as applying to a memory location (instead of applying to an instruction).

除了对内存位置是原子的任何修改外,没有任何限制(因此它要么完全发生,要么根本不发生).如果单个线程检索/设置的值无关紧要,只要它们是原子的,则对计数器之类的东西就很好.

There are no constraints besides any modification to the memory location being atomic (so it either happens completely or not at all). This is fine for something like a counter if the values retrieved by/set by individual threads don't matter as long as they're atomic.

此约束表明,在应用"acquire"之后在代码中发生的任何变量读取都不能在发生之前重新排序.因此,假设您在代码中读取了一些共享内存位置并获取了值X,该值在时间T时存储在该内存位置中,然后应用获取"约束.应用约束后,您从中读取的所有内存位置都将具有它们在时间T或之后的值.

This constraint says that any variable reads that occur in your code after "acquire" is applied can't be reordered to occur before it. So, say in your code you read some shared memory location and get value X, which was stored in that memory location at time T, and then you apply the "acquire" constraint. Any memory locations that you read from after applying the constraint will have the value they had at time T or later.

这可能是大多数人希望凭直觉发生的事情,但是由于CPU和优化器只要不改变结果就可以对指令进行重新排序,因此无法保证.

This is probably what most people would expect to happen intuitively, but because a CPU and optimizer are allowed to reorder instructions as long as they don't change the result, it isn't guaranteed.

为了使"acquire"有用,必须将其与"release"配对,因为否则无法保证另一个线程不会将其应该在时间T发生的写指令重新排序为较早的时间.

In order for "acquire" to be useful, it has to be paired with "release", because otherwise there's no guarantee that the other thread didn't reorder its write instructions that were supposed to occur at time T to an earlier time.

获取-读取您要查找的标志值意味着您不会在其他地方看到过时的值,该值实际上是在释放存储到标志之前通过写入实际更改的.

Acquire-reading the flag value you're looking for means you won't see a stale value somewhere else that was actually changed by a write before the release-store to the flag.

此约束表明,在应用发行"之前在代码中发生的所有变量写入均不能重新排序,以免在其之后发生.因此,假设您在代码中写入了几个共享内存位置,然后在时间T设置了一些内存位置t,然后应用了释放"约束.确保在应用发行版"之前出现在代码中的所有写操作.

This constraint says that any variable writes that occur in your code before "release" is applied can't be reordered to occur after it. So, say in your code you write to a few shared memory locations and then set some memory location t at time T, and then you apply the "release" constraint. Any writes that appear in your code before "release" is applied are guaranteed to have occurred before it.

同样,这是大多数人希望直观地发生的事情,但是不能保证没有任何限制.

Again, this is what most people would expect to happen intuitively, but it isn't guaranteed without constraints.

如果另一个尝试读取值X的线程不使用"acquire",则不能保证看到有关其他变量值更改的新值.因此它可以获取新值,但可能看不到任何其他共享变量的新值.另外请记住,测试是困难.在实践中,某些硬件不会显示带有一些不安全代码的重新排序,因此问题可能不会被发现.

If the other thread trying to read value X doesn't use "acquire", then it isn't guaranteed to see the new value with respect to changes in other variable values. So it could get the new value, but it might not see new values for any other shared variables. Also keep in mind that testing is hard. Some hardware won't in practice show re-ordering with some unsafe code, so problems can go undetected.

Jeff Preshing很好地解释了获取和释放语义,因此读不清楚的话.

Jeff Preshing wrote a nice explanation of acquire and release semantics, so read that if this isn't clear.

这同时执行AcquireRelease排序(即,两个限制都适用).我不确定何时需要这样做-如果某些Release,某些Acquire和某些两者都执行,但在3个或更多线程的情况下可能会有所帮助,但我不确定.

This does both Acquire and Release ordering (ie. both restrictions apply). I'm not sure when this is necessary - it might be helpful in situations with 3 or more threads if some Release, some Acquire, and some do both, but I'm not really sure.

这是最严格的限制,因此也是最慢的选择.它强制内存访问似乎以与每个线程相同的顺序发生.这需要在x86上对原子变量的所有写操作(包括StoreLoad在内的完整内存屏障)上执行MFENCE指令,而较弱的顺序则不需要. (您可以在

This is most restrictive and, therefore, slowest option. It forces memory accesses to appear to occur in one, identical order to every thread. This requires an MFENCE instruction on x86 on all writes to atomic variables (full memory barrier, including StoreLoad), while the weaker orderings don't. (SeqCst loads don't require a barrier on x86, as you can see in this C++ compiler output.)

读取-修改-写入访问(例如原子增量或比较和交换)是在x86上使用lock指令完成的,这些指令已经是全部内存屏障.如果您完全关心在非x86目标上编译为有效的代码,则有可能时应避免使用SeqCst,即使对于原子级的读取-修改-写入操作也是如此.但是,在某些情况下需要使用.

Read-Modify-Write accesses, like atomic increment, or compare-and-swap, are done on x86 with locked instructions, which are already full memory barriers. If you care at all about compiling to efficient code on non-x86 targets, it makes sense to avoid SeqCst when you can, even for atomic read-modify-write ops. There are cases where it's needed, though.

有关原子语义如何转换为ASM的更多示例,请参见

For more examples of how atomic semantics turn into ASM, see this larger set of simple functions on C++ atomic variables. I know this is a Rust question, but it's supposed to have basically the same API as C++. godbolt can target x86, ARM, ARM64, and PowerPC. Interestingly, ARM64 has load-acquire (ldar) and store-release (stlr) instructions, so it doesn't always have to use separate barrier instructions.

顺便说一句,默认情况下,x86 CPU始终处于强烈排序"状态,这意味着它们始终像至少设置了AcqRel模式一样工作.因此,对于x86,排序"仅影响LLVM优化器的行为.另一方面,ARM的订单薄弱. Relaxed默认设置为允许编译器完全自由地对事物进行重新排序,并且在弱排序的CPU上不需要额外的屏障指令.

By the way, x86 CPUs are always "strongly ordered" by default, which means they always act as if at least AcqRel mode was set. So for x86 "ordering" only affects how LLVM's optimizer behaves. ARM, on the other hand, is weakly ordered. Relaxed is set by default, to allow the compiler full freedom to reorder things, and to not require extra barrier instructions on weakly-ordered CPUs.

这篇关于使用哪个std :: sync :: atomic :: Order?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆