在忙等待循环中需要内存屏障还是原子操作? [英] Is memory barrier or atomic operation required in a busy-wait loop?

查看:214
本文介绍了在忙等待循环中需要内存屏障还是原子操作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑下面的 spin_lock()实现,最初来自这个答案 a>:

  void spin_lock(volatile bool * lock){
for(;;){
//插入获取内存屏障和编译器屏障
if(!__ atomic_test_and_set(lock,__ATOMIC_ACQUIRE))
return;

while(* lock)//无障碍;可以吗?
cpu_relax();
}
}

我已经知道:





问题:


  1. 是否有足够的 volatile 或者是否存在任何需要内存或编译器屏障或原子操作的体系结构或编译器 while 循环?



    <1.1>根据 C ++ 标准?b / b>
    $ b

    1.2实际上,对于已知的架构和编译器,特别是对于它支持的GCC和平台而言?


  2. 在GCC和Linux支持的所有体系结构上,这个实现是否安全? (在某些体系结构中,它至少低效,对吗?)

  3. 根据 C ++ 11 及其内存模型?






有几个相关的问题,但我无法从它们构建明确而明确的答案:


解决方案



  1. 这里的volatile是否足够大或者是否有任何体系结构或编译器在while循环中需要内存或编译器的屏障或原子操作?


会将volatile代码看到变化。是的,但不一定就像有内存障碍一样快。在某些情况下,会发生某种形式的同步,并且将从变量中读取新状态,但不能保证代码中其他地方发生了多少事情。



< blockquote>

1.1根据C ++标准?


来自 cppreference:memory_order

这是内存模型和内存顺序,它定义代码需要处理的通用硬件。对于在执行线程之间传递的消息,需要发生线程间发生之前的关系。这需要...





因为你没有执行任何操作在这些情况下,你的程序将会出现在某些当前硬件上,它可能会失败。在实践中,时间片的结束会导致内存变得连贯一致,或者在非自旋锁线程上的任何形式的障碍将确保缓存被刷新。



不能确定volatile读取的原因是current值。
$ b


1.2实际上,对于已知的架构和编译器,特别是对于它支持的GCC和平台而言?


由于代码与基因不一致从 C ++ 11 ,这个代码很可能无法在试图遵守标准的C ++版本中执行。



cppreference:const volatile qualifiers
不稳定的访问停止了从之前到之后以及从之后到之前的工作的优化。
$ b


这使得volatile对象适合与信号处理程序通信,但不能与另一个执行线程通信。


因此,实现必须确保从内存位置读取指令,而不是任何本地副本。但它不必确保通过缓存刷新易失性写入以在所有CPU中产生连贯的视图。从这个意义上讲,写入一个volatile变量后写入另一个线程的时间没有时间限制。

//www.kernel.org/doc/html/v4.11/process/volatile-considered-harmful.htmlrel =nofollow noreferrer> kernel.org为什么volatile在内核中几乎总是错误的




在GCC和Linux支持的所有体系结构中,此实现是否安全? (在某些体系结构中,它至少是低效的,对吧?)

不能保证易失性消息离开设置的线程它。所以不是很安全。在Linux上它可能是安全的。


while循环是否安全,根据C ++ 11及其内存模型?


否 - 因为它不会创建任何线程间消息传递原语。


Consider the following spin_lock() implementation, originally from this answer:

void spin_lock(volatile bool* lock)  {  
    for (;;) {
        // inserts an acquire memory barrier and a compiler barrier
        if (!__atomic_test_and_set(lock, __ATOMIC_ACQUIRE))
            return;

        while (*lock)  // no barriers; is it OK?
            cpu_relax();
    }
}

What I already know:

  • volatile prevents compiler from optimizing out *lock re-read on each iteration of the while loop;
  • volatile inserts neither memory nor compiler barriers;
  • such an implementation actually works in GCC for x86 (e.g. in Linux kernel) and some other architectures;
  • at least one memory and compiler barrier is required in spin_lock() implementation for a generic architecture; this example inserts them in __atomic_test_and_set().

Questions:

  1. Is volatile enough here or are there any architectures or compilers where memory or compiler barrier or atomic operation is required in the while loop?

    1.1 According to C++ standards?

    1.2 In practice, for known architectures and compilers, specifically for GCC and platforms it supports?

  2. Is this implementation safe on all architectures supported by GCC and Linux? (It is at least inefficient on some architectures, right?)
  3. Is the while loop safe according to C++11 and its memory model?


There are several related questions, but I was unable to construct an explicit and unambiguous answer from them:

解决方案

  1. Is volatile enough here or are there any architectures or compilers where memory or compiler barrier or atomic operation is required in the while loop?

will the volatile code see the change. Yes, but not necessarily as quickly as if there was a memory barrier. At some point, some form of synchronization will occur, and the new state will be read from the variable, but there are no guarantees on how much has happened elsewhere in the code.

1.1 According to C++ standards?

From cppreference : memory_order

It is the memory model and memory order which defines the generalized hardware that the code needs to work on. For a message to pass between threads of execution, an inter-thread-happens-before relationship needs to occur. This requires either...

  • A synchronizes-with B
  • A has a std::atomic operation before B
  • A indirectly synchronizes with B (through X).
  • A is sequenced before X which inter-thread happens before B
  • A interthread happens before X and X interthread happens before B.

As you are not performing any of those cases there will be forms of your program where on some current hardware, it may fail.

In practice, the end of a time-slice will cause the memory to become coherent, or any form of barrier on the non-spinlock thread will ensure that the caches are flushed.

Not sure on the causes of the volatile read getting the "current value".

1.2 In practice, for known architectures and compilers, specifically for GCC and platforms it supports?

As the code is not consistent with the generalized CPU, from C++11 then it is likely this code will fail to perform with versions of C++ which try to adhere to the standard.

From cppreference : const volatile qualifiers Volatile access stops optimizations from moving work from before it to after it, and from after it to before it.

"This makes volatile objects suitable for communication with a signal handler, but not with another thread of execution"

So an implementation has to ensure that instructions are read from the memory location rather than any local copy. But it does not have to ensure that the volatile write is flushed through the caches to produce a coherent view across all the CPUs. In this sense, there is no time boundary on how long after a write into a volatile variable will become visible to another thread.

Also see kernel.org why volatile is nearly always wrong in kernel

Is this implementation safe on all architectures supported by GCC and Linux? (It is at least inefficient on some architectures, right?)

There is no guarantee the volatile message gets out of the thread which sets it. So not really safe. On linux it may be safe.

Is the while loop safe according to C++11 and its memory model?

No - as it doesn't create any of the inter-thread messaging primitives.

这篇关于在忙等待循环中需要内存屏障还是原子操作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆