如何使内存“立即"存储在一个线程中?在其他线程中可见? [英] How do I make memory stores in one thread "promptly" visible in other threads?

查看:103
本文介绍了如何使内存“立即"存储在一个线程中?在其他线程中可见?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我想将设备寄存器的内容复制到一个变量中,该变量将被多个线程读取.有一个好的通用方法吗?以下是执行此操作的两种可能方法的示例:

Suppose I wanted to copy the contents of a device register into a variable that would be read by multiple threads. Is there a good general way of doing this? Here are examples of two possible methods of doing this:

#include <atomic>

volatile int * const Device_reg_ptr = reinterpret_cast<int *>(0x666);

// This variable is read by multiple threads.
std::atomic<int> device_reg_copy;

// ...

// Method 1
const_cast<volatile std::atomic<int> &>(device_reg_copy)
  .store(*Device_reg_ptr, std::memory_order_relaxed);

// Method 2
device_reg_copy.store(*Device_reg_ptr, std::memory_order_relaxed);
std::atomic_thread_fence(std::memory_order_release);

更一般地说,面对可能的整个程序优化,如何正确地控制一个线程在其他线程中可见的内存写入延迟?

More generally, in the face of possible whole program optimization, how does one correctly control the latency of memory writes in one thread being visible in other threads?

在您的答案中,请考虑以下情形:

In your answer, please consider the following scenario:

  • 代码在嵌入式系统中的CPU上运行.
  • 单个应用程序正在CPU上运行.
  • 该应用程序的线程数远远少于CPU具有处理器核心的线程数.
  • 每个内核都有大量的寄存器.
  • 该应用程序足够小,可以在构建可执行文件时成功使用整个程序优化.

我们如何确保一个线程中的存储不会无限期保持对其他线程不可见?

How do we make sure that a store in one thread does not remain invisible to other threads indefinitely?

推荐答案

C ++标准非常含糊,无法使原子存储对其他线程可见..

The C++ standard is rather vague about making atomic stores visible to other threads..

29.3.12 实施应使原子存储在合理的时间内对原子负载可见.

29.3.12 Implementations should make atomic stores visible to atomic loads within a reasonable amount of time.

就其细节而言,没有合理"的定义,也不必立即定义.

That is as detailed as it gets, there is no definition of 'reasonable', and it does not have to be immediately.

由于可以在原子操作上指定内存,因此不必使用独立的篱笆来强制执行某些内存排序. 您对使用内存围栏有什么期望..
栅栏旨在强制执行内存操作(在线程之间)的顺序,但是它们不能保证及时的可见性. 您可以使用最强的内存顺序(即seq_cst)将值存储到原子变量,但是即使另一个线程在比store()晚的时间执行load(), 您可能仍会从缓存中获取旧值,但(令人惊讶地)它确实 not 违反了 happens-before 关系. 使用更坚固的栅栏可能会有所不同.时间和可见性,但不能保证.

Using a stand-alone fence to force a certain memory ordering is not necessary since you can specify those on atomic operations, but the question is, what is your expectation with regards to using a memory fence..
Fences are designed to enforce ordering on memory operations (between threads), but they do not guarantee visibility in a timely manner. You can store a value to an atomic variable with the strongest memory ordering (ie. seq_cst), but even when another thread executes load() at a later time than the store(), you might still get an old value from the cache and yet (surprisingly) it does not violate the happens-before relationship. Using a stronger fence might make a difference wrt. timing and visibility, but there are no guarantees.

如果即时可见性很重要,我会考虑使用读取-修改-写入"(RMW)操作来加载该值. 这些是原子操作(即在单个调用中)进行原子读取和修改的原子操作,并具有保证可以对最新值进行操作的附加属性. 但是,由于它们必须到达比本地缓存更远的位置,因此这些调用的执行成本也往往更高.

If prompt visibility is important, I would consider using a Read-Modify-Write (RMW) operation to load the value. These are atomic operations that read and modify atomically (ie. in a single call), and have the additional property that they are guaranteed to operate on the latest value. But since they have to reach a little further than the local cache, these calls also tend to be more expensive to execute.

如Maxim Egorushkin所指出的,是否可以使用比默认值(seq_cst)弱的内存排序取决于是否需要在线程之间同步其他内存操作(使其可见). 从您的问题尚不清楚,但是使用默认值(顺序一致性)通常被认为是安全的.
如果您使用的平台异常弱,性能有问题,并且需要线程之间的数据同步,则可以考虑使用acquire/release语义:

As pointed out by Maxim Egorushkin, whether or not you can use weaker memory orderings than the default (seq_cst) depends on whether other memory operations need to be synchronized (made visible) between threads. That is not clear from your question, but it is generally considered safe to use the default (sequential consistency).
If you are on an unusually weak platform, if performance is problematic, and if you need data synchronization between threads, you could consider using acquire/release semantics:

// thread 1
device_reg_copy.store(*Device_reg_ptr, std::memory_order_release);


// thread 2
device_reg_copy.fetch_add(0, std::memory_order_acquire);

如果线程2看到了线程1写入的值,则可以确保在线程2加载之后可以看到线程1中存储之前的内存操作. 获取/发布操作形成一对,并且它们基于存储和负载之间的运行时关系进行同步.换句话说,如果线程2没有看到线程1存储的值, 没有订购保证.

If thread 2 sees the value written by thread 1, it is guaranteed that memory operations prior to the store in thread 1 are visible after the load in thread 2. Acquire/Release operations form a pair and they synchronize based on a run-time relationship between the store and load. In other words, if thread 2 does not see the value stored by thread 1, there are no ordering guarantees.

如果atomic变量不依赖于任何其他数据,则可以使用std::memory_order_relaxed;否则,可以使用std::memory_order_relaxed.始终保证单个原子变量的存储顺序.

If the atomic variable has no dependencies on any other data, you can use std::memory_order_relaxed; store ordering is always guaranteed for a single atomic variable.

正如其他人所提到的,与std::atomic进行线程间通信时,不需要volatile.

As mentioned by others, there is no need for volatile when it comes to inter-thread communication with std::atomic.

这篇关于如何使内存“立即"存储在一个线程中?在其他线程中可见?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆