我们在哪里可以使用 std::barrier 而不是 std::latch? [英] Where can we use std::barrier over std::latch?

查看:25
本文介绍了我们在哪里可以使用 std::barrier 而不是 std::latch?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近听说新的 C++ 标准特性是:

I recently heard new c++ standard features which are:

  1. std::latch
  2. std::barrier

我无法弄清楚,它们在哪些情况下适用且彼此有用.

I cannot figure it out ,in which situations that they are applicable and useful over one-another.

  • 如果有人能举出一个例子来说明如何明智地使用它们中的每一个,那将非常有帮助.

推荐答案

非常简短的回答

它们确实针对不同的目标:

Very short answer

They're really aimed at quite different goals:

  • 当您有一堆线程并且您想一次在它们之间同步时,屏障很有用,例如,一次对所有数据进行操作.
  • 如果您有一堆工作项并且您想知道它们何时都被处理过,并且不一定对哪个线程处理它们感兴趣,那么锁存器很有用.

当您有一个工作线程池执行某些处理和一个工作项队列共享时,通常会使用屏障和锁存器.这不是使用它们的唯一情况,但它是一种非常常见的情况,确实有助于说明差异.下面是一些示例代码,可以像这样设置一些线程:

Barriers and latches are often used when you have a pool of worker threads that do some processing and a queue of work items that is shared between. It's not the only situation where they're used, but it is a very common one and does help illustrate the differences. Here's some example code that would set up some threads like this:

const size_t worker_count = 7; // or whatever
std::vector<std::thread> workers;
std::vector<Proc> procs(worker_count);
Queue<std::function<void(Proc&)>> queue;
for (size_t i = 0; i < worker_count; ++i) {
    workers.push_back(std::thread(
        [p = &procs[i], &queue]() {
            while (auto fn = queue.pop_back()) {
                fn(*p);
            }
        }
    ));
}

我假设在该示例中存在两种类型:

There are two types that I have assumed exist in that example:

  • Proc:特定于您的应用程序的类型,包含处理工作项所需的数据和逻辑.对一个的引用传递给线程池中运行的每个回调函数.
  • Queue:线程安全的阻塞队列.C++ 标准库中没有这样的东西(有点令人惊讶)但是有很多包含它们的开源库,例如Folly MPMCQueuemoodycamel::ConcurrentQueue,或者您也可以使用 std 自己构建一个不那么花哨的::mutexstd::condition_variablestd::deque(如果您在 Google 上搜索,有很多示例说明如何执行此操作).立>
  • Proc: a type specific to your application that contains data and logic necessary to process work items. A reference to one is passed to each callback function that's run in the thread pool.
  • Queue: a thread-safe blocking queue. There is nothing like this in the C++ standard library (somewhat surprisingly) but there are a lot of open-source libraries containing them e.g. Folly MPMCQueue or moodycamel::ConcurrentQueue, or you can build a less fancy one yourself with std::mutex, std::condition_variable and std::deque (there are many examples of how to do this if you Google for them).

闩锁通常用于等待您推送到队列中的某些工作项全部完成,通常这样您就可以检查结果.

A latch is often used to wait until some work items you push onto the queue have all finished, typically so you can inspect the result.

std::vector<WorkItem> work = get_work();
std::latch latch(work.size());
for (WorkItem& work_item : work) {
    queue.push_back([&work_item, &latch](Proc& proc) {
        proc.do_work(work_item);
        latch.count_down();
    });
}
latch.wait();
// Inspect the completed work

工作原理:

  1. 线程最终会从队列中弹出工作项,可能池中的多个线程同时处理不同的工作项.
  2. 当每个工作项完成时,latch.count_down() 被调用,有效地减少从 work.size() 开始的内部计数器.
  3. 当所有工作项都完成后,该计数器达到零,此时 latch.wait() 返回,生产者线程知道工作项已全部处理完毕.
  1. The threads will - eventually - pop the work items off of the queue, possibly with multiple threads in the pool handling different work items at the same time.
  2. As each work item is finished, latch.count_down() is called, effectively decrementing an internal counter that started at work.size().
  3. When all work items have finished, that counter reaches zero, at which point latch.wait() returns and the producer thread knows that the work items have all been processed.

注意事项:

  • 闩锁计数是将被处理的工作项的数量,而不是工作线程的数量.
  • count_down() 方法可以在每个线程上被调用零次、一次或多次,并且不同线程的调用次数可能不同.例如,即使您将 7 条消息推送到 7 个线程,也可能所有 7 个项目都在同一个线程上处理(而不是每个线程一个),这很好.
  • 其他不相关的工作项可以与这些工作项交错(例如,因为它们被其他生产者线程推送到队列中),这也没关系.
  • 原则上,latch.wait() 可能会在所有工作线程处理完所有工作项后才被调用.(这是您在编写线程代码时需要注意的那种奇怪的条件.)但是没关系,这不是竞争条件:在这种情况下 latch.wait() 将立即返回.
  • 使用闩锁的另一种方法是,除了此处显示的队列之外,还有另一个队列,其中包含工作项的结果.线程池回调将结果推送到该队列,而生产者线程从中弹出结果.基本上,它与此代码中的 queue 方向相反.这也是一个完全有效的策略,事实上,如果有的话它更常见,但在其他情况下闩锁更有用.
  • The latch count is the number of work items that will be processed, not the number of worker threads.
  • The count_down() method could be called zero times, one time, or multiple times on each thread, and that number could be different for different threads. For example, even if you push 7 messages onto 7 threads, it might be that all 7 items are processed onto the same one thread (rather than one for each thread) and that's fine.
  • Other unrelated work items could be interleaved with these ones (e.g. because they weree pushed onto the queue by other producer threads) and again that's fine.
  • In principle, it's possible that latch.wait() won't be called until after all of the worker threads have already finished processing all of the work items. (This is the sort of odd condition you need to look out for when writing threaded code.) But that's OK, it's not a race condition: latch.wait() will just immediately return in that case.
  • An alternative to using a latch is that there's another queue, in addition to the one shown here, that contains the result of the work items. The thread pool callback pushes results on to that queue while the producer thread pops results off of it. Basically, it goes in the opposite direction to the queue in this code. That's a perfectly valid strategy too, in fact if anything it's more common, but there are other situations where the latch is more useful.

屏障通常用于让所有线程同时等待,以便可以同时操作与所有线程关联的数据.

A barrier is often used to make all threads wait simultaneously so that the data associated with all of the threads can be operated on simultaneously.

typedef Fn std::function<void()>;
Fn completionFn = [&procs]() {
    // Do something with the whole vector of Proc objects
};
auto barrier = std::make_shared<std::barrier<Fn>>(worker_count, completionFn);
auto workerFn = [barrier](Proc&) {
    barrier->count_down_and_wait();
};
for (size_t i = 0; i < worker_count; ++i) {
    queue.push_back(workerFn);
}

工作原理:

  1. 所有工作线程将从队列中弹出这些 workerFn 项之一并调用 barrier.count_down_and_wait().
  2. 一旦他们都在等待,其中一个将调用 completionFn() 而其他人继续等待.
  3. 一旦该函数完成,它们都将从 count_down_and_wait() 返回,并可以自由地从队列中弹出其他不相关的工作项.
  1. All of the worker threads will pop one of these workerFn items off of the queue and call barrier.count_down_and_wait().
  2. Once all of them are waiting, one of them will call completionFn() while the others continue to wait.
  3. Once that function completes they will all return from count_down_and_wait() and be free to pop other, unrelated, work items from the queue.

注意事项:

  • 这里的屏障计数是工作线程的数量.
  • 保证每个线程都会从队列中弹出一个 workerFn 并处理它.一旦一个线程从队列中弹出一个,它将在 barrier.count_down_and_wait() 中等待,直到 workerFn 的所有其他副本都被其他线程弹出,所以它不可能再弹出一个.
  • 我使用了一个指向屏障的共享指针,以便在所有工作项完成后它会自动销毁.这不是闩锁的问题,因为我们可以在生产者线程函数中将其设置为局部变量,因为它会等待直到工作线程使用闩锁(它调用 latch.wait()).这里生产者线程不等待屏障,因此我们需要以不同的方式管理内存.
  • 如果您确实希望原始生产者线程等到屏障完成,那很好,它也可以调用 count_down_and_wait(),但显然您需要通过 worker_count +1 到屏障的构造函数.(然后你就不需要为屏障使用共享指针了.)
  • 如果其他工作项同时被推送到队列中,那也没关系,尽管这可能会浪费时间,因为某些线程会坐在那里等待获取屏障,而其他线程则被其他线程分散注意力在他们获得障碍之前工作.
  • Here the barrier count is the number of worker threads.
  • It is guaranteed that each thread will pop precisely one workerFn off of the queue and handle it. Once a thread has popped one off of the queue, it will wait in barrier.count_down_and_wait() until all the other copies of workerFn have been popped off by other threads, so there is no chance of it popping another one off.
  • I used a shared pointer to the barrier so that it will be destroyed automatically once all the work items are done. This wasn't an issue with the latch because there we could just make it a local variable in the producer thread function, because it waits until the worker threads have used the latch (it calls latch.wait()). Here the producer thread doesn't wait for the barrier so we need to manage the memory in a different way.
  • If you did want the original producer thread to wait until the barrier has been finished, that's fine, it can call count_down_and_wait() too, but you will obviously need to pass worker_count + 1 to the barrier's constructor. (And then you wouldn't need to use a shared pointer for the barrier.)
  • If other work items are being pushed onto the queue at the same time, that's fine too, although it will potentially waste time as some threads will just be sitting there waiting for the barrier to be acquired while other threads are distracted by other work before they acquire the barrier.

<强>!!!危险!!!

关于其他工作被推入队列的最后一个要点是很好"只有在其他工作也没有使用障碍时才会出现这种情况!如果您有两个不同的生产者线程将带有屏障的工作项放在同一个队列中,并且这些项目是交错的,那么一些线程将在一个屏障上等待,而其他线程将在另一个屏障上等待,并且都不会达到所需的等待计数 - <强>死锁.避免这种情况的一种方法是只在单个线程中使用这样的屏障,或者甚至在整个程序中只使用一个屏障(这听起来很极端,但实际上是一种很常见的策略,因为屏障通常用于一个-启动时的时间初始化).另一种选择是,如果您使用的线程队列支持它,则立即将屏障的所有工作项以原子方式推送到队列中,这样它们就永远不会与任何其他工作项交错.(这不适用于 moodycamel 队列,该队列支持一次推送多个项目,但不保证它们不会与其他线程推送的项目交错.)

The last bullet point about other working being pushed onto the queue being "fine" is only the case if that other work doesn't also use a barrier! If you have two different producer threads putting work items with a barrier on to the same queue and those items are interleaved, then some threads will wait on one barrier and others on the other one, and neither will ever reach the required wait count - DEADLOCK. One way to avoid this is to only ever use barriers like this from a single thread, or even to only ever use one barrier in your whole program (this sounds extreme but is actually quite a common strategy, as barriers are often used for one-time initialisation on startup). Another option, if the thread queue you're using supports it, is to atomically push all work items for the barrier onto the queue at once so they're never interleaved with any other work items. (This won't work with the moodycamel queue, which supports pushing multiple items at once but doesn't guarantee that they won't be interleved with items pushed on by other threads.)

在您提出此问题时,建议的实验性 API 不支持完成功能.即使是当前的 API 至少也不允许使用它们,所以我想我应该展示一个示例,说明如何也可以像这样使用屏障.

At the point when you asked this question, the proposed experimental API didn't support completion functions. Even the current API at least allows not using them, so I thought I should show an example of how barriers can be used like that too.

auto barrier = std::make_shared<std::barrier<>>(worker_count);
auto workerMainFn = [&procs, barrier](Proc&) {
    barrier->count_down_and_wait();
    // Do something with the whole vector of Proc objects
    barrier->count_down_and_wait();
};
auto workerOtherFn = [barrier](Proc&) {
    barrier->count_down_and_wait();  // Wait for work to start
    barrier->count_down_and_wait();  // Wait for work to finish
}
queue.push_back(std::move(workerMainFn));
for (size_t i = 0; i < worker_count - 1; ++i) {
    queue.push_back(workerOtherFn);
}

工作原理:

关键思想是在每个线程中等待屏障两次,并在两者之间进行工作.第一个等待与前面的示例具有相同的目的:它们确保队列中所有较早的工作项在开始此工作之前已完成.第二次等待确保队列中的任何后续项目在此工作完成之前不会启动.

The key idea is to wait for the barrier twice in each thread, and do the work in between. The first waits have the same purpose as the previous example: they ensure any earlier work items in the queue are finished before starting this work. The second waits ensure that any later items in the queue don't start until this work has finished.

注意事项:

注释与前面的障碍示例大致相同,但有一些不同之处:

The notes are mostly the same as the previous barrier example, but here are some differences:

  • 一个不同之处在于,由于屏障与特定的完成函数无关,您更有可能在多次使用之间共享它,就像我们在闩锁示例中所做的那样,避免使用共享指针.
  • 立>
  • 这个例子看起来像是使用没有完成函数的屏障要复杂得多,但这只是因为这种情况不太适合他们.有时,您所需要的只是到达障碍.例如,虽然我们在线程启动之前初始化了一个队列,但也许您为每个线程都有一个队列,但在线程的运行函数中进行了初始化.在这种情况下,障碍可能只是表示队列已经初始化并准备好让其他线程相互传递消息.在这种情况下,您可以使用没有完成功能的屏障,而无需像这样等待两次.
  • 您实际上可以为此使用锁存器,调用 count_down() 然后调用 wait() 代替 count_down_and_wait().但是使用屏障更有意义,因为调用组合函数更简单,而且使用屏障可以更好地向代码的未来读者传达您的意图.
  • 无论如何,危险"之前的警告仍然适用.
  • One difference is that, because the barrier is not tied to the specific completion function, it's more likely that you can share it between multiple uses, like we did in the latch example, avoiding the use of a shared pointer.
  • This example makes it look like using a barrier without a completion function is much more fiddly, but that's just because this situation isn't well suited to them. Sometimes, all you need is to reach the barrier. For example, whereas we initialised a queue before the threads started, maybe you have a queue for each thread but initialised in the threads' run functions. In that case, maybe the barrier just signifies that the queues have been initialised and are ready for other threads to pass messages to each other. In that case, you can use a barrier with no completion function without needing to wait on it twice like this.
  • You could actually use a latch for this, calling count_down() and then wait() in place of count_down_and_wait(). But using a barrier makes more sense, both because calling the combined function is a little simpler and because using a barrier communicates your intention better to future readers of the code.
  • Any any case, the "DANGER" warning from before still applies.

这篇关于我们在哪里可以使用 std::barrier 而不是 std::latch?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆