并行进行广度​​优先搜索 [英] Parallelizing a Breadth-First Search

查看:71
本文介绍了并行进行广度​​优先搜索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚自学了一些OpenMP,这可能很愚蠢.基本上,我正在尝试并行化c ++中的广度优先搜索程序,每个节点都需要花费很长时间来处理.这是一个示例代码:

I just taught myself some OpenMP and this could be stupid. Basically I'm trying to parallelize a breadth first search program in c++, with each node taking a long time to process. Here's an example code:

queue<node*> q;
q.push(head);
while (!q.empty()) {
  qSize = q.size();
  for (int i = 0; i < qSize; i++) {
    node* currNode = q.front();
    q.pop();
    doStuff(currNode);
    q.push(currNode);
  }
}

处理函数doStuff()非常昂贵,我想对其进行并行化.但是,如果我通过将#pragma omp parallel for放在for行之前来并行化for循环,则会在运行时弹出各种奇怪的错误.我猜测原因是这种方式q.front()q.push()也将并行化,并且多个线程可能会通过q.front()到达同一节点(因为它们都在处理任何q.push之前得到了处理) .

The processing function doStuff() is quite expensive and I want to parallelize it. However if I parallelize the for loop by putting #pragma omp parallel for right before the for line, all kinds of weird error pop up at runtime. I'm guessing the reason is that this way q.front() and q.push() will also get parallelized, and multiple threads will possibly get the same node through q.front() (because they all got processed before any q.push has been processed).

我该如何解决?

推荐答案

解决方案是使用关键节来保护对队列的访问.

The solution is to protect access to the queue with a critical section.

queue<node*> q;
q.push(head);
while (!q.empty()) {
  qSize = q.size();
  #pragma omp parallel for
  for (int i = 0; i < qSize; i++) {
    node* currNode;
    #pragma omp critical
    {
      currNode = q.front();
      q.pop();
    }
    doStuff(currNode);
    #pragma omp critical
    q.push(currNode);
  }
}

这类似于拥有一个通用的互斥锁并将其锁定.

This is similar to having a common mutex and locking it.

此版本在效率上有一些限制:在for循环结束时,尽管队列中有工作,但某些线程可能仍处于空闲状态.就处理队列为空但某些线程仍在计算的情况而言,创建一个可以在队列中有任何东西时使线程连续工作的版本会有些棘手.

There are some limits in efficiency with this version: At the end of the for loop, some threads may idle, despite work being in the queue. Making a version where threads continuously work whenever there is something in the queue is a bit tricky in terms of handling the situations where the queue is empty but some threads are still computing.

根据节点中涉及的数据大小,您还可能会对缓存效果和错误共享产生重大的性能影响.但这无法通过特定示例进行讨论.简单版本在许多情况下可能会足够有效,但是获得最佳性能可能会变得任意复杂.

Depending of the data size involved in a node, you may also have significant performance impact of cache-effects and false sharing. But that can't be discussed with a specific example. The simple version will probably be sufficiently efficient in many cases, but getting optimal performance may become arbitrarily complex.

无论如何,您都必须确保doStuff不会修改任何全局或共享状态.

In any case, you have to ensure that doStuff does not do modify any global or shared state.

这篇关于并行进行广度​​优先搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆