openmp 排序临界区 [英] openmp ordering critical sections

查看:56
本文介绍了openmp 排序临界区的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建一个 OpenMP 程序,该程序将按顺序遍历一个循环.我意识到线程不适用于顺序程序——与单线程相比,我试图获得一点加速,或者至少保持与单线程程序相似的执行时间.

I am trying to create an OpenMP program that will sequentially iterate through a loop. I realize threads are not intended for sequential programs -- I'm trying to either get a little speedup compared to a single thread, or at least keep the execution time similar to a single-threaded program.

在我的#pragma omp 并行部分中,每个线程计算自己的大型数组部分并获得该部分的总和.这些都可以并行运行.然后我希望线程按顺序运行,并将每个总和添加到 TotalSum IN ORDER.所以线程 1 必须等待线程 0 完成,依此类推.我在#pragma omp 临界区中有这部分.一切都运行良好,除了只有线程 0 正在完成然后程序退出.如何确保其他线程将继续轮询?我试过 sleep() 和 while 循环,但它在线程 0 完成后继续退出.

Inside my #pragma omp parallel section, each thread computes its own section of a large array and gets the sum of that portion. These all may run in parallel. Then I want the threads to run in order, and each sum is added to the TotalSum IN ORDER. So thread 1 has to wait for thread 0 to complete, and so on. I have this part inside a #pragma omp critical section. Everything runs fine, except that only thread 0 is completing and then the program exits. How can I ensure that the other threads will keep polling? I've tried sleep() and while loops, but it continues to exit after thread 0 completes.

我没有使用#pragma omp parallel 因为我需要跟踪每个线程访问的主数组的特定范围.以下是相关代码部分的缩短版本:

I am not using #pragma omp parallel for because I need to keep track of the specific ranges of the master array that each thread accesses. Here is a shortened version of the code section in question:

//DONE and MasterArray are global arrays. DONE keeps track of all the threads that have completed

int Function()
{
    #pragma omp parallel
    {
    int ID = omp_get_thread_num
    variables: start,end,i,j,temp(array)  (all are initialized here)
    j = 0;

    for (i = start; i < end; i++)
    {
         if(i != start)
               temp[j] = MasterArray[i];
         else
               temp[j] = temp[j-1] + MasterArray[i];
         j++;
    }



    #pragma omp critical
    {     
        while(DONE[ID] == 0 && ERROR == 0) {

           int size = sizeof(temp) / sizeof(temp[0]);           

           if (ID == 0)  {
              Sum = temp[size];
              DONE[ID] = 1;
              if (some situation)
                 ERROR = 1;   //there's an error and we need to exit the function and program
           }
           else if (DONE[ID-1] == 1) {
              Sum = temp[size];
              DONE[ID] = 1;
              if (some situation)
                 ERROR = 1;   //there's an error and we need to exit the function and program
           }
        }
     }
     }
     if (ERROR == 1)
         return(-1);
     else
         return(0);
   }

在初始化线程数后从main调用该函数.在我看来并行部分完成,然后我们检查错误.如果发现错误,则循环终止.我意识到这里出了点问题,但我无法弄清楚它是什么,现在我只是在兜圈子.任何帮助都会很棒.同样,我的问题是该函数仅在线程 0 执行后退出,但未标记任何错误.我也让它在 pthreads 中运行,但执行起来更简单.谢谢!

this function is called from main after initializing the number of threads. It seems to me that the parallel portion completes, then we check for an error. If an error is found, the loop terminates. I realize something is wrong here, but I can't figure out what it is, and now I'm just going in circles. Any help would be great. Again, my problem is that the function exits after only thread 0 executes, but no error has been flagged. I have it running in pthreads too, but that has been simpler to execute. Thanks!

推荐答案

您尝试使用 #pragma omp critical 对线程进行排序是完全错误的.任何时候临界区中都可以只有一个线程,并且线程到达临界区的顺序是不确定的.所以在你的代码中可能会发生这样的情况,例如线程#2 首先进入临界区并且永远不会离开它,等待线程#1 完成,而线程#1 和其余线程在#pragma omp critical 处等待.即使某些线程,例如线程#0 幸运地以正确的顺序完成临界区,它们将在并行区域的末尾等待一个隐式屏障.换句话说,在这段代码中几乎可以保证死锁.

Your attempt of ordering threads with #pragma omp critical is totally incorrect. There can be just one thread in a critical section at any time, and the order in which the threads arrive to the critical section is not determined. So in your code it can happen that e.g. the thread #2 enters the critical section first and never leaves it, waiting for thread #1 to complete, while the thread #1 and the rest are waiting at #pragma omp critical. And even if some threads, e.g. thread #0, are lucky to complete the critical section in right order, they will wait on an implicit barrier at the end of the parallel region. In other words, the deadlock is almost guaranteed in this code.

我建议你做一些更简单和自然的事情来订购你的线程,即有序部分.它应该是这样的:

I suggest you do something much simpler and natural to order your threads, namely an ordered section. It should look like this:

#pragma omp parallel
{
    int ID = omp_get_thread_num();

    // Computations done by each thread

    #pragma omp for ordered schedule(static,1)
    for( int t=0; t<omp_get_num_threads(); ++t )
    {
        assert( t==ID );
        #pragma omp ordered
        {
            // Do the stuff you want to be in order
        }
    }
}

因此,您创建了一个并行循环,其迭代次数等于该区域中的线程数.schedule(static,1) 子句明确表示每个线程按照线程 ID 的顺序为每个线程分配一个迭代;并且 ordered 子句允许在循环内使用有序部分.现在在循环体中放置一个有序部分(#pragma omp orders 之后的块),它将按照迭代顺序执行,这也是线程 ID 的顺序(如由断言确保).

So you create a parallel loop with the number of iterations equal to the number of threads in the region. The schedule(static,1) clause makes it explicit that the iterations are assigned one per thread in the order of thread IDs; and the ordered clause allows to use ordered sections inside the loop. Now in the body of the loop you put an ordered section (the block following #pragma omp ordered), and it will be executed in the order of iterations, which is also the order of thread IDs (as ensured by the assertion).

更多信息,你可以看这个问题:omp 是怎么做的有序子句有效吗?

For more information, you may look at this question: How does the omp ordered clause work?

这篇关于openmp 排序临界区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆