避免在开放 MP 中创建线程开销 [英] Avoid thread creation overhead in open MP

查看:35
本文介绍了避免在开放 MP 中创建线程开销的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 open MP 来并行化 HEVC 中的一部分代码.代码的基本结构如下

I am using open MP to parallelize a part of code in HEVC. The basic structure of the code is as given below

空函数(){

for(...)

{

#pragma OMP parallel for private(....)

#pragma OMP parallel for private(....)

对于 (...)

{

////做一些并行工作

/// do some parallel work

}//内部for循环结束

} //end of inner for loop

//其他任务

}///外部循环结束

}//函数结束

现在我修改了内部 for 循环,以便代码并行化并且每个线程独立执行任务.我没有收到任何错误,但与单线程相比,多线程的总体处理时间增加了.我想主要原因是对于外循环的每次迭代,内循环都有线程创建开销.有什么办法可以避免这个问题,或者我们只能创建一次线程.我无法并行化外部 for 循环,因为我对内部 for 循环进行了修改,以使每个线程能够独立工作.请提出任何可能的解决方案.

Now i have modified the inner for loop so that the code is parallelized and every thread perform task independently. I am not getting any errors but the overall processing time is increased with multiple threads than what it would have taken with single thread. I guess the main reason is that for every iteration of outer loop there is thread creation overhead for innner loop. Is there any way to avoid this issue or any way by which we can create thread only once. I cannot parallelize the outer for loop since i have made modifications in inner for loop to enable each thread to work independently. Please suggest any possible solutions.

推荐答案

您可以使用单独的指令 #pragma omp parallel#pragma omp for.

You can use separate directives #pragma omp parallel and #pragma omp for.

#pragma omp parallel 创建并行线程,而 #pragma omp for 在线程之间分配工作.对于外循环的顺序部分,您可以使用 #pragma omp single.

#pragma omp parallel creates parallel threads, whereas #pragma omp for distributes the work between the threads. For sequential part of the outer loop you can use #pragma omp single.

这是一个例子:

int n = 3, m = 10;
#pragma omp parallel
{
    for (int i = 0; i < n; i++){
        #pragma omp single
        {
            printf("Outer loop part 1, thread num = %d\n", 
                    omp_get_thread_num());
        }
        #pragma omp for
        for(int j = 0; j < m; j++) {
            int thread_num = omp_get_thread_num();
            printf("j = %d, Thread num = %d\n", j, thread_num);
        }
        #pragma omp single
        {
            printf("Outer loop part 2, thread num = %d\n", 
                    omp_get_thread_num());
        }
    }
}

但我不确定它是否对你有帮助.要诊断 OpenMP 性能问题,最好使用一些分析器,例如 ScalascaVTune.

But I am not sure will it help you or not. To diagnose OpenMP performance issues, it would be better to use some profiler, such as Scalasca or VTune.

这篇关于避免在开放 MP 中创建线程开销的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆