未指定块大小的OpenMP计划(静态):块大小和分配顺序 [英] OpenMP schedule(static) with no chunk size specified: chunk size and order of assignment

查看：51 发布时间：2020/5/21 1:20:24 openmp

本文介绍了未指定块大小的OpenMP计划(静态):块大小和分配顺序的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一些关于#pragma omp for schedule(static)的问题，其中未指定块大小.

I have a few questions regarding #pragma omp for schedule(static) where the chunk size is not specified.

在OpenMP中并行化循环的一种方法是像这样手动进行:

One way to parallelize a loop in OpenMP is to do it manually like this:

#pragma omp parallel 
{
    const int nthreads = omp_get_num_threads();
    const int ithread = omp_get_thread_num();
    const int start = ithread*N/nthreads;
    const int finish = (ithread+1)*N/nthreads;
    for(int i = start; i<finish; i++) {
        //          
    }
}

是否有充分的理由不在OpenMP中手动并行处理这样的循环?如果将值与#pragma omp for schedule(static)进行比较，我会发现给定线程的块大小并不总是同意，因此OpenMP(在GCC中)实现的卡盘尺寸不同于start和finish中定义的卡盘尺寸.为什么会这样?

Is there a good reason not to do manually parallelize a loop like this in OpenMP? If I compare the values with #pragma omp for schedule(static) I see that the chunk sizes for a given thread don't always agree so OpenMP (in GCC) is implementing the chuck sizes different than as defined in start and finish. Why is this?

我定义的start和finish值具有几个方便的属性.

The start and finish values I defined have several convenient properties.

每个线程最多获得一个块.
迭代值的范围直接随线程数增加(即，对于具有两个线程的100个线程，第一个线程将处理迭代1-50和第二个线程51-100，而不是相反.
对于两个在完全相同范围内的for循环，每个线程将在完全相同的迭代上运行.

Each thread gets at most one chunk.
The range of values for iterations increase directly with thread number (i.e. for 100 threads with two threads the first thread will process iterations 1-50 and the second thread 51-100 and not the other way around).
For two for loops over exactly the same range each thread will run over exactly the same iterations.

最初，我说的只是一个块，但是考虑到它之后，如果线程数比N大得多，则该块的大小可能为零. ithread*N/nthreads = (ithread*1)*N/nthreads).我真正想要的属性最多是一块.

Original I said exactly one chunk but after thinking about it it's possible for the size of the chunk to be zero if the number of threads is much larger than N (ithread*N/nthreads = (ithread*1)*N/nthreads). The property I really want is at most one chunk.

使用#pragma omp for schedule(static)时是否保证所有这些属性?

Are all these properties guaranteed when using #pragma omp for schedule(static)?

根据OpenMP规范:

According to the OpenMP specifications:

依赖于哪个线程在任何其他情况下执行特定迭代的程序都是不合格的.

Programs that depend on which thread executes a particular iteration under any other circumstances are non-conforming.

和

具有相同调度和迭代次数的不同循环区域，即使它们出现在相同的并行区域中，也可以以不同的方式在线程之间分配迭代比率.唯一的例外是静态时间表

对于schedule(static)，规范说:

将块按线程号的顺序以循环方式分配给团队中的线程.

chunks are assigned to the threads in the team in a round-robin fashion in the order of the thread number.

此外，规范还针对"schedule(静态)"进行了说明:

Additionally the specification says for `schedule(static):

如果未指定chunk_size，则将迭代空间划分为大小大致相等的块，并且每个线程最多分配一个块.

When no chunk_size is specified, the iteration space is divided into chunks that are approximately equal in size, and at most one chunk is distributed to each thread.

最后，规格说明为schedule(static):

静态时间表的合规实施必须确保将逻辑迭代编号分配给线程的相同方法将在两个中使用如果满足以下条件，则循环区域:1)两个循环区域具有相同的循环迭代次数； 2)两个循环区域均指定相同的chunk_size值；或者两个循环区域均未指定chunk_size； 3)两个循环区域都绑定到相同的平行区域.

A compliant implementation of the static schedule must ensure that the same assignment of logical iteration numbers to threads will be used in two loop regions if the following conditions are satisfied: 1) both loop regions have the same number of loop iterations, 2) both loop regions have the same value of chunk_size specified, or both loop regions have no chunk_size specified, 3) both loop regions bind to the same parallel region.

因此，即使我的代码依赖于线程执行特定的迭代，因此如果我正确阅读了schedule(static)，它们将具有与列为start和finish相同的便捷属性. 我能正确解释吗?当未指定块大小时，这似乎是schedule(static)的特殊情况.

So if I read this correctly schedule(static) will have the same convenient properties I listed as start and finish even though my code relies on thread executes a particular iteration. Do I interpret this correctly? This seems to be a special case for schedule(static) when the chunk size is not specified.

像我一样定义start和finish然后尝试中断这种情况下的规范会更容易.

It's easier to just define start and finish like I did then try and interrupt the specification for this case.

未指定块大小的OpenMP计划(静态):块大小和分配顺序 [英] OpenMP schedule(static) with no chunk size specified: chunk size and order of assignment

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

未指定块大小的OpenMP计划(静态):块大小和分配顺序 [英] OpenMP schedule(static) with no chunk size specified: chunk size and order of assignment

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭