并行化Fortran 2008系统,可能会使用openmp [英] Parallelizing fortran 2008 `do concurrent` systematically, possibly with openmp

查看:233
本文介绍了并行化Fortran 2008系统,可能会使用openmp的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Fortran 2008 并发结构是一个do循环,它告诉编译器没有任何迭代会影响其他迭代。

一个有效的例子:

 程序main 
implicit none
integer :: i
integer,dimension(10):: array
并发(i = 1:10)
数组(i) = i
end do
end program main

可以完成迭代以任何顺序。你可以阅读更多关于它这里



据我所知,gfortran不会自动并行处理这些 do concurrent 循环,而我记得gfortran-diffusion-list (此处)。它只是将它们转换为经典的 do 循环。 我的问题:你知道吗一种系统并行并发循环的方式?例如使用系统的 openmp语法?

解决方案

自动执行它并不容易。 DO CONCURRENT 构造有一个 forall-header ,这意味着它可以接受多个循环,索引变量定义和掩码。基本上,您需要替换:

  DO CONCURRENT([< spec-spec> ::]< forall-triplet- spec 1>,< forall-triplet-spec 2>,... [,< scalar-mask-expression>])
< block>
END DO

with:

  [BLOCK 
< type-spec> ::< indexes>]

!$ omp parallel do
DO< forall-triplet-spec 1>
DO< forall-triplet-spec 2>
...
[IF(< scalar-mask-expression>)THEN]
< block>
[END IF]
...
END DO
END DO
!$ omp end parallel do

[END BLOCK]


$ b

(方括号内的内容是可选的,基于中的相应部分的存在注意,这不会像用< iters 1> *<< / c> iters 2> * ... 独立迭代,这是 DO CONCURRENT 预计要做的事情。还要注意, forall-header 允许一个 type-spec 允许在头文件中定义循环索引,并且你需要围绕 BLOCK ... END BLOCK 构造来保留语义。您还需要检查 scalar-mask-expr 是否存在于 forall-header 的末尾,如果存在,您还应该将 IF ... END IF 在最内层循环中。



如果您只在 DO CONCURRENT 您也可以将它转换为 FORALL 并使用工作共享 OpenMP指令。这将比上述更容易。

 继续并发< forall-header> 
< block>
END DO

会变成:

 !$ omp parallel workshare 
FORALL< forall-header>
< block>
END FORALL
!$ omp end parallel workshare

鉴于以上所述,我唯一可以考虑的系统方法就是系统地通过你的源代码来搜索 DO CONCURRENT 根据 forall-header 和循环体的内容,用上面转换的结构之一替换它。



编辑: OpenMP workshare 指令的用法目前不鼓励。事实证明,至少英特尔Fortran编译器和GCC serialise FORALL 语句和构造在OpenMP workshare 指令中在编译期间OpenMP single 指令不会带来任何加速。其他编译器可能会以不同方式实现,但如果要实现便携性能,最好避免使用它。


The fortran 2008 do concurrent construct is a do loop that tells the compiler that no iteration affect any other. It can thus be parallelized safely.

A valid example:

program main
  implicit none
  integer :: i
  integer, dimension(10) :: array
  do concurrent( i= 1: 10)
    array(i) = i
  end do
end program main

where iterations can be done in any order. You can read more about it here.

To my knowledge, gfortran does not automatically parallelize these do concurrent loops, while I remember a gfortran-diffusion-list mail about doing it (here). It justs transform them to classical do loops.

My question: Do you know a way to systematically parallelize do concurrent loops? For instance with a systematic openmp syntax?

解决方案

It is not that easy to do it automatically. The DO CONCURRENT construct has a forall-header which means that it could accept multiple loops, index variables definition and a mask. Basically, you need to replace:

DO CONCURRENT([<type-spec> :: ]<forall-triplet-spec 1>, <forall-triplet-spec 2>, ...[, <scalar-mask-expression>])
  <block>
END DO

with:

[BLOCK
    <type-spec> :: <indexes>]

!$omp parallel do
DO <forall-triplet-spec 1>
  DO <forall-triplet-spec 2>
    ...
    [IF (<scalar-mask-expression>) THEN]
      <block>
    [END IF]
    ...
  END DO
END DO
!$omp end parallel do

[END BLOCK]

(things in square brackets are optional, based on the presence of the corresponding parts in the forall-header)

Note that this would not be as effective as parallelising one big loop with <iters 1>*<iters 2>*... independent iterations which is what DO CONCURRENT is expected to do. Note also that forall-header permits a type-spec that allows one to define loop indexes inside the header and you will need to surround the whole thing in BLOCK ... END BLOCK construct to preserve the semantics. You would also need to check if scalar-mask-expr exists at the end of the forall-header and if it does you should also put that IF ... END IF inside the innermost loop.

If you only have array assignments inside the body of the DO CONCURRENT you would could also transform it into FORALL and use the workshare OpenMP directive. It would be much easier than the above.

DO CONCURRENT <forall-header>
  <block>
END DO

would become:

!$omp parallel workshare
FORALL <forall-header>
  <block>
END FORALL
!$omp end parallel workshare

Given all the above, the only systematic way that I can think about is to systematically go through your source code, searching for DO CONCURRENT and systematically replacing it with one of the above transformed constructs based on the content of the forall-header and the loop body.

Edit: Usage of OpenMP workshare directive is currently discouraged. It turns out that at least Intel Fortran Compiler and GCC serialise FORALL statements and constructs inside OpenMP workshare directives by surrounding them with OpenMP single directive during compilation which brings no speedup whatsoever. Other compilers might implement it differently but it's better to avoid its usage if portable performance is to be achieved.

这篇关于并行化Fortran 2008系统,可能会使用openmp的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆