Fortran 95 构造(例如 WHERE、FORALL 和 SPREAD)通常会导致更快的并行代码吗? [英] Do Fortran 95 constructs such as WHERE, FORALL and SPREAD generally result in faster parallel code?

查看:27
本文介绍了Fortran 95 构造(例如 WHERE、FORALL 和 SPREAD)通常会导致更快的并行代码吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已通读 Metcalf、Reid 和 Cohen 撰写的 Fortran 95 书籍,以及 Fortran 90 中的数值配方.他们建议使用 WHERE、FORALL 和 SPREAD 等,以避免不必要的程序序列化.

I have read through the Fortran 95 book by Metcalf, Reid and Cohen, and Numerical Recipes in Fortran 90. They recommend using WHERE, FORALL and SPREAD amongst other things to avoid unnecessary serialisation of your program.

但是,我偶然发现了 this answer 声称 FORALL 在理论上很好,但在实践中毫无意义 - 您最好编写循环,因为它们也可以并行化,并且您可以使用 OpenMP(或某些自动功能)显式并行化它们Intel 等编译器).

However, I stumbled upon this answer which claims that FORALL is good in theory, but pointless in practice - you might as well write loops as they parallelise just as well and you can explicitly parallelise them using OpenMP (or automatic features of some compilers such as Intel).

任何人都可以根据经验验证他们是否普遍发现这些构造在并行性能方面比显式循环和 if 语句具有任何优势?

Can anyone verify from experience whether they have generally found these constructs to offer any advantages over explicit loops and if statements in terms of parallel performance?

该语言是否还有其他一些原则上很好但在实践中不值得的并行特性?

And are there any other parallel features of the language which are good in principal but not worth it in practice?

我很欣赏这些问题的答案在某种程度上取决于实现,因此我对 gfortran、Intel CPU 和 SMP 并行性最感兴趣.

I appreciate that the answers to these questions are somewhat implementation dependant, so I'm most interested in gfortran, Intel CPUs and SMP parallelism.

推荐答案

正如我在对另一个问题的回答中所说的,人们普遍认为 FORALL 在被引入该语言时并没有预期的那么有用.正如其他答案中已经解释的那样,它具有限制性要求和有限的作用,并且编译器已经非常擅长优化常规循环.编译器不断变得更好,并且功能因编译器而异.另一个线索是 Fortran 2008 再次尝试......除了向语言添加显式并行化(co-arrays,已经提到)之外,还有do concurrent",一种需要限制的新循环形式,应该更好地允许编译器执行自动并行优化,但应该足够通用才能有用 - 请参阅 ftp://ftp.nag.co.uk/sc22wg5/N1701-N1750/N1729.pdf.

As I said in my answer to the other question, there is a general belief that FORALL has not been as useful as was hoped when it was introduced to the language. As already explained in other answers, it has restrictive requirements and a limited role, and compilers have become quite good at optimizing regular loops. Compilers keep getting better, and capabilities vary from compiler to compiler. Another clue is that the Fortran 2008 is trying again... besides adding explicit parallelization to the language (co-arrays, already mentioned), there is also "do concurrent", a new loop form that requires restrictions that should better allow the compiler to perform automatic parallization optimizations, yet should be sufficiently general to be useful -- see ftp://ftp.nag.co.uk/sc22wg5/N1701-N1750/N1729.pdf.

在获取速度方面,我大多选择可读性好的算法和程序.可维护性.只有当程序太慢时,我才会定位瓶颈并重新编码或实现多线程(OpenMP).FORALL 或 WHERE 与显式 do 循环将产生有意义的速度差异的情况很少见——我会更多地关注它们如何清楚地说明程序的意图.

In terms of obtaining speed, mostly I select good algorithms and program for readability & maintainability. Only if the program is too slow do I locate the bottle necks and recode or implement multi-threading (OpenMP). It will be a rare case where FORALL or WHERE versus an explicit do loop will have a meaningful speed difference -- I'd look more to how clearly they state the intent of the program.

这篇关于Fortran 95 构造(例如 WHERE、FORALL 和 SPREAD)通常会导致更快的并行代码吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆