我如何确保我的Fortran FORALL构造被并行化? [英] How can I ensure that my Fortran FORALL construct is being parallelized?

查看:595
本文介绍了我如何确保我的Fortran FORALL构造被并行化?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经给出了表示金属板表面温度点的2D矩阵。矩阵(板)的边缘保持恒定在20摄氏度,并且在一个预定义点处存在100摄氏度的恒定热源。所有其他网格点最初设定为50摄氏度。我的目标是取所有内部网格点并通过迭代平均计算其周围的四个温度来计算其稳态温度格点(i + 1,i-1,j + 1,j-1)直到达到收敛(迭代之间的变化小于0.02摄氏度)。

据我所知,我迭代网格点的顺序是不相关的。



对我来说,这听起来像是一段很好的时间来调用Fortran FORALL 构造并探索并行化的乐趣。



如何确保代码确实被并行化?

>

例如,我可以在我的单核PowerBook G4上编译此代码,并且由于并行化,我预计速度没有提高。但是如果我在双核AMD Opteron上编译,我会假设可以利用FORALL构造。

另外,有没有一种方法可以测量一个双核AMD Opteron的有效并行程序?



更新

回应MSB的问题, gfortran版本4.4.0。 gfortran是否支持自动多线程?



值得注意的是,我认为FORALL构造已经过时了,那么什么是自动向量化。



也许这对单独的问题来说是最好的,但自动矢量化是如何工作的?编译器是否能够检测到循环中只使用了纯函数或子例程?解析方案

如果使用英特尔Fortran编译器,您可以使用命令行开关打开/增加并行/矢量化的编译器的详细级别。这种方式在编译/链接过程中会显示如下内容:

 文件Y中第X行的FORALL循环已被矢量化为

我承认自从上次使用它以来已有几年了,所以编译器消息可能会实际上看起来非常不同,但这是基本的想法。

I've been given a 2D matrix representing temperature points on the surface of a metal plate. The edges of the matrix (plate) are held constant at 20 degrees C and there is a constant heat source of 100 degrees C at one pre-defined point. All other grid points are initially set to 50 degrees C.

My goal is to take all interior grid points and compute its steady-state temperature by iteratively averaging over the surrounding four grid points (i+1, i-1, j+1, j-1) until I reach convergence (a change of less than 0.02 degrees C between iterations).

As far as I know, the order in which I iterate over the grid points is irrelevant.

To me, this sounds like a fine time to invoke the Fortran FORALL construct and explore the joys of parallelization.

How can I ensure that the code is indeed being parallelized?

For example, I can compile this on my single-core PowerBook G4 and I would expect no improvement in speed due to parallelization. But if I compile on a Dual Core AMD Opteron, I would assume that the FORALL construct can be exploited.

Alternatively, is there a way to measure the effective parallelization of a program?

Update

In response to M.S.B's question, this is with gfortran version 4.4.0. Does gfortran support automatic multi-threading?

That's remarkable that the FORALL construct has been rendered obsolete by, I suppose, what is then auto-vectorization.

Perhaps this is best for a separate question, but how does auto-vectorization work? Is the compiler able to detect that only pure functions or subroutines are being used in a loop?

解决方案

If you use Intel Fortran Compiler, you can use a command line switch to turn on/increase the compliler's verbosity level for parallelization/vectorization. This way during compilation/linking you will be shown something like:

FORALL loop at line X in file Y has been vectorized

I admit that it has been a few of years since the last time I used it, so the compiler message might actually look very different, but that's the basic idea.

这篇关于我如何确保我的Fortran FORALL构造被并行化?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆