如何确保我的 Fortran FORALL 构造被并行化? [英] How can I ensure that my Fortran FORALL construct is being parallelized?

查看:22
本文介绍了如何确保我的 Fortran FORALL 构造被并行化?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我得到了一个代表金属板表面温度点的二维矩阵.矩阵(板)的边缘保持恒定在 20 摄氏度,并且在一个预定义的点有一个 100 摄氏度的恒定热源.所有其他网格点最初设置为 50 摄氏度.

I've been given a 2D matrix representing temperature points on the surface of a metal plate. The edges of the matrix (plate) are held constant at 20 degrees C and there is a constant heat source of 100 degrees C at one pre-defined point. All other grid points are initially set to 50 degrees C.

我的目标是获取所有内部网格点并通过对周围四个网格点 (i+1, i-1, j+1, j-1) 进行迭代平均来计算其稳态温度,直到达到收敛 (迭代之间的变化小于 0.02 摄氏度).

My goal is to take all interior grid points and compute its steady-state temperature by iteratively averaging over the surrounding four grid points (i+1, i-1, j+1, j-1) until I reach convergence (a change of less than 0.02 degrees C between iterations).

据我所知,我遍历网格点的顺序无关紧要.

As far as I know, the order in which I iterate over the grid points is irrelevant.

对我来说,现在是调用 Fortran FORALL 构造并探索并行化乐趣的好时机.

To me, this sounds like a fine time to invoke the Fortran FORALL construct and explore the joys of parallelization.

如何确保代码确实被并行化了?

How can I ensure that the code is indeed being parallelized?

例如,我可以在我的单核 PowerBook G4 上编译它,我预计并行化不会提高速度.但是如果我在双核 AMD Opteron 上编译,我会假设 FORALL 结构可以被利用.

For example, I can compile this on my single-core PowerBook G4 and I would expect no improvement in speed due to parallelization. But if I compile on a Dual Core AMD Opteron, I would assume that the FORALL construct can be exploited.

或者,有没有办法衡量程序的有效并行化程度?

Alternatively, is there a way to measure the effective parallelization of a program?

更新

针对 M.S.B 的问题,这是 gfortran 版本 4.4.0.gfortran 是否支持自动多线程?

In response to M.S.B's question, this is with gfortran version 4.4.0. Does gfortran support automatic multi-threading?

我想,FORALL 结构已经过时了,这很了不起,然后是自动矢量化.

That's remarkable that the FORALL construct has been rendered obsolete by, I suppose, what is then auto-vectorization.

也许这最适合单独的问题,但自动矢量化是如何工作的?编译器是否能够检测到循环中只使用了纯函数或子例程?

Perhaps this is best for a separate question, but how does auto-vectorization work? Is the compiler able to detect that only pure functions or subroutines are being used in a loop?

推荐答案

如果您使用英特尔 Fortran 编译器,您可以使用命令行开关来打开/增加编译器的并行化/向量化的详细级别.这样,在编译/链接期间,您将看到如下内容:

If you use Intel Fortran Compiler, you can use a command line switch to turn on/increase the compliler's verbosity level for parallelization/vectorization. This way during compilation/linking you will be shown something like:

FORALL loop at line X in file Y has been vectorized

我承认距离我上次使用它已经有好几年了,所以编译器消息实际上可能看起来非常不同,但这是基本思想.

I admit that it has been a few of years since the last time I used it, so the compiler message might actually look very different, but that's the basic idea.

这篇关于如何确保我的 Fortran FORALL 构造被并行化?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆