我如何确保我的Fortran FORALL构造被并行化? [英] How can I ensure that my Fortran FORALL construct is being parallelized?
问题描述
据我所知,我迭代网格点的顺序是不相关的。
对我来说,这听起来像是一段很好的时间来调用Fortran FORALL
构造并探索并行化的乐趣。
如何确保代码确实被并行化?
>例如,我可以在我的单核PowerBook G4上编译此代码,并且由于并行化,我预计速度没有提高。但是如果我在双核AMD Opteron上编译,我会假设可以利用FORALL构造。
另外,有没有一种方法可以测量一个双核AMD Opteron的有效并行程序? 更新
回应MSB的问题, gfortran版本4.4.0。 gfortran是否支持自动多线程?
值得注意的是,我认为FORALL构造已经过时了,那么什么是自动向量化。
也许这对单独的问题来说是最好的,但自动矢量化是如何工作的?编译器是否能够检测到循环中只使用了纯函数或子例程?解析方案
如果使用英特尔Fortran编译器,您可以使用命令行开关打开/增加并行/矢量化的编译器的详细级别。这种方式在编译/链接过程中会显示如下内容:
文件Y中第X行的FORALL循环已被矢量化为
我承认自从上次使用它以来已有几年了,所以编译器消息可能会实际上看起来非常不同,但这是基本的想法。
I've been given a 2D matrix representing temperature points on the surface of a metal plate. The edges of the matrix (plate) are held constant at 20 degrees C and there is a constant heat source of 100 degrees C at one pre-defined point. All other grid points are initially set to 50 degrees C.
My goal is to take all interior grid points and compute its steady-state temperature by iteratively averaging over the surrounding four grid points (i+1, i-1, j+1, j-1) until I reach convergence (a change of less than 0.02 degrees C between iterations).
As far as I know, the order in which I iterate over the grid points is irrelevant.
To me, this sounds like a fine time to invoke the Fortran FORALL
construct and explore the joys of parallelization.
How can I ensure that the code is indeed being parallelized?
For example, I can compile this on my single-core PowerBook G4 and I would expect no improvement in speed due to parallelization. But if I compile on a Dual Core AMD Opteron, I would assume that the FORALL construct can be exploited.
Alternatively, is there a way to measure the effective parallelization of a program?
Update
In response to M.S.B's question, this is with gfortran version 4.4.0. Does gfortran support automatic multi-threading?
That's remarkable that the FORALL construct has been rendered obsolete by, I suppose, what is then auto-vectorization.
Perhaps this is best for a separate question, but how does auto-vectorization work? Is the compiler able to detect that only pure functions or subroutines are being used in a loop?
If you use Intel Fortran Compiler, you can use a command line switch to turn on/increase the compliler's verbosity level for parallelization/vectorization. This way during compilation/linking you will be shown something like:
FORALL loop at line X in file Y has been vectorized
I admit that it has been a few of years since the last time I used it, so the compiler message might actually look very different, but that's the basic idea.
这篇关于我如何确保我的Fortran FORALL构造被并行化?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!