在Fortran中传递不连续的数组部分 [英] passing a noncontiguous array section in Fortran

查看:132
本文介绍了在Fortran中传递不连续的数组部分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用intel fortran编译器和intel mkl进行性能检查.我通过调用

I am using intel fortran compiler and intel mkl for a performance check. I am passing some array sections to Fortran 77 interface with calls like

call dgemm( transa,transb,sz_s,P,P,&
            a, Ts_tilde,&
            sz_s,R_alpha,P,b,tr(:sz_s,:),sz_s)

很明显,tr(:sz_s,:)在内存中不是连续的,并且Fortran 77接口期望一个连续的块并为此创建一个临时块.

as evident, tr(:sz_s,:) is not contiguous in memory and the Fortran 77 interface is expecting a continuous block and creating a temporary for this.

我想知道的是,如果我在tr的代码中显式创建我的临时数组并在操作之前和之后从该临时对象来回复制信息,这会有所不同吗?或者与编译器本身相同从性能角度创建临时文件?我猜编译器总是会更高效.

What I was wondering is that will there be a difference if I create my temporary array explicitly in the code for tr and copy information from that temporary back and forth before and after the operation, or will that be the same as compiler itself creating the temporary from a performance point of view? I guess compiler will always be more efficient.

当然,欢迎有更多消除这些临时性的建议.

And of course any more suggestions to eliminate these temporaries are welcome.

还有一点,如果我显然使用该库的Fortran 95接口,并且对较简单的测试问题进行了类似的调用,则不会发出有关创建临时目录的警告.然后,我在mkl手册中读到,Fortran 95接口使用假定的形状数组,这说明了为什么不创建临时项的原因.

One more point, If I use the Fortran 95 interface of the library apparently, with a similar call on a simpler test problem, no warning is issued for the creation of a temporary. Then I read in the manual of mkl that Fortran 95 interface uses assumed shape arrays which explains why temporaries are not created.

但是,到那时,我似乎无法使用某些支持功能,例如计时例程. 即,英特尔mkl具有一些定时支持功能,但是如果我将它们与如下所示的mkl_service例程一起使用,则会收到dsecnd的此名称没有类型,并且必须具有显式类型"错误.也欢迎对此问题有任何想法.一个简单的例子是

However at that point, I can not seem to use some support functions like timing routines. Namely, intel mkl has some timing support functions but if I use them with the mkl_service routine like below then I get 'This name does not have a type, and must have an explicit type' error for dsecnd. Any idea for this problem is also welcome. A simple example for this is given as

program dgemm95_test
! some modules for Fortran 95 interface
use mkl_service
use mkl95_precision
use mkl95_blas
!
implicit none
!
double precision, dimension(4,3) :: a
double precision, dimension(6,4) :: b
double precision, dimension(5,5) :: r ! result array
double precision, dimension(3,2) :: dummy_b
!
character(len=1) :: transa
character(len=1) :: transb
!
double precision :: alpha, beta, t1, t2, t
integer :: sz1, sz2

! initialize some variables
alpha = 1.0
beta = 0.0
a = 2.3
b = 4.5
r = 0.0
transa = 'n'
transb = 'n'
dummy_b = 0.0
! Fortran 95 interface
t1 = dsecnd()
call gemm( a, b(4:6,1:3:2), r(2:5,3:4),&
 transa, transb, alpha, beta )
t2 = dsecnd()
!
write(*,*) r
dummy_b  = r(2:4,4:5)
!
end program dgemm95_test

推荐答案

在将数组节传递给假定大小的数组虚拟参数时,使用临时例程是绝对必要的,这是旧例程使用的,因为数组节在内存中不连续

The temporary is absolutely necessary when passing your array section to an assumed size array dummy argument, which the old routines use, because the array section is not contiguous in memory.

您当然可以创建自己的临时数组.是否会更快取决于许多因素.除其他事项外,重要的是临时文件是分配在堆栈上还是堆上.英特尔®Fortran编译器具有这两种功能,有一些编译器开关可以控制行为(-heap-arrays n),并且可以取决于阵列大小.堆栈分配要快得多,通常是默认设置.默认情况下,自动数组(可能用于自己的临时数组)也会分配在堆栈上.小心堆栈上的大数组,您很容易溢出它并导致崩溃.

You can of course make your own temporary arrays. Whether it will be faster or not depends on many factors. Among others the important thing is whether the temporary is allocated on the stack or on the heap. The Intel Fortran compiler is capable of both, there are compiler switches to control the behavior (-heap-arrays n) and it can depend on the array size. Stack allocation is much faster and it is usually the default. Automatic arrays, which you might use for your own temporary are allocated on the stack by default too. Be careful with large arrays on the stack, you can easily overflow it and cause a crash.

如果速度不太慢,我建议您进行性能测试并使用更简单的变体.可能是Fortran 95界面,但是您应该真正测量时间.

I would suggest you to make a performance test and use the simpler variant if it is not too slow. Probably it will be the Fortran 95 interface, but you should measure the times, really.

关于计时,second()/dsecnd()的MKL手册页指出您必须包含mkl_lapack.fi,并且不涉及任何Fortran95接口.您也可以声明外部双精度,但我会使用include.或使用system_clock()作为便携式标准Fortran 95.

As for the timing, MKL manual page for second()/dsecnd() states you must includemkl_lapack.fi and doesn't speak about any Fortran95 interface. You could get away declaring it external double precision too, but I would use the include. Or use system_clock() as a portable standard Fortran 95.

这篇关于在Fortran中传递不连续的数组部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆