Fortran:动态数组与自动数组避免内存分配 [英] Fortran: dynamic arrays vs. automatic array Avoiding Memory Allocation

查看:2077
本文介绍了Fortran:动态数组与自动数组避免内存分配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对我们fortran代码中的一个进行分析,有两个子程序占用了大部分的计算时间(22.1%和17.2%)。在每个程序中,约5%的时间花在分配和释放内存上。这些例程看起来像

  MODULE foo 

CONTAINS

SUBROUTINE bar( ...)
...
IMPLICIT NONE
...
REAL,ALLOCATABLE,DIMENSION(:, :) :) work
...
ALLOCATE(work(size1,size2))
...
DEALLOCATE(工作)
END SUBROUTINE bar
...
END MODULE foo

这些子程序在我的工作台标记中被调用的次数约为4000-5000次,所以我想摆脱ALLOCATE和DEALLOCATE。

  MODULE foo 

CONTAINS
更改为自动数组
SUBROUTINE bar(...)
...
IMPLICIT NONE
...
REAL,DIMENSION(size1,size2):: work
...
END SUBROUTINE bar
...
END MODULE foo

将结果配置文件更改为

 运行时间符号名称
20955.0ms 17.0%__totzsp_mod_MOD_totzsps
7.0 ms 0.0%malloc
5.0ms 0.0%免费
2.0ms 0.0%user_trap
$ b 16192.0ms 13.2%__tomnsp_mod_MOD_tomnsps
20.0ms 0.0%免费
3.0ms 0.0%malloc
1.0ms 0.0%szone_size_try_large

我看起来像gfortran正在分配这些堆栈,而不是堆,但我担心什么时候这些数组变得太大。



我正在采取的第二种方法是



work_array.f

 分配和取消分配这些数组一次。 MODULE work_array 
IMPLICIT NONE

REAL(rprec),ALLOCATABLE,DIMENSION(:, :) :: work

END MODULE work_array

我在代码的不同部分分配一次。现在我的子程序看起来像

 模块foo 

CONTAINS

子程序栏(...)
...
USE work_array
IMPLICIT NONE
...
END SUBROUTINE bar
...
END MODULE foo

但是,当我运行代码时,配置文件变得更糟。

 运行时间符号名称
30584.0ms 21.6%__totzsp_mod_MOD_totzsps
3494.0ms 2.4%免费
3143.0ms 2.2%malloc
27.0ms 0.0%DYLD-STUB $$ malloc_zone_malloc
19.0ms 0.0%szone_free_definite_size
6.0ms 0.0%malloc_zone_malloc

24325.0ms 17.1%__tomnsp_mod_MOD_tomnsps
2937.0ms 2.0 %free
2456.0ms 1.7%malloc
23.0ms 0.0%DYLD-STUB $$ malloc_zone_malloc
3.0ms 0.0%szone_free_definite_size

这些额外的mallocs和释放源自哪里?我怎么设置这个,所以我分配这些数组一次?

解决方案

由于工作 array只用在 bar 子例程中,你可以添加 save 属性并分配它当子程序第一次被调用时。如果 work1 work2 与以前的调用相比有所不同,那么您可以在此情况下重新分配数组。



一旦子程序不再需要,这会导致释放问题。如果你需要在程序的整个生命周期中调用它,这是没问题的,因为当程序退出时,操作系统应该释放内存。另一方面,如果您在初始化时只需要它,即使不需要,内存也将保持分配。如果内存使用率有问题,也许你可以向子程序添加一个参数,告诉它取消分配 work 数组。


Profiling one of our fortran codes, there are two subroutines that are taking up most of the computational time (22.1% and 17.2%). In each routines, ~5% of the time is spent allocating and freeing memory. These routines look like

MODULE foo

CONTAINS

SUBROUTINE bar( ... )
...
IMPLICIT NONE
...
REAL, ALLOCATABLE, DIMENSION(:,:) :: work
...
ALLOCATE (work(size1,size2))
...
DEALLOCATE (work)
END SUBROUTINE bar
...
END MODULE foo

These subroutines get called on the order of ~4000-5000 times in my bench mark so I would like to get rid of ALLOCATE and DEALLOCATE. Changing these to automatic arrays changes to profiler output to.

MODULE foo

CONTAINS

SUBROUTINE bar( ... )
...
IMPLICIT NONE
...
REAL, DIMENSION(size1,size2) :: work
...
END SUBROUTINE bar
...
END MODULE foo

Changes the resulting profile to

Running Time        Symbol Name
20955.0ms  17.0%    __totzsp_mod_MOD_totzsps
    7.0ms   0.0%        malloc
    5.0ms   0.0%        free
    2.0ms   0.0%        user_trap

16192.0ms  13.2%    __tomnsp_mod_MOD_tomnsps
   20.0ms   0.0%        free
    3.0ms   0.0%        malloc
    1.0ms   0.0%        szone_size_try_large

I looks like gfortran is allocating these on the stack and not that heap but I'm concerned about when happens when these arrays become too large.

The second approach that I'm taking is to allocate and deallocate these arrays once.

work_array.f

MODULE work_array
IMPLICIT NONE

REAL(rprec), ALLOCATABLE, DIMENSION(:,:) :: work

END MODULE work_array

I allocate these once in a different part of the code. Now my subroutine looks like

MODULE foo

CONTAINS

SUBROUTINE bar( ... )
...
USE work_array
IMPLICIT NONE
...
END SUBROUTINE bar
...
END MODULE foo

However when I run the code now the profile get worse.

Running Time        Symbol Name
30584.0ms  21.6%    __totzsp_mod_MOD_totzsps
 3494.0ms   2.4%        free
 3143.0ms   2.2%        malloc
   27.0ms   0.0%        DYLD-STUB$$malloc_zone_malloc
   19.0ms   0.0%        szone_free_definite_size
    6.0ms   0.0%        malloc_zone_malloc

24325.0ms  17.1%    __tomnsp_mod_MOD_tomnsps
 2937.0ms   2.0%        free
 2456.0ms   1.7%        malloc
   23.0ms   0.0%        DYLD-STUB$$malloc_zone_malloc
    3.0ms   0.0%        szone_free_definite_size

Where are these extra mallocs, and frees coming from? How can I set this up so I allocate these arrays once?

解决方案

Since the work array is only used inside the bar subroutine, you could add the save attribute to it and allocate it when the subroutine is called for the first time. If work1 or work2 is different compared to previous calls, you can just reallocate the array in that case.

This does leave the problem of deallocation once the subroutine is no longer needed. If you need to call it during the whole life-time of the program, it's no problem since the OS should deallocate the memory when the program quits. On the other hand, if you only need it during initialization, the memory will remain allocated even when not needed. Maybe you can add an argument to the subroutine which tells it to deallocate the work array, if memory usage is a problem.

这篇关于Fortran:动态数组与自动数组避免内存分配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆