Fortran:动态数组与自动数组避免内存分配 [英] Fortran: dynamic arrays vs. automatic array Avoiding Memory Allocation
问题描述
MODULE foo
CONTAINS
SUBROUTINE bar( ...)
...
IMPLICIT NONE
...
REAL,ALLOCATABLE,DIMENSION(:, :) :) work
...
ALLOCATE(work(size1,size2))
...
DEALLOCATE(工作)
END SUBROUTINE bar
...
END MODULE foo
这些子程序在我的工作台标记中被调用的次数约为4000-5000次,所以我想摆脱ALLOCATE和DEALLOCATE。
MODULE foo
CONTAINS
更改为自动数组
SUBROUTINE bar(...)
...
IMPLICIT NONE
...
REAL,DIMENSION(size1,size2):: work
...
END SUBROUTINE bar
...
END MODULE foo
将结果配置文件更改为
运行时间符号名称
20955.0ms 17.0%__totzsp_mod_MOD_totzsps
7.0 ms 0.0%malloc
5.0ms 0.0%免费
2.0ms 0.0%user_trap
$ b 16192.0ms 13.2%__tomnsp_mod_MOD_tomnsps
20.0ms 0.0%免费
3.0ms 0.0%malloc
1.0ms 0.0%szone_size_try_large
我看起来像gfortran正在分配这些堆栈,而不是堆,但我担心什么时候这些数组变得太大。
我正在采取的第二种方法是
work_array.f
分配和取消分配这些数组一次。 MODULE work_array
IMPLICIT NONE
REAL(rprec),ALLOCATABLE,DIMENSION(:, :) :: work
END MODULE work_array
我在代码的不同部分分配一次。现在我的子程序看起来像
模块foo
CONTAINS
子程序栏(...)
...
USE work_array
IMPLICIT NONE
...
END SUBROUTINE bar
...
END MODULE foo
但是,当我运行代码时,配置文件变得更糟。
运行时间符号名称
30584.0ms 21.6%__totzsp_mod_MOD_totzsps
3494.0ms 2.4%免费
3143.0ms 2.2%malloc
27.0ms 0.0%DYLD-STUB $$ malloc_zone_malloc
19.0ms 0.0%szone_free_definite_size
6.0ms 0.0%malloc_zone_malloc
24325.0ms 17.1%__tomnsp_mod_MOD_tomnsps
2937.0ms 2.0 %free
2456.0ms 1.7%malloc
23.0ms 0.0%DYLD-STUB $$ malloc_zone_malloc
3.0ms 0.0%szone_free_definite_size
这些额外的mallocs和释放源自哪里?我怎么设置这个,所以我分配这些数组一次?
由于工作
array只用在 bar
子例程中,你可以添加 save
属性并分配它当子程序第一次被调用时。如果 work1
或 work2
与以前的调用相比有所不同,那么您可以在此情况下重新分配数组。
一旦子程序不再需要,这会导致释放问题。如果你需要在程序的整个生命周期中调用它,这是没问题的,因为当程序退出时,操作系统应该释放内存。另一方面,如果您在初始化时只需要它,即使不需要,内存也将保持分配。如果内存使用率有问题,也许你可以向子程序添加一个参数,告诉它取消分配 work
数组。
Profiling one of our fortran codes, there are two subroutines that are taking up most of the computational time (22.1% and 17.2%). In each routines, ~5% of the time is spent allocating and freeing memory. These routines look like
MODULE foo
CONTAINS
SUBROUTINE bar( ... )
...
IMPLICIT NONE
...
REAL, ALLOCATABLE, DIMENSION(:,:) :: work
...
ALLOCATE (work(size1,size2))
...
DEALLOCATE (work)
END SUBROUTINE bar
...
END MODULE foo
These subroutines get called on the order of ~4000-5000 times in my bench mark so I would like to get rid of ALLOCATE and DEALLOCATE. Changing these to automatic arrays changes to profiler output to.
MODULE foo
CONTAINS
SUBROUTINE bar( ... )
...
IMPLICIT NONE
...
REAL, DIMENSION(size1,size2) :: work
...
END SUBROUTINE bar
...
END MODULE foo
Changes the resulting profile to
Running Time Symbol Name
20955.0ms 17.0% __totzsp_mod_MOD_totzsps
7.0ms 0.0% malloc
5.0ms 0.0% free
2.0ms 0.0% user_trap
16192.0ms 13.2% __tomnsp_mod_MOD_tomnsps
20.0ms 0.0% free
3.0ms 0.0% malloc
1.0ms 0.0% szone_size_try_large
I looks like gfortran is allocating these on the stack and not that heap but I'm concerned about when happens when these arrays become too large.
The second approach that I'm taking is to allocate and deallocate these arrays once.
work_array.f
MODULE work_array
IMPLICIT NONE
REAL(rprec), ALLOCATABLE, DIMENSION(:,:) :: work
END MODULE work_array
I allocate these once in a different part of the code. Now my subroutine looks like
MODULE foo
CONTAINS
SUBROUTINE bar( ... )
...
USE work_array
IMPLICIT NONE
...
END SUBROUTINE bar
...
END MODULE foo
However when I run the code now the profile get worse.
Running Time Symbol Name
30584.0ms 21.6% __totzsp_mod_MOD_totzsps
3494.0ms 2.4% free
3143.0ms 2.2% malloc
27.0ms 0.0% DYLD-STUB$$malloc_zone_malloc
19.0ms 0.0% szone_free_definite_size
6.0ms 0.0% malloc_zone_malloc
24325.0ms 17.1% __tomnsp_mod_MOD_tomnsps
2937.0ms 2.0% free
2456.0ms 1.7% malloc
23.0ms 0.0% DYLD-STUB$$malloc_zone_malloc
3.0ms 0.0% szone_free_definite_size
Where are these extra mallocs, and frees coming from? How can I set this up so I allocate these arrays once?
Since the work
array is only used inside the bar
subroutine, you could add the save
attribute to it and allocate it when the subroutine is called for the first time. If work1
or work2
is different compared to previous calls, you can just reallocate the array in that case.
This does leave the problem of deallocation once the subroutine is no longer needed. If you need to call it during the whole life-time of the program, it's no problem since the OS should deallocate the memory when the program quits. On the other hand, if you only need it during initialization, the memory will remain allocated even when not needed. Maybe you can add an argument to the subroutine which tells it to deallocate the work
array, if memory usage is a problem.
这篇关于Fortran:动态数组与自动数组避免内存分配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!