OpenMP专用阵列 - 分段错误:11 [英] OpenMP private array - Segmentation fault: 11

查看:270
本文介绍了OpenMP专用阵列 - 分段错误:11的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我尝试使用OpenMP并行处理Fortran90中的程序时,出现了分段错误错误。

 !$ OMP PARALLEL DO NUM_THREADS(4)& 
!$ OMP PRIVATE(numstrain,i)
do irep = 1,nrep
do i = 1,10
PRINT *,numstrain(i)
end do
end do
!$ OMP END PARALLEL DO

我发现如果我注释掉PRINT *,numstrain(i)或删除openmp标志,它的工作原理没有错误。我认为这是因为当我并行访问numstrain(i)时会发生内存访问冲突。我已经将i和numstrain声明为私有变量。有人能给我一些想法,为什么是这样的?非常感谢。 :)

更新:



我修改了以前的版本,这个版本可以打印出正确的结果。 p>

  integer,allocatable :: numstrain(:) 
integer :: allocate_status
integer :: n
!$ OMP PARALLEL DO NUM_THREADS(4)&
!$ OMP PRIVATE(numstrain,i)
n = 1000000
do irep = 1,nrep
allocate(numstrain(n),stat = allocate_status)
do i = 1,10
PRINT *,numstrain(i)
end do
deallocate(numstrain,stat = allocate_status)
end do
!$ OMP END PARALLEL DO $但是,如果我将numstrain移动到另一个由此子例程调用的子例程(下面的代码),1。它总是在一个线程中处理。 2.在某个点(i = 4或5),它返回分段错误:11。当我有不同的NUM_THREADS时,返回Segmentation Fault:11的变量i是不同的。


$ b $ pre $ integer,allocatable :: numstrain )
integer :: allocate_status
integer :: n
!$ OMP PARALLEL DO NUM_THREADS(4)&
!$ OMP PRIVATE(numstrain,i)
n = 1000000
do irep = 1,nrep
allocate(numstrain(n),stat = allocate_status)
调用anotherSubroutine (numstrain)
deallocate(numstrain,stat = allocate_status)
end do
!$ OMP END PARALLEL DO

子程序anotherSubroutine(numstrain)
整数, allocatable :: numstrain(:)
do i = 1,10
PRINT *,numstrain(i)
end do
end subroutine anotherSubroutine
解决方案

最典型的原因是堆栈中没有足够的空间来存放私人副本 numstrain 。计算并比较以下两个值:


  • 数组大小(以字节为单位)
  • 堆栈大小限制



有两种堆栈大小限制。主线程的堆栈大小由Unix系统上的进程限制(使用 ulimit -s 来检查和修改此限制)或在Windows上的链接时固定(为了改变限制,可执行文件的重新编译或二进制编辑是必需的)。附加OpenMP线程的堆栈大小由环境变量控制,如标准 OMP_STACKSIZE 或特定于实现的 GOMP_STACKSIZE (GNU / GCC OpenMP)和 KMP_STACKSIZE (英特尔OpenMP)。注意大多数Fortran OpenMP实现始终将私有无论您是否启用在堆上分配大型数组的编译器选项(使用GNU的 gfortran 和Intel的 ifort

如果您注释掉 PRINT 语句,则可以有效地移除对 numstrain ,编译器可以自由优化它,例如它可能不会创建 numstrain 的私人副本,因此不会超过堆栈限制。






在您提供的附加信息可以得出结论后,堆栈大小不是罪魁祸首。在处理 private ALLOCATABLE 数组时,您应该知道:


  • 未分配数组的私有副本保持未分配状态;
  • 已分配数组的私有副本分配的边界相同。



  • 如果您不在并行区域之外使用 numstrain ,那么您可以在你的第一个案例,但有一些修改:

     整数,allocatable :: numstrain(:) 
    整数:: allocate_status
    整数,参数:: n = 1000000
    接口
    子程序anotherSubroutine(numstrain)
    整数,allocatable :: numstrain(:)
    结束子程序anotherSubroutine
    (4)PRIVATE(numstrain,allocate_status)
    allocate(numstrain(n),stat = allocate_status)
    !$ OMP DO
    do irep = 1,nrep
    调用anotherSubroutine(numstrain)
    end do
    !$ OMP END DO $ b如果你还使用<
    code> numstrain
    在并行区域之外,那么分配和释放就在外面:

      allocate(numstrain(n),stat = allocate_status)
    !$ OMP PARALLEL DO NUM_THREADS(4)PRIVATE(numstrain)
    do irep = 1,nrep
    调用anotherSubroutine(numstrain)
    end do
    !$ OMP END PARALLEL DO
    deallocate(numstrain)

    您还应该知道,当您调用一个采用 ALLOCATABLE 数组作为参数的例程时,必须为该例程提供一个明确的接口。你可以写一个 INTERFACE 块,或者你可以把被调用的例程放到一个模块中,然后把 USE 这个模块 - 都案例将提供明确的界面。如果您没有提供显式接口,编译器将无法正确传递数组,而子例程将无法访问其内容。


    When I try to parallelize my program in Fortran90 by OpenMP, I get a segmentation fault error.

        !$OMP PARALLEL DO NUM_THREADS(4) &
        !$OMP PRIVATE(numstrain, i)
        do irep = 1, nrep
            do i=1, 10
                PRINT *, numstrain(i)
            end do
        end do
        !$OMP END PARALLEL DO
    

    I find that if I comment out "PRINT *, numstrain(i)" or remove openmp flags it works without error. I think it is because memory access conflict happens when I access numstrain(i) in parallel. I already declared i and numstrain as private variables. Could someone please give me some idea why it is the case? Thank you so much. :)

    UPDATE:

    I modified the previous version and this version can print out correct result.

    integer, allocatable :: numstrain(:)
    integer :: allocate_status
    integer :: n
    !$OMP PARALLEL DO NUM_THREADS(4) &
    !$OMP PRIVATE(numstrain, i)
    n = 1000000
    do irep = 1, nrep
        allocate (numstrain(n), stat = allocate_status)
        do i=1, 10
            PRINT *, numstrain(i)
        end do
        deallocate (numstrain, stat = allocate_status)
    end do
    !$OMP END PARALLEL DO
    

    However if I move the numstrain accessing to another subroutine called by this subroutine (code attached below), 1. It always processes in one thread. 2. At some point (i=4 or 5), it returns Segmentation Fault:11. The variable i when it returns Segmentation Fault:11 is different when I have different NUM_THREADS.

    integer, allocatable :: numstrain(:)
    integer :: allocate_status
    integer :: n
    !$OMP PARALLEL DO NUM_THREADS(4) &
    !$OMP PRIVATE(numstrain, i)
    n = 1000000
    do irep = 1, nrep
        allocate (numstrain(n), stat = allocate_status)
        call anotherSubroutine(numstrain)
        deallocate (numstrain, stat = allocate_status)
    end do
    !$OMP END PARALLEL DO
    
    subroutine anotherSubroutine(numstrain)
        integer, allocatable   :: numstrain(:)
        do i=1, 10
            PRINT *, numstrain(i)
        end do
    end subroutine anotherSubroutine
    

    I also tried to both allocate/deallocate in help subroutine and main subroutine, and only allocate/deallocate in help subroutine. Nothing is changed.

    解决方案

    The most typical reason for this is that not enough space is available on the stack to hold the private copy of numstrain. Compute and compare the following two values:

    • the size of the array in bytes
    • the stack size limit

    There are two kinds of stack size limits. The stack size of the main thread is controlled by things like process limits on Unix systems (use ulimit -s to check and modify this limit) or is fixed at link time on Windows (recompilation or binary edit of the executable is necessary in order to change the limit). The stack size of the additional OpenMP threads is controlled by environment variables like the standard OMP_STACKSIZE, or the implementation-specific GOMP_STACKSIZE (GNU/GCC OpenMP) and KMP_STACKSIZE (Intel OpenMP).

    Note that most Fortran OpenMP implementations always put private arrays on the stack, no matter if you enable compiler options that allocate large arrays on the heap (tested with GNU's gfortran and Intel's ifort).

    If you comment out the PRINT statement, you effectively remove the reference to numstrain and the compiler is free to optimise it out, e.g. it could simply not make a private copy of numstrain, thus the stack limit is not exceeded.


    After the additional information that you've provided one can conclude, that stack size is not the culprit. When dealing with private ALLOCATABLE arrays, you should know that:

    • private copies of unallocated arrays remain unallocated;
    • private copies of allocated arrays are allocated with the same bounds.

    If you do not use numstrain outside of the parallel region, it is fine to do what you've done in your first case, but with some modifications:

    integer, allocatable :: numstrain(:)
    integer :: allocate_status
    integer, parameter :: n = 1000000
    interface
       subroutine anotherSubroutine(numstrain)
          integer, allocatable :: numstrain(:)
       end subroutine anotherSubroutine
    end interface
    
    !$OMP PARALLEL NUM_THREADS(4) PRIVATE(numstrain, allocate_status)
    allocate (numstrain(n), stat = allocate_status)
    !$OMP DO
    do irep = 1, nrep
       call anotherSubroutine(numstrain)
    end do
    !$OMP END DO
    deallocate (numstrain)
    !$OMP END PARALLEL
    

    If you also use numstrain outside of the parallel region, then the allocation and deallocation go outside:

    allocate (numstrain(n), stat = allocate_status)
    !$OMP PARALLEL DO NUM_THREADS(4) PRIVATE(numstrain)
    do irep = 1, nrep
       call anotherSubroutine(numstrain)
    end do
    !$OMP END PARALLEL DO
    deallocate (numstrain)
    

    You should also know that when you call a routine that takes an ALLOCATABLE array as argument, you have to provide an explicit interface for that routine. You can either write an INTERFACE block or you can put the called routine in a module and then USE that module - both cases would provide the explicit interface. If you do not provide the explicit interface, the compiler would not pass the array correctly and the subroutine would fail to access its content.

    这篇关于OpenMP专用阵列 - 分段错误:11的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆