为什么在这个openmp代码中发生分段错误? [英] Why Segmentation fault is happening in this openmp code?

查看:2725
本文介绍了为什么在这个openmp代码中发生分段错误?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

主程序:

 程序主体
使用omp_lib
使用my_module
隐含无

整数,参数:: nmax = 202000
real(8):: e_in(nmax)= 0.D0
整数

调用omp_set_num_threads(2)
!$ omp并行默认值(firstprivate)
!$ omp do
do i = 1,2
print *,e_in(i)
print *,eTDSE(i)
end do
!$ omp end do
!$ omp end parallel
end program main



$模块:
$ b

  module my_module 
隐含无

整数,参数private :: ntmax = 202000
double complex :: eTDSE(ntmax)=(0.D0,0.D0)
!$ omp threadprivate(eTDSE )

结尾模块my_module

使用以下内容编译:

  ifort -openmp main.f90 my_module.f90 

执行时会出现分段错误。如果删除主程序中的一个打印命令,它运行良好。此外,如果删除omp函数并且在没有-openmp选项的情况下编译,它也可以正常运行。 解决方案

这种行为的最可能原因是你的堆栈大小限制太小(无论什么原因)。由于 e_in 对于每个OpenMP线程是私有的,因此每个线程的一个副本在线程堆栈上分配(即使您已指定 -heap-arrays !)。 202000 REAL(KIND = 8)的元素取1616 kB(或1579 KiB)。



堆栈大小限制可以通过以下几种机制来控制:堆栈大小由 ulimit -s 中的堆栈大小来控制。这也是主OpenMP线程的堆栈大小限制。在创建新线程时,POSIX线程( pthreads )库也将此限制的值用作默认线程堆栈大小。

  • OpenMP支持通过环境变量 OMP_STACKSIZE 控制所有附加线程的堆栈大小限制。它的值是一个带有可选后缀 k / K 的KiB数字, m / M ffor MiB或 g / G 。这个值不会影响主线程的堆栈大小。


  • GNU OpenMP运行时( libgomp )可以识别非标准环境变量 GOMP_STACKSIZE 。如果设置,它将覆盖 OMP_STACKSIZE 的值。

  • 英特尔OpenMP运行时识别非易失性存储器标准环境变量 KMP_STACKSIZE 。如果设置它,则覆盖 OMP_STACKSIZE 的值,并且如果使用兼容性OpenMP运行时,也覆盖 GOMP_STACKSIZE 的值(这是默认的,因为目前唯一可用的英特尔OpenMP运行时库是 compat 一个)。


  • 如果没有设置 * _ STACKSIZE 变量,Intel OpenMP运行时的默认值是 2m 32位体系结构,在64位体系结构中 4m

  • 在Windows上,堆栈大小为主线程是PE头的一部分,并由链接器嵌入其中。如果使用Microsoft的 LINK 进行链接,则使用 / STACK:reserve [,commit] 指定大小。 reserve 参数指定了最大堆栈大小(以字节为单位),而可选的 commit 参数指定了初始提交大小。两者都可以使用 0x 前缀指定为十六进制值。如果重新链接可执行文件不是一个选项,则可以通过使用 EDITBIN 。它采用与链接器相同的堆栈相关参数。无法编辑启用了MSVC整个程序优化的程序( / GL )。 GNU链接器对于Win32目标,支持通过 - stack 参数设置堆栈大小。要直接从GCC传递选项,可以使用 -Wl, - stack,< size(以字节为单位)>


    请注意,线程堆栈实际上是由 * _ STACKSIZE 设置的大小 (或默认值),与主线程的堆栈不同,它从小开始,然后根据需要增长到设定的限制。因此,不要将 * _ STACKSIZE 设置为任意大的值,否则您可能会遇到进程虚拟内存大小限制。



    以下是一些例子:

      $ ifort -openmp my_module.f90 main.f90 

    将主堆栈大小限制设置为1 MiB(根据默认值,额外的OpenMP线程将获得4 MiB):

      $ ulimit -s 1024 
    $ ./a.out
    zsh:段错误(核心转储)./a.out

    将主堆栈大小限制设置为1700 KiB:

      $ ulimit -s 1700 
    $ ./a.out
    0.000000000000000E + 000
    (0.000000000000000E + 000,0.000000000000000E + 000)
    0.000000000000000E + 000
    (0.000000000000000E + 000,0.000000000000000E + 000)

    将主堆栈大小限制设置为2 MiB,并将附加线程的堆栈大小设置为1 MiB:

      $ ulimit -s 2048 
    $ KMP_STACKSIZE = 1m ./a.out
    zsh:分段错误(核心转储)KMP_STACKSIZE = 1m ./a.out

    在大多数Unix系统中,堆栈大小主线程的限制由PAM或其他登录机制设置(请参阅 /etc/security/limits.conf )。 Scientific Linux 6.3上的默认值为10 MiB。



    如果虚拟地址空间限制设置得太低,另一种可能导致错误的可能情况是。例如,如果虚拟地址空间限制为1 GiB,并且线程堆栈大小限制设置为512 MiB,则OpenMP运行时会尝试为每个附加线程分配512 MiB。在两个线程中,只有1个GiB用于堆栈,并且当代码,共享库,堆等的空间相加时,虚拟内存大小会增加超过1 GiB,并且会发生错误:



    将虚拟地址空间限制设置为1 GiB,并使用两个额外的线程运行512 Mb堆栈(我已将注释<> omp_set_num_threads()

      $ ulimit -v 1048576 
    $ KMP_STACKSIZE = 512m OMP_NUM_THREADS = 3 ./a。 out
    OMP:错误#34:系统无法为OMP线程分配必需的资源:
    OMP:系统错误#11:资源暂时不可用
    OMP:提示:尝试减小OMP_NUM_THREADS的值。
    forrtl:错误(76):中止陷阱信号
    ...跟踪省略...
    zsh:abort(核心转储)OMP_NUM_THREADS = 3 KMP_STACKSIZE = 512m ./a.out

    在这种情况下,OpenMP运行时库将无法创建新线程并在中止之前通知您程序终止。


    main program:

    program main                                                                                                                                                    
      use omp_lib                                                                                                                                                   
      use my_module                                                                                                                                                 
      implicit none                                                                                                                                                 
    
      integer, parameter :: nmax = 202000                                                                                                                           
      real(8) :: e_in(nmax) = 0.D0                                                                                                                                  
      integer i                                                                                                                                                     
    
    call omp_set_num_threads(2)                                                                                                                                     
    !$omp parallel default(firstprivate)                                                                                                                            
    !$omp do                                                                                                                                                        
      do i=1,2                                                                                                                                                      
         print *, e_in(i)                                                                                                                                           
         print *, eTDSE(i)                                                                                                                                          
      end do                                                                                                                                                        
    !$omp end do                                                                                                                                                    
    !$omp end parallel                                                                                                                                              
    end program main
    

    module:

    module my_module                                                                                                                                                
      implicit none                                                                                                                                                 
    
      integer, parameter, private :: ntmax = 202000                                                                                                  
      double complex :: eTDSE(ntmax) = (0.D0,0.D0)                                                                                                                  
    !$omp threadprivate(eTDSE)                                                                                                                                      
    
    end module my_module
    

    compiled using:

    ifort -openmp main.f90 my_module.f90
    

    It gives the Segmentation fault when execution. If remove one of the print commands in the main program, it runs fine. Also if remove the omp function and compile without -openmp option, it runs fine too.

    解决方案

    The most probable cause for this behaviour is that your stack size limit is too small (for whatever reason). Since e_in is private to each OpenMP thread, one copy per thread is allocated on the thread stack (even if you have specified -heap-arrays!). 202000 elements of REAL(KIND=8) take 1616 kB (or 1579 KiB).

    The stack size limit can be controlled by several mechanisms:

    • On standard Unix system shells the amount of stack size is controlled by ulimit -s <stacksize in KiB>. This is also the stack size limit for the main OpenMP thread. The value of this limit is also used by the POSIX threads (pthreads) library as the default thread stack size when creating new threads.

    • OpenMP supports control over the stack size limit of all additional threads via the environment variable OMP_STACKSIZE. Its value is a number with an optional suffix k/K for KiB, m/M ffor MiB, or g/G for GiB. This value does not affect the stack size of the main thread.

    • The GNU OpenMP run-time (libgomp) recognises the non-standard environment variable GOMP_STACKSIZE. If set it overrides the value of OMP_STACKSIZE.

    • The Intel OpenMP run-time recognises the non-standard environment variable KMP_STACKSIZE. If set it overrides the value of OMP_STACKSIZE and also overrides the value of GOMP_STACKSIZE if the compatibility OpenMP run-time is used (which is the default as currently the only available Intel OpenMP run-time library is the compat one).

    • If none of the *_STACKSIZE variables are set, the default for Intel OpenMP run-time is 2m on 32-bit architectures and 4m on 64-bit ones.

    • On Windows, the stack size of the main thread is part of the PE header and is embedded there by the linker. If using Microsoft's LINK to do the linking, the size is specified using the /STACK:reserve[,commit]. The reserve argument specifies the maximum stack size in bytes while the optional commit argument specifies the initial commit size. Both can be specified as hexadecimal values using the 0x prefix. If re-linking the executable is not an option, the stack size could be modified by editing the PE header with EDITBIN. It takes the same stack-related argument as the linker. Programs compiled with MSVC's whole program optimisation enabled (/GL) cannot be edited.

    • The GNU linker for Win32 targets supports setting the stack size via the --stack argument. To pass the option directly from GCC, the -Wl,--stack,<size in bytes> can be used.

    Note that thread stacks are actually allocated with the size set by *_STACKSIZE (or to the default value), unlike the stack of the main thread, which starts small and then grows on demand up to the set limit. So don't set *_STACKSIZE to an arbitrary large value otherwise you may hit the process virtual memory size limit.

    Here are some examples:

    $ ifort -openmp my_module.f90 main.f90
    

    Set the main stack size limit to 1 MiB (the additional OpenMP thread would get 4 MiB as per default):

    $ ulimit -s 1024
    $ ./a.out
    zsh: segmentation fault (core dumped)  ./a.out
    

    Set the main stack size limit to 1700 KiB:

    $ ulimit -s 1700
    $ ./a.out
      0.000000000000000E+000
     (0.000000000000000E+000,0.000000000000000E+000)
      0.000000000000000E+000
     (0.000000000000000E+000,0.000000000000000E+000)
    

    Set the main stack size limit to 2 MiB and the stack size of the additional thread to 1 MiB:

    $ ulimit -s 2048
    $ KMP_STACKSIZE=1m ./a.out
    zsh: segmentation fault (core dumped)  KMP_STACKSIZE=1m ./a.out
    

    On most Unix systems the stack size limit of the main thread is set by PAM or other login mechanism (see /etc/security/limits.conf). The default on Scientific Linux 6.3 is 10 MiB.

    Another possible scenario that can lead to an error is if the virtual address space limit is set too low. For example, if the virtual address space limit is 1 GiB and the thread stack size limit is set to 512 MiB, then the OpenMP run-time would try to allocate 512 MiB for each additional thread. With two threads one would have 1 GiB for the stacks only, and when the space for code, shared libraries, heap, etc. is added up, the virtual memory size would grow beyond 1 GiB and an error would occur:

    Set the virtual address space limit to 1 GiB and run with two additional threads with 512 MiB stacks (I have commented out the call to omp_set_num_threads()):

    $ ulimit -v 1048576
    $ KMP_STACKSIZE=512m OMP_NUM_THREADS=3 ./a.out
    OMP: Error #34: System unable to allocate necessary resources for OMP thread:
    OMP: System error #11: Resource temporarily unavailable
    OMP: Hint: Try decreasing the value of OMP_NUM_THREADS.
    forrtl: error (76): Abort trap signal
    ... trace omitted ...
    zsh: abort (core dumped)  OMP_NUM_THREADS=3 KMP_STACKSIZE=512m ./a.out
    

    In this case the OpenMP run-time library would fail to create a new thread and would notify you before it aborts program termination.

    这篇关于为什么在这个openmp代码中发生分段错误?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆