数组的分段错误,但仅当派生类型的组件存在时 [英] Segmentation fault for array, but only if a component of a derived type

查看:88
本文介绍了数组的分段错误,但仅当派生类型的组件存在时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用Linux(red hat)上的gfortran 4.8.5进行非常简单的设置:

Pretty simple setup, using gfortran 4.8.5 on linux (red hat):

  • 如果我的实数数组(在派生类型内部)的大小大于2,000,000,则会出现段错误.这似乎是一个标准的堆栈/堆问题,因为如果我使用ulimit进行检查,我的堆栈大小为8mb.

  • I get a segfault if my array of reals (inside a derived type) has size > 2,000,000. This seems to be a standard stack/heap issue as my stack size is 8mb if I check with ulimit.

如果数组在派生类型内,则没有问题

There is no problem if the array is NOT inside a derived type

请注意,正如@francescalus猜测的那样,删除初始值= 0.0可以消除问题

Note that as @francescalus guesses, removing the initial value = 0.0 eliminates the problem

编辑以添加:请注意,我发布了一个后续问题

Edit to add: Note that I have posted a followup question Segmentation fault related to component of derived type that represents a more realistic use case and further narrows down the conditions under which this seems to occur.

program main

    call sub1     ! seg fault  if col size >   2,100,000
    call sub2     ! works fine at col size = 100,000,000  

end program main

subroutine sub1

    type table
        real :: col(2100000) = 0.0     ! works if "= 0.0" removed
    end type table

    type(table) :: table1
    table1%col = 1.0

end subroutine sub1

subroutine sub2
    real :: col(100000000) = 0.0
    col = 1.0
end subroutine sub2

这里有一些明显的问题:

Some obvious questions here:

  • 这是预期的行为,还是新版gfortran中已修复的一些错误?

  • Is this expected behavior, or some bug that was fixed in newer versions of gfortran?

我在这里遵循标准的fortran操作程序,还是做错了什么?

Am I following standard fortran operating procedures here, or doing something wrong?

为避免这种情况的推荐方法是什么(请假设我近期无法更新到gfortran的较新版本)?出于某种原因,我几乎肯定会使用可分配的数组组件来解决该问题,但这可能不是理想的常规解决方案,我想知道我在这里拥有的所有不错的选择.

What is the recommended way to avoid this (please assume that I am unable to update to a newer version of gfortran in the near term)? I will almost certainly solve with an allocatable array component for reasons not specific to this question, but that might not be an ideal general solution and I would like to know of all good options I have here.

特别是,初始化派生类型的组件是不好的做法吗?

In particular, is initializing the components of a derived type bad practice?

推荐答案

这可能是运行时的问题,原因是堆栈不足,而不是gfortran的错误.

This is likely to be a runtime issue due to insufficient stack, rather than a bug with gfortran.

Gfortran使用堆栈来存储自动数组和其他初始化数据.当一个这样的数组很小时,如果代码没有产生问题,而当数组的大小增加时,代码却出现段错误,则可能的原因是堆栈用完了.

Gfortran uses the stack to store automatic arrays and other initialization data. When code does not create problems when one such array is small, but segfaults when the size of the array increases, a possible reason is running out of stack.

在最新版本的gfortran中,该问题似乎相同.我使用gfortran 4.8.4、4.9.3、5.5.0、6.4.0、7.3.0和8.2.0编译并运行了您的程序.在所有情况下,我都使用默认堆栈大小获得了分段错误,但是当堆栈大小略微增加时却没有错误.

The issue seems to be the same in more recent versions of gfortran. I compiled and ran your program with gfortran 4.8.4, 4.9.3, 5.5.0, 6.4.0, 7.3.0 and 8.2.0. In all cases I obtained a segmentation fault with the default stack size, but no error when the stack size was slightly increased.

$  ./sfa
Segmentation fault
$ ulimit -s
8192
$ ulimit -s 8256 
$ ./sfa && echo "DONE"
DONE

您的问题可以通过运行来解决

Your problem may be solved by running

$ ulimit -s unlimited

在执行二进制文件之前.我不知道这样做有何特别的惩罚,但程序员更了解内存管理的细节,例如编译器开发人员,可能会认为其他方式.

before executing your binary. I am not aware of any particular penalty for doing this, but programmers more aware of the fine details of memory management, such as compiler developers, may think otherwise.

初始化派生类型的组件并不是一个坏习惯,但是正如您所看到的,如果组件是一个大数组,它可能会在堆栈上产生问题-可能是由于组件本身的存储,还是由于组件本身的存储.存储内存以处理分配的RHS.如果使该组件可分配并在子例程中分配,则将数组存储在堆中而不是堆栈中,通常可以避免此问题.在这种情况下,可能实际上是在子例程中而不是在编译时动态地实际设置数组的值.它可能不太优雅,但我认为这是值得的,因为它是代码开发工作的典型示例,可以防止执行二进制文件时避免与环境相关的错误.

Initializing the components of a derived type is not bad practice, but as you can see, it can create problems with the stack if the component is a big array - be it due to the storage of the component itself, or to the storage of memory to work on the RHS of the assignment. If the component is made allocatable and allocated in a subroutine, the array is stored in the heap rather than in the stack, and this issue is usually avoided. In this case, it may be about actually setting the values of the array dynamically in a subroutine rather than at compile time. It may be less elegant, but I think it's worth it, since it's the typical example of code development work that prevents avoidable, environment-related errors when executing the binary.

您的上述代码符合标准.如评论中所述,缺少用于子例程的显式接口不是一个好习惯,但是对于这些简单的子例程,这并不违反规则.

Your code above is standards compliant. As explained in the comments, lack of explicit interfaces for subroutines is not good practice, but for these simple subroutines it's not against the rules.

某些编译器具有标志,可让您更改某些对象在内存中的分配位置.尽管它可以解决特定问题,但标志是依赖于编译器的,并且在比较不同的编译器时通常不等效.根据我的经验,通过可分配内存使用动态内存是一种更可靠的解决方案.

Some compilers have flags that allow you to change where some objects are allocated in memory. While it may fix a particular issue, flags are compiler dependent, and usually not equivalent when comparing different compilers. Using dynamic memory via allocatables is a more robust solution, according to my experience.

最后,请注意,如果您使用的是OpenMP,则上面的ulimit命令仅影响主线程-您需要通过环境变量OMP_STACKSIZE设置其他每个线程的堆栈大小,该变量不能为.并且要记住,由于堆栈可能在没有适当的Segmentation Fault错误的情况下停止运行,因此非主线程耗尽堆栈的问题更难以诊断.

Finally, note that, if you are using OpenMP, the ulimit command above only affects the master thread - you need to set the stack size of each of the other threads via the environment variable OMP_STACKSIZE, which cannot be unlimited. And bear in mind that non-master threads running out of stack are a problem much more difficult to diagnose, since the binary may stop without a proper Segmentation fault error.

这篇关于数组的分段错误,但仅当派生类型的组件存在时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆