在osx而非Linux上使用numpy的lapack_lite进行多重处理的segfault [英] segfault using numpy's lapack_lite with multiprocessing on osx, not linux

查看：304 发布时间：2020/5/13 19:39:59 python numpy segmentation-fault multiprocessing

本文介绍了在osx而非Linux上使用numpy的lapack_lite进行多重处理的segfault的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

以下测试代码对我来说在OSX 10.7.3上是段错误，但在其他机器上不是:

The following test code segfaults for me on OSX 10.7.3, but not other machines:

from __future__ import print_function

import numpy as np
import multiprocessing as mp
import scipy.linalg

def f(a):
    print("about to call")

    ### these all cause crashes
    sign, x = np.linalg.slogdet(a)
    #x = np.linalg.det(a)
    #x = np.linalg.inv(a).sum()

    ### these are all fine
    #x = scipy.linalg.expm3(a).sum()
    #x = np.dot(a, a.T).sum()

    print("result:", x)
    return x

def call_proc(a):
    print("\ncalling with multiprocessing")
    p = mp.Process(target=f, args=(a,))
    p.start()
    p.join()


if __name__ == '__main__':
    import sys
    n = int(sys.argv[1]) if len(sys.argv) > 1 else 50

    a = np.random.normal(0, 2, (n, n))
    f(a)

    call_proc(a)
    call_proc(a)

以下其中一项的示例输出:

Example output for one of the segfaulty ones:

$ python2.7 test.py
about to call
result: -4.96797718087

calling with multiprocessing
about to call

calling with multiprocessing
about to call

弹出OSX问题报告"，抱怨诸如KERN_INVALID_ADDRESS at 0x0000000000000108之类的段错误；这里是完整的.

with an OSX "problem report" popping up complaining about a segfault like KERN_INVALID_ADDRESS at 0x0000000000000108; here's a full one.

如果我用n <= 32运行它，则运行良好；对于任何n >= 33，它都会崩溃.

If I run it with n <= 32, it runs fine; for any n >= 33, it crashes.

如果我注释掉在原始过程中完成的f(a)调用，则对call_proc的两个调用都可以.如果我在另一个大型数组上调用f，它仍然存在段错误.如果我在另一个小型数组上调用它，或者如果我调用f(large_array)然后将f(small_array)传递给另一个进程，则它可以正常工作.它们实际上并不需要是相同的功能. np.inv(large_array)，然后传递到np.linalg.slogdet(different_large_array)也是段错误.

If I comment out the f(a) call that's done in the original process, both calls to call_proc are fine. It still segfaults if I call f on a different large array; if I call it on a different small array, or if I call f(large_array) and then pass off f(small_array) to a different process, it works fine. They don't actually need to be the same function; np.inv(large_array) followed by passing off to np.linalg.slogdet(different_large_array) also segfaults.

f中所有已注释掉的np.linalg内容均导致崩溃； np.dot(self.a, self.a.T).sum()和scipy.linalg.exp3m正常工作.据我所知，区别在于前者使用numpy的lapack_lite，而后者则不使用.

All of the commented-out np.linalg things in f cause crashes; np.dot(self.a, self.a.T).sum() and scipy.linalg.exp3m work fine. As far as I can tell, the difference is that the former use numpy's lapack_lite and the latter don't.

在我的桌面上，发生这种情况的原因是

This happens for me on my desktop with

python 2.6.7，numpy 1.5.1
python 2.7.1，numpy 1.5.1，scipy 0.10.0
python 3.2.2，numpy 1.6.1，scipy 0.10.1

我认为2.6和2.7是默认系统安装；我从源tarball手动安装了3.2版本.所有这些numpy都链接到系统Accelerate框架:

The 2.6 and 2.7 are I think the default system installs; I installed the 3.2 versions manually from the source tarballs. All of those numpys are linked to the system Accelerate framework:

$ otool -L `python3.2 -c 'from numpy.core import _dotblas; print(_dotblas.__file__)'`
/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/core/_dotblas.so:
    /System/Library/Frameworks/Accelerate.framework/Versions/A/Accelerate (compatibility version 1.0.0, current version 4.0.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 125.2.1)

在具有类似设置的另一台Mac上，我也得到相同的行为.

I get the same behavior on another Mac with a similar setup.

但是f的所有选项都可以在其他正在运行的计算机上使用

But all of the options for f work on other machines running

带有Python 2.6.1和numpy 1.2.1的OSX 10.6.8已链接到Accelerate 4和vecLib 268(除非它没有scipy或slogdet)
将Debian 6与Python 3.2.2，numpy 1.6.1和scipy 0.10.1链接到系统ATLAS
与Python 2.7.1，numpy 1.5.1和scipy 0.8.0链接到系统ATLAS的Ubuntu 11.04

我在这里做错什么了吗?可能是什么原因造成的?我看不到在被腌制和去腌制的numpy数组上运行函数如何可能导致它以后在另一个进程中出现段错误.

Am I doing something wrong here? What could possibly be causing this? I don't see how running a function on a numpy array that's getting pickled and unpickled can possibly cause it to later segfault in a different process.

更新:当我执行核心转储时，回溯跟踪位于Grand Central Dispatch接口dispatch_group_async_f内部.大概这是numpy/GCD与多处理之间的交互中的错误.我将其报告为一个numpy错误，但是如果有人对变通办法有任何想法，或者就此而言，如何解决该错误，我们将不胜感激. :)

Update: when I do a core dump, the backtrace is inside dispatch_group_async_f, the Grand Central Dispatch interface. Presumably this is a bug in the interactions between numpy/GCD and multiprocessing. I`ve reported this as a numpy bug, but if anyone has any ideas about workarounds or, for that matter, how to solve the bug, it'd be greatly appreciated. :)

在osx而非Linux上使用numpy的lapack_lite进行多重处理的segfault [英] segfault using numpy's lapack_lite with multiprocessing on osx, not linux

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在osx而非Linux上使用numpy的lapack_lite进行多重处理的segfault [英] segfault using numpy&#39;s lapack_lite with multiprocessing on osx, not linux

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

在osx而非Linux上使用numpy的lapack_lite进行多重处理的segfault [英] segfault using numpy's lapack_lite with multiprocessing on osx, not linux

登录关闭