当n_jobs> 1时,scikit-learn的GridSearchCV停止工作 [英] scikit-learn's GridSearchCV stops working when n_jobs>1

查看:64
本文介绍了当n_jobs> 1时,scikit-learn的GridSearchCV停止工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我之前曾问过此处提出以下代码行:

I have previously asked here come up with following lines of code:

parameters = [{'weights': ['uniform'], 'n_neighbors': [5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100]}]
clf = GridSearchCV(neighbors.KNeighborsRegressor(), parameters, n_jobs=4)
clf.fit(features, rewards)

但是,当我运行此命令时,出现了另一个与先前询问的问题无关的问题. Python最终显示以下操作系统错误消息:

But when I've run this there has appeared another problem that was not related to the previously asked question. Python ends up with following OS error message:

Process:         Python [1327]
Path:            /Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python
Identifier:      Python
Version:         2.7.2.5 (2.7.2.5.r64662-trunk)
Code Type:       X86-64 (Native)
Parent Process:  Python [1316]
Responsible:     Sublime Text 2 [308]
User ID:         501

Date/Time:       2014-08-12 10:27:24.640 +0200
OS Version:      Mac OS X 10.9.4 (13E28)
Report Version:  11
Anonymous UUID:  D10CD8B7-221F-B121-98D4-4574A1F2189F

Sleep/Wake UUID: 0B9C4AE0-26E6-4DE8-B751-665791968115

Crashed Thread:  0  Dispatch queue: com.apple.main-thread

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000110

VM Regions Near 0x110:
--> 
__TEXT                 0000000100000000-0000000100001000 [    4K] r-x/rwx SM=COW  /Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python

Application Specific Information:
*** multi-threaded process forked ***
crashed on child side of fork pre-exec

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libdispatch.dylib               0x00007fff91534c90 dispatch_group_async_f + 141
1   libBLAS.dylib                   0x00007fff9413f791 APL_sgemm + 1061
2   libBLAS.dylib                   0x00007fff9413cb3f cblas_sgemm + 1267
3   _dotblas.so                     0x0000000102b0236e dotblas_matrixproduct + 5934
4   org.activestate.ActivePython27  0x00000001000c552d PyEval_EvalFrameEx + 23949
5   org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
6   org.activestate.ActivePython27  0x00000001000c5d10 PyEval_EvalFrameEx + 25968
7   org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
8   org.activestate.ActivePython27  0x000000010003d390 function_call + 176
9   org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
10  org.activestate.ActivePython27  0x00000001000c098a PyEval_EvalFrameEx + 4586
11  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
12  org.activestate.ActivePython27  0x00000001000c5d10 PyEval_EvalFrameEx + 25968
13  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
14  org.activestate.ActivePython27  0x00000001000c5d10 PyEval_EvalFrameEx + 25968
15  org.activestate.ActivePython27  0x00000001000c7137 PyEval_EvalFrameEx + 31127
16  org.activestate.ActivePython27  0x00000001000c7137 PyEval_EvalFrameEx + 31127
17  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
18  org.activestate.ActivePython27  0x000000010003d390 function_call + 176
19  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
20  org.activestate.ActivePython27  0x00000001000c098a PyEval_EvalFrameEx + 4586
21  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
22  org.activestate.ActivePython27  0x000000010003d390 function_call + 176
23  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
24  org.activestate.ActivePython27  0x000000010001d36d instancemethod_call + 365
25  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
26  org.activestate.ActivePython27  0x0000000100077dfa slot_tp_call + 74
27  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
28  org.activestate.ActivePython27  0x00000001000c098a PyEval_EvalFrameEx + 4586
29  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
30  org.activestate.ActivePython27  0x000000010003d390 function_call + 176
31  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
32  org.activestate.ActivePython27  0x00000001000c098a PyEval_EvalFrameEx + 4586
33  org.activestate.ActivePython27  0x00000001000c7137 PyEval_EvalFrameEx + 31127
34  org.activestate.ActivePython27  0x00000001000c7137 PyEval_EvalFrameEx + 31127
35  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
36  org.activestate.ActivePython27  0x000000010003d390 function_call + 176
37  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
38  org.activestate.ActivePython27  0x000000010001d36d instancemethod_call + 365
39  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
40  org.activestate.ActivePython27  0x0000000100077a28 slot_tp_init + 88
41  org.activestate.ActivePython27  0x0000000100074e25 type_call + 245
42  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
43  org.activestate.ActivePython27  0x00000001000c267d PyEval_EvalFrameEx + 11997  
44  org.activestate.ActivePython27  0x00000001000c7137 PyEval_EvalFrameEx + 31127
45  org.activestate.ActivePython27  0x00000001000c7137 PyEval_EvalFrameEx + 31127
46  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
47  org.activestate.ActivePython27  0x000000010003d390 function_call + 176
48  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
49  org.activestate.ActivePython27  0x000000010001d36d instancemethod_call + 365
50  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
51  org.activestate.ActivePython27  0x0000000100077a28 slot_tp_init + 88
52  org.activestate.ActivePython27  0x0000000100074e25 type_call + 245
53  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
54  org.activestate.ActivePython27  0x00000001000c267d PyEval_EvalFrameEx + 11997
55  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
56  org.activestate.ActivePython27  0x00000001000c5d10 PyEval_EvalFrameEx + 25968
57  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
58  org.activestate.ActivePython27  0x000000010003d390 function_call + 176
59  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98   
60  org.activestate.ActivePython27  0x000000010001d36d instancemethod_call + 365
61  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
62  org.activestate.ActivePython27  0x0000000100077dfa slot_tp_call + 74
63  org.activestate.ActivePython27  0x000000010000be12 PyObject_Call + 98
64  org.activestate.ActivePython27  0x00000001000c267d PyEval_EvalFrameEx + 11997
65  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
66  org.activestate.ActivePython27  0x00000001000c5d10 PyEval_EvalFrameEx + 25968
67  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
68  org.activestate.ActivePython27  0x00000001000c5d10 PyEval_EvalFrameEx + 25968
69  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
70  org.activestate.ActivePython27  0x00000001000c5d10 PyEval_EvalFrameEx + 25968
71  org.activestate.ActivePython27  0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
72  org.activestate.ActivePython27  0x00000001000c7bf6 PyEval_EvalCode + 54
73  org.activestate.ActivePython27  0x00000001000ed31e PyRun_FileExFlags + 174
74  org.activestate.ActivePython27  0x00000001000ed5d9 PyRun_SimpleFileExFlags + 489
75  org.activestate.ActivePython27  0x00000001001041dc Py_Main + 2940
76  org.activestate.ActivePython27.app  0x0000000100000ed4 0x100000000 + 3796

Thread 0 crashed with X86 Thread State (64-bit):
rax: 0x0000000000000100  rbx: 0x00007fff7cd43640  rcx: 0x0000000000000000  rdx: 0x0000000105e00000
rdi: 0x0000000000000008  rsi: 0x0000000105e01000  rbp: 0x00007fff5fbfa370  rsp: 0x00007fff5fbfa350
r8: 0x0000000000000001   r9: 0x0000000105e00000  r10: 0x0000000105e01000  r11: 0x0000000000000000
r12: 0x000000010ba10530  r13: 0x000000010b000000  r14: 0x00000001066d1970  r15: 0x00007fff915311af
rip: 0x00007fff91534c90  rfl: 0x0000000000010206  cr2: 0x0000000000000110

Logical CPU:     2
Error Code:      0x00000006
Trap Number:     14

.........
VM Region Summary:
ReadOnly portion of Libraries: Total=183.7M resident=97.0M(53%) swapped_out_or_unallocated=86.7M(47%)
Writable regions: Total=1.3G written=142.8M(11%) resident=503.6M(39%) swapped_out=0K(0%) unallocated=791.7M(61%)

当我将代码中的第二行替换为:

When I have replaced the second line in my code by:

clf = GridSearchCV(neighbors.KNeighborsRegressor(), parameters, n_jobs=1)

然后一切正常,除非我不使用多个线程.

Then everything works fine except I don't use multiple threads.

我的操作系统是OSX 10.9.4

My operating system is OSX 10.9.4

我的python版本是2.7.8 | Anaconda 2.0.1(x86_64)| (默认值,2014年7月2日,15:36:00) [GCC 4.2.1(Apple Inc.内部版本5577)]

My python version is 2.7.8 |Anaconda 2.0.1 (x86_64)| (default, Jul 2 2014, 15:36:00) [GCC 4.2.1 (Apple Inc. build 5577)]

我的scikit-lern版本是0.14.1

My scikit-lern version is 0.14.1

我的numpy版本是1.8.1

My numpy version is 1.8.1

我的scipy版本是0.14.0

And my scipy version is 0.14.0

我的问题是,是否有人知道如何使GridSearchCV在多个线程上运行?

My question is if anybody has an idea how to make GridSearchCV run on more than one thread?

我已经意识到,实际上此错误仅发生在我的某些输入数据集中.不幸的是,有问题的数据集(其X)太大,因此无法在此处复制它们.输入要素数据基本上是tf-idf向量,y向量是float> 0,尤其是:

I have realized that actually this error happens only for some of my input data sets. Unfortunately the problematic datasets (its X) are too big so it is not possible to copy them in here. Input features data is basically tf-idf vectors and y vectors are floats > 0, particularly:

[60.0, 7.0, 12.0, 21.0, 5.5, 3.0, 0.0, 2.5, 11.0, 3.0, 16.0, 2.0, 0.0, 4.5, 2.5, 6.0, 9.5, 2.5, 15.0, 7.0, 8.0, 13.0, 14.0, 8.0, 3.5, 6.0, 22.5, 7.0, 4.0, 3.5, 4.5, 6.0, 5.5, 7.0, 2.0, 0.0, 0.0, 0.0, 14.5, 8.0, 7.5, 2.5, 11.5, 1.0, 3.0, 14.5, 10.0, 14.5, 8.0, 8.0, 7.0, 2.5, 3.5, 3.0, 13.5, 7.0, 6.5, 2.5, 9.0, 8.0, 11.0, 17.5, 12.5, 4.5, 5.5, 8.0, 2.0, 7.0, 4.0, 1.5, 3.0, 21.5, 4.5, 4.0, 7.0, 9.0, 13.5, 8.0, 10.5, 4.5, 1.5, 11.5, 7.5, 11.5, 4.5, 5.0, 7.0, 9.5, 4.0, 4.0, 6.0, 3.5, 4.5, 7.5, 3.5, 3.5, 3.5, 6.0, 5.0, 5.5, 25.0, 6.5, 5.0, 2.0, 2.0, 10.5, 0.0, 6.5, 19.0, 9.0, 1.0, 1.5, 1.0, 0.0, 1.0, 4.5, 2.5, 17.5, 39.5, 7.5, 5.5, 8.0, 1.0, 6.0, 12.0, 10.0, 5.5, 19.0, 4.5, 1.5, 25.5, 4.0, 10.0, 18.5, 9.5, 10.5, 2.5, 6.0, 1.0, 10.0, 8.5, 12.5, 13.5, 5.0, 6.5, 11.0, 4.5, 8.0, 7.5, 11.5, 14.5, 9.0, 3.0, 1.5, 3.5, 5.5, 2.5, 12.5, 6.5, 5.5, 5.0, 0.0, 8.0, 3.0, 14.5, 5.0, 14.0, 7.0, 13.5, 12.5, 4.0, 1.5, 6.5, 10.5, 9.0, 16.5, 4.0, 4.0, 15.0, 11.5, 2.5, 8.5, 3.0, 5.0, 4.0, 8.5, 6.0, 5.0, 5.0, 5.0, 5.5, 8.0, 11.0, 4.0, 0.0, 5.5, 0.0, 4.5, 1.5, 0.0, 6.5, 11.0, 2.5, 8.0, 15.5, 5.5, 4.5, 5.0, 4.0, 5.5, 10.5, 7.5, 6.5, 8.5, 2.5, 1.5, 1.5, 18.0, 15.0, 14.0, 9.5, 5.5, 7.5, 14.5, 2.5, 5.0, 60.0, 6.5, 14.5, 6.5, 4.0, 1.5, 2.0, 4.0, 27.0, 3.0, 5.0, 4.0, 2.5, 1.0, 1.5, 1.5, 9.0, 4.0, 8.5, 4.0, 4.0, 0.0, 1.5, 7.5, 1.5, 7.5, 1.0, 28.5, 15.5, 7.5, 1.0, 2.5, 2.5, 2.5, 16.0, 5.5, 8.5, 4.0, 2.5, 5.0, 2.5, 6.0, 11.0, 10.0, 4.5, 6.5, 8.0, 6.0, 4.5, 15.5, 4.0, 5.0]

具有1个作业的版本适用于我的所有输入数据集,即使对于这个输入数据集也是如此.

The version with 1 job works for all of my input data sets, even for this one.

推荐答案

libdispatch.dylib.当程序随后不使用exec系统调用而调用POSIX fork系统调用时,GCD运行时将无法工作,因此,使所有使用multiprocessing模块的Python程序容易崩溃. sklearn的GridsearchCV使用Python multiprocessing模块进行并行化.

libdispatch.dylib from Grand Central Dispatch is used internally by OSX's builtin implementation of BLAS called Accelerate when you do a numpy.dot calls. The GCD runtime does not work when programs call the POSIX fork syscall without using an exec syscall afterwards and therefore makes all Python programs that use the multiprocessing module prone to crash. sklearn's GridsearchCV uses the Python multiprocessing module for parallelization.

在Python 3.4和更高版本中,您可以强制Python多重处理使用

Under Python 3.4 and later you can force Python multiprocessing to use the forkserver start method instead of the default fork mode to workaround this problem, for instance at the beginning of the main file of your program:

if __name__ == "__main__":
    import multiprocessing as mp; mp.set_start_method('forkserver')

或者,您可以从源代码重建numpy并将其链接到ATLAS或OpenBLAS,而不是OSX Accelerate. numpy开发人员正在处理默认情况下包括ATLAS或OpenBLAS的二进制发行版.

Alternatively, you can rebuild numpy from source and make it link against ATLAS or OpenBLAS instead of OSX Accelerate. The numpy developers are working on binary distributions that include either ATLAS or OpenBLAS by default.

这篇关于当n_jobs> 1时,scikit-learn的GridSearchCV停止工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆