当n_jobs> 1时,scikit-learn的GridSearchCV停止工作 [英] scikit-learn's GridSearchCV stops working when n_jobs>1
问题描述
我之前曾问过此处提出以下代码行:
I have previously asked here come up with following lines of code:
parameters = [{'weights': ['uniform'], 'n_neighbors': [5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100]}]
clf = GridSearchCV(neighbors.KNeighborsRegressor(), parameters, n_jobs=4)
clf.fit(features, rewards)
但是,当我运行此命令时,出现了另一个与先前询问的问题无关的问题. Python最终显示以下操作系统错误消息:
But when I've run this there has appeared another problem that was not related to the previously asked question. Python ends up with following OS error message:
Process: Python [1327]
Path: /Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python
Identifier: Python
Version: 2.7.2.5 (2.7.2.5.r64662-trunk)
Code Type: X86-64 (Native)
Parent Process: Python [1316]
Responsible: Sublime Text 2 [308]
User ID: 501
Date/Time: 2014-08-12 10:27:24.640 +0200
OS Version: Mac OS X 10.9.4 (13E28)
Report Version: 11
Anonymous UUID: D10CD8B7-221F-B121-98D4-4574A1F2189F
Sleep/Wake UUID: 0B9C4AE0-26E6-4DE8-B751-665791968115
Crashed Thread: 0 Dispatch queue: com.apple.main-thread
Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000110
VM Regions Near 0x110:
-->
__TEXT 0000000100000000-0000000100001000 [ 4K] r-x/rwx SM=COW /Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python
Application Specific Information:
*** multi-threaded process forked ***
crashed on child side of fork pre-exec
Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0 libdispatch.dylib 0x00007fff91534c90 dispatch_group_async_f + 141
1 libBLAS.dylib 0x00007fff9413f791 APL_sgemm + 1061
2 libBLAS.dylib 0x00007fff9413cb3f cblas_sgemm + 1267
3 _dotblas.so 0x0000000102b0236e dotblas_matrixproduct + 5934
4 org.activestate.ActivePython27 0x00000001000c552d PyEval_EvalFrameEx + 23949
5 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
6 org.activestate.ActivePython27 0x00000001000c5d10 PyEval_EvalFrameEx + 25968
7 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
8 org.activestate.ActivePython27 0x000000010003d390 function_call + 176
9 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98
10 org.activestate.ActivePython27 0x00000001000c098a PyEval_EvalFrameEx + 4586
11 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
12 org.activestate.ActivePython27 0x00000001000c5d10 PyEval_EvalFrameEx + 25968
13 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
14 org.activestate.ActivePython27 0x00000001000c5d10 PyEval_EvalFrameEx + 25968
15 org.activestate.ActivePython27 0x00000001000c7137 PyEval_EvalFrameEx + 31127
16 org.activestate.ActivePython27 0x00000001000c7137 PyEval_EvalFrameEx + 31127
17 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
18 org.activestate.ActivePython27 0x000000010003d390 function_call + 176
19 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98
20 org.activestate.ActivePython27 0x00000001000c098a PyEval_EvalFrameEx + 4586
21 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
22 org.activestate.ActivePython27 0x000000010003d390 function_call + 176
23 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98
24 org.activestate.ActivePython27 0x000000010001d36d instancemethod_call + 365
25 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98
26 org.activestate.ActivePython27 0x0000000100077dfa slot_tp_call + 74
27 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98
28 org.activestate.ActivePython27 0x00000001000c098a PyEval_EvalFrameEx + 4586
29 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
30 org.activestate.ActivePython27 0x000000010003d390 function_call + 176
31 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98
32 org.activestate.ActivePython27 0x00000001000c098a PyEval_EvalFrameEx + 4586
33 org.activestate.ActivePython27 0x00000001000c7137 PyEval_EvalFrameEx + 31127
34 org.activestate.ActivePython27 0x00000001000c7137 PyEval_EvalFrameEx + 31127
35 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
36 org.activestate.ActivePython27 0x000000010003d390 function_call + 176
37 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98
38 org.activestate.ActivePython27 0x000000010001d36d instancemethod_call + 365
39 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98
40 org.activestate.ActivePython27 0x0000000100077a28 slot_tp_init + 88
41 org.activestate.ActivePython27 0x0000000100074e25 type_call + 245
42 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98
43 org.activestate.ActivePython27 0x00000001000c267d PyEval_EvalFrameEx + 11997
44 org.activestate.ActivePython27 0x00000001000c7137 PyEval_EvalFrameEx + 31127
45 org.activestate.ActivePython27 0x00000001000c7137 PyEval_EvalFrameEx + 31127
46 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
47 org.activestate.ActivePython27 0x000000010003d390 function_call + 176
48 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98
49 org.activestate.ActivePython27 0x000000010001d36d instancemethod_call + 365
50 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98
51 org.activestate.ActivePython27 0x0000000100077a28 slot_tp_init + 88
52 org.activestate.ActivePython27 0x0000000100074e25 type_call + 245
53 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98
54 org.activestate.ActivePython27 0x00000001000c267d PyEval_EvalFrameEx + 11997
55 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
56 org.activestate.ActivePython27 0x00000001000c5d10 PyEval_EvalFrameEx + 25968
57 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
58 org.activestate.ActivePython27 0x000000010003d390 function_call + 176
59 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98
60 org.activestate.ActivePython27 0x000000010001d36d instancemethod_call + 365
61 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98
62 org.activestate.ActivePython27 0x0000000100077dfa slot_tp_call + 74
63 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98
64 org.activestate.ActivePython27 0x00000001000c267d PyEval_EvalFrameEx + 11997
65 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
66 org.activestate.ActivePython27 0x00000001000c5d10 PyEval_EvalFrameEx + 25968
67 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
68 org.activestate.ActivePython27 0x00000001000c5d10 PyEval_EvalFrameEx + 25968
69 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
70 org.activestate.ActivePython27 0x00000001000c5d10 PyEval_EvalFrameEx + 25968
71 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118
72 org.activestate.ActivePython27 0x00000001000c7bf6 PyEval_EvalCode + 54
73 org.activestate.ActivePython27 0x00000001000ed31e PyRun_FileExFlags + 174
74 org.activestate.ActivePython27 0x00000001000ed5d9 PyRun_SimpleFileExFlags + 489
75 org.activestate.ActivePython27 0x00000001001041dc Py_Main + 2940
76 org.activestate.ActivePython27.app 0x0000000100000ed4 0x100000000 + 3796
Thread 0 crashed with X86 Thread State (64-bit):
rax: 0x0000000000000100 rbx: 0x00007fff7cd43640 rcx: 0x0000000000000000 rdx: 0x0000000105e00000
rdi: 0x0000000000000008 rsi: 0x0000000105e01000 rbp: 0x00007fff5fbfa370 rsp: 0x00007fff5fbfa350
r8: 0x0000000000000001 r9: 0x0000000105e00000 r10: 0x0000000105e01000 r11: 0x0000000000000000
r12: 0x000000010ba10530 r13: 0x000000010b000000 r14: 0x00000001066d1970 r15: 0x00007fff915311af
rip: 0x00007fff91534c90 rfl: 0x0000000000010206 cr2: 0x0000000000000110
Logical CPU: 2
Error Code: 0x00000006
Trap Number: 14
.........
VM Region Summary:
ReadOnly portion of Libraries: Total=183.7M resident=97.0M(53%) swapped_out_or_unallocated=86.7M(47%)
Writable regions: Total=1.3G written=142.8M(11%) resident=503.6M(39%) swapped_out=0K(0%) unallocated=791.7M(61%)
当我将代码中的第二行替换为:
When I have replaced the second line in my code by:
clf = GridSearchCV(neighbors.KNeighborsRegressor(), parameters, n_jobs=1)
然后一切正常,除非我不使用多个线程.
Then everything works fine except I don't use multiple threads.
我的操作系统是OSX 10.9.4
My operating system is OSX 10.9.4
我的python版本是2.7.8 | Anaconda 2.0.1(x86_64)| (默认值,2014年7月2日,15:36:00) [GCC 4.2.1(Apple Inc.内部版本5577)]
My python version is 2.7.8 |Anaconda 2.0.1 (x86_64)| (default, Jul 2 2014, 15:36:00) [GCC 4.2.1 (Apple Inc. build 5577)]
我的scikit-lern版本是0.14.1
My scikit-lern version is 0.14.1
我的numpy版本是1.8.1
My numpy version is 1.8.1
我的scipy版本是0.14.0
And my scipy version is 0.14.0
我的问题是,是否有人知道如何使GridSearchCV在多个线程上运行?
My question is if anybody has an idea how to make GridSearchCV run on more than one thread?
我已经意识到,实际上此错误仅发生在我的某些输入数据集中.不幸的是,有问题的数据集(其X)太大,因此无法在此处复制它们.输入要素数据基本上是tf-idf向量,y向量是float> 0,尤其是:
I have realized that actually this error happens only for some of my input data sets. Unfortunately the problematic datasets (its X) are too big so it is not possible to copy them in here. Input features data is basically tf-idf vectors and y vectors are floats > 0, particularly:
[60.0, 7.0, 12.0, 21.0, 5.5, 3.0, 0.0, 2.5, 11.0, 3.0, 16.0, 2.0, 0.0, 4.5, 2.5, 6.0, 9.5, 2.5, 15.0, 7.0, 8.0, 13.0, 14.0, 8.0, 3.5, 6.0, 22.5, 7.0, 4.0, 3.5, 4.5, 6.0, 5.5, 7.0, 2.0, 0.0, 0.0, 0.0, 14.5, 8.0, 7.5, 2.5, 11.5, 1.0, 3.0, 14.5, 10.0, 14.5, 8.0, 8.0, 7.0, 2.5, 3.5, 3.0, 13.5, 7.0, 6.5, 2.5, 9.0, 8.0, 11.0, 17.5, 12.5, 4.5, 5.5, 8.0, 2.0, 7.0, 4.0, 1.5, 3.0, 21.5, 4.5, 4.0, 7.0, 9.0, 13.5, 8.0, 10.5, 4.5, 1.5, 11.5, 7.5, 11.5, 4.5, 5.0, 7.0, 9.5, 4.0, 4.0, 6.0, 3.5, 4.5, 7.5, 3.5, 3.5, 3.5, 6.0, 5.0, 5.5, 25.0, 6.5, 5.0, 2.0, 2.0, 10.5, 0.0, 6.5, 19.0, 9.0, 1.0, 1.5, 1.0, 0.0, 1.0, 4.5, 2.5, 17.5, 39.5, 7.5, 5.5, 8.0, 1.0, 6.0, 12.0, 10.0, 5.5, 19.0, 4.5, 1.5, 25.5, 4.0, 10.0, 18.5, 9.5, 10.5, 2.5, 6.0, 1.0, 10.0, 8.5, 12.5, 13.5, 5.0, 6.5, 11.0, 4.5, 8.0, 7.5, 11.5, 14.5, 9.0, 3.0, 1.5, 3.5, 5.5, 2.5, 12.5, 6.5, 5.5, 5.0, 0.0, 8.0, 3.0, 14.5, 5.0, 14.0, 7.0, 13.5, 12.5, 4.0, 1.5, 6.5, 10.5, 9.0, 16.5, 4.0, 4.0, 15.0, 11.5, 2.5, 8.5, 3.0, 5.0, 4.0, 8.5, 6.0, 5.0, 5.0, 5.0, 5.5, 8.0, 11.0, 4.0, 0.0, 5.5, 0.0, 4.5, 1.5, 0.0, 6.5, 11.0, 2.5, 8.0, 15.5, 5.5, 4.5, 5.0, 4.0, 5.5, 10.5, 7.5, 6.5, 8.5, 2.5, 1.5, 1.5, 18.0, 15.0, 14.0, 9.5, 5.5, 7.5, 14.5, 2.5, 5.0, 60.0, 6.5, 14.5, 6.5, 4.0, 1.5, 2.0, 4.0, 27.0, 3.0, 5.0, 4.0, 2.5, 1.0, 1.5, 1.5, 9.0, 4.0, 8.5, 4.0, 4.0, 0.0, 1.5, 7.5, 1.5, 7.5, 1.0, 28.5, 15.5, 7.5, 1.0, 2.5, 2.5, 2.5, 16.0, 5.5, 8.5, 4.0, 2.5, 5.0, 2.5, 6.0, 11.0, 10.0, 4.5, 6.5, 8.0, 6.0, 4.5, 15.5, 4.0, 5.0]
具有1个作业的版本适用于我的所有输入数据集,即使对于这个输入数据集也是如此.
The version with 1 job works for all of my input data sets, even for this one.
推荐答案
libdispatch.dylib
.当程序随后不使用exec
系统调用而调用POSIX fork
系统调用时,GCD运行时将无法工作,因此,使所有使用multiprocessing
模块的Python程序容易崩溃. sklearn的GridsearchCV
使用Python multiprocessing
模块进行并行化.
libdispatch.dylib
from Grand Central Dispatch is used internally by OSX's builtin implementation of BLAS called Accelerate when you do a numpy.dot
calls. The GCD runtime does not work when programs call the POSIX fork
syscall without using an exec
syscall afterwards and therefore makes all Python programs that use the multiprocessing
module prone to crash. sklearn's GridsearchCV
uses the Python multiprocessing
module for parallelization.
在Python 3.4和更高版本中,您可以强制Python多重处理使用
Under Python 3.4 and later you can force Python multiprocessing to use the forkserver start method instead of the default fork
mode to workaround this problem, for instance at the beginning of the main file of your program:
if __name__ == "__main__":
import multiprocessing as mp; mp.set_start_method('forkserver')
或者,您可以从源代码重建numpy并将其链接到ATLAS或OpenBLAS,而不是OSX Accelerate. numpy开发人员正在处理默认情况下包括ATLAS或OpenBLAS的二进制发行版.
Alternatively, you can rebuild numpy from source and make it link against ATLAS or OpenBLAS instead of OSX Accelerate. The numpy developers are working on binary distributions that include either ATLAS or OpenBLAS by default.
这篇关于当n_jobs> 1时,scikit-learn的GridSearchCV停止工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!