当n_jobs> 1的GridSearchCV时,keras + scikit-learn包装器似乎挂起 [英] keras + scikit-learn wrapper, appears to hang when GridSearchCV with n_jobs >1

查看:202
本文介绍了当n_jobs> 1的GridSearchCV时,keras + scikit-learn包装器似乎挂起的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

更新:我必须重新编写该问题,因为经过一些调查,我意识到这是一个不同的问题.

上下文:使用带有scikit Learn的kerasclassifier包装器在gridsearch设置中运行keras. Sys:Ubuntu 16.04,库:anaconda发行版5.1,keras 2.0.9,scikitlearn 0.19.1,tensorflow 1.3.0或theano 0.9.0,仅使用CPU.

代码: 我只是在这里使用代码进行测试: https://machinelearningmastery.com/use-keras-deep-learning-models-scikit-learn-python/,第二个示例为网格搜索深度学习模型参数".请注意第35行,其内容为:

grid = GridSearchCV(estimator=model, param_grid=param_grid)

症状::当网格搜索使用个以上作业(意味着cpus?)时,例如,将上述A行的"n_jobs"设置为"2",下面的行:

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=2)

会导致代码使用tensorflow或theano无限期地挂起,并且没有cpu的用法(请参阅随附的屏幕截图,其中创建了5个python进程,但没有一个正在使用cpu).

通过调试,似乎与'sklearn.model_selection._search'的以下行会引起问题:

line 648: for parameters, (train, test) in product(candidate_params,
                                               cv.split(X, y, groups)))

,程序将挂在它上,无法继续.

对于这意味着什么以及为什么会发生的事情,我真的很感激.

预先感谢

解决方案

您是否正在使用GPU?如果是这样,您将不能有多个线程运行这些参数的每个变体,因为它们将无法共享GPU.

这里是有关如何在带有GridsearchCV的管道中使用keras,sklearn包装器的完整示例:限制tensorflow后端的资源使用情况

  • GPU内存部分在keras 2.0.9中不起作用,但在2.0中起作用.8

  • UPDATE: I have to re-write this question as after some investigation I realise that this is a different problem.

    Context: running keras in a gridsearch setting using the kerasclassifier wrapper with scikit learn. Sys: Ubuntu 16.04, libraries: anaconda distribution 5.1, keras 2.0.9, scikitlearn 0.19.1, tensorflow 1.3.0 or theano 0.9.0, using CPUs only.

    Code: I simply used the code here for testing: https://machinelearningmastery.com/use-keras-deep-learning-models-scikit-learn-python/, the second example 'Grid Search Deep Learning Model Parameters'. Pay attention to line 35, which reads:

    grid = GridSearchCV(estimator=model, param_grid=param_grid)
    

    Symptoms: When grid search uses more than 1 jobs (means cpus?), e.g.,, setting 'n_jobs' on the above line A to '2', line below:

    grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=2)
    

    will cause the code to hang indefinitely, either with tensorflow or theano, and there is no cpu usage (see attached screenshot, where 5 python processes were created but none is using cpu).

    By debugging, it appears to be the following line with 'sklearn.model_selection._search' that causes problems:

    line 648: for parameters, (train, test) in product(candidate_params,
                                                   cv.split(X, y, groups)))
    

    , on which the program hangs and cannot continue.

    I would really appreciate some insights as to what this means and why this could happen.

    Thanks in advance

    解决方案

    Are you using a GPU? If so, you can't have multiple threads running each variation of the params because they won't be able to share the GPU.

    Here's a full example on how to use keras, sklearn wrappers in a Pipeline with GridsearchCV: Pipeline with a Keras Model

    If you really want to have multiple jobs in the GridSearchCV, you can try to limit the GPU fraction used by each job (e.g. if each job only allocates 0.5 of the available GPU memory, you can run 2 jobs simultaneously)

    See these issues:

    这篇关于当n_jobs> 1的GridSearchCV时,keras + scikit-learn包装器似乎挂起的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆