为什么 GridSearchCV 在 {method 'acquire' of 'thread.lock' objects} 上花费了超过 50% 的时间? [英] Why GridSearchCV spends more than 50% time on {method 'acquire' of 'thread.lock' objects}?

查看:91
本文介绍了为什么 GridSearchCV 在 {method 'acquire' of 'thread.lock' objects} 上花费了超过 50% 的时间?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近我正在调整我的一些机器学习管道.我决定利用我的多核处理器.我使用参数 n_jobs=-1 进行了交叉验证.我还对它进行了分析,令我惊讶的是:最重要的功能是:

Recently I am tuning up some of my machine learning pipeline. I decided to take advantage of my multicore processor. And I ran cross-validation with param n_jobs=-1. I also profiled it and what was suprise for me: the top function was:

{method 'acquire' of 'thread.lock' objects}

由于我在 Pipeline 中执行的操作,我不确定这是否是我的错.所以我决定做个小实验:

I was not sure if it was my fault due to operations I do in Pipeline. So I decided to make small experiment:

pp = Pipeline([('svc', SVC())])
cv = GridSearchCV(pp, {'svc__C' : [1, 100, 200]}, jobs=-1, cv=2, refit=True)
%prun cv.fit(np.random.rand(1e4, 100), np.random.randint(0, 5, 1e4))

输出是:

2691 function calls (2655 primitive calls) in 74.005 seconds
Ordered by: internal time

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   83   43.819    0.528   43.819    0.528 {method 'acquire' of 'thread.lock' objects}
    1   30.112   30.112   30.112   30.112 {sklearn.svm.libsvm.fit}

我想知道这种行为的原因是什么.如果可以稍微加快速度.

I wonder what is the cause of such behavior. And if it is possible to speed it up a little bit.

推荐答案

分析器只告诉你主进程在做什么,而它的子进程在做所有的工作.在这种情况下,在 GridSearchCV 上设置 verbose=2 可能会提供比 %prun 更好的输出.

The profiler is only telling you what the main process is doing, while its child processes are doing all the work. Setting verbose=2 on GridSearchCV may give better output than %prun in this case.

这篇关于为什么 GridSearchCV 在 {method 'acquire' of 'thread.lock' objects} 上花费了超过 50% 的时间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆