多处理支持的并行循环不能嵌套在线程下 [英] Multiprocessing backed parallel loops cannot be nested below threads
问题描述
joblib 中出现此类问题的原因是什么?'多处理支持的并行循环不能嵌套在线程下,设置 n_jobs=1'我应该怎么做才能避免这种问题?
What is the reason of such issue in joblib? 'Multiprocessing backed parallel loops cannot be nested below threads, setting n_jobs=1' What should I do to avoid such issue?
实际上我需要实现 XMLRPC 服务器,它在后台线程中运行大量计算并通过 UI 客户端轮询报告当前进度.它使用基于 joblib 的 scikit-learn.
Actually I need to implement XMLRPC server which run heavy computation in background thread and report current progress through polling from UI client. It uses scikit-learn which are based on joblib.
附:我只是将线程的名称更改为MainThread"以避免此类警告,并且一切看起来都很好(按预期并行运行,没有问题).这种变通方法将来可能会出现什么问题?
P.S.: I've simply changed name of the thread to "MainThread" to avoid such warning and everything looks working good (run in parallel as expected without issues). What might be a problem in future for such workaround?
推荐答案
在线程中使用 sklearn 进行预测时,我遇到了同样的警告,使用的是我加载的模型,该模型配备了 n_jobs >1.当你pickle一个模型时会出现它,它保存了它的参数,包括n_jobs.
I had the same warning while doing predictions with sklearn within a thread, using a model I loaded and which was fitted with n_jobs > 1. It appears when you pickle a model it is saved with its parameters, including n_jobs.
为避免警告(和潜在的序列化成本),请在酸洗模型时将 n_jobs 设置为 1:
To avoid the warning (and potential serialization cost), set n_jobs to 1 when pickling your models:
clf = joblib.load(model_filename).set_params(n_jobs=1)
这篇关于多处理支持的并行循环不能嵌套在线程下的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!