Scikit学习多线程 [英] Scikit-learn multithreading

查看:111
本文介绍了Scikit学习多线程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您知道scikit-learn中的模型是自动使用多线程还是仅使用顺序指令?

Do you know if models from scikit-learn use automatically multithreading or just sequential instructions?

谢谢

推荐答案

否.默认情况下,所有scikit-learn估计器都将仅在单个线程上工作.

No. All scikit-learn estimators will by default work on a single thread only.

但是,这又取决于算法和问题.如果算法是需要顺序数据的算法,那么我们将无能为力.如果数据集是多类或多标签的,并且算法是基于一对一的,那么可以使用多线程.

But then again, it all depends on the algorithm and the problem. If the algorithm is such that which want sequential data, we cannot do anything. If the dataset is multi-class or multi-label and algorithm works on a one-vs-rest basis, then yes it can use multi-threading.

在要使用的实用程序或算法中查找参数n_jobs,并将其设置为-1以使用多线程.

Look for a param n_jobs in the utilities or algorithm you want to use, and set it to -1 for using the multi-threading.

例如.

  • LogisticRegression (如果在二进制问题只会训练一个模型,该模型将顺序需要数据,因此在这里使用n_jobs无效.但是它像OvR一样处理多类问题,因此它必须使用相同的数据训练许多估计量.在这种情况下,您可以使用n_jobs=-1.

  • LogisticRegression if working in a binary problem will only train a single model, which will require data sequentially, so here using n_jobs have no effect. But it handles multi-class problems as OvR, so it will have to train those many estimators using the same data. In this case you can use the n_jobs=-1.

DecisionTreeClassifier 本质上是多类启用,不需要训练多个模型.所以我们在那里没有那个参数.

DecisionTreeClassifier is inherently multi-class enabled and dont need to train multiple models. So we dont have that param there.

RandomForestClassifier 这样的集成方法训练多个估计器(与问题类型无关),这些估计器分别处理数据的某些部分,因此在这里我们可以再次使用n_jobs.

Ensemble methods like RandomForestClassifier will train multiple estimators (irrespective of problem type) which individually work on some part of data, so here again we can make use of n_jobs.

交叉验证实用程序,例如 cross_val_score GridSearchCV 在某些情况下将再次起作用数据或某些独立于其他折叠的参数,因此在这里我们也可以使用多线程功能.

Cross-validation utilities like cross_val_score or GridSearchCV will again work on some part of data or some individual parameters, which is independent of other folds, so here also we can use multi-threading capabilities.

这篇关于Scikit学习多线程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆