Scikit学习多线程 [英] Scikit-learn multithreading
问题描述
您知道scikit-learn中的模型是自动使用多线程还是仅使用顺序指令?
Do you know if models from scikit-learn use automatically multithreading or just sequential instructions?
谢谢
推荐答案
否.默认情况下,所有scikit-learn估计器都将仅在单个线程上工作.
No. All scikit-learn estimators will by default work on a single thread only.
但是,这又取决于算法和问题.如果算法是需要顺序数据的算法,那么我们将无能为力.如果数据集是多类或多标签的,并且算法是基于一对一的,那么可以使用多线程.
But then again, it all depends on the algorithm and the problem. If the algorithm is such that which want sequential data, we cannot do anything. If the dataset is multi-class or multi-label and algorithm works on a one-vs-rest basis, then yes it can use multi-threading.
在要使用的实用程序或算法中查找参数n_jobs
,并将其设置为-1
以使用多线程.
Look for a param n_jobs
in the utilities or algorithm you want to use, and set it to -1
for using the multi-threading.
例如.
-
LogisticRegression (如果在二进制问题只会训练一个模型,该模型将顺序需要数据,因此在这里使用
n_jobs
无效.但是它像OvR一样处理多类问题,因此它必须使用相同的数据训练许多估计量.在这种情况下,您可以使用n_jobs=-1
.
LogisticRegression if working in a binary problem will only train a single model, which will require data sequentially, so here using
n_jobs
have no effect. But it handles multi-class problems as OvR, so it will have to train those many estimators using the same data. In this case you can use then_jobs=-1
.
DecisionTreeClassifier 本质上是多类启用,不需要训练多个模型.所以我们在那里没有那个参数.
DecisionTreeClassifier is inherently multi-class enabled and dont need to train multiple models. So we dont have that param there.
像 RandomForestClassifier 这样的集成方法训练多个估计器(与问题类型无关),这些估计器分别处理数据的某些部分,因此在这里我们可以再次使用n_jobs
.
Ensemble methods like RandomForestClassifier will train multiple estimators (irrespective of problem type) which individually work on some part of data, so here again we can make use of n_jobs
.
交叉验证实用程序,例如 cross_val_score
或 GridSearchCV 在某些情况下将再次起作用数据或某些独立于其他折叠的参数,因此在这里我们也可以使用多线程功能.
Cross-validation utilities like cross_val_score
or GridSearchCV will again work on some part of data or some individual parameters, which is independent of other folds, so here also we can use multi-threading capabilities.
这篇关于Scikit学习多线程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!