与sklearn并行训练多个模型? [英] Train multiple models in parallel with sklearn?
问题描述
我想训练具有不同随机状态的多个LinearSVC模型,但我更喜欢并行进行. sklearn中是否有支持此功能的机制?我知道Gridsearch或某些合奏方法正在隐式地执行操作,但是到底是什么呢?
I want to train multiple LinearSVC models with different random states but I prefer to do it in parallel. Is there an mechanism supporting this in sklearn? I know Gridsearch or some ensemble methods are doing in implicitly but what is the thing under the hood?
推荐答案
内部的东西"是库 joblib
,例如GridSearchCV
中的多重处理和一些集成方法.它的Parallel
辅助类是非常方便的瑞士刀,用于尴尬地并行循环.
The "thing" under the hood is the library joblib
, which powers for example the multi-processing in GridSearchCV
and some ensemble methods. It's Parallel
helper class is a very handy Swiss knife for embarrassingly parallel for loops.
这是一个使用joblib并行训练具有四个随机过程的具有不同随机状态的多个LinearSVC模型的示例:
This is an example to train multiple LinearSVC models with different random states in parallel with 4 processes using joblib:
from joblib import Parallel, delayed
from sklearn.svm import LinearSVC
import numpy as np
def train_model(X, y, seed):
model = LinearSVC(random_state=seed)
return model.fit(X, y)
X = np.array([[1,2,3],[4,5,6]])
y = np.array([0, 1])
result = Parallel(n_jobs=4)(delayed(train_model)(X, y, seed) for seed in range(10))
# result is a list of 10 models trained using different seeds
这篇关于与sklearn并行训练多个模型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!