如何在sklearn的管道中腌制单个步骤? [英] How to pickle individual steps in sklearn's Pipeline?
问题描述
我正在使用sklearn中的Pipeline
对文本进行分类.
I am using Pipeline
from sklearn to classify text.
在此示例Pipeline
中,我有一个TfidfVectorizer
以及一些用FeatureUnion
和一个分类器包装的自定义功能,作为Pipeline
步骤,然后我拟合训练数据并进行预测:
In this example Pipeline
, I have a TfidfVectorizer
and some custom features wrapped with FeatureUnion
and a classifier as the Pipeline
steps, I then fit the training data and do the prediction:
from sklearn.pipeline import FeatureUnion, Pipeline
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.svm import LinearSVC
X = ['I am a sentence', 'an example']
Y = [1, 2]
X_dev = ['another sentence']
# classifier
LinearSVC1 = LinearSVC(tol=1e-4, C = 0.10000000000000001)
pipeline = Pipeline([
('features', FeatureUnion([
('tfidf', TfidfVectorizer(ngram_range=(1, 3), max_features= 4000)),
('custom_features', CustomFeatures())])),
('clf', LinearSVC1),
])
pipeline.fit(X, Y)
y_pred = pipeline.predict(X_dev)
# etc.
在这里,我需要腌制TfidfVectorizer
步骤,并保持未腌制的状态custom_features
,因为我仍在对它们进行实验.这个想法是通过腌制tfidf步骤来加快管道的速度.
Here I need to pickle the TfidfVectorizer
step and leave the custom_features
unpickled, since I still do experiments with them. The idea is to make the pipeline faster by pickling the tfidf step.
我知道我可以用joblib.dump
腌制整个Pipeline
,但是如何腌制各个步骤?
I know I can pickle the whole Pipeline
with joblib.dump
, but how do I pickle individual steps?
推荐答案
要腌制TfidfVectorizer,可以使用:
To pickle the TfidfVectorizer, you could use:
joblib.dump(pipeline.steps[0][1].transformer_list[0][1], dump_path)
或:
joblib.dump(pipeline.get_params()['features__tfidf'], dump_path)
要加载转储的对象,可以使用:
To load the dumped object, you can use:
pipeline.steps[0][1].transformer_list[0][1] = joblib.load(dump_path)
不幸的是,您不能使用set_params
(get_params
的反函数)来按名称插入估算器.如果 PR#1769中的更改,您将能够:启用将管道组件设置为参数曾经被合并!
Unfortunately you can't use set_params
, the inverse of get_params
, to insert the estimator by name. You will be able to if the changes in PR#1769: enable setting pipeline components as parameters are ever merged!
这篇关于如何在sklearn的管道中腌制单个步骤?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!