使用一个模型的预测概率训练另一个模型并保存为单个模型 [英] Use predicted probability of one model to train another model and save as one single model
问题描述
我有一个用于某些二进制分类目的的 XGBoost
模型.它利用了一些特性,即 f1, f2, f3, f4, f5, f6, f7
I have a XGBoost
model that I am using for some binary classification purpose. It makes use of some features namely f1, f2, f3, f4, f5, f6, f7
我想使用 sklearn
中的另一个 LogisticRegression
模型,该模型利用模型的输出和 XGBoost
模型的特征进行预测,即必须使用 f1, out
进行预测.其中 out
是 XGBoost
模型做出的预测.
I want to make use of another LogisticRegression
model from sklearn
that makes use of the output of the model and a feature of XGBoost
model to make prediction ie it must take f1, out
to make the prediction. Where out
is the prediction made by the XGBoost
model.
我想将这两个模型保存到一个文件中,了解如何在生产中进行预测.
I want to save these two model into a single file some how to make prediction in production.
我该怎么做.
推荐答案
您需要 FeatureUnion 和 管道来实现这一点.
You would need a combination of FeatureUnion and Pipeline to achieve this.
像这样:
final_classifier = Pipeline([
('features', FeatureUnion([
('f1', FeatureSelector()),
('out', XGBoostClassifierTransformer()),
])
),
('clf', LogisticRegression()),
])
这里,FeatureSelector()
和 XGBoostClassifierTransformer()
是自定义包装器,您可以轻松地自行制作.您需要使用要发送到管道下一部分的输出来实现 fit()
和 transform()
方法.
Here, FeatureSelector()
and XGBoostClassifierTransformer()
are custom wrappers that you can easily make on your own. You need to implement the fit()
and transform()
methods with the output you want to send to the next part of the pipeline.
FeatureUnion 将在其每个内部部件上调用 transform()
,然后组合输出.管道将获取此输出,然后发送到下一部分,即 LogisticRegression.
FeatureUnion will call transform()
on each of its internal parts and then combine the outputs. The pipeline will take this output and then send to next part, ie LogisticRegression.
这看起来像这样.
X --> final_classifier, Pipeline
|
| <== X is passed to FeatureUnion
\/
features, FeatureUnion
|
| <== X is duplicated and passed to both parts
______________|__________________
| |
| |
\/ \/
f1, FeatureSelector out, XGBoostClassifierTransformer
| |
|<= Only f1 is selected from X | <= All features are used in XGBoost
| |
\/________________________________________\/
|
|
\/
clf, LogisticRegression
这篇关于使用一个模型的预测概率训练另一个模型并保存为单个模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!