使用一个模型的预测概率训练另一个模型并保存为单个模型 [英] Use predicted probability of one model to train another model and save as one single model

查看:67
本文介绍了使用一个模型的预测概率训练另一个模型并保存为单个模型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个用于某些二进制分类目的的 XGBoost 模型.它利用了一些特性,即 f1, f2, f3, f4, f5, f6, f7

I have a XGBoost model that I am using for some binary classification purpose. It makes use of some features namely f1, f2, f3, f4, f5, f6, f7

我想使用 sklearn 中的另一个 LogisticRegression 模型,该模型利用模型的输出和 XGBoost 模型的特征进行预测,即必须使用 f1, out 进行预测.其中 outXGBoost 模型做出的预测.

I want to make use of another LogisticRegression model from sklearn that makes use of the output of the model and a feature of XGBoost model to make prediction ie it must take f1, out to make the prediction. Where out is the prediction made by the XGBoost model.

我想将这两个模型保存到一个文件中,了解如何在生产中进行预测.

I want to save these two model into a single file some how to make prediction in production.

我该怎么做.

推荐答案

您需要 FeatureUnion管道来实现这一点.

You would need a combination of FeatureUnion and Pipeline to achieve this.

像这样:

final_classifier = Pipeline([
    ('features', FeatureUnion([
        ('f1', FeatureSelector()),
        ('out', XGBoostClassifierTransformer()),
     ])
    ),
    ('clf', LogisticRegression()),
])

这里,FeatureSelector()XGBoostClassifierTransformer() 是自定义包装器,您可以轻松地自行制作.您需要使用要发送到管道下一部分的输出来实现 fit()transform() 方法.

Here, FeatureSelector() and XGBoostClassifierTransformer() are custom wrappers that you can easily make on your own. You need to implement the fit() and transform() methods with the output you want to send to the next part of the pipeline.

FeatureUnion 将在其每个内部部件上调用 transform(),然后组合输出.管道将获取此输出,然后发送到下一部分,即 LogisticRegression.

FeatureUnion will call transform() on each of its internal parts and then combine the outputs. The pipeline will take this output and then send to next part, ie LogisticRegression.

这看起来像这样.

X --> final_classifier, Pipeline
            |
            |  <== X is passed to FeatureUnion
            \/
      features, FeatureUnion
                      |
                      |  <== X is duplicated and passed to both parts
        ______________|__________________
       |                                 |
       |                                 |                         
       \/                               \/
   f1, FeatureSelector                out, XGBoostClassifierTransformer
           |                                          |   
           |<= Only f1 is selected from X             | <= All features are used in XGBoost
           |                                          |
           \/________________________________________\/
                                      |
                                      |
                                     \/
                                   clf, LogisticRegression

这篇关于使用一个模型的预测概率训练另一个模型并保存为单个模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆