scitkit SGDClassifierpartial_fit不会逐步学习.返回“类应包含所有有效标签和". [英] scitkit SGDClassifier partial_fit doesnot learn incrementally. Returns “classes should include all valid labels&quot;

查看：77 发布时间：2021/5/31 18:41:20 python machine-learning scikit-learn

本文介绍了scitkit SGDClassifierpartial_fit不会逐步学习.返回“类应包含所有有效标签和".的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我将两个数据流传递给sgd_clf分类器，如下面的代码所示.首先partial_fit正在获取数据x1，y1的第一流.第二partial_fit正在获取第二个数据流x2，y2.

I passed two streams of data to sgd_clf classifier as shown in below code. First partial_fit is taking first stream of data x1,y1. Second partial_fit is taking the second stream of data x2,y2.

下面的代码在第二个partial_fit步骤中给我错误，该错误指示之前要包含类标签.当我将来自x2 y2的所有数据包括在x1，y1中时，此错误消失了.(我的班级标签已经包含在内，现在才调用第二个partial_fit)

The below code gives me error at second partial_fit step that class lables to be included prior. This error is gone when i include all my data from x2 y2 in x1, y1. (My class labels are included prior to calling second partial_fit now)

但是，我不能事先给出x2 y2数据.如果我将所有数据都放在第一个partial_fit()之前，为什么我需要使用第二个partial_fit()?实际上，如果我以前知道所有数据，则不需要使用partial_fit()，我可以只进行fit().

However, i cannot give this x2 y2 data prior. If at all i give all my data before first partial_fit(), why is there any need for me to use second partial_fit() ? Infact, if i know all data before, i dont need to use partial_fit(), i could just do fit().

from sklearn import neighbors, linear_model
import numpy as np

def train_new_data():

    sgd_clf = linear_model.SGDClassifier()

    x1 = [[8, 9], [20, 22]]
    y1 = [5, 6]

    classes = np.unique(y1)

    #print(classes)

    sgd_clf.partial_fit(x1,y1,classes=classes)

    x2 = [10, 12]
    y2 = 8


    sgd_clf.partial_fit([x2], [y2],classes=classes)#Error here!!

    return sgd_clf

if __name__ == "__main__":

    print(train_new_data().predict([[20,22]]))

问题1:对于sklearn分类器，我对partial_fit()的理解是错误的，因为它按此处指定的方式动态获取数据:增量学习

Q1: Is my understanding of partial_fit() for sklearn classifiers wrong that it takes data on the fly as specified here: Incremental Learning

第二季度:我想重新训练模型/使用新数据更新模型.我不想从头开始训练.可以，partial_fit可以帮助我吗?

Q2: I want to retrain a model/update a model with the new data. I dont want to train from scratch. Will partial_fit help me with this ?

Q3:我不仅只针对SGDClassifier.我可以使用任何支持在线/批处理学习的算法.我的主要目的是第三季度.我有一个训练有素的模型，可以处理1000幅图像.我不想从头开始重新训练该模型，因为我有一个/两个新的图像样本.既没有兴趣为每个新条目创建一个新模型，然后将它们全部混合在一起.这降低了我搜索整个训练过的模型的预测性能.我只想在partial_fit的帮助下将此新数据实例添加到经过训练的模型中.这可行吗?

Q3: I am not specific only to SGDClassifier. I can use any algorithm that support online/batch learning. My main intention is Q3. I have a trained model on 1000's of images. I dont want to retrain this model from scratch just because i have one/two new samples of images. Neither interested in creating a new model for each new entry and then mix all of them. This decreases my performance for predictions to search all over the trained models. I just want to add this new data instances to the trained model with the help of partial_fit. Is this feasible ?

第4季度:如果我无法使用scikit分类器实现第2季度，请指导我如何实现此目标

Q4: If i cannot acheive Q2 with scikit classifiers, Please direct me how i can achieve this

任何建议，想法或参考都将受到赞赏.

Any suggestions or ideas or references are much appreciated.

scitkit SGDClassifierpartial_fit不会逐步学习.返回“类应包含所有有效标签和". [英] scitkit SGDClassifier partial_fit doesnot learn incrementally. Returns “classes should include all valid labels&quot;

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

scitkit SGDClassifierpartial_fit不会逐步学习.返回“类应包含所有有效标签和". [英] scitkit SGDClassifier partial_fit doesnot learn incrementally. Returns “classes should include all valid labels&amp;quot;

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

scitkit SGDClassifierpartial_fit不会逐步学习.返回“类应包含所有有效标签和". [英] scitkit SGDClassifier partial_fit doesnot learn incrementally. Returns “classes should include all valid labels"

登录关闭