scikit多标签分类:ValueError:错误的输入形状 [英] scikit multilabel classification: ValueError: bad input shape

查看：221 发布时间：2020/5/4 9:53:07 machine-learning classification scikit-learn stochastic-process

本文介绍了scikit多标签分类:ValueError:错误的输入形状的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我相信带有loss='log'的SGDClassifier()支持多标签分类，并且我不必使用OneVsRestClassifier. 选中

I beieve SGDClassifier() with loss='log' supports Multilabel classification and I do not have to use OneVsRestClassifier. Check this

现在，我的数据集很大，我正在使用HashingVectorizer并将结果作为输入传递给SGDClassifier.我的目标具有42048个功能.

Now, my dataset is quite big and I am using HashingVectorizer and passing result as input to SGDClassifier. My target has 42048 features.

运行此命令时，如下所示:

When I run this, as follows:

clf.partial_fit(X_train_batch, y)

我得到:ValueError: bad input shape (300000, 42048).

我还如下使用类作为参数，但是仍然存在相同的问题.

I have also used classes as the parameter as follows, but still same problem.

clf.partial_fit(X_train_batch, y, classes=np.arange(42048))

在SGDClassifier的文档中，显示为y : numpy array of shape [n_samples]

In the documentation of SGDClassifier, it says y : numpy array of shape [n_samples]

推荐答案

否，SGDClassifier不会进行多标签分类-它会进行 multiclass 分类，即一个不同的问题，尽管两个问题都可以通过一对多"的问题简化来解决.

No, SGDClassifier does not do multilabel classification -- it does multiclass classification, which is a different problem, although both are solved using a one-vs-all problem reduction.

然后，无论是SGD还是 OneVsRestClassifier.fit 将接受y的稀疏矩阵.正如您已经发现的，前者想要一个标签数组.出于多标签的目的，后者需要标签列表的列表，例如

Then, neither SGD nor OneVsRestClassifier.fit will accept a sparse matrix for y. The former wants an array of labels, as you've already found out. The latter wants, for multilabel purposes, a list of lists of labels, e.g.

y = [[1], [2, 3], [1, 3]]

表示X[0]具有标签1，X[1]具有标签{2,3}，而X[2]具有标签{1,3}.

to denote that X[0] has label 1, X[1] has labels {2,3} and X[2] has labels {1,3}.

这篇关于scikit多标签分类:ValueError:错误的输入形状的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

scikit多标签分类:ValueError:错误的输入形状 [英] scikit multilabel classification: ValueError: bad input shape

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

scikit多标签分类:ValueError:错误的输入形状 [英] scikit multilabel classification: ValueError: bad input shape

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭