带有 class_weight=auto 的 SGDClassifier 在 scikit-learn 0.15 但不是 0.14 上失败 [英] SGDClassifier with class_weight=auto fails on scikit-learn 0.15 but not 0.14

查看:35
本文介绍了带有 class_weight=auto 的 SGDClassifier 在 scikit-learn 0.15 但不是 0.14 上失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我训练 scikit-learn v0.15 SGDClassifier 使用以下选项:SGDClassifier(loss='log', class_weight=None,penalty='l2'),训练完成且没有错误.然而,当我在 scikit-learn v0.15 上使用 class_weight='auto' 训练这个分类器时,我得到了这个错误:

When I train an scikit-learn v0.15 SGDClassifier with these options: SGDClassifier(loss='log', class_weight=None, penalty='l2'), training completes with no error. Yet when I train this classifier with class_weight='auto' on scikit-learn v0.15, I get this error:

  return self.model.fit(X, y)
  File "/home/rose/.local/lib/python2.7/site-packages/scikit_learn-0.15.0b1-py2.7-linux-x86_64.egg/sklearn/linear_model/stochastic_gradient.py", line 485, in fit
    sample_weight=sample_weight)
  File "/home/rose/.local/lib/python2.7/site-packages/scikit_learn-0.15.0b1-py2.7-linux-x86_64.egg/sklearn/linear_model/stochastic_gradient.py", line 389, in _fit
    classes, sample_weight, coef_init, intercept_init)
  File "/home/rose/.local/lib/python2.7/site-packages/scikit_learn-0.15.0b1-py2.7-linux-x86_64.egg/sklearn/linear_model/stochastic_gradient.py", line 336, in _partial_fit
    y_ind)
  File "/home/rose/.local/lib/python2.7/site-packages/scikit_learn-0.15.0b1-py2.7-linux-x86_64.egg/sklearn/utils/class_weight.py", line 43, in compute_class_weight
    raise ValueError("classes should have valid labels that are in y")
ValueError: classes should have valid labels that are in y

可能是什么原因造成的?

What could cause it?

作为参考,这里是关于 class_weight 的文档:

For reference, here's the documentation on class_weight:

class_weight 拟合参数的预设.相关的权重类.如果没有给出,所有类都应该有一个权重.自动"模式使用 y 的值来自动调整权重与班级频率成反比.

Preset for the class_weight fit parameter. Weights associated with classes. If not given, all classes are supposed to have weight one. The "auto" mode uses the values of y to automatically adjust weights inversely proportional to class frequencies.

推荐答案

我认为这可能是 scikit-learn 中的一个错误.作为解决方法,请尝试以下操作:

I think this may be a bug within scikit-learn. As a work around, try the following:

from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
y_encoded = le.fit_transform(y)
self.model.fit(X, y_encoded)
pred = le.inverse_transform(self.model.predict(X))

这篇关于带有 class_weight=auto 的 SGDClassifier 在 scikit-learn 0.15 但不是 0.14 上失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆