ValueError:无法处理multilabel-indicator和二进制文件的混合 [英] ValueError: Can't handle mix of multilabel-indicator and binary

查看:827
本文介绍了ValueError:无法处理multilabel-indicator和二进制文件的混合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用带有scikit-learn包装器的Keras.特别是,我想使用GridSearchCV进行超参数优化.

I am using Keras with the scikit-learn wrapper. In particular, I want to use GridSearchCV for hyper-parameters optimisation.

这是一个多类问题,即目标变量只能在一组n个类上选择一个标签.例如,目标变量可以是"Class1","Class2" ..."Classn".

This is a multi-class problem, i.e. the target variable can have only one label chosen on a set of n classes. For instance, the target variable can be 'Class1', 'Class2' ... 'Classn'.

# self._arch creates my model
nn = KerasClassifier(build_fn=self._arch, verbose=0)
clf = GridSearchCV(
  nn,
  param_grid={ ... },
  # I use f1 score macro averaged
  scoring='f1_macro',
  n_jobs=-1)

# self.fX is the data matrix
# self.fy_enc is the target variable encoded with one-hot format
clf.fit(self.fX.values, self.fy_enc.values)

问题在于,当在交叉验证期间计算得分时,用于验证样本的真实标签会被一次性编码,而由于某种原因,预测会崩溃为二进制标签(当目标变量只有两个类别时).例如,这是堆栈跟踪的最后一部分:

The problem is that, when score is computed during cross-validation, the true label for validation samples is encoded one-hot, while the prediction for some reason collapses to binary label (when the target variable has only two classes). For instance, this is the last part of the stack trace:

...........................................................................
/Users/fbrundu/.pyenv/versions/3.6.0/lib/python3.6/site-packages/sklearn/metrics/classification.py in _check_targets(y_true=array([[ 0.,  1.],
       [ 0.,  1.],
       [ 0... 0.,  1.],
       [ 0.,  1.],
       [ 0.,  1.]]), y_pred=array([1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1,...0, 1, 0, 0, 1, 0, 0, 0,
       0, 0, 0, 0, 1, 1]))
     77     if y_type == set(["binary", "multiclass"]):
     78         y_type = set(["multiclass"])
     79
     80     if len(y_type) > 1:
     81         raise ValueError("Can't handle mix of {0} and {1}"
---> 82                          "".format(type_true, type_pred))
        type_true = 'multilabel-indicator'
        type_pred = 'binary'
     83
     84     # We can't have more than one value on y_type => The set is no more needed
     85     y_type = y_type.pop()
     86

ValueError: Can't handle mix of multilabel-indicator and binary

我如何指示Keras/sklearn以一键编码的方式给出预测?

How can I instruct Keras/sklearn to give back predictions in one-hot encoding?

推荐答案

按照Vivek的评论,我使用了原始的(不是单次热编码的)目标数组,并配置了(在我的Keras模型中,请参见代码) sparse_categorical_crossentropy,根据对该问题的评论.

Following Vivek's comment, I used the original (not one-hot-encoded) target array, and I configured (in my Keras model, see code) the loss sparse_categorical_crossentropy, as per the comments to this issue.

arch.compile(
  optimizer='sgd',
  loss='sparse_categorical_crossentropy',
  metrics=['accuracy'])

这篇关于ValueError:无法处理multilabel-indicator和二进制文件的混合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆