对于Keras中的多类分类,为什么binary_crossentropy比categorical_crossentropy更准确? [英] Why is binary_crossentropy more accurate than categorical_crossentropy for multiclass classification in Keras?

查看:1200
本文介绍了对于Keras中的多类分类,为什么binary_crossentropy比categorical_crossentropy更准确?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在学习如何使用Keras创建卷积神经网络.我正在尝试为MNIST数据集获得更高的准确性.

I'm learning how to create convolutional neural networks using Keras. I'm trying to get a high accuracy for the MNIST dataset.

显然,categorical_crossentropy适用于2个以上类,而binary_crossentropy适用于2个类.由于有10位数字,因此我应该使用categorical_crossentropy.但是,在训练和测试了数十种模型之后,binary_crossentropy始终明显优于categorical_crossentropy.

Apparently categorical_crossentropy is for more than 2 classes and binary_crossentropy is for 2 classes. Since there are 10 digits, I should be using categorical_crossentropy. However, after training and testing dozens of models, binary_crossentropy consistently outperforms categorical_crossentropy significantly.

在Kaggle上,使用binary_crossentropy和10个历元,我获得了99%以上的准确率.同时,即使使用30个纪元,我也无法使用categorical_crossentropy达到97%以上(虽然不多,但是我没有GPU,所以培训永远需要).

On Kaggle, I got 99+% accuracy using binary_crossentropy and 10 epochs. Meanwhile, I can't get above 97% using categorical_crossentropy, even using 30 epochs (which isn't much, but I don't have a GPU, so training takes forever).

这是我的模型现在的样子:

Here's what my model looks like now:

model = Sequential()
model.add(Convolution2D(100, 5, 5, border_mode='valid', input_shape=(28, 28, 1), init='glorot_uniform', activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(100, 3, 3, init='glorot_uniform', activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.3))
model.add(Flatten())
model.add(Dense(100, init='glorot_uniform', activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(100, init='glorot_uniform', activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(10, init='glorot_uniform', activation='softmax'))
model.compile(loss='binary_crossentropy', optimizer='adamax', metrics=['accuracy'])

推荐答案

简短的答案:不是.

要看到这一点,只需尝试手动计算精度,您就会发现它与Keras使用model.evaluate方法报告的精度不同:

To see that, simply try to calculate the accuracy "by hand", and you will see that it is different from the one reported by Keras with the model.evaluate method:

# Keras reported accuracy:
score = model.evaluate(x_test, y_test, verbose=0) 
score[1]
# 0.99794011611938471

# Actual accuracy calculated manually:
import numpy as np
y_pred = model.predict(x_test)
acc = sum([np.argmax(y_test[i])==np.argmax(y_pred[i]) for i in range(10000)])/10000
acc
# 0.98999999999999999

看起来 如此的原因是一个相当微妙的问题,取决于您选择的损失函数时,Keras如何实际猜测使用哪种精度.在模型编译中仅包含metrics=['accuracy'].

The reason it seems to be so is a rather subtle issue at how Keras actually guesses which accuracy to use, depending on the loss function you have selected, when you include simply metrics=['accuracy'] in your model compilation.

如果您检查源代码,则Keras不会定义一个单一的精度度量标准,但可以定义几个不同的度量标准,其中包括binary_accuracycategorical_accuracy. 在幕后发生的事情是: ,因为您选择了二进制交叉熵作为损失函数,并且未指定特定的精度指标,所以Keras(错误地...)推断出您对binary_accuracy感兴趣,这就是它返回的结果.

If you check the source code, Keras does not define a single accuracy metric, but several different ones, among them binary_accuracy and categorical_accuracy. What happens under the hood is that, since you have selected binary cross entropy as your loss function and have not specified a particular accuracy metric, Keras (wrongly...) infers that you are interested in the binary_accuracy, and this is what it returns.

为避免这种情况,即使用确实的二元互熵作为损失函数(原则上这没有错),同时仍可获得当前问题所需的类别准确度(即MNIST分类) ),则应在模型编译中明确要求categorical_accuracy,如下所示:

To avoid that, i.e. to use indeed binary cross entropy as your loss function (nothing wrong with this, in principle) while still getting the categorical accuracy required by the problem at hand (i.e. MNIST classification), you should ask explicitly for categorical_accuracy in the model compilation as follows:

from keras.metrics import categorical_accuracy
model.compile(loss='binary_crossentropy', optimizer='adamax', metrics=[categorical_accuracy])

在如我上面所示的训练,评分和预测测试集之后,两个指标现在应该是相同的,

And after training, scoring, and predicting the test set as I show above, the two metrics now are the same, as they should be:

sum([np.argmax(y_test[i])==np.argmax(y_pred[i]) for i in range(10000)])/10000 == score[1]
# True

(使用HT来这个很好的答案一个类似的问题,这帮助我了解了这个问题...)

(HT to this great answer to a similar problem, which helped me understand the issue...)

更新:发布后,我发现此问题已在

UPDATE: After my post, I discovered that this issue had already been identified in this answer.

这篇关于对于Keras中的多类分类,为什么binary_crossentropy比categorical_crossentropy更准确?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆