为什么对于 Keras 中的多类分类,binary_crossentropy 比 categorical_crossentropy 更准确? [英] Why is binary_crossentropy more accurate than categorical_crossentropy for multiclass classification in Keras?

查看:52
本文介绍了为什么对于 Keras 中的多类分类,binary_crossentropy 比 categorical_crossentropy 更准确?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在学习如何使用 Keras 创建卷积神经网络.我正在尝试获得 MNIST 数据集的高精度.

I'm learning how to create convolutional neural networks using Keras. I'm trying to get a high accuracy for the MNIST dataset.

显然 categorical_crossentropy 用于 2 个以上的类,而 binary_crossentropy 用于 2 个类.由于有 10 个数字,我应该使用 categorical_crossentropy.然而,在训练和测试数十个模型后,binary_crossentropy 始终显着优于 categorical_crossentropy.

Apparently categorical_crossentropy is for more than 2 classes and binary_crossentropy is for 2 classes. Since there are 10 digits, I should be using categorical_crossentropy. However, after training and testing dozens of models, binary_crossentropy consistently outperforms categorical_crossentropy significantly.

在 Kaggle 上,我使用 binary_crossentropy 和 10 个时期获得了 99+% 的准确率.同时,我无法使用 categorical_crossentropy 超过 97%,即使使用 30 个 epoch(这并不多,但我没有 GPU,所以训练需要很长时间).

On Kaggle, I got 99+% accuracy using binary_crossentropy and 10 epochs. Meanwhile, I can't get above 97% using categorical_crossentropy, even using 30 epochs (which isn't much, but I don't have a GPU, so training takes forever).

这是我的模型现在的样子:

Here's what my model looks like now:

model = Sequential()
model.add(Convolution2D(100, 5, 5, border_mode='valid', input_shape=(28, 28, 1), init='glorot_uniform', activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(100, 3, 3, init='glorot_uniform', activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.3))
model.add(Flatten())
model.add(Dense(100, init='glorot_uniform', activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(100, init='glorot_uniform', activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(10, init='glorot_uniform', activation='softmax'))
model.compile(loss='binary_crossentropy', optimizer='adamax', metrics=['accuracy'])

推荐答案

简短回答:不是.

要看到这一点,只需尝试手动"计算准确度,您就会发现它与 Keras 使用 model.evaluate 方法报告的不同:

To see that, simply try to calculate the accuracy "by hand", and you will see that it is different from the one reported by Keras with the model.evaluate method:

# Keras reported accuracy:
score = model.evaluate(x_test, y_test, verbose=0) 
score[1]
# 0.99794011611938471

# Actual accuracy calculated manually:
import numpy as np
y_pred = model.predict(x_test)
acc = sum([np.argmax(y_test[i])==np.argmax(y_pred[i]) for i in range(10000)])/10000
acc
# 0.98999999999999999

似乎如此的原因是一个相当微妙的问题,即 Keras 如何实际猜测使用哪种准确度,这取决于您选择的损失函数,当您在您的模型编译中只包含 metrics=['accuracy'].

The reason it seems to be so is a rather subtle issue at how Keras actually guesses which accuracy to use, depending on the loss function you have selected, when you include simply metrics=['accuracy'] in your model compilation.

如果您查看源代码,Keras 不会定义一个准确度指标,但有几个不同的指标,其中包括 binary_accuracycategorical_accuracy.幕后会发生什么,因为您选择了二元交叉熵作为损失函数并且没有指定特定的准确度指标,Keras(错误地...)推断您对 binary_accuracy 感兴趣,这就是它返回的内容.

If you check the source code, Keras does not define a single accuracy metric, but several different ones, among them binary_accuracy and categorical_accuracy. What happens under the hood is that, since you have selected binary cross entropy as your loss function and have not specified a particular accuracy metric, Keras (wrongly...) infers that you are interested in the binary_accuracy, and this is what it returns.

为了避免这种情况,即确实使用二元交叉熵作为您的损失函数(原则上没有错),同时仍然获得手头问题所需的分类准确性(即 MNIST 分类),您应该在模型编译中明确要求 categorical_accuracy 如下:

To avoid that, i.e. to use indeed binary cross entropy as your loss function (nothing wrong with this, in principle) while still getting the categorical accuracy required by the problem at hand (i.e. MNIST classification), you should ask explicitly for categorical_accuracy in the model compilation as follows:

from keras.metrics import categorical_accuracy
model.compile(loss='binary_crossentropy', optimizer='adamax', metrics=[categorical_accuracy])

在我上面展示的训练、评分和预测测试集之后,这两个指标现在应该是一样的:

And after training, scoring, and predicting the test set as I show above, the two metrics now are the same, as they should be:

sum([np.argmax(y_test[i])==np.argmax(y_pred[i]) for i in range(10000)])/10000 == score[1]
# True

(HT 对 这个很好的答案一个类似的问题,这帮助我理解了这个问题......)

(HT to this great answer to a similar problem, which helped me understand the issue...)

更新:在我发帖后,我发现这个问题已经在 这个答案.

UPDATE: After my post, I discovered that this issue had already been identified in this answer.

这篇关于为什么对于 Keras 中的多类分类,binary_crossentropy 比 categorical_crossentropy 更准确?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆