为什么 binary_crossentropy 和 categorical_crossentropy 对同一问题给出不同的表现? [英] Why binary_crossentropy and categorical_crossentropy give different performances for the same problem?

查看:41
本文介绍了为什么 binary_crossentropy 和 categorical_crossentropy 对同一问题给出不同的表现?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试训练 CNN 按主题对文本进行分类.当我使用二元交叉熵时,我的准确度约为 80%,使用分类交叉熵时,我的准确度约为 50%.

I'm trying to train a CNN to categorize text by topic. When I use binary cross-entropy I get ~80% accuracy, with categorical cross-entropy I get ~50% accuracy.

我不明白这是为什么.这是一个多类问题,这不是说我必须使用分类交叉熵,而使用二元交叉熵的结果毫无意义吗?

I don't understand why this is. It's a multiclass problem, doesn't that mean that I have to use categorical cross-entropy and that the results with binary cross-entropy are meaningless?

model.add(embedding_layer)
model.add(Dropout(0.25))
# convolution layers
model.add(Conv1D(nb_filter=32,
                    filter_length=4,
                    border_mode='valid',
                    activation='relu'))
model.add(MaxPooling1D(pool_length=2))
# dense layers
model.add(Flatten())
model.add(Dense(256))
model.add(Dropout(0.25))
model.add(Activation('relu'))
# output layer
model.add(Dense(len(class_id_index)))
model.add(Activation('softmax'))

然后我使用categorical_crossentropy作为损失函数来编译它:

Then I compile it either it like this using categorical_crossentropy as the loss function:

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

直觉上,为什么我想使用分类交叉熵是有道理的,我不明白为什么我用二元得到好的结果,而用分类得到差的结果.

Intuitively it makes sense why I'd want to use categorical cross-entropy, I don't understand why I get good results with binary, and poor results with categorical.

推荐答案

导致分类和分类之间这种明显性能差异的原因二元交叉熵是用户 xtof54 已经在他在下面的回答中报告的内容,即:

The reason for this apparent performance discrepancy between categorical & binary cross entropy is what user xtof54 has already reported in his answer below, i.e.:

使用 Keras 方法计算的准确度 evaluate 很简单使用超过 2 个标签的 binary_crossentropy 时出错

the accuracy computed with the Keras method evaluate is just plain wrong when using binary_crossentropy with more than 2 labels

我想更详细地说明这一点,展示实际的潜在问题,解释它并提供补救措施.

I would like to elaborate more on this, demonstrate the actual underlying issue, explain it, and offer a remedy.

这种行为不是错误;根本原因是一个相当微妙的 &当您在模型编译中仅包含 metrics=['accuracy'] 时,Keras 如何根据您选择的损失函数实际猜测使用哪种精度的未记录问题.换句话说,虽然你的第一个编译选项

This behavior is not a bug; the underlying reason is a rather subtle & undocumented issue at how Keras actually guesses which accuracy to use, depending on the loss function you have selected, when you include simply metrics=['accuracy'] in your model compilation. In other words, while your first compilation option

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

有效,你的第二个:

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

不会产生您期望的结果,但原因不是使用二元交叉熵(至少在原则上,它是绝对有效的损失函数).

will not produce what you expect, but the reason is not the use of binary cross entropy (which, at least in principle, is an absolutely valid loss function).

这是为什么?如果您查看 metrics 源代码,Keras 没有定义单一精度指标,但有几个不同的指标,其中包括 binary_accuracycategorical_accuracy.幕后会发生什么,因为您选择了二元交叉熵作为损失函数并且没有指定特定的准确度指标,Keras(错误地...)推断您对 binary_accuracy 感兴趣,这就是它返回的内容- 实际上您对 categorical_accuracy 感兴趣.

Why is that? If you check the metrics source code, Keras does not define a single accuracy metric, but several different ones, among them binary_accuracy and categorical_accuracy. What happens under the hood is that, since you have selected binary cross entropy as your loss function and have not specified a particular accuracy metric, Keras (wrongly...) infers that you are interested in the binary_accuracy, and this is what it returns - while in fact you are interested in the categorical_accuracy.

让我们使用MNIST CNN示例 在 Keras 中,修改如下:

Let's verify that this is the case, using the MNIST CNN example in Keras, with the following modification:

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])  # WRONG way

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=2,  # only 2 epochs, for demonstration purposes
          verbose=1,
          validation_data=(x_test, y_test))

# Keras reported accuracy:
score = model.evaluate(x_test, y_test, verbose=0) 
score[1]
# 0.9975801164627075

# Actual accuracy calculated manually:
import numpy as np
y_pred = model.predict(x_test)
acc = sum([np.argmax(y_test[i])==np.argmax(y_pred[i]) for i in range(10000)])/10000
acc
# 0.98780000000000001

score[1]==acc
# False    

为了解决这个问题,即确实使用二元交叉熵作为你的损失函数(正如我所说,这没有错,至少在原则上)同时仍然获得问题所需的分类准确性目前,您应该在模型编译中明确要求 categorical_accuracy,如下所示:

To remedy this, i.e. to use indeed binary cross entropy as your loss function (as I said, nothing wrong with this, at least in principle) while still getting the categorical accuracy required by the problem at hand, you should ask explicitly for categorical_accuracy in the model compilation as follows:

from keras.metrics import categorical_accuracy
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=[categorical_accuracy])

在 MNIST 示例中,经过我上面展示的训练、评分和预测测试集之后,这两个指标现在应该是相同的:

In the MNIST example, after training, scoring, and predicting the test set as I show above, the two metrics now are the same, as they should be:

# Keras reported accuracy:
score = model.evaluate(x_test, y_test, verbose=0) 
score[1]
# 0.98580000000000001

# Actual accuracy calculated manually:
y_pred = model.predict(x_test)
acc = sum([np.argmax(y_test[i])==np.argmax(y_pred[i]) for i in range(10000)])/10000
acc
# 0.98580000000000001

score[1]==acc
# True    

系统设置:

Python version 3.5.3
Tensorflow version 1.2.1
Keras version 2.0.4

更新:在我发帖后,我发现这个问题已经在 这个答案.

UPDATE: After my post, I discovered that this issue had already been identified in this answer.

这篇关于为什么 binary_crossentropy 和 categorical_crossentropy 对同一问题给出不同的表现?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆