什么是逻辑?softmax 和 softmax_cross_entropy_with_logits 有什么区别? [英] What are logits? What is the difference between softmax and softmax_cross_entropy_with_logits?

查看:23
本文介绍了什么是逻辑?softmax 和 softmax_cross_entropy_with_logits 有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

tensorflow API 文档中,他们使用关键字称为 logits.它是什么?很多方法都是这样写的:

In the tensorflow API docs they use a keyword called logits. What is it? A lot of methods are written like:

tf.nn.softmax(logits, name=None)

如果logits 只是一个通用的Tensor 输入,为什么它被命名为logits?

If logits is just a generic Tensor input, why is it named logits?

其次,下面两种方法有什么区别?

Secondly, what is the difference between the following two methods?

tf.nn.softmax(logits, name=None)
tf.nn.softmax_cross_entropy_with_logits(logits, labels, name=None)

我知道 tf.nn.softmax 做什么,但不知道另一个.举个例子会很有帮助.

I know what tf.nn.softmax does, but not the other. An example would be really helpful.

推荐答案

Logits 只是意味着该函数对早期层的未缩放输出进行操作,并且理解单位的相对比例是线性的.这意味着,特别是,输入的总和可能不等于 1,这些值不是概率(您可能有 5 的输入).

Logits simply means that the function operates on the unscaled output of earlier layers and that the relative scale to understand the units is linear. It means, in particular, the sum of the inputs may not equal 1, that the values are not probabilities (you might have an input of 5).

tf.nn.softmax 仅产生应用 softmax 函数的结果 到输入张量.softmax挤压"输入,使得 sum(input) = 1:这是一种标准化的方法.softmax 的输出形状与输入相同:它只是对值进行归一化.softmax 的输出可以解释为概率.

tf.nn.softmax produces just the result of applying the softmax function to an input tensor. The softmax "squishes" the inputs so that sum(input) = 1: it's a way of normalizing. The shape of output of a softmax is the same as the input: it just normalizes the values. The outputs of softmax can be interpreted as probabilities.

a = tf.constant(np.array([[.1, .3, .5, .9]]))
print s.run(tf.nn.softmax(a))
[[ 0.16838508  0.205666    0.25120102  0.37474789]]

相比之下,tf.nn.softmax_cross_entropy_with_logits 在应用 softmax 函数后计算结果的交叉熵(但它以更数学上更谨慎的方式一起完成).它类似于以下结果:

In contrast, tf.nn.softmax_cross_entropy_with_logits computes the cross entropy of the result after applying the softmax function (but it does it all together in a more mathematically careful way). It's similar to the result of:

sm = tf.nn.softmax(x)
ce = cross_entropy(sm)

交叉熵是一个汇总度量:它对元素求和.tf.nn.softmax_cross_entropy_with_logits 在形状 [2,5] 张量上的输出形状为 [2,1](第一维被视为批次).

The cross entropy is a summary metric: it sums across the elements. The output of tf.nn.softmax_cross_entropy_with_logits on a shape [2,5] tensor is of shape [2,1] (the first dimension is treated as the batch).

如果您想进行优化以最小化交叉熵并且您在最后一层之后进行 softmaxing,您应该使用 tf.nn.softmax_cross_entropy_with_logits 而不是这样做你自己,因为它以数学上正确的方式涵盖了数值不稳定的极端情况.否则,你最终会通过在这里和那里添加小的 epsilon 来破解它.

If you want to do optimization to minimize the cross entropy AND you're softmaxing after your last layer, you should use tf.nn.softmax_cross_entropy_with_logits instead of doing it yourself, because it covers numerically unstable corner cases in the mathematically right way. Otherwise, you'll end up hacking it by adding little epsilons here and there.

2016-02-07 如果您有单类标签,其中一个对象只能属于一个类,您现在可以考虑使用 tf.nn.sparse_softmax_cross_entropy_with_logits 以便您不必将标签转换为密集标签-热阵列.此功能是0.6.0版本后添加的.

Edited 2016-02-07: If you have single-class labels, where an object can only belong to one class, you might now consider using tf.nn.sparse_softmax_cross_entropy_with_logits so that you don't have to convert your labels to a dense one-hot array. This function was added after release 0.6.0.

这篇关于什么是逻辑?softmax 和 softmax_cross_entropy_with_logits 有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆