掩盖张量的损失 [英] Loss on masked tensors

查看:143
本文介绍了掩盖张量的损失的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有

[[4.3, -0.5, -2.7, 0, 0],
[0.5, 2.3, 0, 0, 0]]

显然,第一个示例中的后两个和第二个示例中的后三个被屏蔽(即它们为零),并且不应影响损耗和梯度计算.

where clearly the last two in the first example and last three in the second example are masked (that is, they are zero) and should not affect loss and gradient computations.

如何计算此logit与相应标签之间的交叉熵损失?为了简洁起见,此示例的标签可以是

How do I compute cross-entropy loss between this logits and corresponding labels? For sanity, the labels for this example can be something like

[[1, 0, 0, 0, 0],
[0, 1, 0, 0, 0]]

(一个问题:Softmax,然后是log,在logit上也适用于掩码的零,而tf的交叉熵方法也将考虑这些元素的损耗.)

(One issue: Softmax, followed by log, on the logits will be applicable for the masked zeroes too and tf's cross-entropy method will consider the loss for those elements too.)

(另外,您可以考虑这样的问题:我的批次中的logit长度不同,即我的logit分别是eg.1和eg.2的长度3和2.标签后面跟着相同的长度. )

(Also, you can think about the problem like this: I have logits of varying lengths in a batch, i.e. my logits were length 3 and 2 for eg.1 and eg.2 respectively. Same is followed by the labels.)

推荐答案

使交叉熵损失是库中的常见操作.实际上,它处理权重的更一般概念.提供用于屏蔽的二进制权重.

Masking the cross-entropy loss is a common operation, covered by the library. It actually handles the more general concept of weights; Provide binary weights for masking.

mask = tf.equal(logits, 0) # as in the OP
weights = tf.to_float(mask) # convert to (0, 1) weights
loss = tf.losses.softmax_cross_entropy(labels, logits, weights)

不要通过先计算输出的softmax来计算softmax交叉熵,然后再计算交叉熵,会降低同时进行计算的精度和稳定性.

Don't compute softmax cross entropy by actually computing the softmax of the ouput then the cross-entropy, you loose the computational precision and stability of doing it simultaneously.

这篇关于掩盖张量的损失的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆