如何修改填充向量的seq2seq成本函数? [英] How to modify the seq2seq cost function for padded vectors?

查看:275
本文介绍了如何修改填充向量的seq2seq成本函数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在构造RNN层时,Tensorflow通过使用参数sequence_length支持动态长度序列,其中模型在序列size ='sequence_length'之后不学习序列,即返回零向量。



但是,如何在 https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/seq2seq.py#L890 被修改为遇到被屏蔽的序列,因此成本和困惑只能根据实际序列而不是整个填充序列进行计算?

  def sequence_loss_by_example(logits,targets,weights,average_across_timesteps = True,softmax_loss_function = None,name = None):

如果len(targets)!= len(logits)或len(weights)!= len(logits):
raise ValueError(对数,重量和目标的长度必须相同
%d,%d,%d。%(len(logits),len(weights),len(targets)))
with ops.op_scope(logits + targets + weights,name,
$ b log_perp_list = []
用于logit,目标,zip中的重量(logits,targets,weights):
如果softmax_loss_function为无:
#TODO (irving,ebrevdo):需要重新形成,因为
#sequence_loss_by_example有时被称为标量,
#违反了我们的一般标量严格性策略。
target = array_ops.reshape(target,[-1])$ ​​b $ b crossent = nn_ops.sparse_softmax_cross_entropy_with_logits(
logit,target)
else:
crossent = softmax_loss_function(logit ,target)
log_perp_list.append(crossent * weight)
log_perps = math_ops.add_n(log_perp_list)
如果average_across_timesteps:
total_size = math_ops.add_n(weights)
total_size + = 1e-12#为了避免所有0的权重除以0。
log_perps / = total_size
return log_perps


解决方案

此功能已经支持通过使用权重来计算动态序列长度的成本。只要您确保填充目标的权重为0,则这些步骤的交叉熵将被推到0:

  log_perp_list.append(crossent * weight)

,总大小也将仅反映非填充步骤:

  total_size = math_ops.add_n(weights)

如果您使用零填充,则一种导出权重的方法如下:

  weights = tf.sign(tf.abs(model.targets))

(请注意,您可能需要将其转换为与目标相同的类型)


Tensorflow supports dynamic length sequence by use of the parameter: 'sequence_length' while constructing the RNN layer, wherein the model does not learn the sequence after the sequence size = 'sequence_length' i.e, returns zero vector.

However, how can the cost function at https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/seq2seq.py#L890 be modified to encounter the masked sequences, so that cost and perplexity are calculated only on the actual sequences rather than whole padded sequence?

def sequence_loss_by_example(logits, targets, weights, average_across_timesteps=True,  softmax_loss_function=None, name=None):

    if len(targets) != len(logits) or len(weights) != len(logits):
        raise ValueError("Lengths of logits, weights, and targets must be the same "
                         "%d, %d, %d." % (len(logits), len(weights), len(targets)))
      with ops.op_scope(logits + targets + weights, name,
                        "sequence_loss_by_example"):
        log_perp_list = []
        for logit, target, weight in zip(logits, targets, weights):
          if softmax_loss_function is None:
            # TODO(irving,ebrevdo): This reshape is needed because
            # sequence_loss_by_example is called with scalars sometimes, which
            # violates our general scalar strictness policy.
            target = array_ops.reshape(target, [-1])
            crossent = nn_ops.sparse_softmax_cross_entropy_with_logits(
                logit, target)
          else:
            crossent = softmax_loss_function(logit, target)
          log_perp_list.append(crossent * weight)
        log_perps = math_ops.add_n(log_perp_list)
        if average_across_timesteps:
          total_size = math_ops.add_n(weights)
          total_size += 1e-12  # Just to avoid division by 0 for all-0 weights.
          log_perps /= total_size
    return log_perps

解决方案

This function already supports calculating costs for dynamic sequence lengths through the use of weights. As long as you ensure the weights are 0 for the "padding targets", the cross entropy will be pushed to 0 for those steps:

log_perp_list.append(crossent * weight)

and the total size will also reflect only the non-padding steps:

total_size = math_ops.add_n(weights)

If you're padding with zeros, one way to derive the weights is as follows:

weights = tf.sign(tf.abs(model.targets))

(Note that you might need to cast this to the same type as your targets)

这篇关于如何修改填充向量的seq2seq成本函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆