如何修改填充向量的seq2seq成本函数？ [英] How to modify the seq2seq cost function for padded vectors?

查看：275 发布时间：2017/7/22 14:09:46 python dynamic tensorflow deep-learning lstm

本文介绍了如何修改填充向量的seq2seq成本函数？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在构造RNN层时，Tensorflow通过使用参数sequence_length支持动态长度序列，其中模型在序列size ='sequence_length'之后不学习序列，即返回零向量。

但是，如何在 https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/seq2seq.py#L890 被修改为遇到被屏蔽的序列，因此成本和困惑只能根据实际序列而不是整个填充序列进行计算？

  def sequence_loss_by_example（logits，targets，weights，average_across_timesteps = True，softmax_loss_function = None，name = None）：
 
如果len（targets）！= len（logits）或len（weights）！= len（logits）：
 raise ValueError（对数，重量和目标的长度必须相同
％d，％d，％d。％（len（logits），len（weights），len（targets）））
 with ops.op_scope（logits + targets + weights，name，
 $ b log_perp_list = [] 
用于logit，目标，zip中的重量（logits，targets，weights）：
如果softmax_loss_function为无：
＃TODO （irving，ebrevdo）：需要重新形成，因为
＃sequence_loss_by_example有时被称为标量，
＃违反了我们的一般标量严格性策略。 
 target = array_ops.reshape（target，[-1]）$ b $ b crossent = nn_ops.sparse_softmax_cross_entropy_with_logits（
 logit，target）
 else：
 crossent = softmax_loss_function（logit ，target）
 log_perp_list.append（crossent * weight）
 log_perps = math_ops.add_n（log_perp_list）
如果average_across_timesteps：
 total_size = math_ops.add_n（weights）
 total_size + = 1e-12＃为了避免所有0的权重除以0。 
 log_perps / = total_size 
 return log_perps

解决方案

此功能已经支持通过使用权重来计算动态序列长度的成本。只要您确保填充目标的权重为0，则这些步骤的交叉熵将被推到0：

  log_perp_list.append（crossent * weight）

，总大小也将仅反映非填充步骤：

  total_size = math_ops.add_n（weights）

如果您使用零填充，则一种导出权重的方法如下：

  weights = tf.sign（tf.abs（model.targets））

（请注意，您可能需要将其转换为与目标相同的类型）

Tensorflow supports dynamic length sequence by use of the parameter: 'sequence_length' while constructing the RNN layer, wherein the model does not learn the sequence after the sequence size = 'sequence_length' i.e, returns zero vector.

However, how can the cost function at https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/seq2seq.py#L890 be modified to encounter the masked sequences, so that cost and perplexity are calculated only on the actual sequences rather than whole padded sequence?

def sequence_loss_by_example(logits, targets, weights, average_across_timesteps=True,  softmax_loss_function=None, name=None):

    if len(targets) != len(logits) or len(weights) != len(logits):
        raise ValueError("Lengths of logits, weights, and targets must be the same "
                         "%d, %d, %d." % (len(logits), len(weights), len(targets)))
      with ops.op_scope(logits + targets + weights, name,
                        "sequence_loss_by_example"):
        log_perp_list = []
        for logit, target, weight in zip(logits, targets, weights):
          if softmax_loss_function is None:
            # TODO(irving,ebrevdo): This reshape is needed because
            # sequence_loss_by_example is called with scalars sometimes, which
            # violates our general scalar strictness policy.
            target = array_ops.reshape(target, [-1])
            crossent = nn_ops.sparse_softmax_cross_entropy_with_logits(
                logit, target)
          else:
            crossent = softmax_loss_function(logit, target)
          log_perp_list.append(crossent * weight)
        log_perps = math_ops.add_n(log_perp_list)
        if average_across_timesteps:
          total_size = math_ops.add_n(weights)
          total_size += 1e-12  # Just to avoid division by 0 for all-0 weights.
          log_perps /= total_size
    return log_perps

解决方案

This function already supports calculating costs for dynamic sequence lengths through the use of weights. As long as you ensure the weights are 0 for the "padding targets", the cross entropy will be pushed to 0 for those steps:

log_perp_list.append(crossent * weight)

and the total size will also reflect only the non-padding steps:

total_size = math_ops.add_n(weights)

If you're padding with zeros, one way to derive the weights is as follows:

weights = tf.sign(tf.abs(model.targets))

(Note that you might need to cast this to the same type as your targets)

这篇关于如何修改填充向量的seq2seq成本函数？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何修改填充向量的seq2seq成本函数？ [英] How to modify the seq2seq cost function for padded vectors?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何修改填充向量的seq2seq成本函数？ [英] How to modify the seq2seq cost function for padded vectors?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭